You are on page 1of 898

i

The Association for Computing Machinery


1601 Broadway, 10th Floor
New York, New York 10019, USA

ACM COPYRIGHT NOTICE. Copyright © 2022 by the Association for Computing Machinery,
Inc. Permission to make digital or hard copies of part or all of this work for
personal or classroom use is granted without fee provided that copies are not made or
distributed for profit or commercial advantage and that copies bear this notice and the
full citation on the first page. Copyrights for components of this work owned by others
than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to
republish, to post on servers, or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from Publications Dept., ACM, Inc., fax +1 (212) 869-0481,
or permissions@acm.org.

For other copying of articles that carry a code at the bottom of the first or last page,
copying is permitted provided that the per-copy fee indicated in the code is paid
through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923,
+1-978-750-8400, +1-978-750-4470 (fax).

ACM ISBN: 978-1-4503-9258-7

ii
ii
ASSETS 2022 Foreword

We are pleased to welcome you to the 24th International ACM SIGACCESS Conference on
Computers and Accessibility (ASSETS 2022). ASSETS is the premier computing research
conference exploring the design, evaluation, and use of computing and information technologies
with and for people with disabilities and older adults.

With the continuing COVID-19 pandemic, we have made changes to the conference format. We
are gathering in-person for the first time since 2019 in Athens, Greece, with an additional light-
weight online component for virtual conference attendees. This conference was organized to
support accessibility and inclusivity, given the technical and time-zone constraints of the
conference location. In anticipation of fewer infrastructural resources to support fully streamed
accessible content, and to acknowledge the reality of time-zone differences, we made available
pre-recorded, captioned videos for all presentations to both virtual and in-person conference
attendees. There are also additional creative opportunities for hybrid connections and socializing
across modalities, including online trivia and roundtable social networking events. We
acknowledge that future ASSETS organizing committees will need to make their own decisions
about the length and format of the conference, and so the decision to have an in-person with
light-weight online component is not a permanent decision.

This year’s opening keynote will be given by lawyer and disability rights advocate Haben Girma,
titled “I’m human, too: the biases driving accessibility solutions.” The second day of the
conference features a keynote by SIGACCESS Outstanding Contributions awardee, Dr. Clayton
Lewis.

This year’s technical program features 35 papers, selected by the program committee from 132
submissions (acceptance rate of 26.5%). The program also features six papers published in
TACCESS in the last year. These papers present work from the leading edge of accessible
computing research, including novel tools and interaction methods, advances in VR and AR for
accessibility, accessible technologies for independent living in everyday life, and the perception
and representation of disabilities by diverse groups.

This year was the first year of the Workshops track, led by Kyle Montague and Sowmya
Somanath, which featured five workshops held virtually in the weeks preceding the conference.
These five workshops were selected after a juried review process where researchers and
practitioners who work in the areas of accessibility, disability and computing were invited to
comment and discuss the workshop proposal submissions.

iii
iii
The Posters and Demos track, chaired by Taslima Akter and Hugo Nicolau, received 73
submissions. The program committee selected 43 posters and demos for presentation, resulting in
an acceptance rate of 59%. The accepted posters present late-breaking work from the research
and professional community. The accepted demos provide an opportunity for conference
attendees to experience new advances in accessible technology in-person and virtually.

The Experience Reports track documents authors’ personal or stakeholder experiences related to
the creation, use, and deployment of accessible technologies. Chaired by Katherine Ringland and
Garreth Tigwell, this track received 27 submissions, of which seven reports were selected for
presentation (acceptance rate of 26%).

ACM SIGACCESS is committed to developing the next generation of researchers in the field of
accessible computing. This year’s ASSETS continues the tradition of supporting and showcasing
the work of student researchers. The Doctoral Consortium, chaired by Aqueasha Martin-
Hammond and Katta Spiel, brought together 12 doctoral students to discuss their work with a
panel of established researchers in a one-day workshop on the Saturday before the main
conference. A special edition of the SIGACCESS newsletter will feature extended abstracts from
these doctoral students.

The ACM Student Research Competition (SRC), chaired by Mingming Fan and Roshan Peiris,
offers a unique forum for undergraduate and graduate students to present their original research
before a panel of judges and receive direct feedback on their work from experts. This year, three
entries were selected as SRC finalists to give poster presentations. All participants will give oral
presentations at the conference. Winners from the ASSETS 2022 SRC will go on to compete in
the ACM-wide grand finals.

For three consecutive years, the ASSETS Conference has highlighted submissions that include
components accessible to the public through the Best Artifact Award. The Best Artifact Award is
chaired this year by Dragan Ahmetovic.

The ASSETS 2022 program is organized as follows:

On Monday (Day 1), the program starts with a keynote presentation by Haben Girma. The
keynote will be followed by the first Poster Session, which will feature virtual Student Research
Competition poster presentations. Next, the first technical paper session is convened under the
theme of “AR, VR and Games”. After the lunch break, the second technical paper session
features topics within the theme of “Representation and Inclusion”. The afternoon session
continues with a continuation of the morning Poster session, followed by the technical paper

iv
iv
session “Data for Modeling and Recognition”. Day 1 concludes with an evening reception and
demo session.

On Tuesday (Day 2), the SIGACCESS Outstanding Contributions Keynote address is presented
by Clayton Lewis. The break and second Poster Session is then followed by the technical paper
session on “Composition in Music, Programming and Design”. In the afternoon, a session is
dedicated to the Student Research Competition Finalists presentation. The fifth technical paper
session presents work in the area of “Communication”. After the afternoon break with
continuation of the second Poster Session, the final technical paper session for Day 2 is on
“Social Media and Media”. The SIGACCESS Town Hall will take place at the end of Day 2,
followed by an off-site reception.

On Wednesday (Day 3), we start with a technical session on “Tactile and Haptics”. After a short
break, the session on “Accessibility in Daily Living” is presented. After lunch, the final technical
paper session is dedicated to “Safety, Rehabilitation, and Transportation”. The closing plenary
closes out Day 3.

During the closing session, the ASSETS 2022 awards will be presented, and we will introduce
next year's chairs to make an announcement about ASSETS 2023.

The success of the ASSETS 2022 conference required a tremendous amount of work from
authors, reviewers, committee members, and many others. We thank the authors of all papers,
posters, demos, and experience reports, as well as the applicants to the Doctoral Consortium and
Student Research Competition. We thank the program committee and additional reviewers Sri
Kurniawan and Clayton Lewis for reviewing submissions and providing authors with helpful
feedback.

We also thank the members of the organizing committee: Treasurer and Registration Chair
Martez Mott, Hybrid Experience Chair Christian Vogler, Virtual Chairs Emma McDonnell,
Kelly Mack, Benjamin Tannert, Laurianne Sitbon, Accessibility Chairs J. Bern Jordan and
Lei Shi, Equity and Belonging Chairs Lou Anne Boyd, John A. Guerra Gomez, Proceedings
Chairs Raja Kushalnagar and Sergio Mascetti, Posters and Demos Chairs Taslima Akter and
Hugo Nicolau, Doctoral Consortium Chairs Aqueasha Martin-Hammond and Katta Spiel,
Student Research Competition Chairs Mingming Fan and Roshan Peiris, Experience Reports
Chairs Katherine Ringland and Garreth Tigwell, Web Design Chairs Liang He, Junhan (Judy)
Kong, Jaylin Herskovitz, Jason Wu, Mentoring Chairs Sayan Sarcar and Kirsten Ellis, Publicity
Chairs Alexa Siu and Arthur Theill, Student Volunteer Chairs Franklin Mingzhe Li and Maryam
Banduka, TACCESS Special Issue Chair Maria Wolters, and Best Artifact Award Chairs Dragan
Ahmetovic and Hugo Nicolau.

v
v
We thank the Best Paper Award committee Anthony Hornof, Amy Hurst, Cynthia Putnam,
Sayan Sarcar, and Gerhard Weber. We thank the SIGACCESS Steering Committee: Matt
Huenerfauth (Rochester Institute of Technology, USA) - Chair; Jeffrey Bigham (Carnegie
Mellon University, USA), Tiago Guerreiro (University of Lisbon, Portugal), Jonathan Lazar
(University of Maryland, USA); Kathleen McCoy (University of Delaware, USA), Karyn
Moffatt (McGill University, Canada) for their support and guidance.

Finally, we thank our Champion sponsors Fable and Google, our Gold sponsorsAdobe, Amazon,
Apple, Meta, Microsoft, UW CREATE, our Silver sponsors National Institute on Disability,
Independent Living and Rehabilitation Research (NIDILRR), Twitter, Volkswagen, our Bronze
supporters 2G3R, Intuit, Miraikan, and Studio Pacifica, our In-Kind supporter AIRA, and our
Doctoral Consortium sponsor NSF, and the ACM and SIGACCESS for their very generous
support.

Welcome to ASSETS 2022!

Jon Froehlich Kristen Shinohara Stephanie Ludi


ASSETS 2022 General Chair ASSETS 2022 Program Co-Chair ASSETS 2022 Program Co-Chair
University of Washington, USA Rochester Institute of University of North Texas, USA
Technology, USA

vi
vi
Table of Contents

ASSETS 2022 Conference Organization����������������������������������������������� xix

ASSETS 2022 Sponsors and Supporters���������������������������������������������� xxi

SIGACCESS Outstanding Contributions Keynote


● Challenges and Opportunities in Technology for Inclusion���������������� Article 1
Clayton Lewis

Paper Session 1: AR, VR and Games


● Access on Demand: Real-time, Multi-Modal Accessibility for the
Deaf and Hard-of-Hearing Based on Augmented-Reality��������������������� Article 2
Roshan Mathew, Brian Mak, Wendy Dannels

● VRBubble: Enhancing Peripheral Awareness of Avatars for


People with Visual Impairments in Social Virtual Reality�������������������� Article 3
Tiger Ji, Brianna R. Cochran, Yuhang Zhao

● “It’s Just Part of Me:” Understanding Avatar Diversity and


Self-Presentation of People with Disabilities in Social Virtual
Reality������������������������������������������������������������������������������������������������������� Article 4
Kexin Zhang, Elmira Deldari, Zhicong Lu, Yaxing Yao, Yuhang Zhao

● SoundVizVR: Sound Indicators for Accessible Sounds in Virtual


Reality for Deaf or Hard-of-Hearing Users��������������������������������������������� Article 5
Ziming Li, Shannon Connell, Wendy Dannels, Roshan Peiris

● Uncovering Visually Impaired Gamers’ Preferences for Spatial


Awareness Tools Within Video Games�������������������������������������������������� Article 6
Vishnu Nair, Shao-en Ma, Ricardo E. Gonzalez Penuela, Yicheng He,
Karen Lin, Mason Hayes, Hannah Huddleston, Matthew Donnelly,
Brian A. Smith

Paper Session 2: Representation and Inclusion


● Expressive Bodies – Engaging with Embodied Disability Cultures
for Collaborative Design Critiques��������������������������������������������������������� Article 7
Katta Spiel, Robin Angelini

vii
● Data Representativeness in Accessibility Datasets:
A Meta-Analysis��������������������������������������������������������������������������������������� Article 8
Rie Kamikubo, Lining Wang, Crystal Marte, Amnah Mahmood, Hernisa Kacorri

● Chronically Under-Addressed: Considerations for HCI Accessibility


Practice with Chronically Ill People������������������������������������������������������� Article 9
Kelly Mack, Emma J. McDonnell, Leah Findlater, Heather D. Evans

● Should I Say “Disabled People”, or “People with Disabilities”?


Language Preferences of Disabled People Between Identity- and
Person-First Language�������������������������������������������������������������������������� Article 10
Ather Sharif, Aedan L. McCall, Kianna R. Bolante

● Assistive or Artistic Technologies? Exploring the Connections


between Art, Disability and Wheelchair Use����������������������������������������Article 11
Giulia Barbareschi, Masa Inakage

● “Just Like Meeting in Person” – Examination of Interdependencies


in Dementia-Friendly Virtual Activities������������������������������������������������ Article 12
Elaine Czech, Paul Marshall, Oussama Metatla

Paper Session 3: Data for Modeling and Recognition


● Performing Qualitative Data Analysis as a Blind Researcher:
Challenges, Workarounds and Design Recommendations��������������� Article 13
O. Aishwarya

● Blind Users Accessing Their Training Images in Teachable


Object Recognizers������������������������������������������������������������������������������� Article 14
Jonggi Hong, Jaina Gandhi, Ernest Essuah Mensah, Farnaz Zamiri Zeraati,
Ebrima Haddy Jarjue, Kyungjun Lee, Hernisa Kacorri

● Challenging and Improving Current Evaluation Methods for


Colour Identification Aids��������������������������������������������������������������������� Article 15
Connor Geddes, David R. Flatla

● ASL Wiki: An Exploratory Interface for Crowdsourcing ASL


Translations�������������������������������������������������������������������������������������������� Article 16
Abraham Glasser, Fyodor Minakov, Danielle Bragg

viii
Paper Session 4: Composition in Music, Programming and Design
● Empowering Blind Musicians to Compose and Notate Music
with SoundCells������������������������������������������������������������������������������������� Article 17
William Payne, Fabiha Ahmed, Michael Zachor, Michael Gardell,
Isabel Huey, R. Luke DuBois, Amy Hurst

● Designing Gestures for Digital Musical Instruments: Gesture


Elicitation Study with Deaf and Hard of Hearing People�������������������� Article 18
Ryo Iijima, Akihisa Shitara, Yoichi Ochiai

● Accessible Blockly: An Accessible Block-Based Programming


Library for People with Visual Impairments���������������������������������������� Article 19
Aboubakar Mountapmbeme, Obianuju Okafor, Stephanie Ludi

● CodeWalk: Facilitating Shared Awareness in Mixed-Ability


Collaborative Software Development�������������������������������������������������� Article 20
Venkatesh Potluri, Maulishree Pandey, Andrew Begel, Michael Barnett,
Scott Reitherman

● Designing a Customizable Picture-Based Augmented Reality


Application for Therapists and Educational Professionals
Working in Autistic Contexts���������������������������������������������������������������� Article 21
Tooba Ahsen, Christina Yu, Amanda O’Brien, Ralf W. Schlosser,
Howard C. Shane, Dylan Oesch-Emmel, Eileen T. Crehan, Fahad Dogar

Paper Session 5: Communication


● State of the Art in AAC: A Systematic Review and Taxonomy���������� Article 22
Humphrey Curtis, Timothy Neate, Carlota Vazquez Gonzalez

● AAC with Automated Vocabulary from Photographs: Insights


from School and Speech-Language Therapy Settings���������������������� Article 23
Mauricio Fontana de Vargas, Jiamin Dai, Karyn Moffatt

● LaMPost: Design and Evaluation of an AI-Assisted Email Writing


Prototype for Adults with Dyslexia������������������������������������������������������� Article 24
Steven M. Goodman, Erin Buehler, Patrick Clary, Andy Coenen,
Aaron Donsbach, Tiffanie N. Horne, Michal Lahav, Robert MacDonald,
Rain Breaw Michaels, Ajit Narayanan, Mahima Pushkarna, Joel Riley,
Alex Santana, Lei Shi, Rachel Sweeney, Phil Weaver, Ann Yuan,
Meredith Ringel Morris

ix
● Exploring Smart Speaker User Experience for People Who
Stammer�������������������������������������������������������������������������������������������������� Article 25
Anna Bleakley, Daniel Rough, Abi Roper, Stephen Lindsay,
Martin Porcheron, Minha Lee, Stuart Nicholson, Benjamin Cowan,
Leigh Clark

● Beyond Subtitles: Captioning and Visualizing Non-Speech


Sounds to Improve Accessibility of User-Generated Videos������������ Article 26
Oliver Alonzo, Hijung Valentina Shin, Dingzeyu Li

Paper Session 6: Social Media and Media


● Nothing Micro About It: Examining Ableist Microaggressions
on Social Media�������������������������������������������������������������������������������������� Article 27
Sharon Heung, Mahika Phutane, Shiri Azenkot, Megh Marathe,
Aditya Vashistha

● Authoring Accessible Media Content on Social Networks���������������� Article 28


Letícia Seixas Pereira, José Coelho, André Rodrigues, João Guerreiro,
Tiago Guerreiro, Carlos Duarte

● Support in the Moment: Benefits and use of Video-Span Selection


and Search for Sign-Language Video Comprehension Among
ASL Learners������������������������������������������������������������������������������������������ Article 29
Saad Hassan, Akhter Al Amin, Caluã de Lacerda Pataca, Diego Navarro,
Alexis Gordon, Sooyeon Lee, Matt Huenerfauth

● A Dataset of Alt Texts from HCI Publications: Analyses and Uses


Towards Producing More Descriptive Alt Texts of Data
Visualizations in Scientific Papers������������������������������������������������������� Article 30
Sanjana Chintalapati, Jonathan Bragg, Lucy Lu Wang

Paper Session 7: Tactile and Haptics


● Low-Cost Tactile Coloring Page Fabrication on a Cutting
Machine: Assembly and User Experiences of Cardstock-Layered
Tangible Pictures����������������������������������������������������������������������������������� Article 31
Nicole E. Johnson, Tom Yeh, Ann Cunningham

x
● Animations at Your Fingertips: Using a Refreshable Tactile
Display to Convey Motion Graphics for People who are Blind or
have Low Vision������������������������������������������������������������������������������������� Article 32
Leona Holloway, Swamy Ananthanarayan, Matthew Butler,
Madhuka De Silva, Kirsten Ellis, Cagatay Goncu, Kate Stephens,
Kim Marriott

● Quantifying Touch: New Metrics for Characterizing What Happens


During a Touch��������������������������������������������������������������������������������������� Article 33
Junhan Kong, Mingyuan Zhong, James Fogarty, Jacob O. Wobbrock

● Creating 3D Printed Assistive Technology Through Design


Shortcuts: Leveraging Digital Fabrication Services to Incorporate
3D Printing into the Physical Therapy Classroom������������������������������ Article 34
Erin L. Higgins, William Easley, Karen Gordes, Amy Hurst, Foad Hamidi

● BentoMuseum: 3D and Layered Interactive Museum Map for


Blind Visitors������������������������������������������������������������������������������������������ Article 35
Xiyue Wang, Seita Kayukawa, Hironobu Takagi, Chieko Asakawa

Paper Session 8: Accessibility in Daily Living


● Depending on Independence – An Autoethnographic Account
of Daily Use of Assistive Technologies����������������������������������������������� Article 36
Felix Fussenegger, Katta Spiel

● “I Used To Carry A Wallet, Now I Just Need To Carry My Phone”:


Understanding Current Banking Practices and Challenges
Among Older Adults in China��������������������������������������������������������������� Article 37
Xiaofu Jin, Mingming Fan

● Mobile Phone Use by People with Mild to Moderate Dementia:


Uncovering Challenges and Identifying Opportunities���������������������� Article 38
Emma Dixon, Rain Michaels, Xiang Xiao, Yu Zhong, Patrick Clary,
Ajit Narayanan, Robin Brewer, Amanda Lazar

● Freedom to Choose: Understanding Input Modality Preferences


of People with Upper-Body Motor Impairments for Activities of
Daily Living��������������������������������������������������������������������������������������������� Article 39
Franklin Mingzhe Li, Michael Xieyang Liu, Yang Zhang, Patrick Carrington

xi
Paper Session 9: Safety, Rehabilitation, and Transportation
● It’s Enactment Time!: High-Fidelity Enactment Stage for Accessible
Automated Driving System Technology Research����������������������������� Article 40
Aaron Gluck, Hannah Solini, Julian Brinkley

● Where Are You Taking Me? Reflections from Observing


Ridesharing Use By People with Visual Impairments������������������������ Article 41
Earl W. Huff Jr., Robin N. Brewer, Julian Brinkley

● A Collaborative Approach to Support Medication Management


in Older Adults with Mild Cognitive Impairment Using
Conversational Assistants (CAs)��������������������������������������������������������� Article 42
Niharika Mathur, Kunal Dhodapkar, Tamara Zubatiy, Jiachen Li,
Brian D. Jones, Elizabeth D. Mynatt

● Designing Post-Trauma Self-Regulation Apps for People with


Intellectual and Developmental Disabilities���������������������������������������� Article 43
Krishna Venkatasubramanian, Tina-Marie Ranalli

Posters and Demos


● “I Should Feel Like I’m In Control”: Understanding Expectations,
Concerns, and Motivations for the Use of Autonomous Navigation
on Wheelchairs�������������������������������������������������������������������������������������� Article 44
JiWoong Jang, Yunzhi Li, Patrick Carrington

● “What Makes Sonification User-Friendly?” Exploring Usability and


User-Friendliness of Sonified Responses������������������������������������������� Article 45
Ather Sharif, Olivia H. Wang, Alida T. Muongchan

● “What’s Going on in Accessibility Research?” Frequencies


and Trends of Disability Categories and Research Domains
in Publications at ASSETS�������������������������������������������������������������������� Article 46
Ather Sharif, Ploypilin Pruekcharoen, Thrisha Ramesh, Ruoxi Shang,
Spencer Williams, Gary Hsieh

● A Participatory Design Approach to Explore Design Directions for


Enhancing Videoconferencing Experience for Non-Signing Deaf
and Hard of Hearing Users�������������������������������������������������������������������� Article 47
Yeon Soo Kim, Sunok Lee, Sangsu Lee

xii
● Co-designing a Bespoken Wearable Display for People with
Dissociative Identity Disorder�������������������������������������������������������������� Article 48
Patricia Piedade, Nikoletta Matsur, Catarina Rodrigues, Francisco Cecilio,
Afonso Marques, Rings of Saturn, Isabel Neto, Hugo Nicolau

● Co-designing the Automation of Theatre Touch Tours���������������������� Article 49


Alexandra Tzanidou, Sami Abosaleh, Vasilis Vlachokyriakos

● Creating Personas for Signing User Populations: An Ability-Based


Approach to User Modelling in HCI������������������������������������������������������ Article 50
Amelie Nolte, Karolin Lueneburg, Dieter Wallach, Nicole Jochems

● Designing a Data Visualization Dashboard for Pre-Screening


Hong Kong Students with Specific Learning Disabilities������������������ Article 51
Ka Yan Fung, Zikai Alex Wen, Haotian Li, Xingbo Wang, Shenghui Song,
Huamin Qu

● Digital Accessibility in Iran: An Investigation Focusing on Iran’s


National Policies on Accessibility and Disability Support����������������� Article 52
Laleh Nourian, Kristen Shinohara, Garreth W. Tigwell

● Exploring Accessibility Features and Plug-ins for Digital


Prototyping Tools���������������������������������������������������������������������������������� Article 53
Urvashi Kokate, Kristen Shinohara, Garreth W. Tigwell

● Does XR Introduce Experience Asymmetry in an


Intergenerational Setting?�������������������������������������������������������������������� Article 54
Vibhav Nanda, Hanuma Teja Maddali, Amanda Lazar

● Flexible Activity Tracking for Older Adults Using Mobility Aids — An


Exploratory Study on Automatically Identifying Movement
Modality�������������������������������������������������������������������������������������������������� Article 55
Dimitri Vargemidis, Kathrin Gerling, Luc Geurts, Vero Vanden Abeele

● Improving Image Accessibility by Combining Haptic and


Auditory Feedback��������������������������������������������������������������������������������� Article 56
Mallak Alkhathlan, ML Tlachac, Lane Harrison, Elke Rundensteiner

● Inter-rater Reliability of Command-Line Web Accessibility


Evaluation Tools������������������������������������������������������������������������������������ Article 57
Eryn Rachael Kelsey-Adkins, Robert Thompson

● Investigating How People with Disabilities Disclose Difficulties


on YouTube��������������������������������������������������������������������������������������������� Article 58
Shuo Niu, Jaime Garcia, Summayah Waseem, Li Liu

xiii
● ProAesthetics: Changing How We View Prosthetic Function����������� Article 59
Susanna Abler, Foad Hamidi

● Towards Visualization of Time–Series Ecological Momentary


Assessment (EMA) Data on Standalone Voice–First Virtual
Assistants����������������������������������������������������������������������������������������������� Article 60
Yichen Han, Christopher Bo Han, Chen Chen, Peng Wei Lee,
Michael Hogarth, Alison A. Moore, Nadir Weibel, Emilia Farcas

● Understanding and Improving Information Extraction From Online


Geospatial Data Visualizations for Screen-Reader Users ����������������� Article 61
Ather Sharif, Andrew M. Zhang, Anna Shih, Jacob O. Wobbrock,
Katharina Reinecke

● Understanding Design Preferences for Sensory-Sensitive


Earcons with Neurodivergent Individuals������������������������������������������� Article 62
Lauren Race, Kia El-Amin, Sarah Anoke, Andrew Hayward, Amber James,
Amy Hurst, Audrey Davis, Theresa Mershon

● Understanding How People with Visual Impairments


Take Selfies: Experiences and Challenges����������������������������������������� Article 63
Ricardo E. Gonzalez Penuela, Paul Vermette, Zihan Yan, Cheng Zhang,
Keith Vertanen, Shiri Azenkot

● Voice-Enabled Blockly: Usability Impressions of a Speech-Driven


Block-Based Programming System����������������������������������������������������� Article 64
Obianuju Okafor, Stephanie Ludi

● The Landscape of Accessibility Skill Set in the Software Industry


Positions������������������������������������������������������������������������������������������������� Article 65
Lilu Martin, Catherine M. Baker, Kristen Shinohara, Yasmine N. Elglaly

● How People who are Deaf, Deaf, and Hard of Hearing use
Technology in Creative Sound Activities��������������������������������������������� Article 66
Keita Ohshiro, Mark Cartwright

● Personalizable Alternative Mouse and Keyboard Interface for


People with Motor Disabilities of Neuromuscular Origin������������������� Article 67
Daniel Andreas, Hannah Six, Adna Bliek, Philipp Beckerle

● PigScape: An Embodied Video Game for Cognitive Peer-Training


of Impulse and Behavior Control in Children with ADHD������������������ Article 68
Yulia Gizatdinova, Vera Remizova, Antti Sand, Sumita Sharma,
Kati Rantanen, Terhi Helminen, Anneli Kylliäinen

xiv
● Serenity: Exploring Audio-Based Gaming for Arm-Hand
Rehabilitation after Stroke�������������������������������������������������������������������� Article 69
Yijing Jiang, Daniel Tetteroo

● UnlockedMaps: Visualizing Real-Time Accessibility of Urban


Rail Transit Using a Web-Based Map��������������������������������������������������� Article 70
Ather Sharif, Aneesha Ramesh, Trung-Anh H. Nguyen, Luna Chen,
Kent R. Zeng, Lanqing Hou, Xuhai Xu

● Website Builders Still Contribute to Inaccessible Web Design��������� Article 71


Athira Pillai, Kristen Shinohara, Garreth W. Tigwell

● An Accessible Smart Kitchen Cupboard��������������������������������������������� Article 72


Marios Gavaletakis, Asterios Leonidis, Nikolaos-Menelaos Stivaktakis,
Maria Korozi, Michalis Roulios, Constantine Stephanidis

● Social Access and Representation for Autistic Adult


Livestreamers����������������������������������������������������������������������������������������� Article 73
Terrance Mok, Anthony Tang, Adam McCrimmon, Lora Oehlberg

● Co-Designing Systems to Support Blind and Low Vision


Audio Description Writers��������������������������������������������������������������������� Article 74
Lucy Jiang, Richard Ladner

● College Students’ and Campus Counselors’ Attitudes Toward


Teletherapy and Adopting Virtual Reality (Preliminary Exploration)
for Counseling Services������������������������������������������������������������������������ Article 75
Vanny T. Chao, Roshan L. Peiris

● Context-Responsive ASL Recommendation for Parent-Child


Interaction����������������������������������������������������������������������������������������������� Article 76
Ekram Hossain, Merritt Cahoon, Yao Liu, Chigusa Kurumada, Zhen Bai

● Designing A Game for Pre-Screening Students with Specific


Learning Disabilities in Chinese����������������������������������������������������������� Article 77
Ka Yan Fung, SIN Kuen Fung, Zikai Alex Wen, Lik-Hang Lee,
Shenghui Song, Huamin Qu

● Exploring Motor-Impaired Programmers’ Use of Speech


Recognition�������������������������������������������������������������������������������������������� Article 78
Sadia Nowrin, Patricia Ordóñez, Keith Vertanen

xv
● Investigating Accessibility Challenges and Opportunities for
Users with Low Vision Disabilities in Customer-to-Customer
(C2C) Marketplaces�������������������������������������������������������������������������������� Article 79
Bektur Ryskeldiev, Kotaro Hara, Mariko Kobayashi, Koki Kusano

● Object Recognition from Two-Dimensional Tactile Graphics: What


Factors Lead to Successful Identification Through Touch?������������� Article 80
Anchal Sharma, Srinivasan V., P.V.M. Rao

● Overcoming Barriers to an Accessible e-Learning Ecosystem for


People on the Autism Spectrum: A Preliminary Design��������������������� Article 81
Yussy Chinchay, Javier Gomez, Germán Montoro

● Scaling Crowd+AI Sidewalk Accessibility Assessments: Initial


Experiments Examining Label Quality and Cross-City Training
on Performance�������������������������������������������������������������������������������������� Article 82
Michael Duan, Shosuke Kiami, Logan Milandin, Johnson Kuang,
Michael Saugstad, Maryam Hosseini, Jon E. Froehlich

● SpineCurer: An Inertial Measurement Unit Based Scoliosis


Training System������������������������������������������������������������������������������������� Article 83
Eryuan Mai, Dahua Hu, Jiaming Li, Zhuo Yang

● Systematic Literature Review on Making and Accessibility�������������� Article 84


Saquib Sarwar, David Wilson

● Understanding ASL Learners’ Preferences for a Sign Language


Recording and Automatic Feedback System to Support
Self-Study����������������������������������������������������������������������������������������������� Article 85
Saad Hassan, Sooyeon Lee, Dimitris Metaxas, Carol Neidle,
Matt Huenerfauth

● Vibrotactile Navigation for Visually Impaired People������������������������� Article 86


Stephan Huber, Anastasia Alieva, Aaron Lutz

Student Research Competition Abstracts


● Challenges and Opportunities in Creating An Accessible Web
Application for Learning Organic Chemistry�������������������������������������� Article 87
Allyson Yu

● FootUI: Designing and Detecting Foot Gestures to Assist People


with Upper Body Motor Impairments to Use Smartphones
on the Bed���������������������������������������������������������������������������������������������� Article 88
Xiaozhu Hu, Jiting Wang, Weiwei Gao, Yongquan Hu

xvi
● Investigating Sign Language Interpreter Rendering and Guiding
Methods in Virtual Reality 360-Degree Content���������������������������������� Article 89
Craig Anderton

Doctoral Consortium Abstracts


● Accessible PDFs: Applying Artificial Intelligence for Automated
Remediation of STEM PDFs������������������������������������������������������������������ Article 90
Felix M. Schmitt-Koopmann, Prof. Dr. Elaine M. Huang,
Prof. Dr. Alireza Darvishy

● Accessible Communication and Materials in Higher Education������� Article 91


Kelly Mack

● Exploring Collective Medical Knowledge and Tensions with


Medical Systems in Online ADHD Communities��������������������������������� Article 92
Tessa Eagle

● Understanding the Role of Socio-Technical Infrastructures on the


Organization of Access for the Mixed-Ability Collaborators������������� Article 93
Zeynep Şölen Yıldız

● Supporting Physical Activity in Later Life: Perspectives from


Older Adults�������������������������������������������������������������������������������������������� Article 94
Muhe Yang

● Applying Technology in a Hybrid-Fashion to Create


Dementia-Inclusive Community Spaces���������������������������������������������� Article 95
Elaine Czech

● Learning Music Blind: Understanding the Application of


Technology to Support BLV Music Learning��������������������������������������� Article 96
Leon Lu

● Evaluating Haptic Technology in Accessibility of Digital Audio


Workstations for Visually Impaired Creatives������������������������������������� Article 97
Christina Karpodini

● Lessons Learned from Designing, Deploying and Testing an


Accessible BLE Beacon-Based Wayfinding System in a Multi-Floor
Indoor Environment������������������������������������������������������������������������������� Article 98
Ajay Abraham

xvii
● Socially Connecting Adults with Intellectual Disabilities Through
Inclusive Co-Design of Tangible and Visual Technology������������������� Article 99
Manesha Andradi

● Understanding Social and Environmental Factors to Enable


Collective Access Approaches to the Design of Captioning
Technology������������������������������������������������������������������������������������������� Article 100
Emma J. McDonnell

● Improving Web and Mobile Accessibility Resources for Iranian


Designers���������������������������������������������������������������������������������������������� Article 101
Laleh Nourian

Workshop Abstracts
● The Future of Urban Accessibility for People with Disabilities:
Data Collection, Analytics, Policy, and Tools������������������������������������ Article 102
Jon E. Froehlich, Yochai Eisenberg, Maryam Hosseini, Fabio Miranda,
Marc Adams, Anat Caspi, Holger Dieterich, Heather Feldner, Aldo Gonzalez,
Claudina de Gyves, Joy Hammel, Reuben Kirkham, Melanie Kneisel,
Delphine Labbé, Steve J. Mooney, Victor Pineda, Cláudia Fonseca Pinhão,
Ana Rodríguez, Manaswi Saha, Michael Saugstad, Judy Shanley, Ather Sharif,
Qing Shen, Cláudio T. Silva, Maarten Sukel, Eric K. Tokuda, Sebastian Felix Zappe,
Anna Zivarts

● A Workshop on Disability Inclusive Remote Co-Design������������������ Article 103


Maryam Bandukda, Giulia Barbareschi, Aneesha Singh, Dhruv Jain,
Maitraye Das, Tamanna Motahar, Jason Wiese, Lynn Cockburn,
Amit Prakash, David Frohlich, Catherine Holloway

● Including Accessibility in Computer Science Education����������������� Article 104


Catherine M. Baker, Yasmine N. Elglaly, Anne Spencer Ross,
Kristen Shinohara

● Multidisciplinary Perspectives on Designing Accessible Systems


for Users with Multiple Impairments: Grand Challenges and
Opportunities for Future Research���������������������������������������������������� Article 105
Arthur Theil, Chris Creed, Mohammed Shaqura, Nasrine Olson,
Raymond Holt, Sayan Sarcar, Stuart Murray

● Designing with and for People with Intellectual Disabilities����������� Article 106
Leandro Soares Guedes, Ryan Colin Gibson, Kirsten Ellis, Laurianne Sitbon,
Monica Landoni

xviii
ASSETS 2022 Conference Organization

General Chair: Jon E. Froehlich (University of Washington, USA)

Technical Program Chairs: Kristen Shinohara (Rochester Institute of Technology, USA)


Stephanie Ludi (University of North Texas, USA)

Proceedings Chair: Raja Kushalnagar (Gallaudet University, USA)


Sergio Mascetti (University of Milan, Italy)

Treasurer/Registration Kyle Rector (University of Iowa, USA)


Chair: Martez Mott (Microsoft Research, USA)

Hybrid Experiences Chair: Christian Vogler (Gallaudet University, USA)

Local Chairs: Alexandros Pino (National and Kapodistrian University of


Athens, Greece)
Georgios Kouroupetroglou (National and Kapodistrian
University of Athens, Greece)

Virtual Chairs: Emma McDonnell (University of Washington, USA)


Kelly Mack (University of Washington, USA)
Benjamin Tannert (City University of Applied Science
Bremen, Germany)
Laurianne Sitbon (Queensland University of Technology,
Australia)

Workshop Chairs: Kyle Montague (Northumbria University, England)


Sowmya Somanath (University of Victoria, Canada)

Posters and Demos Chairs: Hugo Nicolau (University of Lisbon, Portugal)


Taslima Akter (Indiana University - Bloomington, USA)

Doctoral Consortium Chairs: Katta Spiel (Vienna University of Technology, Austria)


Aqueasha Martin-Hammond (Indiana University - Purdue
University Indianapolis, USA)

Student Research Mingming Fan (The Hong Kong University of Science and
Competition: Technology, China)
Roshan Peiris (Rochester Institute of Technology, USA)

Experience Reports Chairs: Garreth Tigwell (Rochester Institute of Technology, USA)


Kathryn (Kate) Ringland (University of California - Santa
Cruz, USA)

Mentoring Chair: Sayan Sarcar (Birmingham City University, England)


Kirsten Ellis (Monash University, Australia)

Publicity Chairs: Alexa Siu (Adobe, USA)


Arthur Theil (Birmingham City University, England)

Web and Graphics Design Liang He (University of Washington, USA)

xix
xix
Chairs Junhan (Judy) Kong (University of Washington, USA)
Jaylin Herskovitz (University of Michigan, USA)
Jason Wu (Carnegie Mellon University, USA)

Student Volunteer Chairs: Franklin Mingzhe Li (Carnegie Mellon University, USA)


Maryam Banduka (University College London, England)

Local Accessibility Chairs: Paraskevi Riga (National and Kapodistrian University of


Athens, Greece)
Ariadni Velissaropoulou (National and Kapodistrian
University of Athens, Greece)

Accessibility Chairs: J. Bern Jordan (University of Maryland, USA)


Lei Shi (Google, USA)

ACM Partnerships Chair: Stacy Branham (University of California - Irvine, USA)

Global Outreach Chairs: Uran Oh (Ewha Womans University, Korea)


Yuhang Zhao (University of Wisconsin-Madison, USA)
Manohar Swaminathan (Microsoft Research, India)
Pin Sym Foong National University of Singapore,
Singapore)

Education Outreach Chairs: Anne Spencer Ross (Bucknell University, USA)


Catie Baker (Creighton University, USA)

Equity and Belonging Chair: LouAnne Boyd (Chapman University, USA)


John Guerra Gomez (Northeastern University - Bay Area,
USA)

TACCESS Special Issue Maria Wolters (University of Edinburgh, Scotland)


Chair:

Best Artifact Chair: Dragan Ahmetovic (University of Milan, Italy)

xx
xx
ASSETS 2022 Sponsors & Supporters

Sponsor:

Champion:

Gold Level:

xxi
xxi
Gold Level
(continues):

Silver level:

Bronze level:

xxii
xxii
Doctoral
Consortium
Sponsors:

NSF Award ID 2228013

In-kind Sponsors

xxiii
xxiii
Challenges and opportunities in technology for inclusion
Clayton Lewis
University of Colorado, Boulder, CO, USA
clayton@colorado.edu

ABSTRACT and as technology advisor to the director of the National Institute


What will we be working on in the coming decade? This talk will for Disability and Rehabilitation Research, US Department of Edu-
consider a wide range of possibilities, ranging from predictable cation. He is well known for his research on evaluation methods in
impacts of new technologies to difcult matters of perspective and user interface design. Two methods to which he and his colleagues
policy that infuence the development and deployment of tech- have contributed, the thinking aloud method and the cognitive
nology, to quixotic ambitions. The opinions will be those of the walkthrough, are in regular use in software development organi-
presenter, if even he endorses them. zations around the world. He has also contributed to cognitive
assistive technology, to programming language design, to educa-
CCS CONCEPTS tional technology, and to cognitive theory in causal attribution and
learning.
• Social and professional topics; • People with disabilities;
Before joining the University of Colorado, Lewis was Manager of
KEYWORDS Human Factors at IBM’s Watson Research Center, where he was a
member of the research staf from 1970 to 1973 and 1979 to 1984. He
Inclusion, accessible technology
holds degrees from Princeton, MIT, and the University of Michigan.
ACM Reference Format: He has been honored by appointment to the ACM SIGCHI Academy,
Clayton Lewis. 2022. Challenges and opportunities in technology for inclu- by the SIGCHI Social Impact Award, and by the Strache Leadership
sion. In The 24th International ACM SIGACCESS Conference on Computers Award (CSUN Assistive Technology Conference).
and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM,
New York, NY, USA, 1 page. https://doi.org/10.1145/3517428.3566096 Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
1 BIO on the frst page. Copyrights for third-party components of this work must be honored.
Clayton Lewis is Emeritus Professor of Computer Science at the For all other uses, contact the owner/author(s).
ASSETS ’22, October 23–26, 2022, Athens, Greece
University of Colorado Boulder. Lewis served previously as Co- © 2022 Copyright held by the owner/author(s).
Director for Technology for the Coleman Institute for Cognitive ACM ISBN 978-1-4503-9258-7/22/10.
Disabilities, and Fellow of the Institute of Cognitive Science, at CU, https://doi.org/10.1145/3517428.3566096
Access on Demand: Real-time, Multi-modal Accessibility for the
Deaf and Hard-of-Hearing based on Augmented Reality
Roshan Mathew Brian Mak Wendy Dannels
School of Information, Rochester College Of Liberal Arts, Rochester NTID Center on Culture and
Institute of Technology, Rochester, Institute of Technology, Rochester, Language, Rochester Institute of
New York, United States New York, United States Technology, Rochester, New York,
rm1299@rit.edu bm1531@rit.edu United States
w.dannels@rit.edu
ABSTRACT processes to support the demand. Providing accessibility accom-
In this experience report, two deaf researchers with varying ex- modations is either too costly, or there are too few interpreters or
pertise, communication preferences, and technological skills doc- captioners available [1].
ument their experiences using Access on Demand (AoD), an Aug- An example would be the case of a DHH student sitting in a
mented Reality (AR) based accessibility application that provides classroom far away from the lecturer, unable to see what the inter-
on-demand real-time captioning and sign language interpretation preter is signing clearly. If the DHH student prefers captioning, the
services using the Vuzix Blade AR smart glasses. The researchers re- captioning will either be displayed on a projector display up front
port their observations regarding using remote real-time American or on their laptop or phone. This situation is exacerbated when the
Sign Language (ASL) interpreting, captioning, and auto-captions lecturer is always on the move or demonstrating something, and
ofered by the AoD platform. The authors discuss the benefts and the DHH student has to switch their focus between the lecturer
limitations of using AoD as an assistive technology device and how and the interpreter/captions. It would be benefcial for the DHH
it would beneft the deaf community from the perspective of Deaf student to see the interpreter closer or view the captions while also
and Hard-of-Hearing (DHH) users. being able to focus on the lecturer. Undoubtedly, not all classes will
or can always provide interpreting or captioning support, and the
CCS CONCEPTS burden of getting that accessibility support rests on the DHH per-
son. Moreover, there is always a possibility that the interpreter or
• Human-centered computing → Accessibility; Accessibility
captioner they requested does not show up. On-demand accessibil-
technologies; Accessibility; Accessibility design and evaluation
ity services using Augmented Reality (AR) smart glasses eliminate
methods.
this inconvenience and help the DHH user get the accommodations
they need reliably in situations where they are not provided one.
KEYWORDS Augmented Reality is an interactive experience that provides a
Augmented Reality, Smart glasses, DHH digital modifcation of the real world. There is a huge potential for
ACM Reference Format: AR smart glasses to be immensely useful for the DHH community
Roshan Mathew, Brian Mak, and Wendy Dannels. 2022. Access on Demand: in terms of accessibility to view captions or interpreting. There is
Real-time, Multi-modal Accessibility for the Deaf and Hard-of-Hearing a study that looked at the use of smart glasses to improve lecture
based on Augmented Reality. In The 24th International ACM SIGACCESS comprehension in the classroom for DHH students, initially using
Conference on Computers and Accessibility (ASSETS ’22), October 23–26, 2022, Google Glass and then fully using EPSON Moverio BT-200 [5].
Athens, Greece. ACM, New York, NY, USA, 6 pages. https://doi.org/10.1145/ Another research evaluated a custom-made AR prototype with four
3517428.3551352 modes that included detecting environmental sounds and emotion
indicators, but their prototype was not tested by a DHH user [7].
1 INTRODUCTION In this experience report, we will discuss our evaluation of using
There is a great need for the Deaf and Hard-of-Hearing (DHH) com- the $999 Vuzix Blade smart glasses with a university-made software
munity to receive access services whenever possible to understand called Access on Demand (AoD) installed to enable interpreting
others who do not know or use American Sign Language (ASL) and and captioning in diferent real-world situations such as watching a
only use spoken English. However, it is not always possible to avail niche stand-up comedian show in person or listening in to a speech
of such services according to the needs and convenience of DHH during a wedding where the organizers have not considered provid-
individuals because there is insufcient existing infrastructure and ing access services. This experience report includes observations
made by two DHH researchers with unique perspectives and vary-
ing experiences while using the Vuzix Blade AR smart glasses with
AoD installed and shows that a platform/app that ofers on-demand
This work is licensed under a Creative Commons Attribution-NonCommercial multi-modal accessibility tools should be able to meet diferent
International 4.0 License. users’ needs by providing options to select access services based
ASSETS ’22, October 23–26, 2022, Athens, Greece
on their preferences (interpreting or captioning).
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3551352
ASSETS ’22, October 23–26, 2022, Athens, Greece Roshan Mathew et al.

2 BACKGROUND Software Development Kit (SDK), which allows the developer to


The next subsections provide an overview of the authors’ back- program and code an app for its use [11]. During their experience
grounds, information on the processes and tools used, and the using Vuzix Blade, DHH users were not given the battery pack.
signifcant features ofered by the tool and the platform.
2.4 Access on Demand
2.1 Team Members This section describes the main features and functions of the Access
Mak: It should be noted that I am a hard-of-hearing individual on Demand (AoD) application.
whose main communication method is ASL, with some halting
2.4.1 Initial Set up. After you power on your Blade smart glasses,
ability of spoken English here and there. I wear a set of Bluetooth-
swipe through the app list and tap to launch into the Access on
capable hearing aids, but I am not profoundly deaf. I was an alpha
Demand app, then swipe down to enable QR scanning mode, and
tester with the Vuzix Blade when AoD was still being developed
aim your smart glasses at the specifc QR code on either the session
two years ago. I am an undergraduate journalism major. I only have
page on the AoD website or in printed paper form. You will know
surface knowledge of Augmented Reality, Virtual Reality, Mixed
that you are in once you see the session ID on the screen. All three
Reality, and head-mounted displays. Now I am testing out the AoD
modes below follow this initial setup.
app already installed in the smart glasses.
Mathew: I am deaf and use a cochlear implant (CI) on one ear and 2.4.2 Interpreting. After joining the session, you will see the in-
a hearing aid (HA) on the other. I am late-deafened, can speak, and terpreter on screen in the smart glasses when they get in the same
have a good command of the English language. I attended main- session as you. If not, then you will get a message saying, ‘No in-
stream schools and was exposed to sign language only relatively terpreter is available right now.’ Now the next step is getting the
recently. Therefore, I rely on real-time captioning using human interpreter to hear the outside world for them to interpret for you
captioners or automatic speech recognition (ASR) apps for daily inside the glasses. AoD does not use the embedded microphone
communication accessibility. As a graduate student majoring in inside the smart glasses because it does not capture the outside
a computing program, I have prior experience using Augmented audio well. Also, the stereo speakers embedded in the Vuzix Blade
Reality (AR), Virtual Reality (VR), and Mixed Reality (MR) glasses are not quite loud enough for everyday situations. Thus, you need
and head-mounted displays (HMDs). to use your phone or any other external device with more powerful
Dannels: I have profound hearing loss, do not use any assistive microphones and speakers. So, you need to log in to the AoD web-
technology, and sign ASL. I am the director of XR Accessibility site from a phone or another external device to access ‘presenter’
Solutions Laboratory (XR-ASL), homing in the Rochester Institute mode. This mode enables you to use the microphone and camera
of Technology/National Technical Institute for the Deaf’s Research on another device so the interpreter can listen to what the other
Center on Culture and Language (RIT/NTID CCL) [9]. For the person you are talking or listening to is saying so they can interpret.
month of June 2022, I am mentoring two university students in Finally, you enter the same session ID as the one you scanned the
their summer experience as research assistants. I provided them QR code for.
with the resources allowing them to write their refective paper.
2.4.3 Real-time Captioning. Access on Demand has two modes for
2.2 Process captioning: real-time captioning with a remote human captioner
and auto-captioning using automatic speech recognition. After
The XR-ASL team has developed an app that enables remote real-
joining a session as described under the initial setup, a one-fnger
time captioning and ASL interpreting to be displayed on smart
forward swipe gesture will toggle the interpreting portal with the
glasses. This innovative solution enables deaf individuals to look
captioning portal. To relay the audio to the captioner, the DHH user
directly at people, especially during demonstrations, to review in-
will have to log in to the AoD website using their phone or laptop to
structions and acquire important information instead of looking at
access the presenter mode. The presenter mode enables the camera
diferent individuals and/or screens. Wendy invited two summer
and microphone on the user’s device to let the captioner listen
student participants to experiment with smart glasses. As a part of
to the conversation and view the event. AoD uses C-Print [10] to
research and development, the Access on Demand platform and app
provide real-time captioning services, a speech-to-text technology
were developed at the university to serve as a web portal for several
service developed at RIT/NTID. The captioner assigned for the
purposes. Diferent users will have the ability to sign in and utilize
specifc event connects their copy of C-Print software with the
the features. Sign-in will allow users to use diferent tools developed
AoD platform and joins the same session. Simultaneously, they also
for their needs. Diferent sets of users are: deaf/hard-of-hearing,
need to log in to the AoD website to access the audio and video
captioners, interpreters, and presenters. This paper focuses on DHH
streams from the DHH user’s end. The captioner can then start
users’ perspectives only. Selecting two students with completely
listening to the conversation and provide captioning, which will
diferent backgrounds and experiences will provide the readers
be displayed through the Blade smart glasses. Whenever there is a
with two polarizing perspectives.
pause in the conversation, and the captioner is not typing anything,
the Blade will display a status message, "Captioner on standby."
2.3 Tools
Vuzix Blade is currently the only United States made commercially 2.4.4 Auto-captioning. The process for using auto-captions is quite
available pair of AR smart glasses that are not cable linked to a similar to that of real-time captions on the Blade. The only diference
smartphone, not tethered to a battery pack, and gives access to a is that instead of using the presenter mode on a phone or laptop,
Access on Demand: Real-time, Multi-modal Accessibility for the Deaf and Hard-of-Hearing based on Augmented Reality ASSETS ’22, October 23–26, 2022, Athens, Greece

the DHH user needs to use diferent login credentials to access them stealing it away from you, thus the need to hold on to your
the user interface for auto-captions on the AoD website. This page phone in presenter mode while it is unlocked. The presenter mode’s
contains a display section for captions, a feld to enter the session use of the camera is not helpful as it uses only the selfe camera of
ID, and buttons to enable the microphone and auto-captions. Once your phone; thus, you see only yourself, and also there is no bottom
the microphone and auto-captions are enabled, the captions will video feed to show the interpreter for the other person you are
display both on the phone (or laptop) and the Blade. AoD uses talking to for them to understand what they are speaking to from
Web Speech for automatic speech recognition. Auto-captioning the phone. You cannot use your ASL skill when you give your phone
was tested using the Blade and a laptop as there were technical away for the other person to speak into your phone’s microphone,
glitches with using a phone. and there will be moments when you cannot place your phone
vertically and at some distance away from you for the interpreter
2.4.5 Customization Features. The AoD application ofers the fol-
to see what you are signing. Doing two-way communication is
lowing customization features through a discoverable horizontal
impossible unless you can put your phone down vertically or hold
menu bar: Brightness, Font Size, Font Style, and Font color. The
your phone in a way that lets the other person speak into the
QR Code Scanner described earlier is also accessed using the same
microphone. Thus, for now, the main purpose for live interpreting
menu. The brightness can be adjusted to three diferent levels. There
in AoD is for you to be more attentive to the presentation given by
are three diferent font sizes and four diferent font styles avail-
someone else without getting distracted or having to move your
able for personalization, but the application does not display the
gaze back and forth, as Miller et al. demonstrated in their research
names of the fonts and font styles. Font colors can be customized
back in 2017 [5].
using six colors – white, yellow, green, blue, red, and purple. A
one-fnger tap gesture is used to access a customization feature,
followed by another one-fnger tap gesture to cycle through the
3.1.2 Vuzix Blade. The Vuzix Blade is an exciting device to test
available customization options.
AoD. Currently, it is not an electronic device aimed at consumers.
Rather it is aimed toward business enterprises such as health-
3 DISCUSSION care or manufacturing. I am surprised at its bulkiness, though
In this section, the authors describe their experiences and observa- I can understand why the design is the way it is based on the
tions of using AoD based on their individual communication needs hardware features it has. It has the following features: noise-
and preferences for communication accessibility. canceling microphones, a touchpad, High Defnition (HD) cam-
era (8MP), and microSD storage. It is a device that can be turned
3.1 Real-time Interpreting - Mak – Journalist on and of and charged via USB. There is a slight learning curve
3.1.1 Access on Demand. My experience with using the software in navigating the user interface with one or two fngers with
inside the smart glasses is a mixed bag though I applauded that taps and swipes. I learned rather quickly, but the touchpad on
the software is developed by a tiny team consisting only of Deaf the right temple side is very sensitive, so you will make many
developers who are students working part-time while studying at accidental inputs while using the built-in or custom-installed
RIT. I was assigned to test out the live interpreting feature from the apps.
perspective of a DHH client. Overall, the primary purpose of seeing Problems: The issues I have in using the Vuzix Blade are that
live interpreting in the smart glasses worked though I encountered it has a short battery life of about 1 hour because AoD requires
some usability problems, especially with presenter mode. It should a lot of live streaming, does get sustainably hot on both temple
be noted that the presenter mode is a recently added feature to the sides, is Wi-Fi dependent, and requires a nose clip if a person is
AoD platform. The purpose of this mode is for the interpreter to already wearing prescription glasses (note there is a process where
be able to hear the external audio with your phone’s microphone you can get and install the exact prescription lens from the Vuzix
and enable them to do their job in real time as you observe inside Blade. I did not have the luxury to do it). Also, what I am seeing
the video feed. Thus, there are a few bugs to be expected, although in the smart glasses in terms of ‘augmented reality’ disappointed
the main function of this mode is to use an external microphone me, even if it is still in development. Watching live interpretation
such as a phone that works well. The latency I experienced while with the Vuzix Blade really consisted of a small 2D video feed on
feld-testing the live interpretation is close to zero. I can clearly see the right side of the smart glasses with no setting to change it
the interpreter’s movements smoothly and understand them; at the except by brightness. My original expectation of the AR enabled by
same time, they can hear the external audio captured in presenter the Vuzix Blade was that it could provide a level of 3D interaction
mode. in the real world that is adjusted on the fy, as shown by Yi-Hao
Limitations: From the perspective of a DHH user, however, it Peng et al. with their SpeechBubbles prototype [6]. However, the
does not necessarily improve the communication right now as the AR elements here can only be seen in a static feed and cannot be
presenter mode does not show the interpreter, and it also cannot moved.
emit audio for others to hear the interpreter. It only improves and Conclusion: Overall, even though personally it is cool to experi-
enhances the DHH user’s ability to understand and learn without ment with this device and see how it works with Access on Demand,
moving their head. Also, you limit yourself to signing only with one in real-world usage, I do not think it can be reliable when you can-
hand while the other hand is holding your phone in presenter mode. not communicate with hearing people while you are dealing with
Depending on the situation or person, you might or might not give the short battery life and connectivity issues. I do not think DHH
your phone to the other person you are talking to for a chance of people should be using the Vuzix Blade monocular version for live
ASSETS ’22, October 23–26, 2022, Athens, Greece Roshan Mathew et al.

interpretation specifcally because it does not provide a diferent ex- Scrollability: The captions are displayed in multiple lines within
perience compared to seeing interpreters on Zoom on a fat laptop the display. The number of lines and characters displayed will
screen nor provides additional benefts other than it being portable. depend on the font size and the font style chosen. When testing
There still is a lot of coordination involved in getting an interpreter using real-time captions, the contents did not automatically scroll
on board, and the biggest factor that can negatively afect this expe- to the latest input once it flled up the display container, and there
rience is Wi-Fi connectivity. Seeing an interpreter on your AR smart was no gesture defned to let me manually scroll to the latest text.
glasses is a novel idea, but in the near future, it still needs some Consequently, the captioner had to continuously delete the previous
improvements, especially on the hardware and software categories, content as soon as it reached the last line on the display so I could
such as making it 5G capable or able to use mobile data via your view the latest transcribed text. This issue has since then been
SIM card. rectifed. However, with auto-captions, this was not a concern as I
could view the latest text as the last line within the display.
Positioning: The captions are displayed to the left of the Blade’s
3.2 Real-time Captioning and Auto Captioning -
right lens such that the captions are overlayed in the center of
Mathew – Geek the three-dimensional space in binocular vision. When I look di-
3.2.1 Method. Both captioning modes (real-time captioning and rectly at the speaker, the captions are right in front of their face.
auto-captioning) were tested in separate scenarios where I use While Blade’s native device settings allow you to move the dis-
captioning daily for communication access. Real-time captioning play vertically, it does not ofer the option to move it horizontally
was tested at one of the campus dining places to interact with or diagonally. I prefer to have the captions displayed next to the
the staf while buying lunch. Real-time captioning was preferred speaker, on either side, so the ability to position the captions in 3D
for this scenario because dining locations are usually noisy, and space, depending on my preferences, would have been an excellent
auto-captioning does not typically perform well in such situations functionality. As a workaround, I looked a few degrees to the left
because of the background noise. I have also used real-time cap- or right so that the captions did not block my view of the speaker’s
tioning previously during trials at a local planetarium to watch face.
their shows. Auto-captioning was tested in an ofce environment Captioning Fidelity and Accuracy: With real-time captioning, the
during a conversation with another individual. Auto-captioning captioning fdelity and accuracy depend on the captioner. Some
was chosen for this scenario as it was a relatively quieter environ- captioners may transcribe speech exactly, while others may choose
ment where automatic speech recognition would work well. I took to provide a summary of the conversation. With auto-captioning,
detailed notes documenting these experiences on the same day of the speech is transcribed verbatim, and the accuracy depends on
the testing sessions. the clarity of speech, accents, background noise, etc. No formal
tests were conducted to measure the Word Error Rates (WER) for
3.2.2 Findings. Overall, real-time captioning and auto-captions real-time captioning or auto-captioning. Auto-captioning worked
work as intended. Real-time captions ofer the advantage of having almost in real-time, but a slight delay was noticed occasionally,
the captions provided by a human captioner. Human captioners of- which might have been due to an unstable internet connection.
fer the beneft of better accuracy, the ability to understand diferent Auto-captions failed to generate captions when the speaker’s rate
speech accents, and the capability to comprehend speech better in of speech was faster than usual and also when the speaker was
noisy environments. On the other hand, auto- captions ofer more more than six feet away from the device with the microphone.
privacy when you do not prefer to have a third-party listen to your Auto-captioning was found to be accurate with uncommon words.
conversation and when there is not enough turn-around time to Contextual Information: For real-time captions, the captioner
request and schedule a captioner. I would prefer one over the other could provide contextual information such as speaker names and
depending on my specifc accessibility needs and preferences. Fol- other background noises. However, for auto-captions, additional
lowing are some key observations regarding real-time captioning contextual information is not available. AoD does not ofer speaker
and auto-captions through the AoD platform. identifcation or sound recognition. The captions are displayed
Glanceability: One of the signifcant benefts of using AoD is that continuously without line breaks or punctuations, making it quite
I do not have to constantly switch my focus between the speaker and challenging to use auto-captions for group conversations.
another screen displaying captions, such as on a phone, laptop, or Learnability of Gestures: It takes some efort to learn all the pre-
tablet. I rely on speech reading while communicating, and viewing defned gestures for the Blade, so until I familiarized myself with
the captions on the Blade allowed me to maintain my gaze on the them, it was easy to use the wrong gesture and perform an undesired
speaker’s face to access non-verbal cues and facial expressions action. The Blade also does not ofer many customizable gestures.
that I would have otherwise missed. My conversation partners As a result, some of the custom gestures for AoD are not intuitive.
also pointed out that they thought the conversation was quite For example, I need to use a one-fnger swipe down gesture to access
natural as I looked directly at them instead of my phone or laptop. the horizontal menu at the bottom. However, this lacks external
Glanceability would be particularly benefcial in classrooms during consistency with other applications where such a gesture is used
lectures where instructors speak while referring to a presentation to pull down a menu at the top of the screen.
or demonstration, in a doctor’s ofce when they are explaining a Social Considerations: The Blade is bulky and noticeable com-
diagnosis while sharing a report, in a museum viewing an exhibit pared to regular eyeglasses, making the wearer stand out, especially
while listening to an audio description, or while engaging in a indoors. In addition, the status lights and the AR display’s backlight
hands-on activity listening to verbal instructions. make others curious if they are being recorded. While testing the
Access on Demand: Real-time, Multi-modal Accessibility for the Deaf and Hard-of-Hearing based on Augmented Reality ASSETS ’22, October 23–26, 2022, Athens, Greece

AoD in public spaces such as dining locations, I was asked whether mature, there is potential that AR smart glasses can be used as an
I was capturing others on video or audio and had to reassure them assistive technology tool.
that the platform does not save any data.
Wearability: As stated earlier, I wear a behind-the-ear (BTE) CI 5 CONCLUSION
and a BTE HA. While the Blade’s frame is compact when compared Our experiences with Access on Demand demonstrate that this
to other AR HMDs, its frame is still quite thick and is inconvenient prototype concept is promising and has the potential to fulfll the
to wear along with my CI and HA. Often, tilting or turning my accessibility needs of people with hearing loss in contexts where
head caused the Blade to knock my CI and HA out of position, so I the services of an in-person human captioner or interpreter are
had to keep adjusting them so that everything stayed in its place not available or feasible. However, there are several avenues for
constantly. The Blade also heats up with constant use, although it possible improvements, both with the AoD platform and the AR
does not get too hot. I also noticed some eye strain with continual smart glasses. Other research articles provided a few paths we could
use for over an hour. take to improve our platform, especially on the caption components
[2, 3, 5–8]. There is already a technological-social model developed
by Cesar Lozano and Roci Maciel [4] that we can use to determine
4 BENEFITS, LIMITATIONS, AND FUTURE how efective the AoD platform on the Vuzix Blade will be once
it has been fully developed and open to the general public and
DIRECTIONS measure the impact of social inclusiveness it will provide to the
The AoD prototype is currently designed as an accessibility tool DHH community via the three modes. Of course, there will be some
for situations where there will be a primary presenter, such as more feld testing to iron out the bugs and glitches in the software,
attending a lecture in a large classroom, a presentation or a talk and the next generation model of the Vuzix smart glasses called the
at an event, or watching live shows at a museum. It enhances Shield™ might have the hardware to fully realize the main goal of
learning and comprehension as the DHH users do not have to the AoD with its less bulky design and a better microphone [12],
constantly turn their heads between the speaker and the interpreter that is to provide accessibility everywhere and anytime.
or a captioning screen (on a phone or tablet) that would derail Overall, it was a rewarding experience to test the latest version
their train of thought. However, due to hardware and software of Access on Demand with the three assistive components and
limitations, the use of AoD in situations beyond scenarios described document our unique experiences in this report. We are looking
in this paper is currently restricted, both for DHH users who prefer forward to the future when this technology becomes mainstream.
interpreting or captioning.
Mathew and Mak observed that having a microphone with better ACKNOWLEDGMENTS
capabilities on the Blade would help avoid using a microphone on
This material is based upon work supported by the National Science
personal devices such as phones or laptops. In addition, an alterna-
Foundation under Award No. 1811509. Any opinions, fndings and
tive method of capturing and relaying the DHH users’ signs to the
conclusions, or recommendations expressed in this material are of
interpreters along with better stereo speakers on the Blade would
the author(s) and do not necessarily refect the views of the National
help interpreters voice for the DHH user, resulting in possibilities
Science Foundation.
for two-way communication. It should be noted that Mak could not
really engage in a two-way conversation due to the limitations of REFERENCES
holding the phone, as described earlier, while Mathew was able to [1] Hawa Allarakhia. 2022. Consider stafng options for ASL interpreters. Disability
use AoD during a two-way conversation with another individual Compliance for Higher Education 27, 7 (2022), 1–5. DOI:https://doi.org/10.1002/
because he could voice for himself. Should a DHH person who dhe.31218
[2] Abraham Glasser. 2019. Automatic Speech Recognition Services: Deaf and Hard-
prefers captioning chooses not to voice for themselves, they would of-Hearing Usability. In Extended Abstracts of the 2019 CHI Conference on Human
need to write on a piece of paper or type on their phone to com- Factors in Computing Systems (CHI EA ’19), Association for Computing Machinery,
municate to the other individual as AoD can only be used to view New York, NY, USA, 1–6. DOI:https://doi.org/10.1145/3290607.3308461
[3] Dhruv Jain, Bonnie Chinh, Leah Findlater, Raja Kushalnagar, and Jon Froehlich.
the captions of the conversation partner. Doing away with pre- 2018. Exploring Augmented Reality Approaches to Real-Time Captioning: A
senter mode on personal devices such as phones or laptops would Preliminary Autoethnographic Study. In Proceedings of the 2018 ACM Conference
Companion Publication on Designing Interactive Systems (DIS ’18 Companion),
make the whole experience hands-free, which would be benefcial Association for Computing Machinery, New York, NY, USA, 7–11. DOI:https:
in many situations. Some other areas that need signifcant improve- //doi.org/10.1145/3197391.3205404
ments would be Blade’s battery life, its ability to provide stable [4] Cesar Lozano and Rocio Maciel. 2017. Technological-Social Model: Based in
augmented reality platforms for the inclusion of deaf people in the university
Wi-Fi connectivity, and its form factor. For both interpreting and classroom and the cities. In Proceedings of the 8th Latin American Conference on
captioning, the ability to interact with the interpreter and caption Human-Computer Interaction (CLIHC ’17), Association for Computing Machinery,
containers in 3D space to reposition or customize them, as some New York, NY, USA, 1–4. DOI:https://doi.org/10.1145/3151470.3156648
[5] Ashley Miller, Joan Malasig, Brenda Castro, Vicki L. Hanson, Hugo Nicolau,
researchers demonstrated using the Microsoft HoloLens [3], would and Alessandra Brandão. 2017. The Use of Smart Glasses for Lecture Compre-
also enhance the overall user experience. hension by Deaf and Hard of Hearing Students. In Proceedings of the 2017 CHI
Conference Extended Abstracts on Human Factors in Computing Systems (CHI
In addition to improvements described earlier, future work EA ’17), Association for Computing Machinery, New York, NY, USA, 1909–1915.
should also include streamlining the process of requesting inter- DOI:https://doi.org/10.1145/3027063.3053117
preters or captioners before testing out the software rigorous in [6] Yi-Hao Peng, Ming-Wei Hsi, Paul Taele, Ting-Yu Lin, Po-En Lai, Leon Hsu, Tzu-
chuan Chen, Te-Yen Wu, Yu-An Chen, Hsien-Hui Tang, and Mike Y. Chen. 2018.
more real-world situations such as during an in-person conference SpeechBubbles: Enhancing Captioning Experiences for Deaf and Hard-of-Hearing
meeting or a panel at a convention. As AR technology and hardware People in Group Conversations. In Proceedings of the 2018 CHI Conference on
ASSETS ’22, October 23–26, 2022, Athens, Greece Roshan Mathew et al.

Human Factors in Computing Systems (CHI ’18), Association for Computing Ma- Annals of the Deaf 152, 1 (2007), 20–28. DOI:https://doi.org/10.1353/aad.2007.0015
chinery, New York, NY, USA, 1–10. DOI:https://doi.org/10.1145/3173574.3173867 [9] NTID Research Center on Culture and Language | National Technical Institute
[7] Ali Mohammed Ridha and Wessam Shehieb. 2021. Assistive Technology for for the Deaf | RIT. Retrieved August 5, 2022 from https://www.rit.edu/ntid/nccl
Hearing-Impaired and Deaf Students Utilizing Augmented Reality. In 2021 IEEE [10] The C-Print®System | C-Print. Retrieved August 5, 2022 from https://www.rit.
Canadian Conference on Electrical and Computer Engineering (CCECE), 1–5. edu/ntid/cprint/
DOI:https://doi.org/10.1109/CCECE53047.2021.9569193 [11] Vuzix Blade Upgraded Smart Glasses. Vuzix. Retrieved August 5, 2022 from
[8] Phillip Ward, Ye Wang, Peter Paul, and Mardi Loeterman. 2007. Near-Verbatim https://www.vuzix.com/products/vuzix-blade-smart-glasses-upgraded
Captioning Versus Edited Captioning for Students Who Are Deaf or Hard of [12] Vuzix ShieldTM. Vuzix. Retrieved August 5, 2022 from https://www.vuzix.com/
Hearing: A Preliminary Investigation of Efects on Comprehension. American pages/vuzix-shield
VRBubble: Enhancing Peripheral Awareness of Avatars for People
with Visual Impairments in Social Virtual Reality
Tiger Ji Brianna R. Cochran Yuhang Zhao
tfji@wisc.edu bcochran2@wisc.edu yuhang.zhao@cs.wisc.edu
University of Wisconsin-Madison University of Wisconsin-Madison University of Wisconsin-Madison
Madison, Wisconsin, USA Madison, Wisconsin, USA Madison, Wisconsin, USA
ABSTRACT 1 INTRODUCTION
Social Virtual Reality (VR) is growing for remote socialization and Social virtual reality (VR) refers to VR platforms that allow users to
collaboration. However, current social VR applications are not ac- socialize with each other in the form of avatars in a virtual space
cessible to people with visual impairments (PVI) due to their focus [41]. More and more social VR platforms have been deployed to
on visual experiences. We aim to facilitate social VR accessibility the market, such as VRChat, Rec Room, and Altspace.[22, 43, 78]
by enhancing PVI’s peripheral awareness of surrounding avatar Compared to 2D video conferencing systems that cause “Zoom fa-
dynamics. We designed VRBubble, an audio-based VR technique tigue” [47], social VR ofers an immersive and engaging experience
that provides surrounding avatar information based on social dis- that can enhance interpersonal interaction and boost productivity.
tances. Based on Hall’s proxemic theory, VRBubble divides the As a result, social VR has attracted increasing attention in recent
social space with three Bubbles—Intimate, Conversation, and Social years. The VR market was valued at $7.81 billion as of 2020, and
Bubble—generating spatial audio feedback to distinguish avatars is expected to grow 28.2% annually from 2021 to 2028[54]. Meta,
in diferent bubbles and provide suitable avatar information. We formerly known as Facebook, also re-branded in 2021 alongside
provide three audio alternatives: earcons, verbal notifcations, and their full commitment to producing a Metaverse, envisioning social
real-world sound efects. PVI can select and combine their preferred VR to be the next generation of Internet that connects everyone.
feedback alternatives for diferent avatars, bubbles, and social con- Unfortunately, similar to most mainstream VR applications, cur-
texts. We evaluated VRBubble and an audio beacon baseline with rent social VR mainly targets sighted users by providing various
12 PVI in a navigation and a conversation context. We found that visual avatar designs and supporting non-verbal social interac-
VRBubble signifcantly enhanced participants’ avatar awareness tions, which poses barriers to people with visual impairments (PVI).
during navigation and enabled avatar identifcation in both con- With more than two billion people experiencing visual impair-
texts. However, VRBubble was shown to be more distracting in ments worldwide [53], it is vital to provide PVI equal access to the
crowded environments. emerging social VR as virtual collaboration and gathering increases,
especially during the COVID-19 pandemic [50].
CCS CONCEPTS Researchers have started tackling the VR accessibility problems
for PVI by enabling them to navigate and perceive VR scenes. Some
• Human-centered computing → Virtual reality; Accessibility research leveraged or created additional devices (e.g., PHANToM,
technologies. game controller thumbsticks) to enable VR navigation by providing
haptic feedback and/or audio feedback [7, 24, 30, 46, 49, 68, 85]. Oth-
KEYWORDS ers focused on software solutions, designing accessible interactions
visual impairments, social virtual reality, proxemics, audio feedback based on existing VR setups, such as keyboard-based interactions
with spatial audio feedback for a virtual world [52, 59, 72, 74], and
ACM Reference Format: more intuitive interactions based on of-the-shelf VR controllers
Tiger Ji, Brianna R. Cochran, and Yuhang Zhao. 2022. VRBubble: Enhancing and headsets [86]. However, prior work mainly focused on basic VR
Peripheral Awareness of Avatars for People with Visual Impairments in tasks such as navigation and object perception. It does not address
Social Virtual Reality. In The 24th International ACM SIGACCESS Conference the unique barriers caused by the dynamic and multiplayer nature
on Computers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, of social VR.
Greece. ACM, New York, NY, USA, 17 pages. https://doi.org/10.1145/3517428. Diferent from a static VR scene with only system-generated ob-
3544821 jects, social VR is more complex and challenging—human controlled
avatars constantly move in the environment and interact with other
avatars and objects. All the avatar dynamics are mainly presented
Permission to make digital or hard copies of all or part of this work for personal or visually in social VR and not accessible to PVI. To our knowledge,
classroom use is granted without fee provided that copies are not made or distributed no existing techniques have focused on the avatar dynamics to
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than ACM
support accessible social VR experience for PVI.
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, We aim to fll this gap by enhancing PVI’s awareness of sur-
to post on servers or to redistribute to lists, requires prior specifc permission and/or a rounding avatars in social VR. Unlike prior work that required PVI
fee. Request permissions from permissions@acm.org.
ASSETS ’22, October 23–26, 2022, Athens, Greece
to actively explore and query information from the environment
© 2022 Association for Computing Machinery. [49, 85], we focus on peripheral awareness—the innate ability to un-
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 consciously “maintain and constantly update a sense of our social
https://doi.org/10.1145/3517428.3544821
ASSETS ’22, October 23–26, 2022, Athens, Greece T. Ji et al.

and physical context” [56]. People can usually maintain peripheral instruments in the environment that generated spacial music. Par-
awareness efortlessly without being distracted from their main ticipants then listened to the music tracks to identify and locate
focus [6]. This ability is especially important in social and collabora- instruments on the stage. Heuten et al. [19] presented a sonifcation
tive environments since it provides more context for one’s activity. interface to virtual maps for PVI. PVI can listen to the spatialized
For example, when navigating to a specifc location, a sighted per- earcons specifc to various geographic objects and landmarks on
son can easily notice who has passed by to decide whether to greet the map to construct a mental model of a geographical area. Au-
that person or start a quick conversation; when in the middle of a dioDoom [63, 64] was an acoustic virtual environment designed for
conversation, they can stay aware of who just joined the conversa- blind children. Virtual objects and events generated spatial sounds
tion or who is close enough to overhear the conversation. to help users identify objects, navigate the space, and improve
We seek to facilitate the peripheral awareness of avatars for cognitive skills.
PVI in social VR. Via an iterative design with six PVI, we designed Echolocation. Virtual echolocation has been emulated through
VRBubble, an audio-based VR technique that provides surrounding signals and audio refections [2, 80, 83]. For example, Andrade et
avatar information based on social distances. Following Hall’s prox- al. [2] enabled PVI to use echolocation to navigate a desktop-based
emic theory [18], we split the virtual space with three “bubbles” virtual world, where the user’s avatar can produce mouth-click or
centered on the user—the Intimate Bubble, Conversation Bubble, clap sounds by pressing a key on a keyboard and hear the sound
and Social Bubble—to represent diferent social spaces that are suit- refected in the environment. Waters and Abulula [80] presented
able for diferent social interactions. VRBubble then provides three a sonar system based on the refectivity of ultrasound used in the
spatial audio alternatives (i.e., earcons, real-world sound efects, echolocation of bats. The audio used was scaled within human
and verbal notifcations) to convey the avatar information, such as hearing ranges, so that PVI can utilize the audio to navigate a VR
names, relationships with the PVI (friends or strangers), and their environment. However, the echolocation method was only used by
interactions with the bubbles (entering or leaving). PVI can also a small amount of blind people.
fexibly select and combine their preferred audio alternatives for User Queried Verbal Descriptions Some works allowed users
diferent bubbles and avatars to maintain awareness of the avatar to select objects in the environment via a list or a grid and hear
dynamics. additional descriptions about the selected object [37, 49, 74, 82]. For
We evaluated VRBubble with 12 participants with visual impair- instance, Terraformers [82] was a virtual world game designed to
ments in the context of a navigation task and a conversation task be accessible to PVI. It provided a menu for nearby objects. A user
in social VR, with an audio beacon attached to each avatar as the can thus navigate the menu and hear audio descriptions of selected
baseline. Our study showed that VRBubble enhanced user avatar objects. Nair et al. [49] also compared similar menu systems with
awareness while navigating and was efective at providing users their novel game controller-based navigation technique, allowing
with previously inaccessible identifying information about avatars. PVI to look around a virtual world by scrubbing the thumbstick on a
We also found that PVI generally favor receiving verbal descrip- game controller to diferent directions; the system then announced
tions while navigating, and more brief and intuitive sounds while what was in that direction via spatial verbal descriptions.
conversing.

2 RELATED WORK 2.1.2 Haptic Solutions for VR accessibility. Prior work has also
enhanced the accessibility of virtual environments for PVI by gen-
2.1 Accessibility of Virtual Environments erating haptic feedback or creating haptic controllers [25, 33, 66–
2.1.1 Audio Techniques for VR accessibility. There has been exten- 68, 75, 76, 81, 85]. For example, Jansson et al. [25] enabled PVI to use
sive work that assisted PVI in exploring virtual environments via the stylus on a Phantom Premium device to “touch” a virtual space
audio feedback. We summarize diferent types of audio techniques and receive force feedback to perceive diferent virtual surfaces and
in prior work below. objects. Tzovaras et al. [76] leveraged the CyberGrasp haptic gloves
Audio beacons. Audio beacons have been used to convey object to generate force feedback to a blind user’s fngers, providing the
positions [13, 35, 36, 79]. For example, Walker and Lindsay’s [79] illusion that they were navigating the virtual space with a cane.
study utilized three diferent audio beacons in navigation guidance. In the same vein, Zhao et al. created Canetroller [85], a wearable
They observed the impacts on PVI’s navigation performance as haptic VR controller that simulated white cane interaction for blind
they changed various parameters, such as timbre and distance to people in virtual reality. When a user swept the controller and hit
a waypoint to trigger the audio. Maidenbaum et al. [36] provided a virtual object, a physical resistance and spatial audio sound efect
a beeping sound based on the distance between the PVI’s avatar were generated to simulate the feedback of a white cane hitting a
and the virtual object in front of them to facilitate navigation in real object. A follow-up work by Siu et al. [68] further improved
a virtual space. As the avatar got closer to the object, the beeping the design of the controller by providing three dimensional force
rose in frequency of beeps. Blind Swordsman [13] was a VR game feedback. Recently, Wedof et al. designed a virtual reality game,
on mobile devices, where a blind user can hear the spatial audio called Virtual Showdown [81]. PVI played the game by hitting a
beacon from the enemies, physically turn to that direction, and tap virtual ball into the opponent’ goal across a virtual table using a bat.
the touchscreen to swing his sword in the direction he is facing. In this game, a Kinect was used to track the user’s movement. A
Object Sonifcation. Prior work has also used audio to identify visually impaired user can then hold a Nintendo Switch controller
objects within the virtual space [3, 10, 19, 42, 55, 57, 63–65]. For as the bat and receive both audio and vibration feedback to perceive
example, de Oliveira et al. [10] recreated a virtual stage and placed the relative position of the ball from his bat.
VRBubble ASSETS ’22, October 23–26, 2022, Athens, Greece

Prior research on VR accessibility focuses on space navigation 2.2.2 Assistive Peripheral Awareness Technologies. There has been
and object perception. However, social VR introduces additional some prior work that facilitated peripheral awareness during col-
complications with the dynamic and non-uniform avatars that laborative tasks through visual [6, 61] or audio [9, 27, 29, 34] cues.
present social implications. No research has addressed the accessi- For example, Cadiz et al. [6] created the Sideshow interface to
bility of avatars in social VR. Our research aims to fll this gap by support the user’s peripheral awareness of information on their
facilitating PVI’s awareness of avatars in social VR via customizable computer. The interface remained on the user’s primary display and
audio techniques. presented personalized information through visual notifcations
and summaries, such as number of unread emails or number of
friends online. Sakashita et al. [61] enabled remote collaboration
2.2 Accessibility of the Real World involving physical artifacts through the use of video conferencing
As with virtual environments, a myriad of prior work has designed and motion tracking. They placed video devices to represent each
audio techniques to sonify real world environments, including using collaborator and automatically oriented devices to emulate the gaze
audio beacons to mark waypoints [20, 84], informing users about direction of the collaborator. This supported the user’s peripheral
nearby objects and landmarks through verbal descriptions [15, 66], awareness of what part of the physical artifact a collaborator was
generating auditory icons or earcons to identify points of interest focusing on. However, these works focus on sighted people by
[39, 58], and providing echolocation or sonar systems to enable the providing visual feedback.
exploration of surrounding environments [21, 77]. Similar to audio Some work designed audio feedback to enhance PVI’s peripheral
techniques for virtual worlds, these solutions also do not address awareness in a work collaboration scenarios [9, 27, 34]. For example,
the dynamic complexity of avatars in a social VR context. Lee et al. [34] designed the CollabAlly browser extension, which
provided blind users with audio feedback to support the navigation
of content or comment changes. Earcons were utilized to convey
2.2.1 Technologies to Facilitate Real-World Social Activities. More which part of the document was being edited and diferent voices
relevant research has been done to assist PVI in real-world social were used for text-to-speech to contextualize which collaborator
activities by enhancing their awareness of the conversational part- was editing. Jung [27] used ambient music to convey notifcations
ners and their non-verbal behaviors. Various wearable or handheld to individuals within a physical work space. Speakers were placed
assistive technologies have been developed. These technologies throughout the space to enable spatial audio, then unique musical
came in a variety of forms, including smartphone applications [87], notifcations were played at an individual’s location when events
belts[5, 40], gloves [31, 32], headband [45, 60], glasses [1, 73], and relevant to that individual occurred, such as the reception of an
wearable cameras [12, 73]. Some research focused on haptic feed- email.
back. For example, Krishna et al. [32] created a haptic glove called However, this work does not design for the unique challenges
VibroGlove that allowed PVI to understand facial expressions of posed by a social context. Compared to collaborative tasks which
a conversation partner. The glove consisted of several vibration involve a small number of collaborators and allow for asynchronous
motors mounted on the back of each fnger that was used to present interactions, a social context can involve a large number of moving
six diferent emotions (i.e., anger, disgust, happiness, fear, surprise, avatars, generating more peripheral information and distraction.
and sadness) through vibration patterns. These patterns were de- Thus, a diferent design is needed to adequately facilitate peripheral
signed based around the shape of the mouth and eye area of each awareness for the social context.
facial expression. Another example was Tactile Ban, a wearable
headband prototyped by Qiu et al. [60], which used tactile feedback 3 DESIGN OF VRBUBBLE
to provide the feeling of other people’s gazes onto the PVI. The
We designed VRBubble, an audio-based VR interaction technique
band had two vibration patterns based on if a person glanced at the
to enhance the peripheral awareness of avatars in social VR for
PVI or if their gaze was fxated on them currently.
PVI. Our design followed the method of user-centered design [51].
Some work focused on audio feedback. For example, Anam et
We frst formulated a set of general design considerations based
al. [1] used Google glasses to track faces and communicate social
on prior literature to design an initial prototype. With a formative
signals (i.e., facial features, behavioral expressions, body postures,
study with six PVI, we further iterated and improved our design
and emotions) of surrounding people to the PVI verbally. Recently,
[26]. We describe our design process and the fnal design below.
Morrison et al. [45] designed PeopleLens, a head-mounted device
that provided spatial identifying audio to assist blind children with
gaze direction and mental mapping of surrounding people. Bump 3.1 General Design Considerations
and woodblock sounds were used to guide the user’s gaze to center We formulated the following design considerations (C1-C3) based
on a face, while names would be read out for identifed people in on insights from prior literature [17, 87].
the surroundings as the user’s gaze passed over them. (C1) Leverage mainstream platforms. We focus on desktop VR,
Prior work has focused on enhancing the primary social tasks, where a user can see the virtual environment on the desktop screen,
conveying information about the conversational partners. Besides hear the spatial audio feedback via earphones or speakers, and
the primary task, peripheral awareness (as a secondary task) is also interact with it via keyboard and mouse. While VR headsets are
important in social activities to unconsciously sense the surrounding emerging, most people don’t own a headset due to its cost [70],
dynamics and make ad hoc social decisions. However, this ability let alone PVI who cannot beneft from the visual feedback from
remains understudied for PVI. the headset. Instead, desktop computers are more widely used and
ASSETS ’22, October 23–26, 2022, Athens, Greece T. Ji et al.

many social VR platforms (e.g., VRChat, AltSpace) support desktop intimate space, it is important to alert them when avatars enter this
access. Moreover, following the VR ofce concept by Grubert et al. bubble (C2).
[17], stationary VR in front of a desk that can be controlled by a Conversation Bubble represents the space between the intimate
keyboard can support more comfortable and efcient interaction and personal distance. Avatars in this space are close enough to
for longer use and higher productivity in social and collaborative start a conversation with. Moreover, the user may need to pay
context. attention to these avatars since they are close to overhear one’s
(C2) Convey proper information about avatar dynamics. Prior ongoing conversation. We thus generate audio feedback to notify
research has explored PVI’s needs for information in real-world the user if any avatar enters or exits this space (C2).
social activities and indicated that the top two important informa- Social Bubble defnes the space outside of the personal distance,
tion include the identity of surrounding people and their relative but still within social distance. Avatars in this space are potential
location [87]. We thus translate such needs to the virtual world and conversational partners, but would not be in the immediate distance
provide the corresponding avatar information to support PVI in to start the conversation. The user could then decide whether she is
the social VR context. interested in approaching this person for a conversation. Compared
(C3) Avoid intrusiveness and distraction. Since peripheral aware- to the Intimate and Conversation Bubble, we expect that avatar
ness is supposed to be efortless [6, 38], we seek to minimize the information in the Social Bubble would be less important. We thus
distraction of our design as well as the user’s conscious efort. We design more subtle audio feedback to inform the user of the avatars
thus consider short but intuitive audio feedback at suitable tim- in this bubble (C3).
ing to convey avatar information to PVI in an unobtrusive manner. We do not consider avatars outside of the the social distance (i.e.,
Moreover, instead of having users to actively query information, we public space in Hall’s theory) since they are much less relevant to
focus on proactive notifcations to reduce their interaction efort. the users’ current social context (C3).

3.2 VRBubble based on Hall’s Proxemic Theory 3.3 Audio Alternative Design via Iterations
To convey surrounding avatars’ location information (C2) to PVI Based on the three bubbles, we designed spatial audio feedback to
without overwhelming them (C3), our design followed Hall’s prox- convey the surrounding avatars information, including the avatar
emic theory [18] to divide the virtual environment into diferent identity and motion dynamics between these bubbles (C2), thus
social spaces. Hall’s proxemic theory correlated physical distances enabling PVI to build suitable social interactions upon sufcient
with social interactions that typically happen within that distance, avatar awareness. We also sought to make our audio feedback as
such as the distance for intimate interactions versus the distance least distracting and invasive as possible (C3). To achieve these goals,
for friendly conversation. Three distances were defned: intimate we designed and iterated on diferent audio feedback alternatives
distance (1 foot) where people usually feel distress when this space via a formative study.
is encroached upon unwillingly, personal distance (4 feet) where
people interact with familiar people, and social distance (12 feet) 3.3.1 Formative Study. Following the method of user-centered
where conversation with less familiar acquaintances or group hap- design [51], we conducted a formative study [26] with six PVI (three
pens. The space outside of the social distance is considered to be male, two female, and one who preferred not to say) whose ages
public space. ranged from 22 to 58 (���� = 44.167, �� = 13.348). All participants
were legally blind.
Initial Design. In the formative study, we presented our initial
design of VRBubble with the three bubbles described before. When
an avatar entered or exited a bubble, the user heard an earcon with a
verbal description of the avatar’s information. Earcons were utilized
as they are brief, abstract, and distinctive (C3) sounds that encodes
particular information [4]. We used diferent earcons to represents
an avatar’s moving dynamics between diferent bubbles. A two-
beat earcon with increasing tone (or decreasing tone ) was
used to represent an avatar entering (or leaving) the Social Bubble;
Figure 1: Conceptual diagram of bubbles. similar two-beat earcons with a diferent timbre were used for the
Conversation Bubble; and an “bumping” sound earcon was used to
indicate an avatar in the Intimate Bubble. All earcons were accom-
We defned three virtual bubbles centered on a user based on the panied with a more informative verbal description (C2), reporting
distance thresholds defned by Hall; the Intimate Bubble, Conver- the avatar’s name, relationship with the user (friend or stranger),
sation Bubble, and Social Bubbles (Figure 1). We describe the social and the relative position (“nearby” for Conversation Bubble, “In
indication of each bubble: vicinity” for Social Bubble). For example, the user heard “Friend
Intimate Bubble defnes the space within the intimate distance. Alice nearby (or no longer nearby)” if Alice’s avatar entered (or left)
Avatars in this bubble signify that they are about to collide with the Conversation Bubble. All audio feedback were spatial audio
the user’s avatar. Since PVI cannot visually perceive whether they rendered from the avatar’s position. We prototyped VRBubble using
are too close to others or whether other people are invading their web-based VR (C1, details in Section 4.1.2).
VRBubble ASSETS ’22, October 23–26, 2022, Athens, Greece

Method and Findings. The formative study was hosted virtually avatars in this bubble probably required more immediate attention,
through zoom and took roughly 2 hours for each participant to
we used four-beat earcons with the tone of last beat increased
complete. Participants were given access to our VR environment
3 or decreased 4 to represent an avatar entering or leaving
through a url. Participants were introduced to the initial VRBubble
design through a short tutorial. We then asked them to experience this bubble. To distinguish friend and stranger avatars, we adjusted
VRBubble in two tasks: a navigation task, and a conversation task. the pitch and speed of the earcons, so that higher pitched, faster
Both tasks were used to simulate common social contexts they earcons indicated friends, while normal pitch and speed indicated
could encounter in VR. During the prototype experiencing process, strangers 5 6 7 8 . We used a game-like bump earcon 9 to signify
participants thought aloud, describing whether they like this feature when an avatar was in the Intimate Bubble.
or not and why. By the end, participants discussed how they wanted Verbal Notifcations. We provided clear and short (C3) verbal
to improve VRBubble in diferent social context. notifcations to present avatar information. To reduce distraction,
We summarize our major fndings from the formative study. (1) we informed avatar identity by only reporting the name. We no
While most participants found the earcon design useful, understand- longer provided information on if the avatar was a friend or a
ing the abstract earcons could create a steep learning curve for PVI. stranger since in the real world a person would be able to know
Some participants suggested more intuitive sound efects to present whether someone was a friend by name. We also used verbal no-
avatar information. (2) Verbal description is clear and easy to un- tifcation to convey the avatar’s general position based on their
derstand, but it could be distracting especially in a conversation interaction with the bubbles. Specifcally, we used “in the area” 10
context. Participants suggested shortening the verbal description and “left the area” 11 to indicate when an avatar entered or left
to reduce distraction. (3) Participants had diferent preferences for the Social Bubble, and “nearby” 12 and “no longer nearby” 13 for
audio feedback and valued various avatar information diferently. entering or leaving the Conversation Bubble. If an avatar, Alice, en-
Our design should provide the fexibility for users to customize tered the Conversation Bubble, a user would hear “Alice nearby.” To
their audio experience for diferent avatars and social contexts. signify that an avatar was in the Intimate Bubble, we used “collided
with” 14 followed by the avatar name.
3.3.2 Three Audio Alternatives. Based on the fndings in the Real-world Sound Efect. This design alternative focused on
formative study, we designed three spatial audio feedback alterna- providing realistic real-world sounds to intuitively support periph-
tives: earcons, verbal notifcations, and real-world sound efects. eral awareness and immersion, thus reducing the amount of efort
Each alternative presented similar avatar information, including needed to comprehend the audio notifcation (C3). We used crowd
avatar identity (name and/or relationship with the PVI) and motion sound efects as the background sound, with diferent densities to
dynamics between bubbles (C2). We describe the design of three represent the total number of avatars in all three bubbles within the
audio alternatives for each bubble: social distance (12 feet). Two levels of crowd sound efects with in-
creasing volume and densities indicate two levels of stranger avatar
amount: 1-5 15 and more than 5 16 strangers. Two levels of cheer-
ing crowd sounds were also used to highlight friend avatars due
to their higher importance, indicating two levels of friend avatar

3 Friend entering Conversation Bubble earcon: https://cdn.glitch.global/11db17e0-58dc-


4f24-8390-bb2e597d1475/f_convo_in.mp3?v=1643983723059
4 Friend leaving Conversation Bubble earcon: https://cdn.glitch.global/11db17e0-58dc-
4f24-8390-bb2e597d1475/f_convo_out.mp3?v=1643983757550
5 Stranger entering Social Bubble earcon: https://cdn.glitch.global/11db17e0-58dc-4f24-
8390-bb2e597d1475/s_soc_in.mp3?v=1649899925021
6 Stranger leaving Social Bubble earcon: https://cdn.glitch.global/11db17e0-58dc-4f24-
8390-bb2e597d1475/s_soc_out.mp3?v=1639641013488
Figure 2: Three audio alternatives for diferent bubbles: (a) 7 Stranger entering Conversation Bubble earcon: https://cdn.glitch.global/11db17e0-

earcons, (b) verbal notifcations, (c) real-world sound efects. 58dc-4f24-8390-bb2e597d1475/s_convo_in.mp3?v=1649900021917


8 Stranger leaving Conversation Bubble earcon: https://cdn.glitch.global/11db17e0-
58dc-4f24-8390-bb2e597d1475/s_convo_out.mp3?v=1643983842245
9 Intimate Bubble earcon: https://cdn.glitch.me/11db17e0-58dc-4f24-8390-
Earcon. Given that earcon is abstract and has a high learning
bb2e597d1475%2F483602__raclure__game-bump.mp3?v=1636388046675
curve, we used earcons to present simple information, such as dis- 10 Entering Social Bubble verbal: https://cdn.glitch.global/11db17e0-58dc-4f24-8390-

tinguishing diferent bubbles and diferent avatars (friend avatar bb2e597d1475/area.mp3?v=1643807674537


11 Leaving Social Bubble verbal: https://cdn.glitch.global/11db17e0-58dc-4f24-8390-
vs. stranger avatar). Instead of completely diferent earcons, we de-
bb2e597d1475/noarea.mp3?v=1643807963569
signed associated earcons with distinctions to minimize the learning 12 Entering Conversation Bubble verbal: https://cdn.glitch.global/11db17e0-58dc-4f24-
curve (C3). We used a two-beat earcon with the tone of the last beat 8390-bb2e597d1475/nearby.mp3?v=1643807938153
13 Leaving Conversation Bubble verbal: https://cdn.glitch.global/11db17e0-58dc-4f24-
increased 1 (or decreased 2 ) to indicate an avatar entering
8390-bb2e597d1475/nonearby.mp3?v=1643807989565
14 Intimate Bubble verbal: https://cdn.glitch.global/11db17e0-58dc-4f24-8390-
(or leaving) the Social Bubble. For the Conversation Bubble, since
bb2e597d1475/collidedwith.mp3?v=1643983652403
1 Friend entering Social Bubble earcon: https://cdn.glitch.global/11db17e0-58dc-4f24- 15 Low ambient stranger crowd: https://cdn.glitch.me/11db17e0-58dc-4f24-8390-
8390-bb2e597d1475/f_soc_in.mp3?v=1639640993569 bb2e597d1475/stranger_crowd_small.wav?v=1639761138064
2 Friend leaving Social Bubble earcon: https://cdn.glitch.global/11db17e0-58dc-4f24- 16 High ambient stranger crowd: https://cdn.glitch.me/11db17e0-58dc-4f24-8390-
8390-bb2e597d1475/f_soc_out.mp3?v=1639640998113 bb2e597d1475/stranger_crowd_medium.wav?v=1639761063563
ASSETS ’22, October 23–26, 2022, Athens, Greece T. Ji et al.

amount: 1-5 17 and more than 5 18 friends. Moreover, we put more Table 1: Demographic information of 12 participants.
emphasis on the avatars in the Conversation Bubble since the user
was more likely to converse with them. A spatial footstep sound ID Age/Sex Prior VR Experience
efect was regularly played for every avatar within the Conversa-
P1 31/M Vision therapy
tion Bubble. We generated diferent footsteps sounds to distinguish
P2 57/M 3D virtual home design game
friend avatars from strangers, with the friend avatars presenting
P3 37/M Audio-based VR games; VR studies
a higher pitch heel sound 19 and the stranger avatars presenting
P4 68/M VR environments; VR studies
a boot stomp 20 . We also generated a realistic bump sound 21 to
P5 44/F Simulated street navigation VR app
indicate an avatar entering the Intimate Bubble.
P6 34/F VR headset games
P7 49/F Oculus Quest accessibility testing; VR games
3.4 Customization P8 25/F None
Customization was key to our design since diferent PVI have dif- P9 33/F None
ferent audio preferences and prioritize diferent information and P10 46/M PVI accessible audio games
goals. Instead of generating all audio feedback together (as we did P11 23/M Navigational VR study
in the initial design), we allowed users to select and combine dif- P12 20/M None
ferent audio alternatives for diferent bubbles, avatars (e.g., friend
vs. stranger), and social context to customize their experiences. For 4.1.2 Apparatus. We built a custom VR environment with avatars
example, a user could choose verbal notifcations for friends but and implemented both VRBubble and a baseline feature in the VR
earcons for strangers in the Social Bubble, and combine both au- environment. To evaluate VRBubble in diferent contexts, we imple-
dio alternatives for both friends and strangers in the Conversation mented two VR scenarios, a navigation scenario and a conversation
Bubble. The user was also allowed to select none audio feedback scenario. We describe the design and implementation of our study
for a specifc type of avatar or bubble (e.g., stranger avatars in the environment.
Social Bubble). Baseline. We compared VRBubble’s performance to a baseline
feature. Our baseline was an audio beacon that consisted of a spatial
4 EVALUATION beeping sound, in intervals, attached to each avatar (except for the
We evaluated VRBubble with 12 PVI in diferent social contexts. We participant avatar) in the environment. We chose this baseline since
aimed to understand: (1) How efective is VRBubble in enhancing audio beacon was an efective and common way of conveying the
PVI’s peripheral awareness of avatars in diferent social contexts? presence of people and objects for PVI in the real world [11]. Some
(2) How distracting is VRBubble? (3) How do PVI customize their current assistive technology for PVI (e.g., Microsoft Soundscape
audio experience for diferent avatars, bubbles, and social contexts? [44]) also uses audio beacon to label target destinations for PVI.
VR environment and Two Scenarios. We built a custom VR
4.1 Method environment with two key VR scenarios: a navigation scenario and
a conversation scenario. In the navigation scenario, we generated a
4.1.1 Participants. We recruited 12 participants with visual im-
VR space plan with eight similar navigation routes (Figure 3(a)). All
pairments (7 male, 5 female) whose ages ranged from 20 to 68
routes were approximately 60 ft with one turn (either turn left or
(���� = 38.92, �� = 13.79, Table 1). No participants were in the
turn right). All routes were considered to be the same according to
formative study. The participants were recruited through the Na-
Ishikawa et al.’s study design [23] since they had same length and
tional Federation of the Blind. We used a survey to check the par-
same number of turns. Avatars were rendered at along the routes
ticipants’ eligibility. Participants were eligible for our study if they
at various distances (Figure 3(a)).
were at least 18 years old, legally blind, and capable of independent
In the conversation scenario, we rendered one avatar in front
consent. We also asked about participants’ VR experience in the
of the user, conversing with the user. Other avatars were rendered
survey. We prioritized participants who had experience with VR
and moving around in the virtual space during the conversation
or virtual environments. Among the 12 participants, nine had VR
(Figure 3(b)).
experience, such as audio-based VR games, VR for vision therapy,
Features for Basic Movements To enable participants to navi-
and VR environment exploration. However, no participants had
gate in the virtual environment and experience the avatar aware-
experience with social VR. Participants were compensated at the
ness features, we added two fundamental features to allow PVI to
rate of $20 per hour.
move and turn in a virtual environment: (1) Movement: PVI can
control the movement of their own avatar via arrow keys, for ex-
17 Low ample, up arrow to move forward and left arrow to move left. Each
ambient friend crowd: https://cdn.glitch.me/11db17e0-58dc-4f24-8390-
bb2e597d1475/friend_crowd_small.wav?v=1639761145228 key press would move the PVI a foot in the virtual space and play a
18 High ambient friend crowd: https://cdn.glitch.me/11db17e0-58dc-4f24-8390- footstep sound to notify the PVI. This footstep sound 22 is centered
bb2e597d1475/friend_crowd_medium.mp3?v=1639761151401 on the user’s avatar and difers from the footsteps used for VRBub-
19 Friend footstep: https://cdn.glitch.global/11db17e0-58dc-4f24-8390-bb2e597d1475/
fstep.mp3?v=1643754264523 ble by pitch. (2) Turning. A user can turn her avatar to the left (or
20 Stranger footstep: https://cdn.glitch.global/11db17e0-58dc-4f24-8390-bb2e597d1475/
right) by pressing the left (or right) arrow keys while holding the
sstep.mp3?v=1643754271035
21 Collision: https://cdn.glitch.global/11db17e0-58dc-4f24-8390-bb2e597d1475/mixkit- 22 User footstep: https://cdn.glitch.me/11db17e0-58dc-4f24-8390-bb2e597d1475%
quest-game-heavy-stomp-v-3049.wav?v=1644177861935 2F571698__rainial-co__cotton-thud-7.wav?v=1636411987805
VRBubble ASSETS ’22, October 23–26, 2022, Athens, Greece

controls and the audio feedback produced by the avatar movement.


We then introduced VRBubble and the baseline.
In the tutorial for VRBubble, we introduced each audio alterna-
tive one by one. For each alternative, we demonstrated the audio
feedback of both friend avatar and stranger avatar for each Bubble.
We then rendered avatars in the VR space and asked participants
Figure 3: Two tasks: (a) navigation, (b) conversation. to freely explore the environment with this audio alternative until
they felt fully familiar with it. During the exploration, participants
were asked to think aloud, talking about their experiences with this
shift key. A ticking sound 23 would play for each 90 degree turn, audio design, including whether they like this design or not and
providing a sense of how much the PVI had turned. the reason. We counterbalanced the order of the audio alternatives
VR Environment Implementation and Setup. We implemented using Latin Square [48]. We used the similar method to introduce
the VR environment and all the features with A-Frame [71], a web the baseline.
framework that generates 3D spaces via HTML and Javascript. We Navigation. We evaluated the efectiveness of VRBubble in
hosted our prototype on Glitch [16], a browser-based platform that the navigation context. Each participant conducted 15 trials of
allows users to host and share applications as well as share the navigation tasks along the routes in Figure 3a in diferent conditions.
source code, to enable a remote setup for the study. Participants We randomly selected a route for each trial. To enable participants
and researchers can go to the same web address to join the same VR to follow the route and arrive at the destination, we generated audio-
environment. Once connected, researchers can input unique com- based verbal instructions. Participants heard “turn left (or right)”
mand sequences via a keyboard to adjust the participant’s client, when they needed to turn, and heard “arrived destination” when
such as adjusting the audio alternatives in VRBubble or moving they reached the end of the route. For each trial, the participant
the participant’s avatar to a diferent VR scenario. Participants’ was frst moved to the start point of a route. When the research
behavioral data (e.g., their positions at diferent times) was then announced “Start”, they would navigate by following the verbal
logged and output into the researchers’ browser console. instruction until arriving at the destination. We asked participants
We asked participants to share their screen with audio via Zoom to complete each navigation task as quickly as possible.
to ensure the commands from the researchers were properly re- Participants frst conducted three trials of basic navigation tasks
fected on the participants’ end and to confrm that the audio feed- without any avatars around. These trials helped participants get
back was properly generated at the participants’ end. familiar with the navigation tasks. We then asked participants to
4.1.3 Procedure. Our study consisted of a single session that customize their audio feedback in VRBubble based on the navi-
lasted 2 hours for each participant. We conducted the study re- gation task. To guide participants through the customization, we
motely via a Zoom video call. An initial interview was frst con- asked them to choose and combine audio feedback for the fve
ducted to ask about participants’ prior experience with VR, espe- following situations: (1) when a friend enters or leaves the Social
cially social VR. Then we asked about their social experiences in Bubble; (2) when a stranger enters or leaves the Social Bubble; (3)
real life. when a stranger enters or leaves the Conversation Bubble; (4) when
We then asked participants to experience our VR environment a friend enters or leaves the Conversation Bubble; (5) when an
with the awareness features. Participants were requested to use avatar collides with the user in the Intimate Bubble. For each situa-
their earphones for the study, since we generate spatial audio. We tion, participants could select any combination of audio feedback
sent a url to our VR environment through email before the study. If ranging from multiple to none at all (Figure 7). During the cus-
any participant had difculty accessing their email, we sent the link tomization process, the researcher adjusted the audio alternatives
again through the Zoom chat or read the link out-loud to them. in real time for the participant to experience the selected feedback
After the participant successfully joined the VR environment, we and make immediate corrections. After customizing VRBubble, we
evaluated participants’ experience and performance in two context: asked about the reason behind their choices and any suggestions
(1) a navigation context and (2) a conversation context. In each they have for customizing. Once the customization phase was com-
context, participants completed navigation or conversation tasks pleted, the researcher adjusted the audio feedback in VRBubble to
with two awareness features—VRBubble and the baseline (i.e., audio the participant’s preferred combination.
beacon). We counterbalanced the awareness feature and the context After the customization, participants continued to perform an-
across all participants. We elaborate the details of the remaining other 12 trials of navigation tasks with avatars around. Participants
study in three phases: tutorial, navigation, an conversation. were asked to complete two tasks at the same time: a primary navi-
Tutorial: Introducing the Features. We conducted a tutorial gation task (i.e., navigating to the destination following the verbal
session to teach participants how to use our VR prototypes. We instructions) and a secondary avatar awareness task (i.e., memo-
frst introduced the interactions for basic movements (arrows keys rizing how many avatars they’ve passed by and who they are). We
to move and shift + left/right arrow key to turn, as explained in asked participants to reach the destination as quickly as possible,
Section 4.1.2). We then rendered an empty VR space and encour- but also try their best to perceive and memorize the surrounding
aged participants to move around until they were familiar with the avatars. Before the navigation task, we informed the participants
ahead that we would asked them to report the avatar number and
23 Turn audio: https://cdn.glitch.me/11db17e0-58dc-4f24-8390-bb2e597d1475% names they remembered after each trial. However, if the task was
2F268108__nenadsimic__button-tick.wav?v=1636008594574
ASSETS ’22, October 23–26, 2022, Athens, Greece T. Ji et al.

too difcult, we asked participants to prioritize the primary naviga- We pre-planned 30 yes/no questions. All questions were easy-to-
tion task. answer questions that asked about a specifc fact about the partici-
Participants completed above tasks in two awareness conditions: pants to minimize the variances caused by participants’ literacy and
using VRBubble with the customized audio alternatives, and using cognitive ability. Some example questions included “Have you ever
the baseline. Each condition included six trials of navigation. To been to Italy?”, “Are you a cofee drinker?” We randomly pooled
further understand the efectiveness of VRBubble in situations with 10 questions for each trial and also made sure not to repeat any
diferent avatar complexity, we generated avatars in two amount questions across the study for each participant. We measured par-
levels: Low Amount (1-5 avatars) and High Amount (6-10 avatars). ticipants’ response time for each question.
We decided whether a trial had low or high amount of avatars After all trials, we asked all 30 yes/no questions again without
randomly but ensured that participants experienced each avatar any distractions and collected participants’ answers as the correct
amount level for three trials per awareness condition. For each answers. Participants then assessed the efectiveness, distraction,
trial, we randomly generated avatars within the corresponding and immersion of VRBubble and the baseline for the conversation
avatar amount range, and the avatars were randomly distributed context via 5-point likert scale scores. Finally, participants talked
along the navigation route. We guaranteed that all avatars were about their qualitative feedback and suggestions for VRBubble in
within the social distance (12 feet) to the route, so that all avatars the conversation context.
were detectable by VRBubble. After each navigation trial, we asked We ended the study by discussing with the participants about
participants to report their perceived avatar information, including their general experiences with VRBubble compared to the baseline,
the total avatar number, the number of friend avatars, the number as well as their desired awareness technology for diferent contexts
of stranger avatars, and the avatar names they heard. We timed in social VR in the future.
participants’ navigation time for all trials.
4.1.4 Analysis. We detail our analysis methods for both the quan-
By the end, we asked about participants’ general experience with
titative and qualitative data in our study.
VRBubble and the baseline. Participants rated the efectiveness,
Navigation Performance We frst analyzed the impact of VR-
distraction, and immersion of both features via 5-point likert scale
Bubble on participants’ performance in the navigation task, includ-
scores. Participants also discussed their feedback and suggestions
ing navigation time and the error rate when recalling the number
for improving VRBubble in the navigation context.
of avatars they passed by. We had three within-subject factors:
Conversation. The procedure of the conversation task was sim-
Feature (two levels: VRBubble, Baseline) that defned the avatar
ilar to the navigation task. This phase of the study was conducted
awareness feature participants used, AvatarAmount (Low: 1-5, High:
in the conversation scenario (Figure 3b). Participants conducted
6-10) that specifed the amount of avatars in the space, and Trial
three trials of conversation task in diferent conditions. For each
(1-3) that defned each navigation task in a specifc condition. We
trial, the participant faced an avatar who asked 10 yes/no questions.
had two measures: NavigationTime—the time taken in seconds by
Participants were asked to answer all questions as quickly and
the participant to reach the end of the navigation path, and Avatar-
accurately as possible. A researcher on the team acted as the con-
ErrorRate—the ratio of the diference between the reported avatar
versation avatar to ask the questions. The next question was asked
number and the actual number to the actual number of avatars.
immediately after the participant answered the prior question.
We also added a between-subject factor, Order (VRBubble-Baseline,
Participants frst conducted one trial of “dry-run” task, answering
Baseline-VRBubble) to our model for counterbalance validation.
10 questions without any distraction to get familiar with the task.
The Shapiro-Wilk test showed that both NavigationTime (� =
We then asked them to customize the VRBubble feedback again for
.867, � < .001) and AvatarErrorRate (� = .898, � < .001) devi-
the conversation context. Participants then continued to perform
ated signifcantly from a normal distribution, so we used Aligned
two trials of conversation tasks with avatars around. One trial
Rank Transform for nonparametric factorial ANOVAs (ART) [28]
used the customized VRBubble and the other used the baseline. 24 to model our data. We found no signifcant efect of Order
For each trial, participants conducted two tasks simultaneously:
(� 1,10 = 2.296, � = .161) on the navigation time and no signif-
a primary conversation task (i.e., answering 10 yes/no questions)
cant efect of the interaction between Order and other factors on
and a secondary awareness task (i.e., perceiving and memorizing
navigation time. Similarly, we also found no signifcant efects of
surrounding avatars). The primary task was prioritized if the task
Order (� 1,10 = .0005, � = .982) on avatar error rate and no signif-
was too difcult. After each trial, we asked participants to recall
cant efect of the interaction between Order and other factors on
the avatar information, including total avatar number, the number
error rate.
of friends and strangers, and the avatar names.
Conversation Performance We then analyzed the impact of
To balance the avatar complexity in each trial, we pre-defned
VRBubble on participants’ performance in the conversation task,
two similar groups of avatar moving dynamics. Both groups had
including the response time per question, accuracy of answering
eight avatars spawn in and move around the user over the duration
questions, and the error rate when recalling the number of passed-
of 70 seconds during the conversation process. Each group had
by avatars during the conversation. We had two within-subject
four friend avatars and four stranger avatars. However, the order
factors: Feature (VRBubble, Baseline), and AvatarGroup (Group1,
of avatars and the timing they spawn in were diferent between
Group2) that indicated which pre-defned set of avatar dynamics
the two groups. We counterbalanced the avatar groups with the
was used during each conversation trial. We had three measures:
awareness feature used across participants.
AvatarErrorRate, ResponseTime—the mean response time for the
24 ART is a non-parametric approach to factorial ANOVA or linear mixed efect model
VRBubble ASSETS ’22, October 23–26, 2022, Athens, Greece

participant to answer a question, and IncorrectAnswers—the number nearby in a navigation context. An ART analysis showed a signif-
of questions that participants answered inconsistently compared cant efect of the awareness feature on participants’ error rate when
to the standard answers or questions that needed to be repeated. estimating the amount of avatars they passed by (� 1,110 = 33.54, � <
A between-subject factor, Order (VRBubble-Baseline, Baseline- .001). We then conducted a post hoc paired Wilcoxon signed-rank
VRBubble), was added to valid our counterbalancing. The Shapiro- test and found that participants had a signifcantly lower error
Wilk test showed that ResponseTime (� = .886, � = .011) and Incor- rate when using VRBubble than the baseline (� = 1566.5, � <
rectAnswers (� = .654, � < .001) were non-normally distributed , .001, error rate with VRBubble: ���� = .239, �� = .26, baseline:
while AvatarErrorRate (� = .972, � = .708) followed a normal distri- ���� = .465, �� = .323). This result demonstrated that VRBubble
bution. We thus used ART to model ResponseTime and IncorrectAn- signifcantly improved PVI’s awareness of the surrounding avatar
swers, and used ANOVA [14] for AvatarErrorRate. We found no sig- amount.
nifcant efect of Order on the response time (� 1,8 = .293, � = .603) We examined the error rate more closely for environments with
or questions answered incorrectly (� 1,8 = .802, � = .397) and no sig- diferent avatar complexity (AvatarAmount): low amount of avatars
nifcant efect of the interactions between Order and other factors vs high amount of avatars. A post hoc Wilcoxon signed-rank test
on response time or questions answered incorrectly. Similarly, we showed that VRBubble signifcantly decreased participants’ error
found no signifcant efects of Order (� 1,8 = .702, � = .426) or its rate of avatar number estimation regardless the avatar amount
interactions on the error rate of recalling avatars. was high (� = 585, � < .001) or low (� = 259, � = .01 < .025
Moreover, we found no signifcant efect of AvatarGroup or its with Bonferroni Correction). This indicated that the efectiveness
interactions on ResponseTime (� 1,8 = .026, � = .876), IncorrectAn- of VRBubble was consistent across environments with diferent
swers (� 1,8 = .82, � = .392), and AvatarErrorRate (� 1,8 = .157, � = avatar complexity.
.097). These results confrmed that the two pre-defned sets of Participants’ qualitative feedback also explained the efective-
avatars had similar complexity. ness of VRBubble. Ten participants expressed positive sentiment
Post hoc Analysis. Based on the results of the ART and ANOVA towards VRBubble, emphasizing its ability to provide distinguish-
models, we conducted post hoc tests to further investigate the rela- able audio feedback for each avatar, with seven noting VRBubble
tionship between the diferent levels of signifcant factors. Specif- created a better sense of distance for avatars via the diferent bub-
ically, we used paired t-test for normally distributed data, and bles. In contrast, participants found the audio beacon baseline hard
used the pairwise Wilcoxon signed-rank test for non-normally dis- to use since all avatars shared the same beacon sound and it was
tributed data. We corrected the p-value threshold with Bonferroni difcult to distinguish diferent avatars, especially when avatars
Correction. were close together. P6 “had a hard time diferentiating if it was the
Qualitative Analysis. We video recorded the whole study and beacon for the additional avatar or the same avatar. I had a hard time
transcribed participants’ verbal feedback and responses to the in- distinguishing how many people or avatars there were.”
terview questions with an automatic transcription service, Otter.ai. Avatar Identifcation. Compared to the baseline, VRBubble
Researchers on the team also read through the transcripts and cor- enabled participants to collect more detailed information about
rected the auto transcription errors in the transcripts. We analyzed surrounding avatars, including avatar names and the relationship
the transcripts using the the qualitative analysis method described to the user.
by Saldaña [62]. Two researchers frst independently coded the frst When customizing VRBubble, nine participants selected verbal
four participants’ transcripts. They then discussed and generated a notifcation that reported avatar names. With this feature, partici-
codebook upon agreement. One researcher then followed the code- pants in general recalled the names of passed-by avatars accurately
book and coded the rest of the data. When a new code emerged, with a mean accuracy of 0.715 (�� = .305). Six participants ex-
the two researchers discussed again and updated the codebook. pressly liked VRBubble’s capacity to provide information about
The codes were then organized and categorized into themes using each individual avatar. For example, P3 emphasized the importance
afnity diagram. of name identifcation in engaging in social interactions: “[The
name] would potentially determine whether or not I would want to go
and speak to them ... or even somebody I want to avoid ... I can see that
4.2 Results having been named there is a useful feature.” However, we noticed
We report the impact of VRBubble on participants’ performance in that the avatar identifcation accuracy of some participants dropped
diferent context, including both a navigation and a conversation to below 0.5 in more crowded avatar environments. This was be-
task. We also report participants’ experiences with diferent audio cause the verbal notifcations from diferent avatars would overlap
alternatives and their preferences when customizing VRBubble for and become confusing when the avatar reached high amounts. As
diferent context. P5 noted, “If it’s not too many people happening at once then I can
feel it out, but then when so many people are happening, then I can’t
4.2.1 VRBubble in Navigation. For the navigation task, we found really fgure it out.”
that VRBubble provided the user with more detailed and accurate Ten participants’ VRBubble customization supported friend iden-
information about surrounding avatars, compared to the baseline. tifcation (i.e., earcon or real-world sound efect for Conversation
However, we also noticed that VRBubble could be more distracting Bubble). Most participants distinguished friends and strangers ac-
than the baseline. We report the detailed results below. curately with a mean accuracy of 0.658 for friends (�� = .367)
Avatar Amount Estimation. We analyzed the efectiveness of and 0.656 for strangers (�� = .376). Ten participants agreed that
VRBubble on enhancing PVI’s awareness of the amount of avatars discerning between friends and strangers was important for them
ASSETS ’22, October 23–26, 2022, Athens, Greece T. Ji et al.

to decide what social actions to take. As P2 said, “I might want to the baseline in the conversation context. We report the detailed
ask a stranger what time it is. But if I hear friendly footsteps coming measures below.
towards me, I want to know who’s approaching and I can strike up con- Avatar Amount Estimation. We analyzed the efectiveness of
versation.” However, P4 was an outlier who consistently reported VRBubble on enhancing PVI’s awareness of avatar amount nearby
many more avatars than the actual friend and stranger numbers in a conversation context. An ANOVA analysis showed no signif-
in the space. This was because P4 used the abstract earcons for cant efect of the awareness feature on participants’ error rate when
the Conversation Bubble and found it difcult to discern the pitch estimating the amount of avatar they passed by (� 1,8 = 1.431, � =
changes between friends and strangers. .266). Compared to the navigation task, we found that participants
The two participants (P5, P10) chose verbal notifcation for the made more errors when estimating the avatar number with VRBub-
Conversation Bubble and thus could not distinguish friends and ble in the conversation task (���� = .305, �� = .191). The error
strangers. They customized VRBubble based on their real-life ex- rates went especially high when there are more avatars around.
periences, as in real-life they would recognize if someone was a Interestingly, we noticed that the reported avatar number by par-
friend or stranger by name. We envisioned that users would be able ticipants never exceeded fve across all trials, even though there
to distinguish friends and strangers via verbally reported names in were usually more avatars present. Specifcally, eight participants
a real social VR setting after long-term use. encountered more than fve avatars in the conversation task, and
Impact on Navigation Time. We compared the efects of aware- the highest avatar number was eight. This might indicate a limit in
ness feature on participants’ navigation time. With an ART model, how much audio feedback can be peripherally processed (or a upper
we found a signifcant efect of awareness feature on navigation bound of avatar number that can be perceived) while focusing on
time (� 1 = 36.708, � < .001). A post hoc paired Wilcoxon signed- an audio-centered task such as conversation. As P4 noted, “Because
rank test with correction showed that participants navigated sig- you had to answer the question, so you couldn’t put all your brain-
nifcantly slower when using VRBubble than using the beacon power into [the surrounding avatars]. I just don’t think the human
baseline (� = 482, � < .001, navigation time with VRBubble: brain is segmented enough to handle both those tasks at the same
���� = 36.507, �� = 14.897, baseline: ���� = 31.293, �� = 11.033). time.”
The result indicated that VRBubble distracted participants more Avatar Identifcation. Similar to the navigation task, partici-
than the baseline in the navigation task. pants appreciated VRBubble’s ability to identify surrounding avatars
We further investigated the efect of VRBubble under diferent in real-time since it enabled them to dynamically adapt to the so-
avatar complexity (low amount of avatars vs. high amount). We cial environments. Ten participants indicated that knowing who
found a signifcant efect of the interaction between Feature and was in the immediate surroundings had an impact on their social
AvatarAmount on participants’ navigation time (� 1 = 13.676, � < behaviours and what they would say during a conversation.
.001). Using a post hoc Wilcoxon signed-rank test with Bonferroni Seven participants customized VRBubble to allow for name iden-
Correction, we found that, there’s no signifcant diference in time tifcation. However, participants’ avatar recognition accuracy were
between VRBubble and baseline when the avatar number was low lower (���� = .655, �� = .282) compared to the navigation task
(� = 209, � = .052), but participants walked signifcantly slower (���� = .715, �� = .305). Moreover, ten participants customized
with VRBubble than the baseline when the avatar number was high VRBubble to distinguish friends from strangers. Similarly, they were
(� = 66, � < .001). This indicated that VRBubble was only more less accurate compared to those in the navigation task, both when
distracting when the environments became crowded. identifying friends (���� = .567, �� = .323) and when identifying
The distraction of VRBubble could be caused by the more di- strangers (���� = .467, �� = .302). The lower accuracy could be
verse audio feedback provided by VRBubble, where participants attributed to the more attention-demanding nature of the conver-
spent longer time to associate the sounds with diferent events. Six sation task, and the direct confict between verbal notifcations and
participants mentioned being distracted and slowed down since the conversation.
they tried to remember the meaning of the diferent sounds.As P5 Impact on Conversation. We investigated the impact of VR-
explained, “If I go too fast, I might miss it ... I have to pay attention Bubble on participants’ ability to answer questions in a conver-
on top of all the other avatars leaving, coming in, and coming out.” sation. With an ART model, we found no signifcant efect of the
Another reason that may cause a slower navigation was par- awareness feature on the number of questions answered incorrectly
ticipants’ interest. Since VRBubble enabled participants to receive (� 1,8 = .647, � = .444). Most participants answered all questions
more information about the surrounding avatars, it aroused partici- correctly with both features. Four participants (P2, P4, P5, P12)
pants’ interest to explore the environment during the navigation. presented incorrect answers while using VRBubble, and four par-
In our study, while we emphasized that navigation was the pri- ticipants (P4, P7, P10, P11) did so with the baseline. P2 was the
mary task, we still observed that some participants (e.g., P2, P5, only participant who answered two questions incorrectly, while
P12) stopped, looked around, and even moved their avatar left and everyone else answered one question incorrectly.
right to explore nearby avatars with VRBubble. In contrast, since Moreover, we compared the efects of VRBubble and the baseline
the baseline did not provide much information, participants mostly on participants’ response time to questions in the conversation.
focused on the navigation. With an ART model, we found no signifcant efect of Feature on
response time (� 1,8 = .206, � = .662). These results showed that
VRBubble and baseline were generally at the same distraction level
4.2.2 VRBubble in Conversation. Unlike in the navigation task, in the conversation task.
VRBubble generally did not perform signifcantly diferent from
VRBubble ASSETS ’22, October 23–26, 2022, Athens, Greece

was novel and efective, and expressed favor that VRBubble pro-
vided them with more information than the baseline did. Seven
participants liked that VRBubble separated the space and conveyed
a sense of distance to the user. P3 noted, “I like the three diferent
distances. If I walk into a social situation, I wish I had a bubble radar
just to know when people are in my immediate vicinity versus people
a little ways away. It is just nice to have a concept of relative [dis-
tance].” Participants also liked the sense of avatar presence and
movement that was conveyed by the feedback from the bubbles.
We then elaborate participants’ experiences with diferent audio
alternatives below.
Earcon. Four participants liked the brevity and non-intrusiveness
of the earcons. Unlike the continuous audio feedback in the baseline,
Figure 4: Likert scale score averages for each task: navigation
participants liked that the earcons discretely notifed them every
Task (nav) and conversation task (conv).
time an avatar entered a bubble. For example, P8 had difculty
with discerning spatial audio. She reported, “I like how concrete
[the earcon] is. I have a really hard time with spatial awareness. So,
4.2.3 Perceived Experience with VRBubble. We compared partici- without necessarily having to keep track of the space because there
pants subjective experience with VRBubble and the baseline from are those diferent [earcons], it’s very helpful.” Three participants
three dimensions, including efectiveness, distraction, and immer- noted that earcons for each avatar event could be easily noticed and
sion. Figure 4 shows the mean scores given by participants for each distinguished, which could be used to reduce audio overloading in
dimension. Participants perceived VRBubble (���� = 3.917, �� = crowded environments. As P4 indicated, “To signify the person is
.9) to be more efective than the baseline (���� = 2.667, �� = 1.231) leaving, I’d want the [earcon] sound that shows the person’s going
at enhancing avatar awareness in the navigation task, while per- out. Because then you don’t have too many verbal cues, you know, to
ceiving VRBubble (���� = 3.167, �� = 1.403) to be slightly less not listen too much.”
efective than the baseline (���� = 3.25, �� = 1.485) for the conver- The main deterrent of earcons was the deep learning curve. Six
sation task. These scores matched our expectations given that the participants found it hard to recognize diferent earcon sounds. P4
conversation task was more attention-demanding and participants pointed out that the meanings of earcons would have to become sec-
had less cognitive capacity to use VRBubble. ond nature in order to be efective in a dynamic social environment:
Participants perceived more immersion in the virtual space while “You don’t want to have to think about it, you want to immediately
using VRBubble (���� = 4.333, �� = .888) than the baseline react.” Five participants had issues with discerning the pitch difer-
(���� = 2.667, �� = 1.557) during the navigation task. They also ence between friend and stranger earcons. As P1 said, “there were
felt more immersed during the conversation task when using VR- more and more unexpected [earcons] that threw me of and, being not
Bubble (���� = 3.583, �� = 1.443) than when using the baseline as familiar with the tones, second guessing if [the earcon] was higher
(���� = 3.083, �� = 1.505). Six participants found the audio bea- [pitched] or lower, entering or exiting.” P4 suggested that instead of
con baseline to be annoying or distracting, and thus diminishing using diferent pitches, completely diferent earcons should be used
their immersive experience. In contrast, VRBubble provided a better for diferent types of avatars.
sense of presence by dividing the social space and associating the Verbal Notifcation. Nine participants liked that the verbal noti-
avatar dynamics with social indications. As P3 mentioned, “The fcations were informative and easy to understand. They expressed
bubble system was much more immersive, just because it gives you a desire for sufcient information about their surroundings to ad-
a couple of clear lines between when you’re entering and leaving just their behaviors, especially in the context of conversation. P3
someone’s space. The beacon is just super impersonal.” emphasized that distinguishing between friends and strangers can
Interestingly, participants perceived VRBubble (���� = 1.75, �� = help protect his privacy, “there may be something that you’re willing
.754) to be less distracting than the baseline (���� = 2.583, �� = to share with a friend, but not necessarily with a complete stranger.”
1.379) for the navigation task, despite taking signifcantly more time The drawback of verbal notifcations was its verbosity, which
to complete the task with VRBubble. This further confrmed that caused more audio overlapping between avatars. Nine participants
the slower navigation could be caused by participant’s increased mentioned this problem in the study. Four participants also noted
interest in exploring the surrounding with VRBubble instead of that hearing a verbal notifcation immediately drew their attention,
distraction. For the conversation task, participants perceived VR- which could be distracting in certain scenarios, such as conversation.
Bubble (���� = 3.167, �� = 1.467) to be slightly more distracting Due to these issues, three participants cared only about verbal
than the baseline (���� = 3.083, �� = 1.443). Five participants notifcations for important people, such as friends, rather than
expressed that the amount of information from VRBubble caused being verbally informed about every avatar.
distractions during conversation, suggesting simpler or reduced Real-world Sound Efect. Six participants liked the footstep
audio was preferred for audio-intensive tasks. and crowd noises since they were ambient and non-distracting.
The audio had no sudden alerts and was quiet enough to tune
4.2.4 Experience with VRBubble and its Audio Alternatives. All par- out when focusing on the primary task. The crowd sounds were
ticipants agreed that the Bubble concept based on social distances also immersive and helped convey a general sense of the social
ASSETS ’22, October 23–26, 2022, Athens, Greece T. Ji et al.

environment, such as if the user was at an informal party or a


business meeting. For example, P4 preferred using real-world sound
efects in the conversation task for an immersive experience. As
he expressed, “I was not gearing [my customization] toward the
best [avatar recognition] results but [was] gearing it toward getting
a real feel for it.” P3 even suggested providing “diferent types of
ambient background noises that meet the theme of whatever rooms.”
Two participants (P3, P6) also felt that the two levels of ambient
crowd sound with diferent loudness was sufcient, and any further
levels may be too loud or unable to contribute any additional useful
information. Moreover, two participants (P1, P2) liked the spatial
footstep audio in the Conversation Bubble, which helped them
constantly track important avatars in the conversational distance.
Figure 6: VRBubble customization for the conversation task:
However, six participants disliked the constant real-world sound
side-by-side comparison between amount of times audio al-
efects since it was hard to for them to distinguish each avatar
ternative is selected for friends and strangers, at each bubble.
via the general crowd sounds. P6 thus suggested combining the
Intimate bubble is shared between friends and strangers.
crowd sound with an overall description of the avatar number.
Two participants (P7, P10) had difculty distinguishing friend and
stranger avatars via the diferent footstep sounds, especially with
other louder audio in the environment.
4.2.5 Customization of Audio Alternatives. All participants appre-
Figure 5 and Figure 6. While seven participants used the same
ciated the fexibility of customizing the audio feedback in VRBubble.
combination for both navigation and conversation tasks, fve par-
For example, P8 had an auditory processing disorder that caused
ticipants changed their customization. We found that participants
large quantities of sound to be difcult to process. As she men-
tended to simplify their audio combination in the conversation task,
tioned, “[I like] the idea that you can customize how you’re going
and all changes happened in Social Bubble. The changes included
to handle an event or have part of an event made accessible to you
removing an audio alternative (P4, P11 removed verbal notifcation,
via sound. It is not something most people think about and I really
P5 removed real-world sound efect for strangers, and P7 removed
appreciate it.”
earcons for friends), and changing verbal notifcations to more brief
Participants customized the audio alternatives for diferent so-
earcons (P3 for strangers, P11 for friends). This pattern suggested
cial tasks, avatar roles, and bubbles (Figure 7). Figure 5 shows the
that users preferred shorter and non-verbal feedback during audio-
distribution of audio alternatives chosen for the navigation task,
focused tasks like conversation, whereas more informative feedback
and Figure 6 shows the distribution for the conversation task. While
was preferred for exploration-based tasks like navigation.
no participants selected the exactly same combinations, we saw
Friends vs Strangers. Participants had divergent preferences
some patterns across participants that we report below.
when distinguishing friends and strangers. Some participants (P2,
P5, P7, P8) chose more detailed information for friends and felt
that stranger information was less important especially in the So-
cial Bubble. P2 and P7 even did not want any audio feedback for
strangers in the Social Bubble. On the other hand, some participants
(P3, P11) wanted to know more about strangers: P3 selected earcons
for friends and more informative verbal notifcation for strangers in
the Social Bubble in the navigation task, while P11 added earcons
for the strangers to alert himself. They explained that in a real
use case, users were likely to already know enough information
about friends and would only need more detailed information about
strangers. As P3 described, “The verbal would be useful because that
might indicate to me it’s a stranger I don’t know. And this is their
name.”
Figure 5: VRBubble customization for navigation task: Side- Paterns across Bubbles. Most participants preferred the real-
by-side comparison between amount of times that each audio world sound efects for the Intimate Bubble, while some participants
alternative was selected for friends and strangers, at each bub- selected the abstract earcons. Participants mostly made the decision
ble. Intimate bubble is shared between friends and strangers. based on which audio feedback they found more pleasant. Interest-
ing, P5 was the only participant who expressed interest in knowing
Navigation vs. Conversation Task. In general, we found that who they bumped into, thus choosing verbal notifcation. This may
verbal notifcation was most popular for the navigation task, while suggest that in social VR, avatar collision was an event that PVI
earcons and real-world sound efects were more popular in the wanted to avoid but not important enough for PVI to care about
conversation task (except for Intimate Bubble), as shown between the details.
VRBubble ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 7: Chart of participants’ VRBubble customization in navigation and conversation tasks.

For the Social and Conversation Bubbles, we found that partici- 5.1 Audio Preferences for Peripheral Awareness
pants had diferent preferences for diferent bubbles without clear Besides seeing VRBubble as a whole VR technique, our research
patterns, and only P10 used verbal notifcation for both bubbles. also explored PVI’s experiences and preferences of perceiving pe-
Audio Combinations. The most popular audio combination was ripheral information via diferent audio modalities. Unlike prior
between verbal notifcations and real-world sound efects, which work that studied the use of diferent audio feedback in various
was adopted by four participants (P4, P5, and P9 used the combina- primary tasks (e.g., [1, 45, 87]), we focused on using audio to convey
tion for the Social Bubble, P2 used it at the Conversation Bubble). peripheral dynamics and investigated the feasibility of diferent au-
Two participants (P7, P11) also combined earcons with verbal noti- dio modalities in two representative social contexts: navigation and
fcations. Participants used such combinations to add immersion conversation. Our study showed that compared to a fxed design
to the detailed information provided by verbal notifcations. As P5 (e.g., nonadjustable audio beacons), fexible audio customization
described, “Because that will add more reality to what I’m doing, in- allows users to efectively control distraction from a given primary
stead of just a boring computer thing.” Only P12 used a combination task. In our study, fve participants changed their audio selection
of earcons and real-world footsteps. As he explained, “The abstract after experiencing a diferent social task (Table 1), highlighting the
[earcon] is my favorite and that’s why I chose most of the options need for customization depending on the social context.
here. But I thought it would be helpful to have the footsteps as well Our research identifed PVI’s customization patterns. First, par-
just so it’s a little bit diferent [from just the earcon].” ticipants’ audio preferences were driven by the cognitive load of
5 DISCUSSION the primary task. They preferred more detailed and descriptive au-
dio modality (e.g., verbal notifcation) in less-attention-demanding
With VRBubble, we contributed the frst VR technique that lever- tasks, but could only handle brief and distinct audio feedback (e.g.,
aged the social distances (i.e., Bubbles) in Hall’s proxemic theory earcons, sound efects) in more attention-demanding and audio-
[18] to enhance PVI’s peripheral awareness of avatars in a complex focused tasks. Second, the importance of peripheral information
and dynamic social VR context. VRBubble enabled user customiza- was another factor that infuenced PVI’s audio selection. Specif-
tion by providing three spatial audio alternatives—earcons, verbal cally, the more detailed verbal descriptions or multiple audio alter-
notifcations, and real-world sound efects—and allowing users to native combinations were selected for more important avatars.
select and combine their preferred audio feedback for diferent Our results also suggested (but need further investigation) that
bubbles, avatars, and social contexts. people may have a upper limit of capacity to perceive peripheral
Our evaluation with 12 PVI demonstrated the efectiveness of information in attention-demanding tasks (e.g., perceiving fve
VRBubble: It signifcantly enhanced participants’ awareness of the avatars at most in a conversation task), so that they had to give
amount of avatars they passed by in a navigation task and enabled up on less important peripheral information, such as information
participants to identify most avatars, including their names and on strangers in the Social Bubble. While our research focused on
relationship to the users in both navigation and conversation tasks. the social VR context, the preferences of audio modalities by PVI
However, we found that VRBubble could be distracting especially in to receive peripheral information could be applied to broader use
crowded environments with high amount of avatars. Our research cases, such as situational awareness systems for drivers.
was the frst step towards the general accessibility of the dynamic
social VR. Based on our exploration, we discuss the takeaways,
design implications, technology generalizability, and the lessons
we learnt from our remote study.
ASSETS ’22, October 23–26, 2022, Athens, Greece T. Ji et al.

5.2 Design Implications the users’ efort and satisfaction. Online communities for avatar
Based on participants’ experiences with VRBubble, we distill design sonifcation could also be encouraged to provide PVI a place to
implications to inspire more accessible and immersive social VR search and fnd their preferred audio source, similar to Thingiverse
experience for PVI. [69] which serves as an online community that provides 3D model
Context-Adaptive Feedback. Our study indicated that partici- resources for makers.
pants preferred diferent audio modalities for diferent social con-
texts. However, social activity is a highly dynamic process where 5.3 Generalizability
users’ context and primary task can change quickly. For example, a We discuss the generalizability of our technique from both the plat-
user could be navigating, but stop to chat with a passing friend in form and the users’ perspectives. While implemented and evaluated
the next moment. Thus, it could be difcult for PVI to manually ad- on a desktop-based VR platform, VRBubble can be easily transferred
just their audio combinations every time to match the fast-changing to HMD-based VR as it focuses on proactive audio design that is
social contexts. Some participants suggested pre-defning multiple not device or platform specifc. It can be adapted to other hardware
sets of audio combinations for diferent contexts and designing easy through combination with suitable input techniques (e.g., input via
ways to toggle between them. As P9 suggested, “[You can have] a controllers or body/head movements).
walking around mode, and a talking conversation mode. If you exit a Beyond PVI, VRBubble could also be applied for sighted users in
conversation and you’re walking around, switch back to names.” To social VR. Various real-world presence cues, such as breathing or
reduce the efort of active user control, researchers could consider wind movement are not conveyed through visual information. Thus
context-adaptive feedback mechanism, which recognizes users’ so- providing sighted users with additional peripheral information via
cial context with AI technology and automatically switches to users’ audio feedback could enhance immersion in social VR, as well as
preferred feedback based on the current context. ensuring users’ privacy (i.e., being aware of potential out of sight
Balance Automatic and Manually-queried Feedback. VRBub- eavesdroppers).
ble focused on conveying peripheral information, thus providing
proactive feedback that was automatically triggered by environ- 5.4 Evaluation via a Remote Study
mental changes to reduce users’ interaction eforts. However, in our
We conducted our study remotely due to the restriction of the pan-
study, several participants indicated the desire for actively query-
demic. We summarize our experience and the lessons we learnt
ing avatar information since they may have missed some prior
from the remote study. First, we found it difcult for participants to
information or need more detailed information about particular
setup all the software and hardware required by our study indepen-
avatars of interest. We acknowledge that both types of feedback
dently. To guide participants through the whole setup process, we
have their values. For example, our study showed that while proac-
connected with the participants 30 minutes ahead of the study via
tive feedback were suitable for important and urgent information,
Zoom, sent them the url to our VR environment, and asked them to
too much feedback can distract and overwhelm the users. Mean-
share screen with us during the setup process. Participants gener-
while, manually-queried feedback can be a complement to provide
ally appreciated our assistance via Zoom and felt the use of Zoom
detailed information on demand and reduce feedback overload. It is
sharing function was efective for the remote study. As P7 noted,
thus important to consider how to balance proactive and manually-
“You’ve just proven a feature not in virtual reality right now—feature
queried feedback to optimize users’ experience in social VR, espe-
sharing. You just showed how important that is for someone with a
cially in crowded virtual environments or cognitively heavy tasks.
disability because accessibility [functionality] doesn’t always work.”
For example, proactive notifcations can be used for important and
Second, it is hard to guarantee the consistency of participants’
nearby avatars, and manually-queried feedback can serve to provide
experience in a remote study, depending on each individual’s de-
additional information based on users’ interest. Additionally, users
vices and environments. For example, they may receive diferent
should also have the option to manually repeat any information
audio quality due to the diferent headphones, and some of them
provided by proactive feedback.
may experience feedback delay due to unstable Internet access. For
Enable Third-Party Audio Design. While providing difer-
this study, we tried to ensure a certain level of consistency by check-
ent audio alternatives, VRBubble provided only pre-defned audio
ing that participants had the pre-requisite equipment, including a
efects, especially for earcons and real-world sound efects. Inter-
keyboard, a pair of headphones that supports spatial audio, and a
estingly, our study indicated that participants had distinct hearing
Microsoft Edge browser installed to access our VR environment.
abilities, audio preferences, and audio aesthetic. For example, some
Participants shared their screen and audio via Zoom, so that we can
participants (P2, P8) found the real-world footstep more pleasant
check whether our feature was functioning properly in real time.
to hear, while some (P4, P9) preferred the more abstract earcons.
However, the audio share feature in Zoom did not support spatial
Even for the same audio modality, participants had diferent pref-
audio, which made it impossible to fully confrm users’ experience.
erences. Some (P3, P6, P11, P12) liked the current earcon design,
In the future, how to setup a remote study platform and enable
while some (P1, P4, P5, P9, P10) found them difcult to distinguish.
researchers to easily control and confrm users’ experience is a vital
To better fulfll the users’ needs, one solution could be allowing
research direction to explore.
users to design and upload their own audio source to represent
particular avatars. However, designing and generating audio source
5.5 Limitations and Future Work
could be technically challenging, especially for people with visual
impairments. It is thus important to consider the trade-of between Our research has limitations. Since our study focused on the design
and evaluation of the audio feedback (i.e., the output) in VRBubble,
VRBubble ASSETS ’22, October 23–26, 2022, Athens, Greece

we used a Wizard-of-Oz [8] setup to enable participants to cus- [7] Chetz Colwell, Helen Petrie, Diana Kornbrot, Andrew Hardwick, and Stephen
tomize the audio alternatives. Thus, how to enable PVI to efciently Furner. 1998. Haptic virtual reality for blind computer users. ASSETS 98 (January
1998). https://doi.org/10.1145/274497.274515
and easily control and customize VRBubble’s audio modalities (i.e., [8] Nils Dahlback, Arne Jonsson, and Lars Ahrenberg. 1993. WIZARD OF OZ STUD-
the input) still mains unaddressed. A future work of interest would IES — WHY AND HOW. Intelligent User Interfaces (1993).
[9] Maitraye Das, Thomas Barlow McHugh, Anne Marie Piper, and Darren Gergle.
be to explore efective input modalities to support fast interaction 2022. Co11ab: Augmenting Accessibility in Synchronous Collaborative Writing
for PVI in social VR. for People with Vision Impairments. In Proceedings of the 2022 CHI Conference
Additionally, our current evaluation used mock-up social VR on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22).
Association for Computing Machinery, New York, NY, USA, Article 196, 18 pages.
environments with system-generated avatars, which may not fully https://doi.org/10.1145/3491102.3501918
refect the avatar dynamics in the real social VR environments. For [10] Patricia A. de Oliveira, Erich P. Lotto, Ana Grasielle D. Correa, Luis G. G. Taboada,
example, avatars in social VR may be conversing rather than stand- Laisa C. P. Costa, and Roseli D. Lopes. 2015. Virtual Stage: An Immersive Musical
Game for People with Visual Impairment. In 2015 14th Brazilian Symposium on
ing silently, so that they might be located or identifed by their voice. Computer Games and Digital Entertainment (SBGames). 135–141. https://doi.org/
This can potentially change users’ experience with VRBubble. Fu- 10.1109/SBGames.2015.26
[11] Euan Freeman, Graham Wilson, Stephen Brewster, Gabriel Baud-Bovy, Charlotte
ture research should evaluate VRBubble in more realistic social VR Magnusson, and Hector Caltenco. 2017. Audible Beacons and Wearables in
scenarios and adapt the design to diferent situations; for example, Schools: Helping Young Visually Impaired Children Play and Move Independently.
a context-aware technique that adjusts the audio feedback based CHI 17 (January 2017). https://doi.org/10.1145/3025453.3025518
[12] Lakshmi Gade, Sreekar Krishna, and Sethuraman Panchanathan. 2009. Person
on avatars’ voice activity (e.g., muting or reducing the volume of localization using a wearable camera towards enhancing social interactions
audio notifcations for a speaking avatar). for individuals with visual impairment. In Proceedings of the 1st ACM SIGMM
international workshop on Media studies and implementations that help improving
access to disabled users. 53–62.
6 CONCLUSION [13] Evil Dog Games. 2014. Blind Swordsman. https://devpost.com/software/blind-
swordsman last accessed 18 Jan 2014.
We presented VRBubble, a social VR technique to augment pe- [14] Ellen R Girden. 1992. ANOVA: Repeated measures. Number 84. Sage.
ripheral awareness by utilizing Hall’s proxemic theory. VRBubble [15] Cole Gleason, Alexander J. Fiannaca, Melanie Kneisel, Edward Cutrell, and
provided three audio alternatives—earcons, verbal notifcation, and Meredith Ringel Morris. 2018. FootNotes: Geo-Referenced Audio Annotations
for Nonvisual Exploration. 2, 3, Article 109 (sep 2018), 24 pages. https:
real-world sound efects—for PVI to select and combine to maintain //doi.org/10.1145/3264919
aware of diferent avatars at diferent social distances. We compared [16] Glitch. 2000. Glitch. https://glitch.com last accessed 12 January 2022.
VRBubble to a standard audio beacon baseline via a user study with [17] Jens Grubert, Eyal Ofek, Michel Pahud, and Per Ola Kristensson. 2018. The
ofce of the future: Virtual, portable, and global. IEEE computer graphics and
12 participants with visual impairments. We assessed the efect applications 38, 6 (2018), 125–133.
of VRBubble on participants’ ability to identify avatars as well as [18] Edward T Hall. 1966. The Hidden Dimension. Doubleday.
[19] Wilko Heuten, Daniel Wichmann, and Susanne Boll. 2006. Interactive 3D sonif-
its distraction in diferent primary social tasks. Participants found cation for the exploration of city maps. Nordic conference on Human-computer
VRBubble efective at providing previously inaccessible avatar in- interaction: changing roles (2006), 155–164.
formation, and expressed the need for customizing audio feedback [20] Simon Holland, David Morse, and Henrik Gedenryd. 2002. AudioGPS: Spatial
Audio Navigation with a Minimal Attention Interface. Personal and Ubiquitous
in diferent social VR contexts. Our study contributed a novel acces- Computing 6 (09 2002). https://doi.org/10.1007/s007790200025
sibility technique to inform the social VR dynamics and provided [21] Seyedeh Maryam Fakhr Hosseini, Andreas Riener, Rahul Bose, and Myounghoon
implications for the future design of accessible VR for people with Jeon. 2014. “Listen2dRoom”: Helping visually impaired people navigate indoor
environments using an ultrasonic sensor-based orientation aid.
visual impairments. [22] Rec Room Inc. 2016. Rec Room. Oculus Quest, Microsoft Windows, PlayStation,
iOS, Android, Xbox.
[23] Toru Ishikawa, Hiromichi Fujiwara, Osamu Imai, and Atsuyuki Okabe. 2008.
ACKNOWLEDGMENTS Wayfnding with a GPS-based mobile navigation system: A comparison with
We thank the National Federation of the Blind for helping us recruit maps and direct experience. Journal of environmental psychology 28, 1 (2008),
74–82.
for our study, as well as the anonymous participants that provided [24] Gunnar Jansson. 1999. Can a Haptic Display Rendering of Virtual Three-
their perspective. Dimensional Objects be useful for People with Visual Impairments? 93, 7 (July
1999). https://doi.org/10.1177/0145482X9909300707
[25] Gunnar Jansson, Helen Petrie, Chetz Colwell, Diana Kornbrot, J Fänger, H König,
REFERENCES Katarina Billberger, Andrew Hardwick, and Stephen Furner. 1999. Haptic vir-
[1] ASM Iftekhar Anam, Shahinur Alam, and Mohammed Yeasin. 2014. Expression: tual environments for blind people: Exploratory experiments with two devices.
A dyadic conversation aid using Google Glass for people who are blind or visually International journal of virtual reality 4, 1 (1999), 8–17.
impaired. In 6th International Conference on Mobile Computing, Applications and [26] Tiger F. Ji, Brianna R Cochran, and Yuhang Zhao. 2022. Demonstration of VRBub-
Services. IEEE, 57–64. ble: Enhancing Peripheral Avatar Awareness for People with Visual Impairments
[2] Ronny Andrade, Steven Baker, Jenny Waycott, and Frank Vetere. 2018. Echo- in Social Virtual Reality (CHI EA ’22). Association for Computing Machinery, New
house: Exploring a virtual environment by using echolocation. In Proceedings of York, NY, USA, Article 401, 6 pages. https://doi.org/10.1145/3491101.3519657
the 30th Australian Conference on Computer-Human Interaction. 278–289. [27] Ralf Jung. 2008. Smart sound environments: merging intentional soundscapes,
[3] Matthew T Atkinson, Sabahattin Gucukoglu, Colin HC Machin, and Adrian E nonspeech audio cues and ambient intelligence. The Journal of the Acoustical
Lawrence. 2006. Making the mainstream accessible: redefning the game. In Society of America 123, 5 (2008), 3935–3935. https://doi.org/10.1121/1.2936002
Proceedings of the 2006 ACM SIGGRAPH Symposium on Videogames. 21–28. arXiv:https://doi.org/10.1121/1.2936002
[4] Stephen A Brewster, Peter C Wright, and Alistair DN Edwards. 1993. An evalua- [28] Matthew Kay, Lisa A. Elkin, James J. Higgins, and Jacob O. Wobbrock. 2021.
tion of earcons for use in auditory human-computer interfaces. In Proceedings of ARTool: Aligned Rank Transform for Nonparametric Factorial ANOVAs. https:
the INTERACT’93 and CHI’93 conference on Human factors in computing systems. //doi.org/10.5281/zenodo.594511 R package version 0.11.1.
222–227. [29] Fredrik Kilander and Pekka Loennqvist. 2002. A whisper in the woods - an
[5] Hendrik P Buimer, Marian Bittner, Tjerk Kostelijk, Thea M Van Der Geest, Ab- ambient soundscape for peripheral awareness of remote processes.
dellatif Nemri, Richard JA Van Wezel, and Yan Zhao. 2018. Conveying facial [30] Julian Kreimeier and Timo Gotzelmann. 2019. First Steps Towards Walk-In-Place
expressions to blind and visually impaired persons through a wearable vibrotac- Locomotion and Haptic Feedback in Virtual Reality for Visually Impaired. CHI
tile device. PloS one 13, 3 (2018), e0194737. EA 19 (May 2019). https://doi.org/10.1145/3290607.3312944
[6] J.J. Cadiz, Gina Venolia, Gavin Jancke, and Anoop Gupta. 2002. Designing and [31] Sreekar Krishna, Shantanu Bala, Troy McDaniel, Stephen McGuire, and Sethura-
deploying an information awareness interface. CSCW 02 (November 2002). man Panchanathan. 2010. VibroGlove: an assistive technology aid for conveying
https://doi.org/10.1145/587078.587122 facial expressions. In CHI’10 Extended Abstracts on Human Factors in Computing
ASSETS ’22, October 23–26, 2022, Athens, Greece T. Ji et al.

Systems. 3637–3642. [54] Andrew Osterland. 2021. Virtual Reality Headset Market Size, Share & Trends
[32] Sreekar Krishna, Dirk Colbry, John Black, Vineeth Balasubramanian, and Sethura- Analysis Report By End-device (Low-end, High-end), By Product Type (Stan-
man Panchanathan. 2008. A systematic requirements analysis and development dalone, Smartphone-enabled), By Application (Gaming, Education), And Seg-
of an assistive device to enhance the social interaction of people who are blind or ments Forecasts, 2021 - 2028. https://www.grandviewresearch.com/industry-
visually impaired. In Workshop on Computer Vision Applications for the Visually analysis/virtual-reality-vr-headset-market last accessed 26 Jan 2022.
Impaired. [55] Jamie Pauls. 2020. Vintage Games Series, Part 4: Immerse Yourself in the World
[33] Orly Lahav and David Mioduser. 2004. Exploration of unknown spaces by people of Shades of Doom. https://www.afb.org/aw/21/12/17336 last accessed 18 Jan
who are blind using a multi-sensory virtual environment. Journal of Special 2022.
Education Technology 19, 3 (2004), 15–23. [56] Elin R. Pedersen and Tomas Sokoler. 1997. AROMA: abstract representation of
[34] Cheuk Yin Phipson Lee, Zhuohao Zhang, Jaylin Herskovitz, JooYoung Seo, and presence supporting mutual awareness. CHI 97 (March 1997). https://doi.org/10.
Anhong Guo. 2022. CollabAlly: Accessible Collaboration Awareness in Document 1145/258549.258584
Editing. In Proceedings of the 2022 CHI Conference on Human Factors in Computing [57] Aaron Preece. 2018. A Review of A Hero’s Call, an Accessible Role Playing Game
Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, from Out of Sight Games. https://www.afb.org/aw/19/3/15113 last accessed 18
New York, NY, USA, Article 596, 17 pages. https://doi.org/10.1145/3491102. Jan 2022.
3517635 [58] Giorgio Presti, Dragan Ahmetovic, Mattia Ducci, Cristian Bernareggi, Luca Lu-
[35] Tapio Lokki and Matti Grohn. 2005. Navigation with auditory cues in a virtual dovico, Adriano Baratè, Federico Avanzini, and Sergio Mascetti. 2019. WatchOut:
environment. IEEE MultiMedia 12, 2 (2005), 80–86. Obstacle Sonifcation for People with Visual Impairment or Blindness. In The
[36] Shachar Maidenbaum, Shelly Levy-Tzedek, Daniel-Robert Chebat, and Amir 21st International ACM SIGACCESS Conference on Computers and Accessibility
Amedi. 2013. Increasing accessibility to the blind of virtual environments, using (Pittsburgh, PA, USA) (ASSETS ’19). Association for Computing Machinery, New
a virtual mobility aid based on the" EyeCane": Feasibility study. PloS one 8, 8 York, NY, USA, 402–413. https://doi.org/10.1145/3308561.3353779
(2013), e72555. [59] Radegast Project. 2009. Radegast. Microsoft Windows, Mac, Linux.
[37] Masaki Matsuo, Takahiro Miura, Masatsugu Sakajiri, Junji Onishi, and Tsukasa [60] Shi Qiu, Matthias Rauterberg, and Jun Hu. 2016. Designing and evaluating a
Ono. 2016. Audible Mapper & ShadowRine: Development of Map Editor Using wearable device for accessing gaze signals from the sighted. In International
only Sound in Accessible Game for Blind Users, and Accessible Action RPG for Conference on Universal Access in Human-Computer Interaction. Springer, 454–
Visually Impaired Gamers. In International Conference on Computers Helping 464.
People with Special Needs. Springer, 537–544. [61] Mose Sakashita, E. Andy Ricci, Jatin Arora, and François Guimbretière. 2022. Re-
[38] Tara Matthews and Jennifer Mankof. 2005. A toolkit for evaluating peripheral moteCoDe: Robotic Embodiment for Enhancing Peripheral Awareness in Remote
awareness displays. In From the Awareness Systems Workshop at CHI. Collaboration Tasks. Proc. ACM Hum.-Comput. Interact. 6, CSCW1, Article 63
[39] Keenan R May, Brianna J Tomlinson, Xiaomeng Ma, Phillip Roberts, and Bruce N (apr 2022), 22 pages. https://doi.org/10.1145/3512910
Walker. 2020. Spotlights and soundscapes: On the design of mixed reality auditory [62] Johnny Saldaña. 2021. The coding manual for qualitative researchers. sage.
environments for persons with visual impairment. ACM Transactions on Accessible [63] Jaime Sanchez and Mauricio Lumbreras. 1997. Hyperstories: Interactive narrative
Computing (TACCESS) 13, 2 (2020), 1–47. in virtual worlds. Hypertextes et hypermédias (1997), 329–338.
[40] Troy McDaniel, Sreekar Krishna, Vineeth Balasubramanian, Dirk Colbry, and [64] Jaime Sánchez and Mauricio Lumbreras. 1999. Virtual environment interaction
Sethuraman Panchanathan. 2008. Using a haptic belt to convey non-verbal through 3D audio by blind children. CyberPsychology & Behavior 2, 2 (1999),
communication cues during social interactions to individuals who are blind. In 101–111.
2008 IEEE international workshop on haptic audio visual environments and games. [65] JH Sánchez and MA Sáenz. 2006. Assisting the mobilization through subway
IEEE, 13–18. networks by users with visual disabilities. Virtual Reality & Assoc. Tech (2006).
[41] Joshua McVeigh-Schultz, Anya Kolesnichenko, and Katherine Isbister. 2019. Shap- [66] Daisuke Sato, Uran Oh, João Guerreiro, Dragan Ahmetovic, Kakuya Naito,
ing pro-social interaction in VR: an emerging design framework. In Proceedings Hironobu Takagi, Kris M. Kitani, and Chieko Asakawa. 2019. NavCog3 in
of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12. the Wild: Large-Scale Blind Indoor Navigation Assistant with Semantic Fea-
[42] Lotf Merabet and Jaime Sanchez. 2009. Audio-based navigation using virtual tures. ACM Trans. Access. Comput. 12, 3, Article 14 (aug 2019), 30 pages.
environments: combining technology and neuroscience. AER Journal: Research https://doi.org/10.1145/3340319
and Practice in Visual Impairment and Blindness 2, 3 (2009), 128–137. [67] David W Schloerb, Orly Lahav, Joseph G Desloge, and Mandayam A Srinivasan.
[43] Microsoft. 2013. Altspace. Oculus Quest, Microsoft Windows. 2010. BlindAid: Virtual environment system for self-reliant trip planning and
[44] Microsoft. 2022. Microsoft Soundscape. Microsoft Windows. orientation and mobility training. In 2010 IEEE Haptics Symposium. IEEE, 363–
[45] Cecily Morrison, Ed Cutrell, Martin Grayson, Geert Roumen, Rita Faia Marques, 370.
Anja Thieme, Alex Taylor, and Abigail Sellen. 2021. PeopleLens. Interactions 28, [68] Alexa F Siu, Mike Sinclair, Robert Kovacs, Eyal Ofek, Christian Holz, and Edward
3 (2021), 10–13. Cutrell. 2020. Virtual reality without vision: A haptic and auditory white cane
[46] Konstantinos Moustakas, Georgios Nikolakis, Konstantinos Kostopoulos, Dim- to navigate complex virtual worlds. In Proceedings of the 2020 CHI conference on
itrious Tzovaras, and Michael G. Strintzis. 2007. Haptic Rendering of Visual Data human factors in computing systems. 1–13.
for the Visually Impaired. IEEE (January 2007). https://doi.org/10.1145/274497. [69] Zach Smith and Bre Pettis. 2008. Thingiverse. https://www.thingiverse.com/
274515 last accessed 12 April 2022.
[47] Robby Nadler. 2020. Understanding “Zoom fatigue”: Theorizing spatial dynamics [70] Statista. 2019. Leading barriers to mass adoption of VR according to XR pro-
as third skins in computer-mediated communication. Computers and Composition fessionals worldwide as of the 3rd quarter of 2019. https://www.statista.com/
58 (October 2020). https://doi.org/10.1016/j.compcom.2020.102613 statistics/1099109/barriers-to-mass-consumer-adoption-of-vr/ last accessed 7
[48] Gerhard Nahler. 2009. Latin square. Springer Vienna, Vienna, 105–105. https: July 2015.
//doi.org/10.1007/978-3-211-89836-9_776 [71] Supermedium. 2015. A-Frame. Cross-platform.
[49] Vishnu Nair, Jay L Karp, Samuel Silverman, Mohar Kalra, Hollis Lehv, Faizan [72] Manohar Swaminathan, Sujeath Pareddy, Tanuja S. Sawant, and Shubi Agarwal.
Jamil, and Brian A Smith. 2021. NavStick: Making Video Games Blind-Accessible 2018. Video Gaming for the Vision Impaired. ASSETS 18 (October 2018). https:
via the Ability to Look Around. In The 34th Annual ACM Symposium on User //doi.org/10.1145/3234695.3241025
Interface Software and Technology. 538–551. [73] Juan R Terven, Joaquin Salas, and Bogdan Raducanu. 2014. Robust head gestures
[50] Research Nester. 2021. Virtual Collaboration Market Segmentation by Tools recognition for assistive technology. In Mexican Conference on Pattern Recognition.
(Instant Communication, Project Management Tools, Cloud-Based Tools, and Springer, 152–161.
Others); and by End-User (IT & Telecom, BFSI, Retail, Healthcare, Logistics [74] Shari Trewin, Vicki L Hanson, Mark R Laf, and Anna Cavender. 2008. PowerUp:
& Transportation, Education, Manufacturing, and Others) – Global Demand an accessible virtual world. In Proceedings of the 10th international ACM SIGAC-
Analysis and Opportunity Outlook 2021-2029. https://www.researchnester.com/ CESS conference on Computers and accessibility. 177–184.
reports/virtual-collaboration-market/2994 last accessed 10 January 2022. [75] Dimitrios Tzovaras, Konstantinos Moustakas, Georgios Nikolakis, and Michael G
[51] Donald A. Norman. 1986. User Centered System Design; New Perspectives on Strintzis. 2009. Interactive mixed reality white cane simulation for the training
Human-Computer Interaction. L. Erlbaum Associates Inc. of the blind and the visually impaired. Personal and Ubiquitous Computing 13, 1
[52] Bugra Oktay and Eelke Folmer. 2010. TextSL: a screen reader accessible interface (2009), 51–58.
for second life. W4A 10 (April 2010). https://doi.org/10.1145/1805986.1806017 [76] Dimitrios Tzovaras, Georgios Nikolakis, George Fergadis, Stratos Malasiotis, and
[53] World Health Organization. 2021. Blindness and vision impairment. Modestos Stavrakis. 2002. Design and implementation of virtual environments
https://www.who.int/news-room/fact-sheets/detail/blindness-and-visual- training of the visually impaired. In Proceedings of the ffth international ACM
impairment#:~:text=Key%20facts,has%20yet%20to%20be%20addressed. last conference on Assistive technologies. 41–48.
accessed 26 Jan 2022. [77] Pablo Vera, Daniel Zenteno, and Joaquín Salas. 2014. A Smartphone-Based Virtual
White Cane. Pattern Anal. Appl. 17, 3 (aug 2014), 623–632. https://doi.org/10.
1007/s10044-013-0328-8
[78] Inc VRChat. 2017. VRChat. Oculus Quest, Microsoft Windows.
VRBubble ASSETS ’22, October 23–26, 2022, Athens, Greece

[79] Bruce N. Walker and Jefrey Lindsay. 2006. Navigation Performance With a [84] Jef Wilson, Bruce N. Walker, Jefrey Lindsay, Craig Cambias, and Frank Dellaert.
Virtual Auditory Display: Efects of Beacon Sound, Capture Radius, and Practice. 2007. SWAN: System for Wearable Audio Navigation. In 2007 11th IEEE Interna-
Human Factors 48, 2 (2006), 265–278. tional Symposium on Wearable Computers. 91–98. https://doi.org/10.1109/ISWC.
[80] Dean A. Waters and Husam H. Abulula. 2001. The virtual bat: Echolocation in 2007.4373786
virtual reality. [85] Yuhang Zhao, Cynthia L Bennett, Hrvoje Benko, Edward Cutrell, Christian Holz,
[81] Ryan Wedof, Lindsay Ball, Amelia Wang, Yi Xuan Khoo, Lauren Lieberman, Meredith Ringel Morris, and Mike Sinclair. 2018. Enabling people with visual
and Kyle Rector. 2019. Virtual showdown: An accessible virtual reality game impairments to navigate virtual reality with a haptic and auditory cane simulation.
with scafolds for youth with visual impairments. In Proceedings of the 2019 CHI In Proceedings of the 2018 CHI conference on human factors in computing systems.
conference on human factors in computing systems. 1–15. 1–14.
[82] Thomas Westin. 2004. Game accessibility case study: Terraformers–a real-time [86] Yuhang Zhao, Edward Cutrell, Christian Holz, Meredith Ringel Morris, Eyal Ofek,
3D graphic game. In Proceedings of the 5th International Conference on Disability, and Andrew D Wilson. 2019. Seeingvr: A set of tools to make virtual reality more
Virtual Reality and Associated Technologies, ICDVRAT. accessible to people with low vision. In Proceedings of the 2019 CHI conference on
[83] Matt Wilkerson, Amanda Koenig, and James Daniel. 2010. Does a Sonar System human factors in computing systems. 1–14.
Make a Blind Maze Navigation Computer Game More "Fun"?. In Proceedings of [87] Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2018. A face
the 12th International ACM SIGACCESS Conference on Computers and Accessibility recognition application for people with visual impairments: Understanding use
(Orlando, Florida, USA) (ASSETS ’10). Association for Computing Machinery, beyond the lab. In Proceedings of the 2018 CHI Conference on Human Factors in
New York, NY, USA, 309–310. https://doi.org/10.1145/1878803.1878886 Computing Systems. 1–14.
“It’s Just Part of Me:” Understanding Avatar Diversity and
Self-presentation of People with Disabilities in Social Virtual
Reality
Kexin Zhang Elmira Deldari Zhicong Lu
University of Wisconsin-Madison Univ. of Maryland, Baltimore County City University of Hong Kong
Madison, Wisconsin, USA Baltimore County, Maryland, USA Kowloon, Hong Kong, China
kzhang284@wisc.edu edeldar1@umbc.edu zhicong.lu@cityu.edu.hk

Yaxing Yao Yuhang Zhao


Univ. of Maryland, Baltimore County University of Wisconsin-Madison
Baltimore County, Maryland, USA Madison, Wisconsin, USA
yaxingyao@umbc.edu yuhang.zhao@cs.wisc.edu

ABSTRACT of People with Disabilities in Social Virtual Reality. In The 24th International
In social Virtual Reality (VR), users are embodied in avatars and ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22),
October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 16 pages.
interact with other users in a face-to-face manner using avatars as
https://doi.org/10.1145/3517428.3544829
the medium. With the advent of social VR, people with disabilities
(PWD) have shown an increasing presence on this new social me-
dia. With their unique disability identity, it is not clear how PWD 1 INTRODUCTION
perceive their avatars and whether and how they prefer to disclose With the rising popularity of virtual reality (VR), social VR is be-
their disability when presenting themselves in social VR. We fll this coming the mainstream of the online social eco-system. Social VR
gap by exploring PWD’s avatar perception and disability disclosure refers to VR platforms where people communicate and socialize in
preferences in social VR. Our study involved two steps. We frst con- the form of avatars [26]. One iconic characteristic of social VR is the
ducted a systematic review of ffteen popular social VR applications embodied “face-to-face” interaction from a frst-person perspective
to evaluate their avatar diversity and accessibility support. We then through avatars. By customizing their avatars, users can craft and
conducted an in-depth interview study with 19 participants who maintain any characters they prefer in a virtual world.
had diferent disabilities to understand their avatar experiences. Due to the deep embodiment of a user with their avatars, to
Our research revealed a number of disability disclosure preferences some degree, a customized avatar can be considered as a proxy
and strategies adopted by PWD (e.g., refect selective disabilities, of the user themselves [56]. A myriad of research has explored
present a capable self). We also identifed several challenges faced how people use avatars to shape their self-images in virtual social
by PWD during their avatar customization process. We discuss the spaces, including both PC-based virtual worlds [41] and the more
design implications to promote avatar accessibility and diversity immersive social VR [25, 26]. Research has found that while many
for future social VR platforms. people designed their avatars to refect their physical appearance
in the real world [57], some people leveraged the opportunity to
CCS CONCEPTS experiment with diferent aspects of their personalities [18, 24]
• Human-centered computing → Virtual reality; Accessibility and even explored completely diferent identities [81]. However,
technologies. most prior works focused on the self-presentation preferences of
people without disabilities. People with disabilities (PWD), given
KEYWORDS their unique disability identity, may have diferent perceptions and
Social VR, avatar, self-perception, disability disclosure, visual im- preferences when constructing self-images via avatars in social VR.
pairments, d/Deaf and heard of hearing Research on PWD’s self-presentation in social VR is in its in-
ACM Reference Format:
fancy. On traditional social media platforms, PWD have a strong
Kexin Zhang, Elmira Deldari, Zhicong Lu, Yaxing Yao, and Yuhang Zhao. presence and use various strategies to construct their online images
2022. “It’s Just Part of Me:” Understanding Avatar Diversity and Self-presentation as others do [72, 85]. Unlike people with typical abilities, disabil-
ity disclosure is a unique perspective that PWD need to consider
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed during self-presentation. Many PWD are cautious about disclos-
for proft or commercial advantage and that copies bear this notice and the full citation ing their disability-related vulnerabilities, and some intentionally
on the frst page. Copyrights for components of this work owned by others than ACM hide their disability to avoid potential risks, such as loss of job
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a opportunities [90] and cyber-bullying [46]. In contrast, some proac-
fee. Request permissions from permissions@acm.org. tively disclose their disabilities, especially on dating platforms, to
ASSETS ’22, October 23–26, 2022, Athens, Greece build trust and flter potential partners [65]. Unlike the 2D text-
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 or image-based social media, social VR provides a more embod-
https://doi.org/10.1145/3517428.3544829 ied and immersive experience via avatars, which can potentially
ASSETS ’22, October 23–26, 2022, Athens, Greece Zhang et al.

change PWD’s self-presentation preferences and their willingness multiple online identities for diferent audience groups [17], and
to disclose their disabilities. To promote safe and inclusive social the stereotypical representation of genders [3].
VR experiences for PWD, the accessibility and VR communities are With the advances in computer graphic technology and hard-
in need of a thorough understanding of PWD’s avatar perceptions ware support, social media expands from 2D to 3D avatar-based
and disability disclosure preferences on the emerging social VR systems [25]. Instead of using text and images to construct self-
platforms. image, people can directly create and personalize an avatar as their
Our research aims to fll the gap by investigating how PWD de- virtual representation and manage the impression from others via
sign and use their avatars for disability disclosure and self-presentation the avatar. Many researchers explored users’ digital representations
in social VR. Given that social VR is an emerging but premature via avatars in PC-based virtual worlds, including both the game
medium that lacks sufcient accessibility support, we also explore worlds (e.g., World of Warcraft [21]) and the social worlds (e.g., Sec-
the status quo of avatar diversity and accessibility on commercial ond Life [48]). In the virtual game worlds where a limited number
social VR platforms to reveal the barriers faced by PWD in the of pre-defned avatars were provided, many users tended to select
avatar customization process. Specifcally, we aim to answer the avatars that stood out, followed a trend [18], or served the role they
following research questions: planned to play in the game [27]. In contrast, social worlds gave
• RQ1: Whether and how is avatar diversity supported on the users more fexibility to design their own avatars, for example, by
mainstream commercial social VR platforms? adjusting and customizing diferent body features (e.g., hair, eyes,
• RQ2: Whether and how do PWD disclose their disabilities skin color). Without restrictions from the game theme, many users
when presenting themselves via avatars? chose to create avatars that resembled themselves in the real world
• RQ3: What challenges do PWD face during the avatar design [18, 71, 82]. For example, Ducheneaut et al. [18] explored users’
and creation process? avatar customization across three virtual worlds (i.e., Maple Story,
To answer these research questions, we frst systematically re- World of Warcraft, and Second Life), and found that more users
viewed 15 commercial social VR applications to understand what preferred reproducing some of their physical characteristics in the
avatar features are supported for disability representation. We then social virtual worlds than in the game worlds. Koles et al. [44] also
conducted in-depth semi-structured interviews with eight visually showed that the majority of users in Second Life tended to use
impaired people, nine d/Deaf and hard of hearing people, and two their physical selves as the starting point to create their virtual
people with multiple disabilities to explore their avatar perceptions representation.
and customization experiences in the social VR context. Our fnd- With the high fexibility in avatar customization, people also
ings revealed a spectrum of disability disclosure strategies adopted have the freedom to create avatars that are diferent from their
by PWD for self-presentation, from accurately disclosing one’s dis- physical appearances, thus better shaping their online identities.
ability, to selectively presenting a particular aspect of the disability, As a result, some people used avatars to experiment with diferent
presenting the changes in one’s ability, to hiding one’s disability to aspects of their personalities [42, 78, 79]. For example, Kafai et
construct a diferent self. Our study highlighted the lack of avatar al. [42] surveyed 438 tween players in a virtual social community
diversity for disability representation and emphasized PWD’s needs named Whyville and found that most tweens did not construct
to fexibly control their disability disclosure in diferent social con- their avatar appearances based on their real selves. Instead, they
texts. We also identifed the avatar accessibility barriers faced by designed avatars to achieve recognition from others or to reveal
PWD and suggested design implications. specifc aspects of their “real” selves, including those they desired
This paper makes three contributions. First, to the best of our but could not present in real life. Via avatar creation, users can even
knowledge, this is the frst research that investigates social VR create and explore a completely diferent identity, such as an avatar
avatar diversity and self-presentation from the lens of disability. with a diferent gender [37] or a diferent race [62], which may help
Second, our research presents rich and in-depth data to uncover them discover their true selves and increase their self-esteem [5].
PWD’s diferent disability disclosure strategies and preferences. Most prior work on avatar perception focused on the majority
Third, design implications are derived to inspire a more accessible group. Limited attention has been paid to the self-presentation
and inclusive avatar experience for PWD in social VR. and identity disclosure of the under-representative user groups,
especially people with disabilities.

2 RELATED WORK
2.1 Self-presentation in Virtual Worlds
Self-presentation, also known as impression management, refers
2.2 Self-disclosure by People with Disabilities in
to the ways that a person “conveys an impression to others which Virtual Worlds
is in his interests to convey” [30]. Via self-presentation, people Diferent from self-presentation where people can construct and
can selectively craft, present, and maintain specifc facets of their present any images that they prefer to the audience, self-disclosure
identities based on diferent audiences and social settings [1, 22, refers to the act of revealing any messages about oneself to others
43, 63]. A myriad of research has explored how people manage [14, 29, 31]. While self-disclosure is involved in self-presentation
their identities on social media (e.g., Facebook, Twitter), revealing strategies to help build closer relationships [70], it also poses po-
the complexity of self-presentation from various aspects, such as tential risks of exposing one’s vulnerabilities, especially to the
beautifed real self on anonymous social platforms [29, 87, 88], unknown or anonymous audience online [53]. Prior research has
Avatar Diversity and the Self-presentation of People with Disabilities in Social VR ASSETS ’22, October 23–26, 2022, Athens, Greece

explored the online self-disclosure experiences of various under- 2.3 Avatars Design and Self-presentation in
representative groups, such as racial minority [36, 49, 50] and gen- Embodied Social VR
der minority [9, 11, 20, 60]. In this section, however, we focus on
In recent years, social VR has gained increasing attention. Unlike
the work for people with disabilities (PWD).
the PC-based virtual worlds where users see avatars on a 2D screen
Disability disclosure has always been an important topic for
and can only control the avatars via a keyboard and a mouse, social
PWD even before social media platforms emerged [38, 64, 68, 80].
VR incorporates full-body tracking, providing a more embodied
Prior research focused primarily on the physical working context,
frst-person avatar experience as well as richer and more immer-
showing that disability disclosure could potentially assure appro-
sive social interactions [25]. As a result, social VR provides new
priate workplace accommodations and increase workplace diver-
afordances for avatar design and interaction [34, 45, 59, 83]. For
sity and inclusiveness for PWD [80]. However, it may also lead
instance, Kolesnichenko et al. [45] interviewed industry experts
to negative employment consequences for PWD, such as lowered
who work for diferent commercial social VR platforms and uncov-
supervisor expectations, isolation from colleagues, and increased
ered the current avatar design practices from diferent aspects, such
possibility of termination [38, 64]. Because of these, in working or
as locomotion, avatar aesthetics, and avatar’s relation to one’s vir-
daily living contexts, disability disclosure is highly contextualized,
tual identity. Menon [59] also researched the relationship between
and infuenced by many factors such as employers, managers, and
avatar customization and embodiment in a virtual job interview
workplace climate [64, 80].
context, highlighting the needs for basic avatar customization.
The advent of online social platforms (e.g., social media, social
Some research specifcally focused on avatar-based self-presentation
virtual worlds) afords new forms of social interactions and activi-
in social VR and how it afected users’ behaviors and interactions
ties, along with new social relationships. These online platforms
[25, 26, 55]. For example, Freeman et al. [25] interviewed 30 people
bring PWD more opportunities to interact with others and get ac-
about their avatar perception and social interaction experiences
cess to resources and communities [74, 76]. For example, Boellstorf
in social VR. They found that most people considered the avatars
[6] built “Ethnographia”, a virtual island in Second Life, to investi-
to be themselves and strived to make their avatars similar to their
gate how digital spaces infuenced PWD’s experiences and found
physical appearances. This research also revealed how avatar gen-
that building virtual worlds enhanced PWD’s sense of ability and
ders and race afected people’s experience. For example, female
helped them reconstruct identities. With these online social plat-
avatars received better frst impression and nicer treatment but
forms, PWD were able to manage their disability disclosure, thus
also led to more harassment, and non-white avatars could bring
achieving a “levelling ground” where they could be treated on their
certain social stigma. Moreover, Maloney and Freeman [55] studied
merits as a person without being shadowed by their disabilities
how and why people disclose their information in social VR. They
[7]. However, disability disclosure in online communities can also
argued that creating avatars that look like one’s physical self can
bring risks to PWD [69, 74]. For example, Ringland [69] conducted
disclose important personal information such as gender, race, and
a 200-hour observation in Autcraft, an online community for chil-
appearance. While this helped build close connection with others,
dren with autism, and found that the autistic identity in virtual
it can potentially lead to privacy risks.
worlds could be both a source of empowerment and a source of
With more realistic avatars and more embodied interactions,
harassment and violence.
social VR can also afect PWD’s self-presentation and disability dis-
Researchers have explored PWD’s self-disclosure strategies and
closure preferences. Some researchers have noticed the importance
preferences in diferent online communities [6, 28, 40, 65, 69]. For
of this research direction. Boellstorf [6] emphasized the importance
example, Porter et al. surveyed 91 adults with and without disabil-
of distinguishing “virtual world” and “virtual reality” for under-
ities to understand the needs for disability disclosure on online
standing PWD’s avatar experience. Mott et al. [61] also stressed
dating platforms. They found a higher expectation of the disclosure
the needs for supporting more diverse avatar representations for
of visible disabilities than invisible disabilities [65]. By designing a
PWD in social VR. However, to our knowledge, no research has
movement-based virtual game for young people using wheelchair
thoroughly explored PWD’s avatar design experiences in the em-
and exploring their avatar preferences, Gerling et al. found that
bodied social VR. Our research contributes to this line of research
while 6 out of 8 participants depicted wheelchairs as indispensable
by developing a deep understanding of the current avatar diver-
parts of their self-images, only two participants were willing to use
sity practices, as well as PWD’s self-presentation challenges and
avatars with mobility disabilities in the game [28]. Recently, Davis
strategies in social VR.
and Stanovsek conducted a 3-year ethnography study to explore
how PWD customize their avatars in Second Life and found that 3 STUDY I: APPLICATION REVIEW: AVATAR
many participants used non-human avatars to free themselves from
their visible disabilities [16].
DIVERSITY ON COMMERCIAL SOCIAL VR
Compared to the conventional social media and virtual worlds, PLATFORMS
the emerging social VR brings unique embodied experiences, which The goal of this study is to build a comprehensive understanding of
may potentially afect PWD’s willingness of disability disclosure. the avatar diversity and accessibility practices on current commer-
However, little work has investigated the infuence of social VR on cial social VR platforms (RQ1). Specifcally, we aim to uncover 1) the
PWD’s self-presentation and disability disclosure preferences. general avatar design practices: what types of avatars are supported
and how users can select or customize an avatar on diferent social
VR platforms; 2) avatar diversity support: what avatar features are
provided to enable disability disclosure for PWD; and 3) avatar
ASSETS ’22, October 23–26, 2022, Athens, Greece Zhang et al.

accessibility: what accessibility features are provided for PWD to non-humanoid avatars, such as cartoons, robotic fgures, animals,
perceive, use, and customize avatars. and objects. VRChat was also the only platform that enabled users
to design and import their own avatars from third-party platforms.
3.1 Method Additionally, three platforms (Villa, Arthur, and Spatial) employed
Photo-based Avatar Generation. Table 2 in Appendix listed the
We conducted a systematic review of ffteen popular commercial
avatar customization methods supported by diferent platforms.
social VR applications. To determine which social VR applications
Beyond individual applications, the Oculus system provided its
to review, we conducted an exhaustive search on three mainstream
own avatar system (i.e., Meta Avatars) that employed the Avatar
VR application stores, including Oculus, Viveport, and Steam. Our
Feature Customization model. When frst logged in to an Oculus
search focused on applications available in the United States in
device, users were automatically directed to Horizon Home1 to
November and December 2021. We frst searched the keyword “so-
customize their avatars for the whole VR system. Three platforms–
cial” in these stores and identifed a total of 133 VR applications:
Horizon Worlds, Horizon Venues, Alcove–directly adopted the Meta
47 from Oculus, 54 from Viveport, and 50 from Steam (18 apps
Avatars, so that a user could continue using the same avatar from
were available across multiple stores). To further narrow down
the Oculus system.
the scope, we fltered the applications by checking the application
Our review of the avatar customization process confrmed the
descriptions and only focused on the applications whose descrip-
results from Kolesnichenko et al.’s work [45] and further extended
tions clearly indicated a social or collaborative nature.Finally, we
the scope to more widely-used social VR applications.
excluded VR applications that showed an intense gaming nature
(e.g., sports, shooting games), which can distract users from socializ- 3.2.2 Avatar Realism. Most social VR platforms provided only
ing, such as EchoVR, StarTrek: Bridge Crew, and SpaceteamVR. We humanoid avatars. We investigated the avatar realism by examining
also consulted other related work and online articles about popular the completeness of the humanoid avatars. We found that only
social VR applications to ensure not missing any mainstream social three platforms–VRChat, vTime XR, and ENGAGE–provided full
VR platforms [2, 52, 86]. This process resulted in ffteen social VR body avatars, including head, neck, torso, arms, hands, legs, and
applications with strong social nature, including Rec Room, VR- feet. All other 12 platforms focused on rendering the upper body
Chat, Horizon Worlds, vTime XR, AltspaceVR, Bigscreen, Alcove, of the avatars, presenting at least the avatars’ head and hands.
Half + Half, Horizon Venues, Villa, Arthur, ENGAGE, Multiverse, Specifcally, fve platforms did not show the avatars’ arms but only
PokestarVR, and Spatial (details can be found in Table 2 in Appen- presented fowing hands; avatars in Rec Room and Bigscreen did
dix). not have necks; and Villa’s avatars did not even have a torso. In
Two researchers on the team reviewed all applications indepen- terms of avatar rendering details, except for Rec Room, all social
dently. We adopted a depth-frst traverse strategy, clicking all avail- VR platforms we reviewed rendered avatar fngers and tracked the
able buttons and menu items in the avatar customization process. users’ fnger movement via the VR controllers. Table 2 in Appendix
During the review, researchers video-recorded and took notes of provides details of the avatar realism on all platforms.
all avatar options and the interaction process. The two researchers
then discussed their results to ensure the reliability of the review. 3.2.3 Disability Representation in Avatars. Our review uncov-
Following the same strategy, we reviewed the general setting in ered the limited avatar diversity support for PWD. We found that
each social VR application to examine what accessibility features most commercial social VR platforms did not ofer any disability-
were supported. Any accessibility features that could infuence or related avatar features. Meta Avatars was the only avatar system
be applied to the avatar selection and customization process were that provided hearing device features for people who are d/Deaf
documented. All applications were reviewed with Oculus Quest 2. or hard of hearing. Two types of hearing devices were supported:
cochlear implants and hearing aids. A user can put the hearing de-
3.2 Review Results vice on the avatar via three options: left ear only, right ear only, and
both ears (Fig 1A). This feature was provided under the category of
3.2.1 Avatar Creation Methods. We identifed four ways to cre-
“Clothing.” However, no other assistive devices or disabilities were
ate and customize avatars in social VR: (1) Full Avatar Selection: se-
supported. Additionally, we found some “near disability-related”
lecting from a set of pre-determined avatars provided by the system,
features that may be used to indicate a disability. Specifcally, Bige-
(2) Avatar Feature Customization: customizing diferent components
screen provided an eye patch feature under “Glasses” category,
of a human avatar (e.g., eyes, hair styles), (3) Photo-based Avatar
which users can add onto their avatars to present their visual im-
Generation: uploading a photo to the system and generating a cor-
pairments or eye injuries. However, we acknowledge that this fea-
responding avatar automatically, and (4) Third-party Avatar Import:
ture might be designed as an eye decoration (e.g., to imitate a pirate)
designing an avatar using a third-party platform and uploading it
instead of a disability feature.
to the social VR application.
Besides disabilities, some applications allowed users to customize
Most social VR platforms (12 out of 15) supported Avatar Feature
wrinkles and face lines on avatars to refect age and represent older
Customization, allowing users to customize the avatar’s physical
adults. For example, Meta Avatars provided fve levels of face lines
features (e.g., skin tone, body shape, eyes), clothing (e.g., outfts,
with diferent depth and number of wrinkles on avatar’s face to
accessories), or both. Three platforms (VRChat, Half+Half, and Mul-
tiverse) employed the Full Avatar Selection model. Notably, VRChat
ofered a set of 80 pre-defned avatars for users to select and sup- 1 A virtual home in Oculus Quest 2, which functions like the homepage of 2D platforms
ported various avatar types, including both humanoid avatars and and enables certain social activities [66].
Avatar Diversity and the Self-presentation of People with Disabilities in Social VR ASSETS ’22, October 23–26, 2022, Athens, Greece

demonstrate diferent ages. vTime XR ofered four sagging levels disabilities (or disabilities that can be indicated by visible assis-
for avatars’ facial skin to present age (Fig 1B). tive technologies). Both disabilities can be easily noticed in the
real world and thus can be refected via avatar design. For exam-
ple, blind people may use a white cane, and d/Deaf people may
wear hearing aids or may sign. We recruited 19 participants (11
female, 7 male, and 1 transgender). Their ages ranged from 20 to 58
(���� = 33, �� = 11.57). Among the participants, eight had visual
impairments (V-P1 to V-P8), nine were DHH (H-P1 to H-P9), and
two had multiple disabilities (M-P1, M-P2). M-P1 was both blind
and DHH, while M-P2 was blind and had a prosthetic leg. Table 1
shows participants’ detailed information.
We spread our recruitment information via (1) the mailing lists
of non-proft disability organizations, such as the National Feder-
Figure 1: Avatar Diversity: A. Hearing devices in Meta
ation of the Blind and National Association of the Deaf, and (2)
Avatars; B. Sagging facial skin in vTime XR Avatar.
mainstream social media platforms, such as Facebook groups, Twit-
ter, and Instagram. Interested participants could fll in a screening
3.2.4 Accessibility Features for Avatar Customization. The
survey with their age, disability conditions, and general avatar ex-
accessibility features on social VR platforms were limited. While
periences. Participants were eligible if they were over 18 and had
some platforms supported standard accessibility features that were
visual impairments or were d/Deaf or hard of hearing. We limited
commonly used on computer and smartphone interfaces (e.g., mag-
the recruitment to individuals who spoke English. If selected to
nifcation, color correction), no features were designed specifcally
participate, participants were asked to sign a consent form prior
for avatar customization. Specially, AltspaceVR allowed users to
to the interview. Upon completion, participants received $15 as
adjust the scale of the avatar interface. The Oculus system also
compensation.
provided visual augmentations (i.e., color correction, text size ad-
justment) that can be applied to any applications on Oculus devices.
4.1.2 Interview protocol. The interview included three phases. The
Moreover, most social VR platforms allowed adjustment to the
frst phase focused on participants’ background information, in-
audio and haptic feedback, which could be helpful for PWD.
cluding demographics (age and gender), self-reported disability, and
By reviewing mainstream social VR platforms, we developed a
experiences with VR and social VR.
comprehensive understanding of the current practices of the avatar
The second phase focused on participants’ self-presentation and
design, diversity, and accessibility. The results of this study inspired
disability disclosure via avatars. We frst asked participants to send
and served as a solid grounding for our following interview study
photos of their avatars or screen share their avatars with us to
(Study II). First, the review identifed existing disability features
demonstrate their avatar design. We asked about how they de-
for avatars and the social VR platforms that supported such fea-
signed their avatars, why they customized their avatars in this
tures, which enabled us to better understand participants’ social VR
way, and whether their avatars involved any features to disclose
experiences and disability disclosure choices in an in-depth inter-
their disabilities. If the participants disclosed their disabilities via
view. Second, the limited disability features (i.e., hearing aids, eye
their avatars, we further asked how they disclosed their disabilities,
patches) identifed in this study helped us narrow down the focus
why they wanted to do so, and their experience in social VR after
to participants who were visually impaired and deaf or hard of
they disclosed disabilities via avatars. If participants’ avatars did
hearing, since they were the only disability groups that had avatar
not indicate their disabilities, we asked about their willingness of
representations in social VR.
disclosing their disabilities via avatars and the reason. Additionally,
we asked participants whether and how they had disclosed their
4 STUDY II: INTERVIEWS: disabilities through any other means or on any other online social
SELF-PRESENTATION OF PEOPLE WITH platforms and the rationales. Participants also discussed whether
DISABILITIES VIA AVATARS and how they wanted the social VR platforms to support disability
In this study, we investigated how PWD design and craft their representations via avatars.
avatars to disclose their disabilities and present themselves in social The last phase of the interview focused on avatar creation acces-
VR (RQ2) as well as their challenges and needs during the avatar sibility. We asked about participants’ experiences, difculties, and
creation and customization process (RQ3). strategies when creating and customizing avatars and the types
of assistance they needed to complete the avatar creation process.
4.1 Method We also asked about their suggestions for a more accessible avatar
experience.
We conducted semi-structured interviews with 19 PWD via Zoom
in February and March 2022. The study was approved by the Insti-
4.1.3 Data recording and analysis . Upon participants’ consent,
tutional Review Board (IRB).
all 19 interviews were recorded and auto-transcribed by Zoom.
4.1.1 Participants. Our recruitment focused on two types of sen- Participants had the option to turn of their cameras during the
sory disabilities, (1) d/Deaf or Hard of Hearings (DHH) people and interview, although it was not required. Two researchers manually
(2) visually impaired (VI) people, since they are common and visible cleaned all transcripts by checking the recorded interviews.
ASSETS ’22, October 23–26, 2022, Athens, Greece Zhang et al.

We conducted thematic analysis [8, 13] to identify repetitive 4.2.2 Disability Disclosure via Avatars. Our study indicated
patterns and themes in the interviews. First, we manually organized that disability disclosure via avatars was an essential strategy of
all qualitative data in an Excel sheet and selected fve representative self-presentation for PWD. We identifed participants’ diferent
transcripts (2 DHH participants, 2 VI participants, and 1 participant disability disclosure preferences in social VR.
with multiple disabilities) as samples. Three researchers coded all Refect one’s physical self. The majority of our participants (17
samples independently at the sentence level with open coding. out of 19) designed avatars to refect their physical appearances in
Then, they discussed and reconciled their codes to resolve any real life, including both facial features and outfts. Some participants
diferences, and developed an initial codebook upon agreement. even hoped to craft the fne details (e.g., makeup, accessories) of
Next, two researchers divided the rest of the transcripts based on the the avatars to show their daily styles, habits, and values in real life.
participants’ disabilities. Specifcally, one researcher analyzed all As part of their physical appearance, eight participants (e.g.,
VI participants’ data (including the two participants with multiple H-P3, H-P7, V-P1) expressed their willingness to disclose their
disabilities) and the other researcher analyzed all DHH participants’ disability via their avatars since they believed that the disability “is
data. During this process, the two researchers regularly checked part of me” (H-P3). As H-P7 indicated, “I have [a cochlear implant]
each other’s codes and discussed as needed to ensure consistency. on [my avatar] all the time really, just because that’s what I do in real
New codes were added to the codebook based on the agreement life. I like my avatar to represent me as realistic as possible or as close
between the two researchers. In the meantime, the third research to [myself], so if I have a cochlear implant I’m not ashamed of it.” Two
oversaw all these activities to ensure a high-level agreement. The participants (M-P1, H-P8) specifcally emphasized that when they
fnal codebook contained over 120 codes. We categorized all the disclosed their disabilities on social platforms, they never aimed
codes into high-level themes using afnity diagram and achieved to highlight their disability since the disability was just like other
twenty themes. physical features, such as hairstyle, skin color, and gender. As M-P1
stated, “Because that’s who I am. I don’t see any reason to not disclose
[my disability]. It’s a part of who I am. And I wouldn’t like try to
4.2 Findings pretend that I was a diferent race or try to pretend that I was not
4.2.1 Experience with Avatars and Social VR Platforms. Six- female cisgender kind of person, so I wouldn’t pretend that I didn’t
teen out of 19 participants had social VR experiences. In general, we have a disability.”
found that DHH participants had a richer experience with social VR Interestingly, although most participants wanted their avatars to
avatars than VI participants. Except for H-P3, all DHH participants refect their physical selves as much as possible, H-P2 did not want
had used social VR multiple times. Five DHH participants were the details to be completely the same: “I just tried to make it look
frequent users with experiences of more than a year. In contrast, VI kind of like me, but not too much like me, because it’s kind of creepy
participants had limited social VR experience, with most of them when they look like twins. [The avatars] don’t ever look completely
only trying it a few times. Three VI participants did not have so- like you, especially the hair, you cannot replicate someone’s hair [with
cial VR experience, but they used avatars on conventional social the current VR technology].” With the limitation of current avatar
media, such as Snapchat Bitmoji and Meta Avatars for Instagram. realism and the Uncanny Valley efect2 , she discussed the boundary
Several VI participants mentioned that they attempted to customize between the physical self in real life and the virtual image presented
their avatars on social VR several times but were blocked when by avatars in the virtual world: “I don’t want [avatars] to be too much
setting up an account. We report the accessibility challenges faced like me, because it’s not real life. Don’t try to make it be real life when
by participants in Section 4.2.5. it’s not like [real life].”
The two most commonly used social VR platforms by our par- Refect one’s physical self with selective disabilities. Unlike
ticipants were VRChat (7 participants) and Rec Room (5). Most most participants who preferred disclosing their disabilities entirely,
participants used social VR via head-mounted VR devices and the M-P1 selectively disclosed her disability. M-P1 experienced both
most commonly used devices were Oculus Quest, Oculus Rift, and visual and hearing loss, and she decided which disability to disclose
Valve Index. However, three participants (H-P1, V-P2, V-P6) pre- based on the visibility of the disabilities. She used both a white cane
ferred using desktop-based social VR due to the accessibility issues and hearing aids in the real life. However, she would only add a
of VR devices. white cane to her avatar to signify her visual impairments but not
Participants used social VR for multiple reasons. Most partici- disclose her hearing loss. This is because hearing aids were not vis-
pants used social VR to communicate with friends (7 participants) ible to people in the real world in most cases. Interestingly, M-P1’s
and play games (6). H-P4 specifcally wanted to know other DHH consideration of disability visibility was to fulfll the expectation of
people and connect with DHH community in social VR. Six par- the audience, specifcally people who knew her in real life. As she
ticipants used social platforms for professional purposes, such as explained,
hosting VR meetups (H-P1), promoting rights for PWDs as accessi- “I do use hearing aids, but I don’t think that they are visible partic-
bility activists (H-P1), and using social VR as an education platform ularly... so I was like, well I am not going to add a hearing aid, because
(e.g., M-P1, V-P7, H-P4). Notably, three participants (H-P4, H-P6, a lot of people probably don’t even know that I use [hearing aids],
H-P8) used social VR for American Sign Language (ASL) educa- because I don’t think they’re that visible, whereas people would know
tion. They learned or taught ASL in a VRChat community called that I use a cane. I was trying to make [my avatar] look like me, and
“Helping Hands” [84], which was a community for DHH people
to communicate via sign language (see details about ASL in VR in 2 Uncanny Valley efect refers to viewer’s increased eerie feelings when an entity looks
Section 4.2.4). highly human-like [35].
Avatar Diversity and the Self-presentation of People with Disabilities in Social VR ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Participants’ demographics and social VR experiences. * indicates avatar platforms demonstrated by our participants.

ID Age/ Self-reported Disability Assistive Tech Used Social VR Social VR Platforms Used
Sex Experience
H-P1 52/F Profound deaf since birth Cochlear implant 12 months VRChat, AltspaceVR, Spatial, Mozilla Hubs, Rec
Room, Meta Avatars*
H-P2 20/F Severe deaf in left ear Hearing aid Multiple times A social VR game, Snapchat*

H-P3 20/F 25% hearing loss in right ear Hearing aid Tried once VR Exhibition at a museum, Snapchat*, Meta
Avatars*, Sims*, Memoji*
H-P4 23/Trans Moderate hearing loss Hearing aids 24 months VRChat*

H-P5 29/F Moderate hearing loss Hearing aids Multiple times VRChat, Rec Room, Snapchat*, Meta Avatars*,
Roblox*
H-P6 25/M Profound deaf N/A 12 months VRChat*, SteamVR

H-P7 30/M Profound deaf Cochlear implant 12 months VRChat, Rec Room*, Lost Horizon, Meta
Avatars*
H-P8 25/M Deaf Hearing aids 24 months VRChat*, SteamVR

H-P9 29/F 80% hearing loss Hearing aids Multiple times A social VR app to play games with others,
Snapchat*
V-P1 28/M Blind White cane Few times Horizon Series, Meta Avatars*

V-P2 31/F Residual vision in one eye N/A 2 months Cardboard, Meta Avatars*

V-P3 26/M Loss of side vision N/A 24 months BeanVR, Snapchat*

V-P4 38/M Blind White cane, guide dog Multiple times VRChat, Rec Room, Oculus Venues, Xbox*

V-P5 32/F Only have vision in one eye White cane, guide dog N/A Snapchat*

V-P6 25/F Blind White cane N/A Twitter avatar*

V-P7 49/F Blind White cane Multiple times A social VR app with educational purpose, sev-
eral social VR games, Meta Avatars*
V-P8 53/F peripheral vision loss White cane, guide dog N/A Meta Avatars*

M-P1 58/F Blind since birth; hearing dis- Hearing aids, white Few months A social VR app for education, Meta Avatars*
ability since recently cane, guide dog
M-P2 38/M Blind; has a prosthetic leg White cane Multiple times Rec Room*

I think that if I had hearing aids, a lot of people would be like, why any sort of background noise, I can’t hear anything that you say. The
do you have hearing aids, so why did you do that.” whole world is just like quieter. People don’t realize that.”
Moreover, M-P1’s experience with diferent disabilities also infu- Present a capable self. Uniquely for PWD, some participants
enced her choice of disability disclosure. She preferred to disclose wanted to present a capable self via avatars, demonstrating that
only the “dominant disability” that can represent her most: “I have they were as capable as people without disabilities in accomplish-
had the hearing disability for a lot less time than the [blindness], I ing tasks. However, participants employed divergent strategies to
mean I’ve been blind all my life, so pretty like used to [blindness]. present their capabilities.
I think that maybe if I had grown up hard of hearing or deaf that Four participants (H-P4, V-P3, V-P4, M-P2) chose to hide their
would be more important to represent for me.” disabilities in avatars, so that their disabilities would not bias others
Refect changes of the physical self. Notably, participants and overshadow their capability in social interactions. For example,
who experienced acquired and progressive disabilities mentioned V-P3, who played multiplayer games in BeanVR, preferred not to
that they wanted to use their avatars to refect their physical changes disclos his visual disability, because he was afraid of being judged
and inform people about their current abilities. For example, H-P2, as a weak player. As V-P3 described, “Sometimes there are challenges,
who suddenly experienced one-side hearing loss at the age of 20, when you are trying to make friends, people may like disregard your
emphasized her need to inform people of her current hearing ability friend request because [your] visual impairment. Sometimes you have
via her avatar: “Because people don’t know [my acquired disability], to keep it a secret. I was like looking for teammate on BeanVR, so that
especially when [my hearing loss] happens so suddenly to me. Liter- you can join a game, nobody wanted to be my teammates at the end
ally people are like ‘Oh, you can still hear out with one ear, you’re because maybe my weakness.”
just fne.’ But it’s like, I have no ability to locate a noise. When there’s
ASSETS ’22, October 23–26, 2022, Athens, Greece Zhang et al.

Reversely, some participants (V-P1,V-P5) crafted their avatars I’ve been that way (not disclosing disability unless necessary) all my
to show their disabilities and used them as a way to prove their life, not just in VR.”
capability and independence in using this new technology. As V- Moreover, some participants (H-P4, V-P4, and M-P2) viewed
P1 said: “I see that avatar can be a symbol of hope, a symbol of I social VR as a world where they can express themselves freely
can do this. For me, disability is not the end. I like to overcome my and explore an ideal self that was diferent from the real world. As
disability and show others that I can overcome it. I’ve thought of ways V-P4 said, “Because I feel like in virtual reality, your disability really
to overcome every obstacle. So, by [customizing my avatars], [I show shouldn’t matter because it’s virtual reality. And virtual reality you’re
that] every person with disability can become independent.” not hampered or hindered or shouldn’t be any way. So my case of
Present a professional self for disability education and blindness, plus a prosthetic leg, should not hold me back in virtual
awareness. Nine participants (4 DHH and 4 VI participants, and M- reality.”
P1) disclosed their disabilities to trigger conversations and educate
the public about disability and equity. As H-P7 expressed, “I think
if people notice [the disability feature of my avatar], it can spark 4.2.3 Avatar Customization for Disability Disclosure. While
an interesting discussion about oh look they’re wearing a cochlear most participants indicated a strong desire to be able to show their
implant, what is that.” H-P1 further emphasized the importance of disabilities via avatar design, they experienced various difculties
educating people about DHH people: “Avatars [with disabilities] in practicing disability disclosure via avatars. In this section, we
were kind of cool, because that’s me educating people that deafness is reveal participants’ challenges in disability disclosure via avatars,
a spectrum, we don’t ofer in buckets.” the strategies they adopted, and their desired features to support
V-P7 shared her disability to advocate for social justice. As V-P7 disability representation.
said, “Part of the reason [of disclosing my disability] is that, I am an Challenges in disclosing disability via avatars. We identifed
advocate for social justice and all marginalized communities, and so four main challenges that participants encountered when disclosing
I want people to know that I am an ally, and I am here to support disability via avatars.
and work for equitable opportunities for all individuals regardless of First, almost all participants (18 out of 19) complained about the
what their diferent diversity might be, and so having my disability lack of disability representations in avatar design. Seven partici-
out there on social media gives people a little more insight into me as pants mentioned not knowing or not being able to fnd any avatar
a person.” features that represent disabilities on the social platforms they used.
Disclose disability selectively based on social contexts. While some platforms support limited disability features (e.g., hear-
Some participants decided whether to disclose their disabilities ing aids and cochlear implants in Meta Avatars), the choices and
based on specifc social circumstances. Seven participants (three customization options were limited. For example, H-P2 mentioned
DHH and four VI participants) considered the audience. Some pre- that Snapchat only provided hearing aids for avatars, but people
ferred disclosing their disabilities only to the audience who they who used cochlear implants did not want to use them to represent
knew in real life. For example, H-P1 did not want to disclose her their disabilities. Due to the lack of disability features, V-P5 used
disability to random strangers in social VR: “I think the only situa- regular avatar decorations (i.e., sunglasses) to present her low vi-
tion I can think of where I may not disclose is when there are other sion. However, it cannot clearly signify her disability since people
strangers, too many strangers. In a setting where you can just show up, without disabilities also use it. As V-P5 said, “My avatar has sun-
and a stranger can come run after you, I would not want to [disclose glasses [since I’m light sensitive], but this doesn’t really make sense
my disability].” because anybody can wear sunglasses. I honestly do wish that there is
V-P2 emphasized that disability was her privacy. She would not more representation of people with disabilities.”
share information about her disability on every occasion. Instead, V-P4 further emphasized the importance of avatar diversity for
she would assess the situation and audience, then make the decision disabilities by comparing it to other minority groups, “I think it’s a
accordingly. As she indicated, “Well, in general, I am a private person. very good idea to have [disability-related] features, I might even use
I don’t like too many people knowing about my disability, because them. Everyone’s talking about diversity and it’s all about race and
they haven’t been through what I’ve been through, they judge before gender. But one common thing that gets left out is disability. I think
they even know. I am a really sensitive person, so somebody says that that you should have the option to represent, whether you want to use
to me may not be hurtful to them, [but hurtful to me].” that or not it’s up to you, but it never hurts to have.”
Present a diferent self that is not defned by disability. Second, current disability signifers for avatars were not always
While the majority of participants were willing to disclose their compatible with other avatar features. For example, H-P1 tried to
disabilities via avatars, three participants (H-P4, V-P4, and M-P2) put a cochlear implant on her avatar but ended up not using it since
held the opposite attitudes. H-P4 preferred not disclosing her dis- it was blocked by the avatar’s hair: “The [Meta Avatars] do have a
ability at all because she did not want her disability to become her cochlear implant you can add. But you can’t see it, because my hair
representative identity and to overshadow her personality when kind of covered it. I tried to spare [the hair] in a pony tail to see if
socializing with others: “I don’t like to disclose [my disability], be- you could see the cochlear implants. But you couldn’t, so I put it back
cause I don’t want that to be the initial impression that people have, down.”
that this person is deaf. I just don’t feel a need to say anything unless Moreover, the unrealistic size of the virtual hearing devices also
I have to. Because it might just be a personal thing, and I just don’t prevented participants from applying them to their avatars. H-P5
want that to be characteristically associated. It’s not a bad thing. But explained her experience in Roblox: “The hearing aids are larger
than the character, so if you try to put them on, it’s seriously like
Avatar Diversity and the Self-presentation of People with Disabilities in Social VR ASSETS ’22, October 23–26, 2022, Athens, Greece

trying to put Barbie clothes on like a Little Tikes doll. They’re just In addition, H-P7 considered social VR to be a safer place to
wrong, they’re just foating.” disclose disability since he could easily move away from toxic con-
Methods of disclosing disabilities via avatars. Participants versations. As he indicated, “the freedom of moving around [with
mostly disclosed their disabilities by accessorizing their avatars with avatars] makes you get the power to walk away [from ofensive com-
the assistive devices they used in real life. Three DHH participants ments], you have a bit of control that you wouldn’t have in social
added cochlear implants or hearing aids to their avatars (H-P1 and media.”
H-P7 on Meta Avatars, H-P3 on Snapchat). For VI participants, there However, some VI participants preferred using text-based meth-
were no avatar features designed to represent visual impairments ods to disclose their disabilities other than avatars since avatars
on current social VR platforms. Two VI participants (V-P2 and V- were visual and not accessible. For instance, V-P7 wrote posts or ar-
P5) thus put dark glasses on their avatars to signify their visual ticles to reveal her disability to the public. Five VI participants also
impairments. “My avatar has sunglasses because when I go out, I used emojis (e.g., ) in their posts to present their disabilities.
always have sunglasses with me, because I’m light sensitive. So that For instance, M-P1 utilized disability-related emojis in her everyday
is why my avatar is wearing sunglasses.” (V-P5). communications. As she mentioned, “Well, I didn’t create [the emoji],
Moreover, P-H6 added an ID badge and a mini-status to his but I like that [they] are already on the iPhone. There is a person with
avatars to indicate his disability. As he described, “You can make a white cane or whatever. I have used those before to represent me, if
your own ID badge to explain that you’re Deaf or/and Hard of Hearing I’m writing something, posting a message, or something.”
and have it on your avatar so you can let anyone read it and learn Desired Disability Representations in Avatar Design. Par-
about you. And plus I have a little mini status that would show above ticipants suggested various ways to facilitate disability represen-
my avatar head [if you enabled it in the menu], it’ll say ‘I’m Deaf tation in avatars. Most participants hoped that avatar interfaces
:D.”’ (Fig. 2A) could include more assistive devices for PWD (e.g., prosthetic limbs,
wheelchairs, white cane). While some platforms already provided
hearing aids and cochlear implants, DHH participants wanted more
options that present diferent details. For example, they wanted
hearing aids with diferent colors, brands, and wearing methods
(e.g., one side and both sides), so that users can freely choose their
preferred appearance of assistive devices and feel more realistic.
For VI participants, two main desired features were the white cane
and guide dog, which were commonly used by VI people in real
life. Beyond simply incorporating assistive devices, participants
also expressed the need to communicate their personalities through
the assistive devices they used. Two participants (V-P1 and V-P6)
desired to decorate their avatars’ canes. As V-P6 explained, “[The
Figure 2: A. H-P6’s mini-status in VRChat; B. H-P6’s sign cane] not only ties in with my disability, but like my person-hood as
avatar with black-outlined hands and ID Badge well around and through that disability.” Moreover, M-P1 who used
Braille wished to have jewelry with Braille on it, which represented
Alternative approaches to disclosing disabilities. Besides “her culture as a person with a disability.”
avatars, participants (nine VI and seven DHH participants) used Besides accessories, H-P1 emphasized the need for a customiz-
other methods, such as bio, posts, photos, emoji, and YouTube able avatar body to accurately refect PWD’s physical appearance,
videos, to present their disabilities on social media. including their disabilities. She used the Helen Keller doll as an
Compared to the 2D social platforms, we found that many par- example, “When the Helen Keller doll came out, Haben Girma, who’s
ticipants (six DHH and two VI) were more willing to disclose their a very famous blind person, was not happy with that because [the
disabilities via 3D avatars in social VR. The embodied experience doll’s] eyes were proportional. They were the same. But Helen Keller in
made participants feel more comfortable disclosing their disabili- real life, her eyes are not. I mean, even Haben’s eyes are not perfectly
ties. In contrast, some participants (H-P4, H-P5) felt that expressing proportioned. So maybe that would be something worth improving,
their disabilities in a less embodied way (e.g., via text on social is to give people the ability to customize the eyes diferently. Some
media) was highlighting their disabilities rather than representing people may want to be as accurate as possible.”
themselves. As H-P5 noted, “The hearing aids are physical reality However, one participant (H-P8) did not support providing dis-
for me, so I just have them on [my avatar]. I wouldn’t want to put ability features for avatars. He was concerned that these features
it in text. I wouldn’t want it to be [highlighted]. While [the hearing could be abused by people without disabilities and suggested ver-
aid] is visible it feels more subtle [than disclosing my disability in ifying the authenticity of one’s disability before a user achieves
text].” Meanwhile, some participants felt that the visual presenta- access to these features. As he described, “From my experience in
tion of avatars was more noticeable than in other mediums. For VRChat, there are a lot of people who have faked disabilities. Do not
example, H-P2 claimed that avatar would be "the frst thing" that trust anyone even your new online friends that you have met recently.
people see, and H-P7 added that “it’s the easiest way of [disclosing I don’t think it’s necessary to show any disabilities on the avatars.
my disability]. I don’t think I would put it in like text information
like oh I’m deaf. Because I think most of the time well, probably no
one reads it.”
ASSETS ’22, October 23–26, 2022, Athens, Greece Zhang et al.

People have to show proof that they truly have disabilities. For exam- (Fig. 3D). Members in Helping Hands can choose sign avatars from
ple, I won’t hold back and sign fast as possible to fake deaf people. his warehouse.
They will suddenly show their ugly true colors.” Unique disability disclosure via sign avatars. All three par-
ticipants who used VR-ASL preferred not using any hearing devices
4.2.4 Specialized avatars and VR-ASL for the Deaf commu-
on their avatars, because these assistive devices can not distinguish
nity. Notably, three people in the Deaf community3 (H-P4, H-P6,
Deaf people from deaf or hard of hearings people. Instead, the best
H-P8) reported using specialized avatars to sign in social VR, specif-
way to present the Deaf culture was to sign via their avatars. As
ically in VRChat (Fig. 2B). All three participants were members
such, the sign avatars and the use of VR-ASL became a unique way
of the “Helping Hands” community in VRChat, where they were
of disability disclosure for Deaf people. As H-P6 explained, “I can
either learning or teaching American Sign Language (ASL). Instead
convey and gesture [my disability] to anyone I meet, ‘Sorry, me deaf,
of typical ASL, they used VR-ASL, a simplifed version of ASL in
me sign’ and they get the gist just fne anyway.”
VR (Fig. 3). We report participants’ experiences with the specialized
avatars and VR-ASL.
VR-ASL. ASL in real life involved rich movements of fngers,
4.2.5 Avatar Creation and Accessibility. Avatar creation and
body gestures, and facial expressions. However, not all signs can
customization posed barriers to some participants, especially those
be recognized in VR due to its limited tracking capabilities. Thus,
with visual impairments. We describe the diferent challenges DHH
Deaf users simplifed or adjusted some signs so that they could be
and VI participants faced.
captured by the current VR controllers, which resulted in VR-ASL
Barriers to DHH people. For DHH participants, the avatar
(Fig. 3A). It is worth noting that all three participants who used
customization interface in social VR was accessible and easy to use.
VR-ASL adopted the Valve Index Knuckles Controllers, which can
Six DHH participants refected that they did not encounter any
capture more hand gestures and fnger movements than other VR
difculties because all avatar features and customization steps were
controllers [39].
visual and rarely involved any audio information.
Specialized sign avatars. To enable people to better perceive
However, the major challenge occurred when creating the spe-
the signs, participants used specialized sign avatars in VRChat. The
cialized sign avatars with third-party platforms since it required
sign avatars had three unique characteristics. First, their hands
programming skills (Section 4.2.4). Both H-P4 and H-P8 reported
usually had a high contrast skin color to the avatars’ clothes, which
experiencing technical issues and highlighted the lack of support
increased the visibility of the hand gestures. As H-P4 mentioned,
in solving the issues. As H-P4 emphasized, “Oh so many times, you
“Oftentimes, people who use sign language, if their [avatar] skin is
don’t even want to believe, it’s always difcult because when the issue
white, it’s easier to see and contrast [with dark shirts color], so that
is about avatar creation, as of now is that there isn’t a lot of resources
helps people visually understand what you’re saying.” Second, many
about it. There isn’t a lot of tutorials that are either very relevant or
sign avatars had black outlined hands to further enhance their visi-
high quality for creating avatars. So that has a lot to be progressed.
bility (Fig. 2B). Last, sign avatars were re-programmed to support
That’s like my biggest issue. I don’t know how to do something.”
richer gestures, so that users can conduct more sophisticated signs
Although some third-party platforms ofered tutorials for avatar
in social VR (see the creation of specialized avatars in Section 4.2.4).
customization, most tutorials focused on audio instructions and
For example, H-P4’s avatar can perform the hand gesture that sig-
were not accessible to DHH users. As H-P6 complained, “I think
nifed the letter “y” in ASL (Fig. 3B), while with the same controller
there were some visual examples and others, which was very helpful
input, typical avatars would only pose a “rock and roll” gesture (Fig.
to learn about, but for most of it was about audio stuf which I, un-
3C) that did not represent any sign letters.
fortunately, can’t hear them. I feel like they haven’t thought this out
Creation of specialized sign avatars. Participants used dif-
very well.”
ferent ways to acquire, create, and customize sign avatars. H-P4
Barriers to VI people. Unlike DHH participants, VI participants
usually imported the publicly available avatars in VRChat into
faced more signifcant challenges when creating avatars and using
Unity and re-programmed them to support more hand gestures.
VR platforms in general. Without the support of a screen reader, VR
H-P4 explained the detailed process of her avatar augmentation
was not accessible at all to blind users. For example, M-P2 purchased
in Unity: “I was using custom animations, and I was able to modify
an Oculus Quest but decided to abandon it due to accessibility issues.
the controller inputs to play animations, like I might be able to do ‘e’
As he explained, “I have tried, the Oculus Quest, totally inaccessible,
hand or fat ‘o’ hand, otherwise you won’t be able to have it. Here
because it does not speak. I see it has no screen reading functionality in
is a little chart with all the diferent combinations you can do (Fig.
it. Because I have absolutely no use for this thing unless Meta includes
4A), and I was able to modify the game to which I can have access to
screen reading technology and text-to-speech technology in this. Still,
even more [gestures] with the combination.” H-P8 also modifed the
it’s sitting in my closet in its original box, it’s never even been opened.”
avatars in VRChat, but he asked his friends to do it for him due to
For low vision participants, similar to Zhao et al.’s results [89],
technical barriers (see Section 4.2.5).
we found that information readability was a big challenge. The text
Instead of creating his own avatars, as a teacher who taught
on some avatar interfaces was too small. Moreover, some interfaces
VR-ASL in Helping Hands, H-P6 received his avatars as a gift from
appeared at a fxed distance, preventing users from getting close
“A9,” a professional creator of sign avatars. A9 owned a warehouse
to see. As V-P4 said, “For avatar creation, they’ll use that sort of fat
in VRChat, showcasing all the sign avatars he created for VR-ASL
screen style interface, but then the texts are not large enough, well
3 The Deaf community “views themselves as a unique cultural and linguistic minority then you either guess, or you just don’t interact with it. So that’s where
who use sign language as their primary language” [12]. the problem is.”
Avatar Diversity and the Self-presentation of People with Disabilities in Social VR ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: A. Signing the alphabet using ASL vs. VR-ASL; B. H-P4’s avatar signing “y” hand; C. the “Rock and Roll” gesture; D.
Warehouse with sign avatars created by “A9”.

Figure 4: Creation of specialized sign avatars: A. Oculus Hand Chart; B. Unity interface in modifying hand gestures

Avatar creation on social media. Compared to VR, avatar designing [my avatar], I did have to have my daughter help. It wasn’t
creation on conventional social media was more accessible to VI very accessible and so really it doesn’t mean a lot to me because I’m
participants since participants can access the interface with their not completely invested in it, because I can’t confrm that I like it”
phone or computer that incorporated a screen reader. Seven VI (V-P7).
participants had experience creating avatars on social media, such Moreover, the avatar customization interfaces usually ofered a
as Instagram and SnapChat. However, VI participants still faced long list of options, which was difcult for VI users to navigate with
various challenges. screen readers. As V-P6 said, “I think there were so many buttons to
The biggest issue was the lack of sufcient descriptions across the fick through, for example, I would choose my hair color and it will
whole avatar design process. When designing avatars, participants be like 60 of them. Those processes [pose] accessibility hazards. I am
(e.g., V-P1, V-P7) had difculty understanding each avatar feature, good with technology, but I think that someone who was not so good
tracking their design progress, and confrming the fnal outcome. with technology would get really overwhelmed with that.”
Without suitable descriptions and notifcations, the avatars created
by VI participants (e.g., V-P7, V-P8) did not match their desired 5 DISCUSSION
avatar appearances. As V-P7 indicated, “Meta has a little bit of
Our research has contributed the frst exploration of avatar diver-
description, but not everything has a description. So the frst [avatar]
sity and PWD’s self-presentation in social VR. We answer the three
I made, [my friends] said it was terrible, didn’t look anything like me.
research questions proposed in the Introduction. Firstly, our sys-
I didn’t choose the right face, I didn’t choose the right color skin tone.
tematic review in Study I highlighted the lack of disability represen-
People told me the outft I chose was ugly. I didn’t get a lot of positive
tation in avatar design on the mainstream social VR platforms. Only
feedback.” Since participants could not see and confrm the fnal
the Meta Avatars (and the platforms that adopted the Meta Avatars)
look of their avatars, they did not build strong connections with
provided disability-related features, but the features were restricted
the avatars, thus not caring much about their avatars: “When I was
to DHH people (RQ1). Secondly, our in-depth interviews in Study
ASSETS ’22, October 23–26, 2022, Athens, Greece Zhang et al.

II indicated that PWD employed a spectrum of self-presentation displaying their disabilities via avatars in social VR were more nat-
strategies via avatars in social VR (see details in Section 5.1), and ural. The support of embodied interaction further enhanced PWD’s
the majority of them showed strong preferences in disclosing their engagement and attachment. Our Deaf participants can thus com-
disabilities via avatars by adding the assistive tools they used in real municate with their preferred manner in real life—ASL, and they
life (RQ2). We also uncovered the major diferences between DHH even created VR-ASL to accommodate to the limited capability of
and VI participants in avatar perception (Section 5.2). Lastly, the current VR technology. Our fndings echoed the Embodied Social
avatar creation and customization process posed barriers to PWD, Presence Theory [58] that the embodied avatars and the shared vir-
especially for VI participants (RQ3). In this section, we discuss the tual space and activities can afect user perception and bring them
unique self-presentation strategies of PWD from the lens of disabil- to a higher engagement level, and further expanded this theory by
ity disclosure, the avatar perception diferences between DHH and providing evidences from the disability perspective.
VI users, and the design considerations we derive to inspire more
accessible and inclusive avatar design. 5.2 Avatar Perception of People with Diferent
Disabilities
5.1 Disability Disclosure via Avatars in Social Our fndings uncovered DHH and VI people’s diferent avatar ex-
VR perience and perception. Given the visual-driven nature of current
VR technology, it is not surprising that VI people face tremen-
Our research identifed some similar self-presentation patterns
dously more challenges than DHH people when designing and
observed in people without disabilities. We confrmed that most
using avatars. Our study showed that DHH users had much more
people regarded the embodied avatars as themselves and designed
substantive avatar experiences than VI users. DHH users already
avatars to refect their physical selves [25], and people may want
started forming communities in social VR (e.g.,“Helping Hands”),
their avatars to reveal a diferent self in online games other than
whereas the majority of VI participants only tried social VR for
their physical selves [42].
a few times. Due to the distinct avatar experiences, DHH and VI
Beyond the insights from prior research, our research uncovered
people perceived avatars diferently. Since VI users cannot see or
PWD’s unique avatar perceptions and self-presentation strategies
even imagine the appearances of their avatars, they had weak at-
in social VR. We found that PWD managed their disability dis-
tachment with their avatars and some ended up not caring about
closure to craft their self-images in social VR. Instead of a binary
their avatars at all in social VR. As V-P4 said, “[Avatar design] really
switch between disclosing and not disclosing [16], PWD adopted
is just for fun, I mean it’s just kind of seeing what types of appear-
a spectrum of strategies to determine to what extent and from
ances, or what types of features that the avatars have. I really don’t
what aspect they would like to disclose their disability to shape
necessarily care about avatars all that much.” In contrast, DHH peo-
their avatar fgures. Some participants saw disability as part of
ple showed stronger attachment with their avatars and spent more
their physical selves and wanted to refect their disability as accu-
efort customizing, specializing, and even re-programming their
rately as possible; some selectively disclosed a certain aspect of
avatars for better self-presentation and communication.
their disabilities (e.g., the more visible disability, or the disability
While this research focused only on DHH and VI users, our
with stronger personal attachment) to signify their major disability
fndings indicated that people with diferent disabilities may have
identity; and some participants with acquired or progressive dis-
diferent avatar perception and self-presentation preferences. For
abilities employed avatars as a nuanced way to indicate their ability
example, compared to people who have visible disabilities (e.g., a
changes. Moreover, PWD disclosed their disabilities via avatars
person with an amputation) or who use apparent assistive tech-
to convey certain signals to the audience. For example, some par-
nologies (e.g., a VI person who has a cane), people with invisible
ticipants designed avatars with disability features to demonstrate
conditions and using invisible assistive technologies (e.g., a person
their capability and independence, while some participants used
who experiences chronic pain and uses a trigger tracker) may be
their avatars to increase disability awareness and advocate for di-
more reluctant to disclose their disabilities in social VR. Future
versity and equity. These patterns demonstrated the importance
research should consider other disabilities and explore diferent
of disability disclosure for PWD in self-presentation, suggesting
factors that may afect people’s disability disclosure decisions, such
the necessity of granting PWD sufcient fexibility to control their
as visibility of disability and visibility of assistive technology [23].
disability disclosure in the avatar creation process.
Our research also highlighted PWD’s diferent disability disclo-
sure preferences from other online social platforms (e.g., social
5.3 Design Implications for Avatar Diversity
media, virtual worlds). While avatars were also supported in some and Accessibility
2D/desktop-based social platforms (e.g., Instagram, Second Life), 5.3.1 Avatar Diversity. We drew three design implications to
PWD generally did not feel attached to their avatars due to the lack promote disability representations for avatars.
of embodiment, and many tended to hide their disabilities by using Ofer more assistive device representations. Most partici-
non-human avatars [16, 75]. In contrast, our study suggested that pants saw their assistive devices as the key signifer of their disabili-
the embodied nature of social VR enabled people to build strong ties and preferred adding these devices to their avatars for disability
attachment with their avatars, making them more willing to refect representation. Participants suggested various commonly-used as-
their disabilities in their avatars. Some participants (H-P4, H-P5) sistive technologies that can represent their disabilities, including
mentioned that they did not want to disclose their disabilities on a white cane and guide dog for VI people, cochlear implants and
social media since it felt like “showing of” their disabilities, but hearing devices for DHH people, and prosthetic limbs, wheelchair,
Avatar Diversity and the Self-presentation of People with Disabilities in Social VR ASSETS ’22, October 23–26, 2022, Athens, Greece

and walking aids for people with motor disabilities. Besides the Convey avatar design outcomes for VI users. VI users rely
basic assistive technology representations, participants wanted to on alternative text to perceive graphic information. Our research
further personalize their virtual assistive devices to refect their highlighted the lack of and the need for alternative text and screen
aesthetics and personalities, for example, customizing the color and reader support for avatars in social VR. While the avatar design
the brand. Adding these features in the mainstream avatar systems system for conventional social media (e.g., Facebook, SnapChat)
would empower PWD to express and present themselves freely and could be accessed by VI users via alternative text and embedded
equally in social VR. Designers should involve PWD throughout screen readers, participants reported challenges in understanding
the whole avatar design process to ensure the suitability of the their customization progress and fnal outcomes, which led to their
disability representations. emotional detachment from the avatars. Artifcial intelligence (AI)
Support representations for people with invisible disabil- technology could be considered to recognize the avatars and gener-
ities. Compared to visible disabilities, some invisible disabilities ate semantic confrmations about the holistic avatar customization
could not be easily expressed, especially when no apparent assis- outcomes, such as whether the avatar looks like the user. However,
tive technologies are used to signify the disability. In our study, this could be technically and ethically challenging since catego-
participants suggested using accessories, such as a necklace with rizing and recognizing human facial features is difcult in both
Brailles or an arm badge with their community logo, to present technology [15, 67, 73] and sociology felds [19, 33]. For example,
their disabilities. We suggest that designers should consider using an algorithm may mis-recognize particular avatar or user faces and
these indirect mediums, such as outfts with disability signifers report inappropriate results. Moreover, how to suitably label and
(e.g., a T-Shirt with the Autism Awareness Puzzle logo), to enable describe human appearance, especially the marginalized groups, is
people with invisible disabilities to better manage their disability another important question to consider [4, 33, 40]. Besides refning
visibility in social VR. the algorithms and datasets from the computer vision perspective
Guarantee appropriate use of diversity features. Adding [77], HCI solutions could also be adopted, for example, indicating
more avatar diversity features may also pose potential risks. Our the potential inaccuracy of the recognized results to users [54, 91],
participants indicated their concerns on the inappropriate usage of or leveraging human-AI collaboration to achieve more reliable re-
these features by people without disabilities. The abuse of diversity sults [32, 51].
features can lead to cyber-bullying and increase the misconception Add closed caption for all audio information. Although
of disability from the public. One participant (H-P8) further sug- the avatar creation and customization process was visual-driven,
gested conducting strict verifcation on the authenticity of one’s adding closed caption would still be valuable, especially for DHH
disability before they can access the diversity features. How to users. We believe that with proper caption positioning, closed cap-
guarantee the proper use of the diversity features is a vital issue tion will have an signifcantly positive impact on the social VR
that should be considered by the commercial social VR platforms ecosystem.
from both the design and the policy perspectives. Researchers and
designers should investigate how to set up suitable rules to support 6 LIMITATIONS AND FUTURE WORK
interaction freedom without marginalizing and harming vulnerable In this paper, we studied the avatar experience and perception of
populations on these new and emerging social platforms. PWD on social VR platforms. We focused on two disability groups—
DHH and VI people. Future work should take into account more
5.3.2 Avatar Accessibility. Our fndings highlighted the dif- diverse types of disabilities to get a more comprehensive under-
culties faced by VI and DHH people when designing avatars. This standing of PWD’s disability disclosure preferences, especially for
echoed the general VR accessibility issues revealed by prior re- people with invisible disabilities. Moreover, our application review
search [10, 47, 61, 89], and also expanded the problems from a new study (Study I) focused on the Oculus Quest 2 platform. Although
avatar creation perspective. We drew three design implications to social VR applications mostly support the same avatar design and
enhance avatar accessibility. accessibility features across VR devices, we acknowledge that dif-
Combine avatar automation with fne-grained customiza- ferent VR devices may ofer their own system-level avatars (e.g.,
tion. Avatar creation is not accessible to VI participants due to Oculus ofers the Meta Avatars), which may bring nuances to the
its heavily visual-driven nature and the complex steps to navigate. review results. Thus, other mainstream VR devices should also be
Our blind participants usually needed external human assistance considered to achieve a more comprehensive results. Finally, future
in customizing their avatars. Instead of crafting the avatars from work may complement our qualitative fndings with quantitative
scratch, some participants suggested automating the avatar genera- analysis to investigate what factors may impact PWD’s disability
tion process (M-P1, V-P5): the avatar system should automatically disclosure behaviors, such as gender, age, and disability visibility.
generate a baseline avatar based on the user’s photo, and allow
the user to adjust details based on their preferences. While not ACKNOWLEDGMENTS
new, this feature is supported by very limited number of social VR We thank the National Federation of the Blind for helping us recruit
platforms. In our application review (Study I), only three out of 15 for our study, as well as the anonymous participants that provided
commercial social VR platforms support avatar automation. We their perspective.
suggest that designers should consider enabling multiple avatar
creation methods—both auto generation and manual adjustment— REFERENCES
to enhance the accessibility of avatar creation and customisation [1] Dominic Abrams and Michael A Hogg. 2004. Metatheory: Lessons from social
for people with diverse abilities. identity research. Personality and social psychology review 8, 2 (2004), 98–106.
ASSETS ’22, October 23–26, 2022, Athens, Greece Zhang et al.

[2] Alvirmin. 2021. 7 Best Social Networking Apps in VR for Oculus Quest 152–177.
2. https://allvirtualreality.com/review/best-social-networking-apps-vr-oculus- [30] Erving Gofman. 1959. The presentation of self in everyday life. Bantam Doubleday
quest.html last accessed 5 July 2022. Dell Publishing group.
[3] Jane Bailey, Valerie Steeves, Jacquelyn Burkell, and Priscilla Regan. 2013. Negoti- [31] Kathryn Greene, Valerian J Derlega, and Alicia Mathews. 2006. Self-disclosure
ating with gender stereotypes on social networking sites: From “bicycle face” to in personal relationships. The Cambridge handbook of personal relationships 409
Facebook. Journal of Communication Inquiry 37, 2 (2013), 91–112. (2006), 427.
[4] Cynthia L Bennett, Cole Gleason, Morgan Klaus Scheuerman, Jefrey P Bigham, [32] Anhong Guo, Anuraag Jain, Shomiron Ghose, Gierad Laput, Chris Harrison, and
Anhong Guo, and Alexandra To. 2021. “It’s Complicated”: Negotiating Accessibil- Jefrey P Bigham. 2018. Crowd-ai camera sensing in the real world. Proceedings of
ity and (Mis) Representation in Image Descriptions of Race, Gender, and Disability. the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 3 (2018),
In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–20.
1–19. [33] Margot Hanley, Solon Barocas, Karen Levy, Shiri Azenkot, and Helen Nissenbaum.
[5] Katherine Bessière, A Fleming Seay, and Sara Kiesler. 2007. The ideal elf: Identity 2021. Computer Vision and Conficting Values: Describing People with Auto-
exploration in World of Warcraft. Cyberpsychology & behavior 10, 4 (2007), mated Alt Text. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics,
530–535. and Society. 543–554.
[6] Tom Boellstorf. 2019. The Ability of Place: Digital Topographies of the Virtual [34] Daniel Hepperle, Christian Felix Purps, Jonas Deuchler, and Matthias Wölfel.
Human on Ethnographia Island. Current Anthropology 61 (08 2019), S109–S122. 2021. Aspects of visual avatar appearance: self-representation, display type, and
[7] Natilene Bowker and Keith Tufn. 2002. Disability discourses for online identities. uncanny valley. The Visual Computer (2021), 1–18.
Disability & Society 17, 3 (2002), 327–344. [35] Daniel Hepperle, Hannah Ödell, and Matthias Wölfel. 2020. Diferences in
[8] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. the Uncanny Valley between Head-Mounted Displays and Monitors. In 2020
Qualitative research in psychology 3, 2 (2006), 77–101. International Conference on Cyberworlds (CW). 41–48. https://doi.org/10.1109/
[9] Justin Buss, Hayden Le, and Oliver L Haimson. 2022. Transgender CW49994.2020.00014
identity management across social media platforms. Media, Culture & [36] Tanner Higgin. 2009. Blackless fantasy: The disappearance of race in massively
Society 44, 1 (2022), 22–38. https://doi.org/10.1177/01634437211027106 multiplayer online role-playing games. Games and Culture 4, 1 (2009), 3–26.
arXiv:https://doi.org/10.1177/01634437211027106 [37] Zaheer Hussain and Mark D Grifths. 2008. Gender swapping and socializing
[10] Diane Carr. 2010. Constructing disability in online worlds: conceptualising in cyberspace: An exploratory study. CyberPsychology & Behavior 11, 1 (2008),
disability in online research. London Review of Education (03 2010), 51–61. 47–53.
[11] Justine Cassell. 1998. Chess for girls?: Feminism and computer games. (1998). [38] Nora F Huvelle, Milton Budof, and Deidre Arnholz. 1984. To tell or not to
[12] National Deaf Center. 2021. The Deaf Community: An Introduction. tell: Disability disclosure and the job interview. Journal of Visual Impairment &
[13] Victoria Clarke, Virginia Braun, and Nikki Hayfeld. 2015. Thematic analysis. Blindness 78, 6 (1984), 241–244.
Qualitative psychology: A practical guide to research methods 222 (2015), 248. [39] VALVE INDEX. 2022. Controllers. https://www.valvesoftware.com/en/index/
[14] Paul C Cozby. 1973. Efects of density, activity, and personality on environmental controllers last accessed 11 April 2022.
preferences. Journal of Research in Personality 7, 1 (1973), 45–60. [40] Emory James Edwards, Kyle Lewis Polster, Isabel Tuason, Emily Blank, Michael
[15] Chirag Dalvi, Manish Rathod, Shruti Patil, Shilpa Gite, and Ketan Kotecha. 2021. Gilbert, and Stacy Branham. 2021. "That’s in the Eye of the Beholder": Layers of
A Survey of AI-Based Facial Emotion Recognition: Features, ML & DL Techniques, Interpretation in Image Descriptions for Fictional Representations of People with
Age-Wise Datasets and Future Directions. IEEE Access 9 (2021), 165806–165840. Disabilities. In The 23rd International ACM SIGACCESS Conference on Computers
[16] Donna Z Davis and Shelby Stanovsek. 2021. The machine as an extension of the and Accessibility (ASSETS ’21). Article 19, 14 pages.
body: When identity, immersion, and interactive design serve as both resource [41] Yasmin B Kafai, Deborah A Fields, and Melissa Cook. 2007. Your second selves:
and limitation for the disabled. Human-Machine Communication 2 (2021), 121– avatar designs and identity play in a teen virtual world. In Proceedings of DIGRA,
135. Vol. 2007.
[17] Joan Morris DiMicco and David R Millen. 2007. Identity management: multiple [42] Yasmin B Kafai, Deborah A Fields, and Melissa S Cook. 2010. Your second selves:
presentations of self in facebook. In Proceedings of the 2007 international ACM Player-designed avatars. Games and culture 5, 1 (2010), 23–42.
conference on Supporting group work. 383–386. [43] Sanjay Kairam, Mike Brzozowski, David Hufaker, and Ed Chi. 2012. Talking in
[18] Nicolas Ducheneaut, Ming-Hui Wen, Nicholas Yee, and Greg Wadley. 2009. Body circles: selective sharing in google+. In Proceedings of the SIGCHI conference on
and mind: a study of avatar personalization in three virtual worlds. In Proceedings human factors in computing systems. 1065–1074.
of the SIGCHI conference on human factors in computing systems. 1151–1160. [44] Bernadett Koles and Peter Nagy. 2012. Who is portrayed in Second Life: Dr. Jekyll
[19] Yarrow Dunham, Elena V. Stepanova, Ron Dotsch, and Alexander Todorov. 2015. or Mr. Hyde? The extent of congruence between real life and virtual identity.
The development of race-based perceptual categorization: skin color dominates Journal For Virtual Worlds Research 5, 1 (2012).
early category judgments. Developmental Science 18, 3 (2015), 469–483. [45] Anya Kolesnichenko, Joshua McVeigh-Schultz, and Katherine Isbister. 2019. Un-
[20] Allison Eden, Erin Maloney, and Nicholas David Bowman. 2010. Gender attribu- derstanding emerging design practices for avatar systems in the commercial
tion in online video games. Journal of Media Psychology: Theories, Methods, and social vr ecology. In Proceedings of the 2019 on Designing Interactive Systems
Applications 22, 3 (2010), 114. Conference. 241–252.
[21] Blizzard Entertainment. 2021. World of Warcraft. https://worldofwarcraft.com/ [46] Robin M Kowalski, Chad A Morgan, Kelan Drake-Lavelle, and Brooke Allison.
en-us/ last accessed 5 July 2022. 2016. Cyberbullying among college students with disabilities. Computers in
[22] Shelly D Farnham and Elizabeth F Churchill. 2011. Faceted identity, faceted lives: Human Behavior 57 (2016), 416–427.
social and technical issues with being yourself online. In Proceedings of the ACM [47] Rachel L. Franz, Sasa Junuzovic, and Martez Mott. 2021. Nearmi: A Framework
2011 conference on Computer supported cooperative work. 359–368. for Designing Point of Interest Techniques for VR Users with Limited Mobility. In
[23] Heather A. Faucett, Kate E. Ringland, Amanda L. L. Cullen, and Gillian R. Hayes. The 23rd International ACM SIGACCESS Conference on Computers and Accessibility.
2017. (In)Visibility in Disability and Assistive Technology. 10, 4 (2017), 17 pages. 1–14.
[24] Katrina Fong and Raymond A Mar. 2015. What does my avatar say about me? [48] Linden Lab. 2013. 10 Years of Second Life. https://www.lindenlab.com/releases/
Inferring personality from avatars. Personality and Social Psychology Bulletin 41, infographic-10-years-of-second-life last accessed 4 Feb 2022.
2 (2015), 237–249. [49] Jong-Eun Roselyn Lee. 2014. Does virtual diversity matter?: Efects of avatar-
[25] Guo Freeman and Divine Maloney. 2021. Body, avatar, and me: The presentation based diversity representation on willingness to express ofine racial identity
and perception of self in social virtual reality. Proceedings of the ACM on Human- and avatar customization. Computers in Human Behavior 36 (2014), 190–197.
Computer Interaction 4, CSCW3 (2021), 1–27. [50] Jong-Eun Roselyn Lee and Sung Gwan Park. 2011. “Whose second life is this?”
[26] Guo Freeman, Samaneh Zamanifard, Divine Maloney, and Alexandra Adkins. How avatar-based racial cues shape ethno-racial minorities’ perception of virtual
2020. My body, my avatar: How people perceive their avatars in social virtual worlds. Cyberpsychology, Behavior, and Social Networking 14, 11 (2011), 637–642.
reality. In Extended Abstracts of the 2020 CHI Conference on Human Factors in [51] Sooyeon Lee, Rui Yu, Jingyi Xie, Syed Masum Billah, and John M Carroll. 2022.
Computing Systems. 1–8. Opportunities for human-AI collaboration in remote sighted assistance. In 27th
[27] James Paul Gee. 2003. What video games have to teach us about learning and International Conference on Intelligent User Interfaces. 63–78.
literacy. Computers in entertainment (CIE) 1, 1 (2003), 20–20. [52] Qiaoxi Liu and Anthony Steed. 2021. Social Virtual Reality Platform Comparison
[28] Kathrin Gerling, Kieran Hicks, Michael Kalyn, Adam Evans, and Conor Linehan. and Evaluation Using a Guided Group Walkthrough Method. Frontiers In Virtual
2016. Designing Movement-Based Play With Young People Using Powered Reality (2021), 1–15.
Wheelchairs. In Proceedings of the 2016 CHI Conference on Human Factors in [53] Xiao Ma, Jef Hancock, and Mor Naaman. 2016. Anonymity, Intimacy and Self-
Computing Systems. 4447–4458. Disclosure in Social Media. In Proceedings of the 2016 CHI Conference on Human
[29] Jennifer L Gibbs, Nicole B Ellison, and Rebecca D Heino. 2006. Self-presentation Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association
in online personals: The role of anticipated future interaction, self-disclosure, for Computing Machinery, New York, NY, USA, 3857–3869. https://doi.org/10.
and perceived success in Internet dating. Communication research 33, 2 (2006), 1145/2858036.2858414
Avatar Diversity and the Self-presentation of People with Disabilities in Social VR ASSETS ’22, October 23–26, 2022, Athens, Greece

[54] Haley MacLeod, Cynthia L Bennett, Meredith Ringel Morris, and Edward Cutrell. [73] Luke Stark. 2019. Facial Recognition is the Plutonium of AI. XRDS 25, 3 (apr
2017. Understanding blind people’s experiences with computer-generated cap- 2019), 50–55.
tions of social media images. In Proceedings of the 2017 CHI Conference on Human [74] Karen Stendal. 2012. How do People with Disability Use and Experience Virtual
Factors in Computing Systems. 5988–5999. Worlds and ICT: A Literature Review. Journal of Virtual Worlds Research (05
[55] Divine Maloney, Samaneh Zamanifard, and Guo Freeman. 2020. Anonymity 2012), 1–17.
vs. familiarity: Self-disclosure and privacy in social virtual reality. In 26th ACM [75] Karen Stendal, Judith Molka-Danielsen, Bjørn Munkvold, and Susan Balandin.
Symposium on Virtual Reality Software and Technology. 1–9. 2012. Virtual worlds and people with lifelong disability: Exploring the relationship
[56] Tony Manninen and Tomi Kujanpää. 2007. The value of virtual assets: the role of with virtual self and others. ECIS (01 2012), 156–179.
game characters in MMOGs. International Journal of Business Science & Applied [76] Stephanie Stewart, Terri S. Hansen, and Timothy A. Carey. 2010. Opportunities
Management (IJBSAM) 2, 1 (2007), 21–33. for People with Disabilities in the Virtual World of Second Life. Rehabilitation
[57] Rosa Mikeal Martey and Mia Consalvo. 2011. Performing the looking-glass self: Nursing Journal 35 (2010), 254–259.
Avatar appearance and group identity in Second Life. Popular Communication 9, [77] Yaniv Taigman, Ming Yang, Marc’Aurelio Ranzato, and Lior Wolf. 2014. DeepFace:
3 (2011), 165–180. Closing the Gap to Human-Level Performance in Face Verifcation. In Proceedings
[58] Brian E. Mennecke, Janea L. Triplett, Lesya M. Hassall, and Zayira Jordan Conde. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
2010. Embodied Social Presence Theory. In 2010 43rd Hawaii International Con- [78] Sherry Turkle. 1999. Life on the Screen: Identity in the Age of the Internet. The
ference on System Sciences. 1–10. Psychohistory Review 27, 2 (1999), 113.
[59] Akash Menon. 2021. The Role of Avatar Creation and Embodied Presence in [79] Sherry Turkle. 2005. The second self: Computers and the human spirit. Mit Press.
Virtual Reality Job Interviews. [80] Sarah Von Schrader, Valerie Malzer, and Susanne Bruyère. 2014. Perspectives
[60] Helen Morgan, Amanda O’donovan, Renita Almeida, Ashleigh Lin, and Yael on disability disclosure: the importance of employer practices and workplace
Perry. 2020. The Role of the Avatar in Gaming for Trans and Gender Diverse climate. Employee Responsibilities and Rights Journal 26, 4 (2014), 237–255.
Young People. International journal of environmental research and public health [81] Zach Waggoner. 2009. My avatar, my self: Identity in video role-playing games.
17, 22 (2020), 8617. McFarland.
[61] Martez Mott, Edward Cutrell, Mar Gonzalez Franco, Christian Holz, Eyal Ofek, [82] Paul Wallace and James Maryott. 2009. The impact of avatar self-representation
Richard Stoakley, and Meredith Ringel Morris. 2019. Accessible by Design: An on collaboration in virtual worlds. Innovate: Journal of Online Education 5, 5
Opportunity for Virtual Reality. In 2019 IEEE International Symposium on Mixed (2009).
and Augmented Reality Adjunct (ISMAR-Adjunct). 451–454. [83] Thomas Waltemate, Dominik Gall, Daniel Roth, Mario Botsch, and Marc Erich
[62] Lisa Nakamura. 1995. Race in/for cyberspace: Identity tourism and racial passing Latoschik. 2018. The impact of avatar personalization and immersion on vir-
on the Internet. Works and Days 13, 1-2 (1995), 181–193. tual body ownership, presence, and emotional response. IEEE transactions on
[63] Carman Neustaedter and Elena Fedorovskaya. 2009. Presenting Identity in a Vir- visualization and computer graphics 24, 4 (2018), 1643–1652.
tual World through Avatar Appearances. In Proceedings of Graphics Interface 2009 [84] VRChat Legends Wiki. 2022. Helping Hands. https://vrchat-legends.fandom.
(Kelowna, British Columbia, Canada) (GI ’09). Canadian Information Processing com/wiki/Helping_Hands last accessed 9 April 2022.
Society, CAN, 183–190. [85] Shaomei Wu and Lada A Adamic. 2014. Visually impaired users on an online
[64] Veronica Pearson, Frances Ip, Heidi Hui, Nelson Yip, et al. 2003. To tell or not to social network. In Proceedings of the SIGCHI Conference on Human Factors in
tell; disability disclosure and job application outcomes. Journal of Rehabilitation Computing Systems. 3133–3142.
69, 4 (2003), 35. [86] XRTODAY. 2022. The Best Social Apps in VR. https://www.xrtoday.com/virtual-
[65] John R Porter, Kiley Sobel, Sarah E Fox, Cynthia L Bennett, and Julie A Kientz. reality/the-best-social-apps-in-vr/ last accessed 5 July 2022.
2017. Filtered out: Disability disclosure practices in online dating communities. [87] Jennifer Yurchisin, Kittichai Watchravesringkan, and Deborah Brown McCabe.
Proceedings of the ACM on Human-Computer interaction 1, CSCW (2017), 1–13. 2005. An exploration of identity re-creation in the context of internet dating.
[66] Mark Rabkin. 2021. Connect 2021 Recap: Horizon Home, the future of work, presence Social Behavior and Personality: an international journal 33, 8 (2005), 735–750.
platforms, and more. https://www.oculus.com/blog/connect-2021-recap-horizon- [88] Shanyang Zhao, Sherri Grasmuck, and Jason Martin. 2008. Identity construction
home-the-future-of-work-presence-platform-and-more/ last accessed 13 April on Facebook: Digital empowerment in anchored relationships. Computers in
2022. human behavior 24, 5 (2008), 1816–1836.
[67] Inioluwa Deborah Raji, Timnit Gebru, Margaret Mitchell, Joy Buolamwini, Joon- [89] Yuhang Zhao, Edward Cutrell, Christian Holz, Meredith Ringel Morris, Eyal Ofek,
seok Lee, and Emily Denton. 2020. Saving Face: Investigating the Ethical Concerns and Andrew D Wilson. 2019. SeeingVR: A set of tools to make virtual reality more
of Facial Recognition Auditing. Association for Computing Machinery, New York, accessible to people with low vision. In Proceedings of the 2019 CHI conference on
NY, USA, 145–151. https://doi.org/10.1145/3375627.3375820 human factors in computing systems. 1–14.
[68] Sheila Riddell and Elisabet Weedon. 2014. Disabled students in higher education: [90] Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2017. The efect
Discourses of disability and the negotiation of identity. International Journal of of computer-generated descriptions on photo-sharing experiences of people with
Educational Research 63 (2014), 38–46. visual impairments. Proceedings of the ACM on Human-Computer Interaction 1,
[69] Kathryn E Ringland. 2019. “Autsome”: Fostering an Autistic Identity in an Online CSCW (2017), 1–22.
Minecraft Community for Youth with Autism. (2019), 142–143. [91] Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2018. A face
[70] Ann E Schlosser. 2020. Self-disclosure versus self-presentation on social media. recognition application for people with visual impairments: Understanding use
Current Opinion in Psychology 31 (2020), 1–6. https://doi.org/10.1016/j.copsyc. beyond the lab. In Proceedings of the 2018 CHI Conference on Human Factors in
2019.06.025 Privacy and Disclosure, Online and in Social Interactions. Computing Systems. 1–14.
[71] Zicodas Serapis. 2008. Coming of age in second life: an anthropologist explores
the virtually human.
[72] Carmit-Noa Shpigelman and Carol J Gill. 2014. Facebook use by persons with A APPENDIX
disabilities. Journal of Computer-Mediated Communication 19, 3 (2014), 610–624.
ASSETS ’22, October 23–26, 2022, Athens, Greece Zhang et al.

Table 2: Overview of social VR platforms and their avatar creation options. Description of the icons : humanoid avatar= ,
robot avatar= , cartoon avatar= , animal avatar= , object avatar= , abstract avatar= , full body avatar includes head,
upper body, arm hand, leg= , partial body avatar includes head, upper body, arm, hand = , foating head without neck= ,
foating upper body without arm = , foating hand= , foating fngerless hand = , full avatar selection = , avatar feature
customization = , photo-based avatar generation = , third-party avatar import =

VR Platforms Avatar type Avatar Customization Avatar Realism Disability Representation

Rec Room N/A

VRChat Any uploaded 3rd-party disability features

Horizon Worlds Hearing aids; cochlear implants

vTime XR N/A

AltspaceVR N/A

Bigscreen Eye patch

Alcove Hearing aids; cochlear implants

Half+Half N/A

Horizon Venues Hearing aids; cochlear implants

Villa N/A

Arthur N/A

ENGAGE N/A

Multiverse N/A

PokerStars VR Hearing aids; cochlear implants

Spatial N/A
SoundVizVR: Sound Indicators for Accessible Sounds in Virtual
Reality for Deaf or Hard-of-Hearing Users
Ziming Li Shannon Connell
zl1398@rit.edu sdcnai@rit.edu
School of Information, National Technical Institute for the Deaf,
Rochester Institute of Technology Rochester Institute of Technology
Rochester, New York, USA Rochester, New York, USA

Wendy Dannels Roshan Peiris


w.dannels@rit.edu roshan.peiris@rit.edu
Center on Culture and Language, School of Information,
National Technical Institute for the Deaf, Rochester Institute of Technology
Rochester Institute of Technology Rochester, New York, USA
Rochester, New York, USA
ABSTRACT ACM Reference Format:
Sounds provide vital information such as spatial and interaction Ziming Li, Shannon Connell, Wendy Dannels, and Roshan Peiris. 2022.
SoundVizVR: Sound Indicators for Accessible Sounds in Virtual Reality for
cues in virtual reality (VR) applications to convey more immersive
Deaf or Hard-of-Hearing Users. In The 24th International ACM SIGACCESS
experiences to VR users. However, it may be a challenge for deaf or Conference on Computers and Accessibility (ASSETS ’22), October 23–26, 2022,
hard-of-hearing (DHH) VR users to access the information given Athens, Greece. ACM, New York, NY, USA, 13 pages. https://doi.org/10.1145/
by sounds, which could limit their VR experience. To address this 3517428.3544817
limitation, we present “SoundVizVR”, which explores visualizing
sound characteristics and sound types for several types of sounds in 1 INTRODUCTION
VR experience. SoundVizVR uses Sound-Characteristic Indicators
to visualize loudness, duration, and location of sound sources in
VR and Sound-Type Indicators to present more information about
the type of the sound. First, we examined three types of Sound-
Characteristic Indicators (On-Object Indicators, Full Mini-Maps
and Partial Mini-Maps) and their combinations in a study with 11
DHH participants. We identifed that the combination of Full Mini-
Map technique and On-Object Indicator was the most preferred
visualization and performed best at locating sound sources in VR.
Next, we explored presenting more information about the sounds
using text and icons as Sound-Type Indicators. A second study
with 14 DHH participants found that all Sound-Type Indicator
combinations were successful at locating sound sources.
Figure 1: SoundVizVR method that uses Mini-Maps and
On-Object Indicators to present sound source characteris-
CCS CONCEPTS tics and sound type information. The Mini-Map visualizes
• Human-centered computing → Accessibility technologies; sounds in the VR environment while the On-Object Indica-
User studies. tor displays the sound originating from an object (“a text
that says ‘Bark’ is shown”). This example shows using icons
KEYWORDS in the Full Mini-Map and text on the object to describe the
virtual reality, audio visualization, deaf and hard-of-hearing, acces- sound type.
sibility
Virtual reality (VR) technologies have the capability to deliver
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed a completely immersive experience to the user. This experience
for proft or commercial advantage and that copies bear this notice and the full citation allows the user to completely engulf or immerse their senses in
on the frst page. Copyrights for components of this work owned by others than the the content and the interactions. Immersive VR experiences are
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission typically dependent on the quality of the visual, sound, and interac-
and/or a fee. Request permissions from permissions@acm.org. tion dimensions of the experience [35, 38]. As such, sounds in VR
ASSETS ’22, October 23–26, 2022, Athens, Greece that enhance and complement the interactions are a crucial part
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 of the “immersive” virtual reality experience. VR utilizes sounds
https://doi.org/10.1145/3517428.3544817 in many forms, such as spatial audio, voice, interaction sounds,
ASSETS ’22, October 23–26, 2022, Athens, Greece Li, Connell, Dannels, and Peiris

rhythmic interaction, background sounds/music, etc., to enhance In the second study, we integrated the chosen visualization tech-
the interaction experience [18]. nique into a VR game scene to explore the best and most preferred
However, although sounds in VR play an essential role in creat- type of indicator for presenting the sound types (e.g., footsteps,
ing the immersive experience, sounds could also be a limiting factor gunshots, etc.). We implemented combinations of icons and text
for deaf or hard-of-hearing (DHH) individuals to experience a fully representation methods into the visualization technique to present
immersive experience in VR due to the limited sound accessibil- the diferent sound types. We conducted a user experiment with
ity [19, 26]. To address similar limitations in everyday situations 14 DHH participants to collect user performance data and user
(in the non-VR space), visualization-based [17, 20, 24] or haptics- feedback to evaluate these combinations.
based [15, 29, 30] methods have been proposed to make everyday In summary, our research contributions are as follows:
sounds accessible for DHH persons. However, adapting the above • A VR sound visualization prototype software that can be
practices to the VR space presents a set of unique challenges due used as a Unity platform plugin to improve sound accessibil-
to the fctional nature of the visual world and novel interaction ity for DHH users in VR projects
possibilities VR presents [25]. With these challenges in mind, a few • A study with 11 DHH participants that identifes charac-
pioneering research has already begun exploring making sounds teristics of the combinations of Mini-Map Indicators and
in virtual reality more accessible for DHH users. Here, Jain et al. On-Object Indicators to represent sound characteristics in
defned a “Taxonomy of Sounds in Virtual Reality” to help future VR, and
VR designers design accessible sounds in VR [18]. In addition, Jain • A study with 14 DHH participants that discusses the prefer-
et al. also presented an exploration of the design space for accessible ences of presenting sound types through icons and text in
VR sounds using visual and haptic-based method [19] and another the Full Mini-Map and on virtual objects.
pioneering work, EarVR [26], presented a haptic-based method for
presenting spatial sounds for DHH VR users. 2 RELATED WORK
Inspired by the above approaches, we present SoundVizVR that
explores visualizing VR sounds via sound indicators. Specifcally, 2.1 Sound Visualization
SoundVizVR uses Mini-Map Indicators and On-Object Indicator Prior work has mainly explored visual and haptic methods, aiming
style interfaces as Sound-Characteristic Indicators and uses text and to make sounds more accessible for DHH individuals. Here, we
icons as Sound-Type Indicators to increase the ability to localize focus on prior work closely related to our work that applied visual-
sound sources and visualize sound characteristics in VR (Figure 1). based sound accessibility methods.
Our proposed sound visualization methods aim to enable users Sound visualization techniques used in music can convey music
to visualize several types of sounds that originate from the sound information (e.g., pitch, tempo, etc.) in real-time with animated
sources within the VR environments [18], by identifying sound images [41]. Music visualization methods are widely supported
characteristics (e.g., loudness, duration) and sound types (e.g., foot- by media player software, such as Windows Media Player, Foo-
steps, gunshots). Infuenced by Stockburger [42], we use the term bar2000, and iTunes. They visualize music characteristics using
diegetic sound for the sound that comes from the object in the VR spectrum-like or waveform-like 2D displays. There is also research
world (e.g., a phone ringing). For the current scope of this work, on music visualization techniques conducted among DHH commu-
we focus only on several diegetic sounds in VR [18, 42]: localized nities to evaluate DHH people’s experience with these methods.
speech, inanimate objects, animate objects, and point ambience, For example, S. Nanayakkara et al. implemented a system combin-
as they are critical for the experience in VR [18]. We present our ing a vibrated chair with a visual display to provide an enhanced
prototype as a generalizable and customizable plugin1 that can musical experience to DHH people [29]. In addition, J. Mori and
be integrated into any VR software developed in the Unity game D. I. Fels conducted research with DHH people to investigate their
engine 2 . Using this Unity plugin, we envision that VR designers emotional reaction to a song with diferent animated lyrics [28],
and developers can promptly visualize sound source information which indicated that the animated text could provide entertainment
to increase the sound accessibility for DHH VR users. value of the music without losing the readability of lyrics.
We conducted two user studies to explore the usability of frstly, Similarly, sounds also have vital contributions to the experience
the Sound-Characteristic Indicator visualization method and sec- designs in games. There are video games that look into using sound
ondly, the Sound-Type Indicator visualization method. In the frst visualization methods to make their content more accessible to
study, we conducted a preliminary evaluation with 11 DHH partici- DHH people. For example, a sandbox video game called “Minecraft”
pants to identify the best and the most preferred Sound-Characteristic [27] features “Subtitles” that use text labels with arrows to indicate
Indicator visualization technique to localize sound sources from six the sound types and the directions of in-game sounds (e.g., rain falls,
design combinations of the Mini-Map based and the On-Object In- zombie groans, etc.) near the player avatar [47]. And a battle royale
dicator based sound visualization method. These methods assisted game called “Fortnite” [7] uses a radar-like interface to assist DHH
DHH VR users in locating the sound sources and visualizing other players in accurately locating the vital sound efects in the game
characteristics of the sounds, such as the loudness and duration. environment, like footsteps and gunfres. However, although with
The best technique of the frst study was selected for the second caption supported [23], the playing experience is still signifcantly
study based on the performance data and participants’ feedback. reduced in many commercial video games once sounds are disabled.
This limitation may be due to the lacking sound accessibility designs
1 https://github.com/ZimingLiii/SoundVizVR-Plugin in their major game events (e.g., notifcations of shootings from an
2 https://unity.com/ enemy in a frst-person shooter [FPS] game) [3, 48].
SoundVizVR: Accessible VR Sounds for DHH Users with Sound Indicators ASSETS ’22, October 23–26, 2022, Athens, Greece

Inspired by these prior works, we are exploring implementing 3.1 Sound-Characteristic Indicator System
sound visualization methods that can provide sound characteristic Design
information. Our proposed methods adopt the interface compo-
nents of Mini-Maps and environmental indicators, infuenced by
the accessibility designs of existing video games.

2.2 XR Sound Accessibility


Previous research primarily addressed accessibility for people with
visual impairments [40, 46, 50] or mobility impairments [9, 10, 32]
using XR technologies (including virtual reality, augmented reality,
etc.). For DHH users, researchers mainly investigated captioning,
sound awareness, description of sounds, and the ability to locate
sounds in XR environments. Figure 2: The design of the Sound-Characteristic Indica-
Prior work on sound accessibility in augmented reality (AR) tor. (Study 1 game scene is shown.) The smaller red object
explores speech captioning for talking people using wearable AR is the Sound-Characteristic Indicator which appears when
devices [14, 31, 34]. For example, D. Jain et al. [14] evaluated real- the sound source object makes a sound and dynamically
time captioning in an AR approach, which provided insights on changes its size during the sound. The waveform in the
the UI design of speech captioning systems in AR head-mounted speech bubble shows a sound being played, and the circles
displays (HMDs) and outlined the benefts of AR head-mounted above indicate how the indicator changes the size based on
display captioning. Another work from Y. Peng et al. [34] proposed the loudness characteristics of the sound wave.
an AR captioning system called “SpeechBubbles” to address prob-
lems in the group conversation scenario, such as the out-of-view
captions and the speaker association. Also, prior research in AR The Sound-Characteristic Indicator was designed as a customiz-
accessibility addressed sound awareness [16] and sound detection able 3D object (a Unity plugin). It is attached as a part of any sound
and localization [11] which provide glanceable sound information source object in the VR design. The indicator appears in a Mini-Map
of the environment to DHH people. and on the sounding object when the object starts making sounds
In terms of VR, prior work mainly explores providing accessible and disappears at the end of the sound. As shown in Figure 2, it can
acoustic clues using subtitles [1, 13, 36, 44]. Several other research visualize the current loudness of the sound by dynamically chang-
examines VR in the context of story telling [4] for DHH individuals ing its size - when the sound gets louder, the indicator expands;
and learning sign language for DHH children [33]. And studies when the sound is lower, the indicator shrinks. In this way, the
like EarVR [26] also looked into conveying sound information in dynamic changes in the object’s size indicate the sound’s loudness,
VR through haptics. Some VR games in the market adopt sound and the appearance and disappearance of the object relate to the
accessibility interfaces other than captions, for example, the enemy duration of the sound.
sound direction indicator in “The Persistence” [22]. In addition, to indicate the location of the sounding objects and
Our work is infuenced by prior work from D. Jain et al., which inform the loudness of the sounds, we designed two components
proposed a sound taxonomy in VR [18]. And also, our study is for our proposed sound visualization system: Mini-Maps and On-
inspired by another research that presented an evaluation of their Object Indicator.
prototypes developed using visual-based and haptic-based repre-
sentation methods for VR sounds [19]. However, to the best of our
knowledge, limited research examines sound visualization meth-
ods in VR in detail. Therefore, it would be benefcial to conduct
user experiments to help gain qualitative and quantitative insights,
such as performance, usability, and user experience, of the sound
visualization method designs for VR.

3 SOUNDVIZVR SOUND VISUALIZATION


SYSTEM
The scope of this work mainly focuses on diegetic types of sounds,
specifcally localized speech, inanimate objects, animate objects,
and point ambience [18]. To visualize these sounds in VR, we fo-
cused on two aspects of the sound indicators - 1) Sound-Characteristic
Indicator and 2) Sound-Type Indicator. The Sound-Characteristic Figure 3: Three types of Sound-Characteristic Indicators: (a)
Indicator was implemented to present information such as the On-Object Indicators; (b) Full Mini-Map Indicator; (c) Partial
location, loudness, and the duration of sounds. The Sound-Type Mini-Map Indicator
Indicator was used to convey more meaningful information about
the sounds, such as a footstep sound from a person walking nearby.
ASSETS ’22, October 23–26, 2022, Athens, Greece Li, Connell, Dannels, and Peiris

3.1.1 On-Object Indicator. We wanted the user to be aware of the the user and further notify the sound type via its iconic or textual
sound playing from a sounding object while looking at the object, label at the same time.
which is an intuitive way to identify a sounding object. Hence, as
seen in Figure 3a, we placed the Sound-Characteristic Indicator 3.3 System Implementation
object hovering near the top of the object.
The appearance of the indicator is customizable. To make its
design simple and visually apparent, we designed the appearance
of the On-Object Indicator as a red-colored sphere for Study 1.
However, designers may customize the indicator using any 3D
objects or images based on their requirements.
3.1.2 Mini-Maps. The Mini-Map systems are commonly used in
many video games [49]. The Mini-Map design can present play-
ers with information about their surroundings. So the player can
keep track of the notable updates around them, especially those
in their blind spots. Inspired by such map designs, we integrated Figure 5: The implementation of a game scene developed us-
a circular Mini-Map, which is called a “Full Mini-Map”, into our ing the Unity game engine.
proposed system to help users locate the sound source (Figure 3b).
Through trial and error and feedback from pilot demos, we placed
the Mini-Map on the screen’s left side, similar to many video games. We developed two task-based VR game scenes with our proposed
Furthermore, the same Sound-Characteristic Indicator described system integrated using the Unity game engine version 2021.1.15f1.
above is visualized to present sound source information on the Full Both scenes can be deployed to Oculus’s PC-powered VR headsets.
Mini-Map. In our case, an Oculus Rift S with dual VR controllers is used [6].
In addition to the Full Mini-Map, we also present a Partial Mini- To implement the sound indicator, frst, we need to obtain the
Map (Figure 3c). Here, the front sector of the Partial Mini-Map, sample data from the audio source of the current time frame. Then,
which represents the user’s current feld of view, is hidden. We we retrieve the average absolute value from the sample data and
explored this design so that the user is only presented with sound map it to the scale value of the sound indicator object of each frame.
information that is not in their feld of view. In addition, this design As a result, we can see the indicator’s size changes according to the
aims to encourage users to pay more attention to the environment current loudness of the audio source (Figure 2) during the game
in front of them and observe potentially sounding objects instead engine’s run-time.
of over-relying on the information shown on the Mini-Maps. In terms of the Mini-Maps’ implementation, we use an additional
virtual camera in the Unity game engine that is attached to the
3.2 Sound-Type Indicator player at the ceiling level (Figure 5). This virtual camera always
faces the ground. It ignores all other objects in the view except for
the sound indicators. By rendering this virtual camera’s view into
the Mini-Map’s texture, the Mini-Maps can refect sound indicators
around the player. After importing the plugin to the Unity editor, the
designer could drag and drop the add-on prefab onto a sound-source
object. Next, the designer would be able to customize the 3D objects,
icons, and text based on the preferences or design requirements.
As mentioned above, the appearance of the sound indicator in the
On-Object Indicator and Mini-Maps is customizable. Our prototype
Figure 4: The design of Sound-Type Indicator. (a) The Sound- allows the designers to replace the default red sphere object with
Type Indicator with text on the Full Mini-Map and an icon a preferred icon object or a descriptive text object for each sound
on the object. Here, in the player’s view, the plushie rabbit source to further visualize the sound type information in a game
makes a “talking sound”, and its corresponding text and icon scene.
are shown. (b) Icons used for the presented sounds. (c) Text
descriptions of the sounds. 4 STUDY 1: EVALUATING THE
SOUND-CHARACTERISTIC INDICATOR
Besides knowing the sound characteristics, it is essential to un- The main goal of the frst study is to evaluate the performance and
derstand what types of sounds are presented in a scenario. Thus, the user experience of our proposed Sound-Characteristic Indicator
to assist the DHH users in identifying the types of the sounds in visualization methods on locating the sound source and visualizing
the VR environment, we explored customizing the sound indicator sound characteristics like duration and loudness. For this purpose,
in the On-Object Indicator and Mini-Maps with icons and text (Fig- we conducted a user experiment in which users were required to
ure 4a). The size of the icon (Figure 4b) and text (Figure 4c) changes localize sound sources in VR as the main task to identify the best
along with the loudness of the indicator’s corresponding sound. In performing and the most preferred Sound-Characteristic Indica-
this way, a sound indicator can inform the sound characteristics to tor method from the possible design combinations. Both studies
SoundVizVR: Accessible VR Sounds for DHH Users with Sound Indicators ASSETS ’22, October 23–26, 2022, Athens, Greece

presented in the article were approved by the institution’s ethics Table 2: Study 1 conditions
review board.
On-Object Indicators
Independent Variable
4.1 Task Design Without With
Without NON OI
Mini-Map Full FM FM-OI
Partial PM PM-OI

we calculated how many times a participant accurately selected


a correct sounding sphere. The next random sphere started play-
ing another random sound clip in 3 seconds after the participant
Figure 6: Study 1 task design - (a) VR environment of Study chose a sphere or pressed the “skip” button. One repetition of a
1; (b) The participant’s view of the scene while performing condition consisted of eight such tasks (eight diferent sounds from
a task using the FM-OI visualization method. eight spheres) and each condition was repeated three times. The
participant had the right to stop the experiment at any point of
Similar to the task design in EarVR [26], as the main task, the time.
participant was required to locate the object (presented as spheres)
from which the sound originated (Figure 6). In the experimental 4.2 Study Design
scene, we placed the player in the center of the stage with eight Study 1 used a within-subject evaluation design that consisted
identical spheres evenly placed around the player in a circle. The of two independent variables: On-Object Indicator (Without On-
spheres were labeled from 0 to 7 in counterclockwise order. The Object Indicator or With On-Object Indicator) and Mini-Map Type
player could look around or lead to any direction but was not (Full Mini-Map, Partial Mini-Map, or Without Mini-Map). The de-
required to walk in the scene. pendent variables were sound source localization accuracy and the
In the experiment, one of the eight spheres served as a sound time of completion of each task. In addition, we recorded additional
source and started to play a random sound clip selected from an data, such as the head rotation angles. In total, there were six condi-
audio clip set. The audio clip set was formed based on four cate- tions (Table 2): NON (no sound visualization technique was used),
gories chosen from the taxonomy of sounds in VR proposed by Jain OI (On-Object Indicators only-Fig. 3a), FM (Full Mini-Map technique
et al. that represented diegetic sounds [18]: Localized speech, Inan- only-Fig. 3b), FM-OI (a combination of Full Mini-Map technique
imate objects, Animate objects, and Point ambience. The selected and On-Object sound indicator-Fig. 4a), PM (Partial Mini-Map tech-
audio clips were organized into two duration scopes: short duration nique only-Fig. 3c), and PM-OI (a combination of Partial Mini-Map
sounds (a sound clip’s length was shorter than 3 seconds) and long technique and On-Object sound indicator). The NON condition was
duration sounds (a sound clip’s length was over 15 seconds). In included as the baseline condition in which no Sound-Characteristic
total, eight sound clips were included in the set. The audio clip set Indicator was presented, similar to existing VR experiences. Each
used in the experiment is shown in Table 1. It was selected from task was repeated 3 times. Each participant faced a total of 144
the royalty-free sound clip website “SoundBible” 3 to represent a tasks during the experiment (8 tasks x 3 repetitions x 6 conditions).
wide range of sounds from VR games. In addition, short sound clips
were explored to identify the impact of the visualization techniques 4.3 Participants
on locating quick sounds that may appear and disappear in the We recruited 11 DHH participants for the study from the authors’
visualization. institution (6 males, 2 females, 3 non-binary people; ages 18-45,
Mean = 27.36, SD = 8.82). The participant group consisted of 4 deaf
Table 1: Study 1 sound selection
and 7 hard-of-hearing participants. Five of the participants had used
VR devices before. The participants were recruited through fyers
Category Short Duration Long Duration and word-of-mouth advertising in the institution. Each participant
Localize Speech Old Man Laugh Baby Talk was paid $15 after completing the user experiment.
Inanimate Objects Gun Shot Phone Ringing
Animate Objects Footstep Barking Dog 4.4 Procedure
Point Ambience Fire Burning Waterfall After signing the informed consent form, the participant was given
an introduction to the system and asked to fll out a demographic
questionnaire. The information was provided to the participant
In the task, the participant was asked to select the sounding
through text and slides. However, a hard-of-hearing research team
sphere using a VR controller as a pointer. If the participant could
member used sign language to discuss additional details if and when
not identify the sounding sphere after 7 seconds from the start of
necessary. Also, the participant was asked to take of the hearing
the sound, they were allowed to press a “skip” button on the VR
aid, if there was one, to ensure the visualization was the focus in this
controller to skip to the next task. As the localization accuracy rate,
controlled study. Next, the participant put on the VR headset and
3 https://soundbible.com/ held a VR controller. Before a condition started, the participant was
ASSETS ’22, October 23–26, 2022, Athens, Greece Li, Connell, Dannels, and Peiris

given sufcient time to get familiar with the VR device’s control, the Greenhouse-Geisser correction. A signifcant main efect was found
VR game scene, and the Sound-Characteristic Indicator interface on Mini-Map Type (F1.591,15.912 = 27.453, p < 0.001), and on On-
based on the condition. As the condition started, the participant was Object Indicator (F1,10 = 31.983, p < 0.001). Post-hoc comparison
required to perform the tasks with the corresponding combination between each pairs of Mini-Map Type showed NON, FM: p < 0.001;
of the Sound-Characteristic Indicator visualization technique of NON, PM: p < 0.001; FM, PM: p = 0.029. Post-hoc comparison be-
the condition. The order of the condition for each participant was tween each pairs of On-Object Indicator showed NON, OI: p < 0.001.
assigned randomly. After completing a condition, the participant Post-hoc comparison between each combination showed signifcant
was asked to fll out a post-condition questionnaire. Then, the diference between each pairs, except FM, OI: p = 0.363; FM, FM-OI:
participant could take a 3-minute break if needed. After completing p = 1.000; FM, PM-OI: p = 1.000; PM, OI: p = 0.843; OI, FM-OI: p =
all six conditions, the participant was asked to fll out a post-test 0.085; OI, PM-OI: p = 0.146; FM-OI, PM-OI: p = 1.000.
questionnaire. The experiment took approximately 75 minutes for Figure 7c shows the localization accuracy rate on two types of
each participant. sound duration of Study 1. The results were further analyzed using
paired samples t-test. A signifcant main efect was found on all six
4.5 Results of Study 1 conditions - NON: p = 0.003; OI: p < 0.001; FM: p = 0.006; FM-OI: p
= 0.003; PM: p = 0.006; PM-OI: p < 0.001.

Table 3: The System Usability Scale Scores and Adjective Rat-


ings of Study 1

Condition SUS Score (SD) Adjective Rating


NON 52.05 (21.12) Poor
OI 71.14 (18.45) Good
FM 84.55 (15.24) Excellent
FM-OI 84.77 (16.03) Excellent
PM 51.82 (23.69) Poor
PM-OI 76.36 (24.91) Good

4.5.2 System usability and subjective mental workload. Table 3


shows the System Usability Scale (SUS) results of Study 1. The
SUS adjective rating is obtained from the 7-point adjective scale
proposed by Bangor et al. [2].

Figure 7: Results of Study 1 - (a) The average localization ac-


curacy rate; (b) The average completion time (Sec) of each
task; (c) The average localization accuracy rate on two types
of sounds of Study 1. Error bars denote the standard devia-
tion.

4.5.1 Localization accuracy and completion time. Figure 7a shows Figure 8: NASA TLX scores of Study 1. The lower ratings in-
the localization accuracy rate of Study 1. The results were further an- dicate lower task loads.
alyzed using repeated measures ANOVA with Greenhouse-Geisser
correction. A signifcant main efect was found on Mini-Map Type Figure 8 shows the subjective mental workload results collected
(F1.53,15.37 = 41.618, p < 0.001), and on On-Object Indicator (F1,10 from NASA Task Load Index (NASA TLX) questionnaires [12]
= 40.436, p < 0.001). Post-hoc comparison between each pairs of across six dimensions: mental demand, physical demand, temporal
Mini-Map Type showed NON, FM: p < 0.001; NON, PM: p < 0.001; demand, performance, efort, and frustration.
FM, PM: p < 0.001. Post-hoc comparison between each pairs of On-
Object Indicator showed NON, OI: p < 0.001. Post-hoc comparison 4.6 Discussion of Study 1
between each combination showed signifcant diference between Overall, our results indicated that all the conditions with a Sound-
each pairs, except FM, OI: p = 0.540; FM, FM-OI: p = 1.000; FM, Characteristic Indicator performed signifcantly better than the
PM-OI: p = 1.000; OI, FM-OI: 0.258; OI, PM-OI: 0.540; FM-OI, PM-OI: NON condition with FM-OI, FM, PM-OI achieving higher than 90%
p = 1.000. accuracy for sound source localization. The NON condition, which
Figure 7b shows the task completion time of Study 1. The re- has no integrated sound visualization methods similar to the exist-
sults were further analyzed using repeated measures ANOVA with ing VR experiences, achieved the lowest localization accuracy (M:
SoundVizVR: Accessible VR Sounds for DHH Users with Sound Indicators ASSETS ’22, October 23–26, 2022, Athens, Greece

24.24%, SD: 0.29) and the longest task completion time (M: 8.93s, diferentiate between two objects in proximity would be an added
SD: 2.21). During further analysis of the NON condition, we found advantage.” (F14)
that the correct localization was achieved by a few hard-of-hearing In addition, we looked into OI to better understand the efect of
participants who, although, indicated that it was difcult for them the On-Object Indicator alone. Although there are no statistically
to confdently locate the sound source without the assistance of signifcant diferences when comparing its localization accuracy
visual cues. For example, one of the hard-of-hearing participants (M: 82.58%, SD: 0.15) and completion time (M: 5.71s, SD: 1.41) with
(F02) wrote in the feedback: “I could not identify any of the sources FM and FM-OI, OI achieved signifcantly high subjective mental
of sound securely, I could maybe localize the sound to 3 spheres [three workloads in NASA TLX scores among all six dimensions compared
spheres that present in the view], but past that it’s beyond me.” Simi- to FM and FM-OI. The participant reported that they had to scan
larly, the deaf participants reported that they could not either tell through all the objects to fnd the audio cues due to the lacking
the location of sounds or the existence of a sound without any visu- indicators of the sound direction. “The problem with this system is
alization. The NASA TLX ratings for the NON condition reported although it does indicate the sound, it doesn’t indicate the direction of
the highest ratings (higher ratings indicate more cognitive load the sound which could waste time in a high pressure gameplay.” (F07)
requirements) for all categories except for the temporal demand We looked into the results of PM and PM-OI to investigate if
category (Figure 8). the Partial Mini-Map worked in reducing the information and en-
In terms of the FM condition, which introduced a Full Mini-Map abled participants to focus on the environment as we intended. PM
Sound-Characteristic Indicator component compared to the NON achieved the lowest localization accuracy (M: 60.23%, SD: 0.26) and
condition, its localization accuracy reached 93.94% (SD: 0.09), and longest task completion time (M: 6.38s, SD: 2.08) among the fve
task completion reduced to 4.73s (SD: 0.89). It also outperformed proposed methods. In addition, the participants reported that it was
the SUS score (with an “Excellent” adjective rating) and showed difcult to identify the direction of the sounds with this method.
relatively low subjective mental workloads across all NASA TLX “If the sound was in front of me, I had no idea which of the 2 spheres
dimensions. It might indicate that the participants were able to it could be unless I looked to the side to see the dot on the map.” (F02)
identify the sound locations and perform tasks with the assistance of These kinds of feedback may back up PM’s low SUS adjective rating
this method. The participants’ feedback supported this observation. (“Poor”) and relatively high mental workload performance, espe-
For example, F12 wrote: “the Full Mini-Map helps me fnd the sound cially in mental demand, temporal demand, efort, and frustration
location more accurately.” Those who had a wide experience in dimensions. In terms of the PM-OI condition, with an On-Object
playing games indicated that they might get used to the Full Mini- Sound Indicator component integrated, PM-OI achieved similar
Map method quickly since the Full Mini-Map was commonly used performance in localization accuracy (M: 91.67%, SD: 0.09) and task
in games [49]. For example, F03, who played games 4-6 times a time completion (M: 4.55s, SD: 0.87) metrics compared to FM and
week, wrote: “I think that because I play video games with full circle FM-OI, which showed no statistically signifcant diference. The
Mini-Map on the screen, it is very easy for me to engage in this participants reported that the On-Object Sound Indicator enabled
system.” However, some participants reported they had to rely on them to locate the sound source within multiple spheres on their
the map to perform tasks, even without paying attention to the front, which overcame the disadvantages of PM alone. “The sound
environment. They also mentioned that, with the Full Mini-Map indicators help me fnd the sound location better when I can’t fnd
alone, they could fail to pinpoint the sound source from two close [it] with the Partial Mini-Map. If I [was] unable to locate the sounds
objects. For example, F02 added: “I’m not entirely sure how accurate with the front cut up, the sound indicators helped me fnd it faster
I was because I often had to choose between two of the spheres.” and accurately.” (F12) Also, the participants reported that PM-OI
The FM-OI condition, which was developed based on FM, com- could assist them in focusing more on the game itself. For example,
bined a Full Mini-Map component with On-Object Sound-Characteristic
“The dots over the top made things much easier, and
Indicators that also supported visualizing the sound of an object
having the radar showed what’s around me made it
in the environment. FM-OI had similar yet marginally better per-
clear when to turn. This is by far the easiest way to
formance compared to FM on localization accuracy (M: 94.70%,
achieve the tasks. And yet the only time I missed the
SD: 0.07) and task completion time metrics (M: 4.41s, SD: 0.93), al-
target. I learned there is no replacement, meaning that
though not statistically signifcantly diferent. Similar to FM, FM-OI
after one came and went, it would not be there again.
achieved an “Excellent” adjective rating in the SUS score and rela-
That led me to reorient to the unselected options to
tively low mental workloads across all six NASA TLX dimensions.
get a ‘jump’ on them. Fun combination.” (F08)
The participants’ comments reported that FM-OI enabled them to
pay more attention to their surroundings with the assistance of However, some participants indicated that the sound localization
the On-Object Indicators. “Since the notifcation appears on both experience on PM-OI was ambiguous. As F07 said: “I can’t tell if I’m
map and game, I was able to focus on the game and use the map as a looking directly at the source of the noise in the Mini-Map with the
support.” (F01) They reported that the Full Mini-Map component notch and due to the nature of VR, I don’t have the same feld of vision
was useful to assist them in identifying the sound direction, and the as I would have in real life. So I have to put in a little extra efort to
On-Object Indicator component was helpful when identifying the identify the sphere.” This may support PM-OI’s higher subjective
specifc sounding sphere. “If there are two objects that are quite close mental workload result of the Performance dimension in the NASA
to each other, knowing just the direction of the sound may not always TLX ratings, especially when compared to FM and FM-OI.
be helpful. In such cases, having an environment sound indicator to When considering the duration of the clips used, sound localiza-
tion in long-duration sounds had a signifcantly better localization
ASSETS ’22, October 23–26, 2022, Athens, Greece Li, Connell, Dannels, and Peiris

accuracy than in short-duration sounds in all conditions. This result scene for the study (Figure 9a). The participant was placed near
was expected here as we identifed that if the participant was not in center of the scene and the objects that serve as speakers were
the view of a short duration sound, it had a higher chance of being distributed in the scene. Similar to Study 1, we did not require the
missed in a majority of the conditions. Here, F01 indicated: “For participants to move in the game scene. However, the participants
sounds that disappeared too quickly, it’s difcult to notice if there were could still rotate physically in place with minor lateral movements
any sound at all. This is especially true for ‘no map with indicator’, (rotate their bodies or lean towards a target) while wearing the VR
‘Partial Mini-Map without indicator’, and ‘Partial Mini-Map with headsets. Before a condition began, the participant was allowed to
indicator’.” About the OI condition, F14 discussed: “For this task, I familiarize with the sound source objects that were indicated using
had to scan through all the spheres until I found an audio indicator a label showing the name (category) of the object (Figure 9b). In
on top of one of the spheres. Sometimes, for short audio, I guess the addition, we used the same set of sound clips from the previous
audio indicator disappeared by the time I scanned through the spheres, study and used the same text label shown in Figure 4c. The used
which was not helpful in identifying the target... However, having icons (Figure 4b) were selected from a free online icon database
an audio indicator was a lot better than having no visual cues at website called “Flaticon” 4 . To prevent any biases (due to preferences
all!” Also, it should be noted that we used diferent sounds for the and/or color blindness), we used black and white icons.
diferent durations of the same type of sounds (we did not use a
short and long version of the same clip) to represent a larger variety Table 4: The task list of Study 2
of sounds.

4.7 Summary of Study 1 No. Task Instruction

Based on the analysis and observation of the results, we selected the 1 Find the radio making the fowing water sound
FM-OI as the Sound-Characteristic Indicator visualization method 2 Search for the item making a sound
for Study 2. We chose FM-OI over the six conditions primarily 3 Find and identify the ringing telephone
because it had the best performance data in both localization ac- 4 Find the object making the baby talking sound
curacy and task completion time. Similarly, the results from SUS, 5 Find and identify the sounding plush toy
NASA TLX questionnaires, and participants’ comments of Study 1 6 Find the radio making the gunshot sound
reported above supported our decision. Furthermore, in the post- 7 Find the object making the campfre sound
test questionnaire, most participants (7/11) indicated that FM-OI 8 Find the plushie making the Santa Claus laughter sound
was the sound visualization method they liked most among the fve
proposed methods, while preferences over the other methods were
diverse (PM-OI: 2/11, OI: 1/11, FM: 1/11).

5 STUDY 2: EVALUATING THE SOUND-TYPE


INDICATOR

Figure 10: Study 2 task design - (a) Task instruction panels;


(b) The participant’s view of the scene when performing a
task. The TM-IO condition is shown.
Figure 9: VR environment of Study 2 - (a) An overview of the
Study 2’s scene. A participant stands at the location marked As the main task, the participant was presented with a task from
by the blue arrow; (b) The participant’s view of the scene a list of task instructions (Figure 10a) as described in Table 4. It was
before a task. Before the task, the participant was shown the presented in the main view for 7 seconds to ensure the participant
sound source objects with their labels displayed. had enough time to read it, and afterward, it was maintained in the
upper region of the participant’s VR view. Next, the VR scene played
three diferent sounds from at most three sound source objects at
Based on the results from Study 1, we selected the FM-OI Sound-
the same time (Figure 10b). The participant was required to select
Characteristic Indicator visualization method for the following
the sound source object that played the sound specifed in the
studies. As such, the main goal of Study 2 is to evaluate the perfor-
instruction. We designed such a task to encourage the participant
mance and user experience of the diferent Sound-Type Indicator
to focus more on the sound-type labels during the task. For example,
visualization methods (texts and icons).
during a given task, all three radios in the VR scene might make
sounds, but if Instruction 1 was presented, the participant was
5.1 Task Design
required to identify the correct radio by looking for the “water
To present a more realistic VR scene for this experiment, we used fowing” icon or text label.
the Kid’s Room model package [5] and several additional object
models [43, 45] from the Unity Asset Store to build our VR game 4 https://www.faticon.com/
SoundVizVR: Accessible VR Sounds for DHH Users with Sound Indicators ASSETS ’22, October 23–26, 2022, Athens, Greece

The recording of task completion time began only at the point was required to perform the tasks with the sound visualization
in time when the correct sound clip started to play from an object. technique of the condition. The order of the condition for each
As the localization accuracy rate, we calculated how many times a participant was assigned randomly. After completing a condition,
participant accurately selected a correct sounding object after the the participant was asked to fll out a post-condition questionnaire.
correct sound clip started to play. Then, the participant could take a 3-minute break if needed. After
Similar to Study 1, the participant was able to skip the task if completing all four conditions, the participant was asked to fll out
they could not identify the correct task after 10 seconds by pressing a post-test questionnaire. The experiment took approximately 45
the skip button. The participant also had the right to stop the minutes for each participant.
experiment at any point of time. Eight such tasks (eight diferent
sounds from eight objects) were included in each repetition of a 5.5 Results of Study 2
condition. And each condition was repeated twice for a participant.

5.2 Study Design


The experiment of Study 2 used a within-subjects evaluation design.
It consisted of two independent variables: Sound-Type Indicator on
Objects (icon or text) and Sound-Type Indicator on the Full Mini-
Map (icon or text). The dependent variables were sound source
localization accuracy and the task completion time while searching
for the correct sound source. Similar to Study 1, we recorded ad-
ditional data such as the head rotation angles. In total, there were
four conditions as shown in Table 5: IM-IO (Icon indicators on the
Full Mini-Map + Icon indicators on Objects), IM-TO (Icon indicators Figure 11: Results of Study 2 - (a) The average localization
on the Full Mini-Map + Text indicators on Objects), TM-IO (Text accuracy rate of each condition; (b) The average task com-
indicators on the Full Mini-Map + Icon indicators on Objects), and pletion time (Sec) of each condition. Error bars denote the
TM-TO (Text indicators on the Full Mini-Map + Text indicators standard deviation.
on Objects). Each participant faced 64 tasks in total (8 tasks x 2
repetitions x 4 conditions). 5.5.1 Localization accuracy and task completion time. Figure 11a
shows the localization accuracy rate of Study 2. Repeated measures
Table 5: Study 2 conditions ANOVA revealed no signifcant diference on Indicators on Full
Mini-Map (F1,13 = 0.198, p = 0.664) and on Indicators on Objects
Indicators on Objects (F1,13 = 2.021, p = 0.179).
Independent Variable Figure 11b shows the task completion time of Study 2. Repeated
Icon Text
Icon IM-IO IM-TO measures ANOVA revealed no signifcant diference on Indicators
Indicators on Full Mini-Map on Full Mini-Map (F1,13 = 6.596e − 4, p = 0.980) and on Indicators
Text TM-IO TM-TO
on Objects (F1,13 = 0.409, p = 0.553).

5.3 Participants Table 6: The System Usability Scale Scores and Adjective Rat-
We recruited 14 DHH participants for the study from the authors’ ings of Study 2
institution (8 males, 4 females, 2 non-binary people; ages 18-45,
Mean = 26.21, SD = 8.31). The participant group consisted of 6 deaf SUS Score (SD) Adjective Rating
and 8 hard-of-hearing participants. Six of the participants had used IM-IO 81.96 (14.01) Excellent
VR devices before. Ten of the participants had previously taken IM-TO 73.39 (18.78) Good
part in Study 1. However, it should be noted that the two research TM-IO 75.00 (15.81) Good
studies were conducted separately. The participants were recruited TM-TO 76.25 (18.47) Good
through advertising in the institution by fyers and word of mouth.
Each participant was paid $25 after completing the user experiment. 5.5.2 System usability and subjective mental workload. Table 6
shows the SUS results of Study 2. Figure 12 reveals the subjec-
5.4 Procedure tive mental workload results of Study 2 collected from NASA TLX
Similar to Study 1, after signing up for the informed consent form, questionnaires across six dimensions: mental demand, physical
the participant was given an introduction to the system and asked demand, temporal demand, performance, efort, and frustration.
to fll out a demographic questionnaire. Next, after the participant
put on the VR system, they were given sufcient time to familiarize 5.6 Discussion of Study 2
themselves with the VR device’s control and the VR game scene. In 5.6.1 Sound Type Identification. Overall, the quantitative results
addition, the participants were allowed to familiarize themselves indicate that DHH participants can perform the tasks with the
with the instructions and the name (category) labels and locations of assistance of the four evaluated Sound-Type Indicator visualization
the sound source objects. As the condition started, the participant methods and achieved high localization accuracy at around 90%.
ASSETS ’22, October 23–26, 2022, Athens, Greece Li, Connell, Dannels, and Peiris

5.6.2 Sound Localization. Unlike Study 1, in which the sound


sources were evenly organized around the player in a circle, we ran-
domly distributed the sound sources in the VR environment in Study
2 to mimic the sound design of an actual VR game scene. There
were sound sources that stayed close to each other, and a sound
source was put at position right above another sound source (same
azimuth angle, but diferent altitude). Here, although the Mini-Map
Indicator only indicated the directions of the sound sources in a 2D
plane and did not show the altitude information of the sound source
Figure 12: NASA TLX scores of Study 2. The lower ratings in a 3D space, we believe that the On-Object Indicator overcame
indicate lower task loads. this issue. Unfortunately however, no comment on this aspect was
received from the participants.
In Study 2, some tasks had multiple (at most three) sounds hap-
pening simultaneously. The tasks of Study 2 with multiple sounding
Based on the analysis of the results, there is no signifcant diference objects were designed to explore if the DHH participants could
among all tested Sound-Type Indicator visualization methods on localize the correct sound source among many with the help of
both localization accuracy and task completion time metrics of the visualization methods. Based on the analysis of the qualitative
Study 2. It indicates that the overall performance of the four tested results, most of the participants (11/14) indicated that the visualiza-
Sound-Type Indicator visualization methods are similar. tion methods worked well in assisting them to identify the sound
The icons used in Study 2 were selected based on what a designer source when multiple sounds were happening in the scene. For
would pick as icons for the user interface applying a nomic sound example, when talking about the TM-TO, which used text to indi-
mapping method [8, 21]. Prior to the study, we did not present cate sound types, G10 said: “It [TM-TO] works well in cases where
the icons used in the experiment and their descriptive text to the there are multiple diferent types of noise and we need to identify a
participants. It aimed to explore if the participants could identify specifc sound. This is dependent on how descriptive the text is how-
the unfamiliar icons shown during the experiment. In the post-test ever.” However, participants also mentioned when the simultaneous
questionnaire, we presented two fve-point Likert scale questions to sounds contained short duration sounds, they would still struggle
respectively investigate if participants could understand the mean- to localize the correct sound source, even with the assistance of the
ings of icons and text shown in the indicator. The results show visualization methods. For example, G05 noted: “Works well when
that the icons (M: 3.93, SD: 0.83) and text (M: 3.93, SD: 0.92) have there are various sounds going on. Doesn’t work well when it quickly
similar ratings, which indicates the participants can understand fashed and went away when I was trying to read multiple texts to
the meanings of both the icons and text. The results of the system identify the object making the noise.”
usability evaluation can support this conclusion. The usability re-
sults revealed that IM-IO, which used icons in both Full Mini-Map
and On-Object Indicators, achieved the “Excellent” adjective rating. 5.6.3 Preferences. In terms of the procedure of searching for the
In the subjective mental workload results (Figure 12), IM-IO out- sound source using the visualization methods, most of the partic-
performed fve of the six dimensions: physical demand, temporal ipants (10/14) indicated that they frst used the Full Mini-Map to
demand, performance, efort, and frustration. And it also achieved identify the direction of the correct sound clip, and then use the On-
a relatively low mental demand load. In the participants’ feedback, Object Indicator to locate the specifc sound source. For example,
G06 said: “Iconic shows me a lot which allows me to identity easier G06 said: “I looked at the map where the icon is, then I search around
and quicker.” G02 added: “It [the iconic representative method] is the room to fnd the source of the sound where the icon marked on
something to get use for a bit, but it is easy to understand.” the map.” This primarily followed the similar pattern of locating
In terms of using descriptive text as Sound-Type Indicators in the sound source using FM-OI as indicated by the participants in
Study 2, TM-TO achieved the lowest mental demand in the subjec- Study 1.
tive mental workload scale and a “Good” adjective system usability In addition, we investigated the participants’ preferences on the
rating. The participants’ feedback supported that the text represen- four tested Sound-Type Indicator visualization methods, especially
tation method is easier to understand in most cases. For example, preferences on using text or icons to indicate the sound types. For
G02 said: “Texts on the environment work well as it describes the the overall preferred sound visualization method of Study 2, the
sounds well good enough like phones ringing or waterfall.” “[Text has] participants provided diverse feedback - IM-IO: 4/14, IM-TO: 3/14,
Nothing lost in translation.” G14 added. However, when it came to a TM-IO: 3/14, TM-TO: 4/14. The participants who preferred icons
situation where the descriptive text in the Sound-Type Indicators indicated that icons had lower cognitive load, allowing them to
did not exactly match the instruction, the text Sound-Type Indicator focus on the game content. For example, G16 mentioned: “I thought
visualization may cause a higher cognitive load. For example, the that iconic indicators are easier/quicker to understand than texts. I
instruction of task No.7 is “Find the object making the campfre do not prefer reading a lot when I play VR games.” The participants
sound” (Table 4), while the sound type label of the correct sound who tended to use text as Sound-Type Indicators voiced that text is
source shows “Crackling Fire” (Figure 4c). G16 noted: “Inconsistency more noticeable in the scene and can clearly convey its meaning.
in the wording of objects (prior to task vs during task) made it more As noted by G10, “Text is easier for me to notice on the map, and also
difcult to identify the appropriate object.” to identify the specifc type of noise.”
SoundVizVR: Accessible VR Sounds for DHH Users with Sound Indicators ASSETS ’22, October 23–26, 2022, Athens, Greece

While we could not come to a consensus on what might be 6.2 Future Works
the most preferred method for indicating sound types, this also In our future studies, we aim to explore integrating SoundVizVR into
indicates that the future designs of Sound-Type Indicators may diferent genres of VR applications, especially VR games, to further
allow the user to customize what to be displayed based on their evaluate its performance. In addition, we aim to explore Sound-
own preferences. VizVR in the 3D space [39], how it will afect the immersive ex-
perience and also explore other visualization techniques. Options
that enable customization to the integrated SoundVizVR compo-
6 DISCUSSION, LIMITATIONS AND FUTURE nents (e.g., changing the position, size, opacity, or background of
the Mini-Maps, or changing the size or color of the icon and text
WORK in the Sound-Type Indicator) will also be explored in our future
The results of Study 1 show that FM-OI had the best performance in studies.
localization accuracy and task completion time. The results from the We also intend to explore using the SoundVizVR plugin with VR
questionnaires reveal that FM-OI has a good system usability and content designers. This would enable us to determine any prefer-
less subjective mental workloads. Also, the participants’ feedback ences of the method and the plugin from a game designer’s perspec-
shows that FM-OI can allow them to pay more attention to the game tive such as the how easy it is to add this plugin to the workfow,
content while spending less efort in locating the sound source. how customizable the experiences are, etc.
The results of Study 2 show that the four tested Sound-Type In addition, to have a better understanding of SoundVizVR, we
Indicators have similar performance on localization accuracy and are looking forward to applying eye-tracking devices in our user
task completion time. The participants can identify the sound type experiments to see how participants perceive information from the
of a sound source and complete the tasks with the assistance of user interface. Also, we are looking forward to evaluating Sound-
the tested sound visualization methods. Moreover, the participants VizVR in other user groups (e.g., hearing people) to explore if it
have diverse preferences on the four tested Sound-Type Indica- could address other accessibility issues, such as situational impair-
tors in terms of using text or icons to indicate the sound types in ments among the hearing people during their using VR applications
the visualization methods, thus indicating a customizability of the [37]. This knowledge may help further improve our designs.
Sound-Type Indicators based on the individual preferences.
There are several limitations we wish to address.
7 CONCLUSION
In this paper, we proposed SoundVizVR that aimed to advance
sound accessibility in VR environments for DHH users. We con-
6.1 Limitations ducted a user experiment with 11 DHH participants to identify the
Our current study addressed four sound categories from the sound best performing and most preferred Sound-Characteristic Indicator
taxonomy in VR proposed by Dhruv Jain et al. [18], while another method from six design combinations. Furthermore, to evaluate
fve sound categories were not explored here. These unaddressed the performance and user experience of four diferent Sound-Type
categories were Non-localized speech, Notifcation sounds, Inter- Indicator visualization methods, we conducted another user study
action sounds, Surrounding ambience, and Music. To provide full with 14 DHH participants. Participants’ task performance and feed-
immersive and accessible sound experience, our future studies will back indicated that SoundVizVR can assist them in locating sound
explore on these unaddressed categories. sources, identifying sound characteristics, and identifying the sound
We evaluated our sound visualization methods in a scenario types of in-game sound efects in VR environments.
with at most three synchronous sound sources in our Study 2. The
performance of the sound visualization method in scenarios with REFERENCES
more than three concurrent sound sources (e.g., a large crowd of [1] Belén Agulló, Mario Montagud, and Isaac Fraile. 2019. Making interaction with
talking people) was untested. In that case, fltering strategies of virtual reality accessible: rendering and guiding methods for subtitles. Artifcial
Intelligence for Engineering Design, Analysis and Manufacturing 33 (11 2019), 1–13.
sounding objects might need to be investigated. https://doi.org/10.1017/S0890060419000362
The Full Mini-Map component in our proposed sound visualiza- [2] Aaron Bangor, Philip Kortum, and James Miller. 2009. Determining What Indi-
vidual SUS Scores Mean: Adding an Adjective Rating Scale. J. Usability Studies 4,
tion methods indicated sounds within its circular range. For the 3 (may 2009), 114–123.
sounds that were outside the range, DHH users can only identify [3] Fl’vio Coutinho, Raquel O. Prates, and Luiz Chaimowicz. 2011. An Analysis
those, that were within their feld of view, with the assistance of On- of Information Conveyed through Audio in an FPS Game and Its Impact on
Deaf Players Experience. In 2011 Brazilian Symposium on Games and Digital
Object Sound Indicators. We aim to explore visualizing the distant Entertainment. 53–62. https://doi.org/10.1109/SBGAMES.2011.16
sounds on the Full Mini-Map in our future study. [4] Sigal Eden and Sara Ingber. 2014. Enhancing Storytelling Ability with Virtual
In our second study, as a starting point, we selected the sound Environment among Deaf and Hard-of-Hearing Children. In Computers Helping
People with Special Needs, Klaus Miesenberger, Deborah Fels, Dominique Archam-
type icons and text descriptions (and their display parameters such bault, Petr Peňáz, and Wolfgang Zagler (Eds.). Springer International Publishing,
as the size, black and white color, etc.) based on trial and error with Cham, 386–392.
[5] 3D Everything. 2021. Kids Room | 3D Interior | Unity Asset Store. Retrieved
a few pilot studies. However, such content can be changed based on April 13, 2022 from https://assetstore.unity.com/packages/3d/props/interior/kids-
user preferences such as the icon types, shorter/longer descriptions, room-48596
text and icon sizes/colors, etc. Therefore, these parameters should [6] Facebook. 2022. Oculus Rift S: PC-Powered VR Gaming Headset | Oculus. Re-
trieved April 13, 2022 from https://www.oculus.com/rift-s/
be investigated in the future from a user preference perspective as [7] Epic Games. 2022. Fortnite – a free-to-play Battle Royale game and more. Re-
well as a content-designer’s perspective. trieved April 12, 2022 from https://www.epicgames.com/fortnite/en-US/home
ASSETS ’22, October 23–26, 2022, Athens, Greece Li, Connell, Dannels, and Peiris

[8] William W. Gaver. 1987. Auditory Icons: Using Sound in Computer Interfaces. [25] Paul Milgram and Fumio Kishino. 1994. A taxonomy of mixed reality visual
SIGCHI Bull. 19, 1 (jul 1987), 74. https://doi.org/10.1145/28189.1044809 displays. IEICE TRANSACTIONS on Information and Systems 77, 12 (1994), 1321–
[9] Kathrin Gerling, Patrick Dickinson, Kieran Hicks, Liam Mason, Adalberto L. 1329.
Simeone, and Katta Spiel. 2020. Virtual Reality Games for People Using Wheelchairs. [26] Mohammadreza Mirzaei, Peter Kán, and Hannes Kaufmann. 2020. EarVR: Us-
Association for Computing Machinery, New York, NY, USA, 1–11. https://doi. ing Ear Haptics in Virtual Reality for Deaf and Hard-of-Hearing People. IEEE
org/10.1145/3313831.3376265 Transactions on Visualization and Computer Graphics 26, 5 (2020), 2084–2093.
[10] Kathrin Gerling and Katta Spiel. 2021. A Critical Examination of Virtual Reality https://doi.org/10.1109/TVCG.2020.2973441
Technology in the Context of the Minority Body. In Proceedings of the 2021 CHI [27] Mojang. 2022. Minecraft Ofcial Site | Minecraft. Retrieved April 13, 2022 from
Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). https://www.minecraft.net/en-us
Association for Computing Machinery, New York, NY, USA, Article 599, 14 pages. [28] Jorge Mori and Deborah I. Fels. 2009. Seeing the music can animated lyrics
https://doi.org/10.1145/3411764.3445196 provide access to the emotional content in music for people who are deaf or hard
[11] Ru Guo, Yiru Yang, Johnson Kuang, Xue Bin, Dhruv Jain, Steven Goodman, Leah of hearing?. In 2009 IEEE Toronto International Conference Science and Technology
Findlater, and Jon Froehlich. 2020. HoloSound: Combining Speech and Sound for Humanity (TIC-STH). 951–956. https://doi.org/10.1109/TIC-STH.2009.5444362
Identifcation for Deaf or Hard of Hearing Users on a Head-Mounted Display. In [29] Suranga Nanayakkara, Elizabeth Taylor, Lonce Wyse, and S H. Ong. 2009. An
The 22nd International ACM SIGACCESS Conference on Computers and Accessibility Enhanced Musical Experience for the Deaf: Design and Evaluation of a Music
(Virtual Event, Greece) (ASSETS ’20). Association for Computing Machinery, New Display and a Haptic Chair. In Proceedings of the SIGCHI Conference on Human
York, NY, USA, Article 71, 4 pages. https://doi.org/10.1145/3373625.3418031 Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association for
[12] S. G. Hart. 2006. Nasa-Task Load Index (NASA-TLX); 20 Years Later. Proceedings Computing Machinery, New York, NY, USA, 337–346. https://doi.org/10.1145/
of the Human Factors and Ergonomics Society Annual Meeting 50 (2006), 904 – 1518701.1518756
908. [30] Suranga Chandima Nanayakkara, Lonce Wyse, S. H. Ong, and Eliza-
[13] Chris J. Hughes, Marta B. Zapata, Matthew Johnston, and Pilar Orero. 2020. beth A. Taylor. 2013. Enhancing Musical Experience for the Hearing-
Immersive Captioning: Developing a framework for evaluating user needs. In Impaired Using Visual and Haptic Displays. Human–Computer Interac-
2020 IEEE International Conference on Artifcial Intelligence and Virtual Reality tion 28, 2 (2013), 115–160. https://doi.org/10.1080/07370024.2012.697006
(AIVR). 313–318. https://doi.org/10.1109/AIVR50618.2020.00063 arXiv:https://www.tandfonline.com/doi/pdf/10.1080/07370024.2012.697006
[14] Dhruv Jain, Bonnie Chinh, Leah Findlater, Raja Kushalnagar, and Jon Froehlich. [31] De Xing Ong, Kai Xiang Chia, Yi Yi Huang, Jasper Teck Siong Teo, Jezamine Tan,
2018. Exploring Augmented Reality Approaches to Real-Time Captioning: A Melissa Lim, Dongyu Qiu, Xinxing Xia, and Frank Yunqing Guan. 2021. Smart
Preliminary Autoethnographic Study. In Proceedings of the 2018 ACM Conference Captions: A Novel Solution for Closed Captioning in Theatre Settings with AR
Companion Publication on Designing Interactive Systems (Hong Kong, China) (DIS Glasses. In 2021 IEEE International Conference on Service Operations and Logistics,
’18 Companion). Association for Computing Machinery, New York, NY, USA, 7–11. and Informatics (SOLI). 1–5. https://doi.org/10.1109/SOLI54607.2021.9672391
https://doi.org/10.1145/3197391.3205404 [32] Shanmugam Muruga Palaniappan, Ting Zhang, and Bradley S. Duerstock. 2019.
[15] Dhruv Jain, Brendon Chiu, Steven Goodman, Chris Schmandt, Leah Findlater, and Identifying Comfort Areas in 3D Space for Persons with Upper Extremity Mobility
Jon E. Froehlich. 2020. Field Study of a Tactile Sound Awareness Device for Deaf Impairments Using Virtual Reality. In The 21st International ACM SIGACCESS
Users. In Proceedings of the 2020 International Symposium on Wearable Computers Conference on Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19).
(Virtual Event, Mexico) (ISWC ’20). Association for Computing Machinery, New Association for Computing Machinery, New York, NY, USA, 495–499. https:
York, NY, USA, 55–57. https://doi.org/10.1145/3410531.3414291 //doi.org/10.1145/3308561.3353810
[16] Dhruv Jain, Leah Findlater, Jamie Gilkeson, Benjamin Holland, Ramani Du- [33] David Passig and Sigal Eden. 2001. Virtual reality as a tool for improving spatial
raiswami, Dmitry Zotkin, Christian Vogler, and Jon E. Froehlich. 2015. Head- rotation among deaf and hard-of-hearing children. CyberPsychology & Behavior
Mounted Display Visualizations to Support Sound Awareness for the Deaf and 4, 6 (2001), 681–686.
Hard of Hearing. In Proceedings of the 33rd Annual ACM Conference on Hu- [34] Yi-Hao Peng, Ming-Wei Hsi, Paul Taele, Ting-Yu Lin, Po-En Lai, Leon Hsu,
man Factors in Computing Systems (Seoul, Republic of Korea) (CHI ’15). As- Tzu-chuan Chen, Te-Yen Wu, Yu-An Chen, Hsien-Hui Tang, and Mike Y. Chen.
sociation for Computing Machinery, New York, NY, USA, 241–250. https: 2018. SpeechBubbles: Enhancing Captioning Experiences for Deaf and Hard-
//doi.org/10.1145/2702123.2702393 of-Hearing People in Group Conversations. In Proceedings of the 2018 CHI
[17] Dhruv Jain, Khoa Huynh Anh Nguyen, Steven M. Goodman, Rachel Grossman- Conference on Human Factors in Computing Systems (Montreal QC, Canada)
Kahn, Hung Ngo, Aditya Kusupati, Ruofei Du, Alex Olwal, Leah Findlater, and Jon (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–10.
E. Froehlich. 2022. ProtoSound: A Personalized and Scalable Sound Recognition https://doi.org/10.1145/3173574.3173867
System for Deaf and Hard-of-Hearing Users. In Proceedings of the 2022 CHI [35] Sandra Poeschl, Konstantin Wall, and Nicola Doering. 2013. Integration of spatial
Conference on Human Factors in Computing Systems (New Orleans, LA, USA) sound in immersive virtual environments an experimental study on efects of
(CHI ’22). Association for Computing Machinery, New York, NY, USA, Article spatial sound on presence. In 2013 IEEE Virtual Reality (VR). 129–130. https:
305, 16 pages. https://doi.org/10.1145/3491102.3502020 //doi.org/10.1109/VR.2013.6549396
[18] Dhruv Jain, Sasa Junuzovic, Eyal Ofek, Mike Sinclair, John Porter, Chris Yoon, [36] Sylvia Rothe, Kim Tran, and Heinrich Hussmann. 2018. Positioning of Subtitles
Swetha Machanavajhala, and Meredith Ringel Morris. 2021. A Taxonomy of in Cinematic Virtual Reality. https://doi.org/10.2312/egve.20181307
Sounds in Virtual Reality. In Designing Interactive Systems Conference 2021 (Virtual [37] Zhanna Sarsenbayeva, Niels van Berkel, Chu Luo, Vassilis Kostakos, and Jorge
Event, USA) (DIS ’21). Association for Computing Machinery, New York, NY, Goncalves. 2017. Challenges of Situational Impairments during Interaction with
USA, 160–170. https://doi.org/10.1145/3461778.3462106 Mobile Devices. In Proceedings of the 29th Australian Conference on Computer-
[19] Dhruv Jain, Sasa Junuzovic, Eyal Ofek, Mike Sinclair, John R. Porter, Chris Yoon, Human Interaction (Brisbane, Queensland, Australia) (OZCHI ’17). Association
Swetha Machanavajhala, and Meredith Ringel Morris. 2021. Towards Sound for Computing Machinery, New York, NY, USA, 477–481. https://doi.org/10.
Accessibility in Virtual Reality. In Proceedings of the 2021 International Conference 1145/3152771.3156161
on Multimodal Interaction (Montréal, QC, Canada) (ICMI ’21). Association for [38] Stefania Serafn and Giovanni Serafn. 2004. Sound Design to Enhance Presence
Computing Machinery, New York, NY, USA, 80–91. https://doi.org/10.1145/ in Photorealistic Virtual Reality. In ICAD.
3462244.3479946 [39] Lichao Shen, MHD Yamen Saraiji, Kai Kunze, Kouta Minamizawa, and
[20] Dhruv Jain, Kelly Mack, Akli Amrous, Matt Wright, Steven Goodman, Leah Roshan Lalintha Peiris. 2020. Visuomotor Infuence of Attached Robotic Neck
Findlater, and Jon E. Froehlich. 2020. HomeSound: An Iterative Field Deployment of Augmentation. In Symposium on Spatial User Interaction (Virtual Event, Canada)
an In-Home Sound Awareness System for Deaf or Hard of Hearing Users. Association (SUI ’20). Association for Computing Machinery, New York, NY, USA, Article 14,
for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/ 10 pages. https://doi.org/10.1145/3385959.3418460
3313831.3376758 [40] Alexa F. Siu, Mike Sinclair, Robert Kovacs, Eyal Ofek, Christian Holz, and Edward
[21] Peter Keller and Catherine Stevens. 2004. Meaning From Environmental Sounds: Cutrell. 2020. Virtual Reality Without Vision: A Haptic and Auditory White Cane
Types of Signal-Referent Relations and Their Efect on Recognizing Auditory to Navigate Complex Virtual Worlds. Association for Computing Machinery, New
Icons. Journal of experimental psychology. Applied 10 (04 2004), 3–12. https: York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376353
//doi.org/10.1037/1076-898X.10.1.3 [41] Sean M. Smith and Glen N. Williams. 1997. A Visualization of Music. In Proceed-
[22] Firesprite Limited. 2021. The Persistence - Games | PlayStation®. Retrieved ings of the 8th Conference on Visualization ’97 (Phoenix, Arizona, USA) (VIS ’97).
April 12, 2022 from https://www.playstation.com/en-us/games/the-persistence/ IEEE Computer Society Press, Washington, DC, USA, 499–f.
[23] Carme Mangiron. 2016. Reception of game subtitles: an empirical study. The [42] Axel Stockburger. 2003. The game environment from an auditive perspective.
Translator 22, 1 (2016), 72–93. https://doi.org/10.1080/13556509.2015.1110000 Level Up (2003), 4–6.
arXiv:https://doi.org/10.1080/13556509.2015.1110000 [43] Valday Team. 2016. Vintage 50s Radio and Phone | 3D Electronics | Unity Asset
[24] Tara Matthews, Janette Fong, F Wai-Ling Ho-Ching, and Jennifer Mankof. 2006. Store. Retrieved April 13, 2022 from https://assetstore.unity.com/packages/3d/
Evaluating non-speech sound visualizations for the deaf. Behaviour & Information props/electronics/vintage-50s-radio-and-phone-71657
Technology 25, 4 (2006), 333–351. [44] Mauro Teóflo, Alvaro Lourenço, Juliana Postal, and Vicente Lucena Jr. 2018.
Exploring Virtual Reality to Enable Deaf or Hard of Hearing Accessibility in Live
SoundVizVR: Accessible VR Sounds for DHH Users with Sound Indicators ASSETS ’22, October 23–26, 2022, Athens, Greece

Theaters: A Case Study. 132–148. https://doi.org/10.1007/978-3-319-92052-8_11 [48] Bei Yuan, Eelke Folmer, and Frederick C. Harris. 2011. Game Accessibility: A
[45] vUv. 2021. Speakers PBR | 3D Electronics | Unity Asset Store. Retrieved April 13, Survey. Univers. Access Inf. Soc. 10, 1 (mar 2011), 81–100. https://doi.org/10.1007/
2022 from https://assetstore.unity.com/packages/3d/props/electronics/speakers- s10209-010-0189-5
pbr-111606 [49] Veronica Zammitto. 2008. VISUALIZATION TECHNIQUES IN VIDEO GAMES.
[46] Ryan Wedof, Lindsay Ball, Amelia Wang, Yi Xuan Khoo, Lauren Lieberman, https://doi.org/10.14236/ewic/EVA2008.30
and Kyle Rector. 2019. Virtual Showdown: An Accessible Virtual Reality Game [50] Yuhang Zhao, Edward Cutrell, Christian Holz, Meredith Ringel Morris, Eyal
with Scafolds for Youth with Visual Impairments. In Proceedings of the 2019 Ofek, and Andrew D. Wilson. 2019. SeeingVR: A Set of Tools to Make Virtual
CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Reality More Accessible to People with Low Vision. In Proceedings of the 2019
Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–15. CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland
https://doi.org/10.1145/3290605.3300371 Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–14.
[47] Minecraft Wiki. 2022. Subtitles – Minecraft Wiki. Retrieved April 12, 2022 from https://doi.org/10.1145/3290605.3300341
https://minecraft.fandom.com/wiki/Subtitles
Uncovering Visually Impaired Gamers’ Preferences for Spatial
Awareness Tools Within Video Games
Vishnu Nair Shao-en Ma Ricardo E. Gonzalez Penuela
Columbia University Columbia University Cornell University, Cornell Tech
New York, New York, USA New York, New York, USA New York, New York, USA

Yicheng He Karen Lin Mason Hayes


Columbia University Columbia University Rochester Institute of Technology
New York, New York, USA New York, New York, USA Rochester, New York, USA

Hannah Huddleston Matthew Donnelly Brian A. Smith


Stanford University Bowdoin College Columbia University
Stanford, California, USA Brunswick, Maine, USA New York, New York, USA

Figure 1: Illustrations of the four spatial awareness tools we implemented within Dungeon Escape. These approaches — the
smartphone map, the whole-room shockwave, the directional scanner, and the simple audio menu — represent a broad range
of designs for facilitating spatial awareness for VIPs. However, we still do not yet understand what the relative merits and
limitations of each approach are.
ABSTRACT on SATs to gain spatial awareness, especially in complex envi-
Sighted players gain spatial awareness within video games through ronments where using rich ambient sound design alone may be
sight and spatial awareness tools (SATs) such as minimaps. Vi- insufcient. Researchers have developed many SATs for facilitating
sually impaired players (VIPs), however, must often rely heavily spatial awareness within VIPs. Yet this abundance disguises a gap
in our understanding about how exactly these approaches assist
VIPs in gaining spatial awareness and what their relative merits
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed and limitations are. To address this, we investigate four leading
for proft or commercial advantage and that copies bear this notice and the full citation approaches to facilitating spatial awareness for VIPs within a 3D
on the frst page. Copyrights for components of this work owned by others than ACM video game context. Our fndings uncover new insights into SATs
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a for VIPs within video games, including that VIPs value position
fee. Request permissions from permissions@acm.org. and orientation information the most from an SAT; that none of
ASSETS ’22, October 23–26, 2022, Athens, Greece the approaches we investigated convey position and orientation
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 efectively; and that VIPs highly value the ability to customize SATs.
https://doi.org/10.1145/3517428.3544802
ASSETS ’22, October 23–26, 2022, Athens, Greece Nair et al.

CCS CONCEPTS directional scanner, and a simple audio menu of points-of-interest.


• Human-centered computing → Auditory feedback; Accessi- Together, they represent a broad range of design choices, including
bility technologies; Accessibility systems and tools. touchscreen-based vs. game controller interaction and "all-at-once"
(collective) overviews vs. pointer-based scanning.
KEYWORDS In order to investigate these research questions, we conducted
a user study in which nine visually impaired participants played
Audio navigation tools; spatial awareness tools; blind-accessible
multiple levels of a 3D adventure video game using our tools.
games; visual impairments
For RQ1, we evaluated how important VIPs consider six dif-
ACM Reference Format: ferent types of spatial awareness — that have been found to be
Vishnu Nair, Shao-en Ma, Ricardo E. Gonzalez Penuela, Yicheng He, Karen important in physical world settings [20, 22, 24, 29, 31, 34, 49, 65]
Lin, Mason Hayes, Hannah Huddleston, Matthew Donnelly, and Brian — to be in comparison to each other within a video game context.
A. Smith. 2022. Uncovering Visually Impaired Gamers’ Preferences for
Section 2.1 identifes the six types. We observed that participants
Spatial Awareness Tools Within Video Games. In The 24th International
ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22),
considered position and orientation to be the most important type
October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 16 pages. of spatial awareness and that they considered the scale and shape
https://doi.org/10.1145/3517428.3544802 of an area to be the least important aspects of spatial awareness.
These particular fndings refect existing work within the physical
1 INTRODUCTION world that highlights the importance of position and orientation for
VIPs [24, 31, 49], but difer from existing work within the physical
Mainstream 3D video games are largely inaccessible to visually-
world that has also found knowledge of area scale and shape to be
impaired players (VIPs) because they often lack crucial accessi-
an important component of better understanding an area, especially
bility tools [5, 46]. Although some recent mainstream games [16]
when freely exploring it [4, 30].
have made strides in making certain in-game abilities accessible
With respect to RQ2, we observed that each tool had its own
to VIPs, many crucial abilities remain inaccessible. Among these is
strength for VIPs: The directional scanner communicated the ar-
the ability for players to gain spatial awareness of their surround-
rangement of items very well; the simple audio menu communicated
ings, which prior work has established is crucial for granting VIPs
the presence of items very well; the smartphone map communicated
with an enhanced sense of space and presence within the game
the shape of an area very well; and the whole-room shockwave
world [2, 4].
communicated the scale of the area well. Importantly, however, we
Sighted players gain spatial awareness using a combination of
also discovered signifcant defciencies in today’s spatial awareness
vision and auxiliary tools such as minimaps and world maps [7, 13,
tools. No tool excelled across the board, and in particular, none of
14, 58, 67], which we will refer to as spatial awareness tools (SATs).
the tools communicated position and orientation very well despite
VIPs, however, are not able to beneft from vision in the same way
our fnding from RQ1 that position and orientation is the most
as sighted players. Although rich environmental sound design can
important type of spatial awareness to VIPs within games. Further-
ensure some level of accessibility and spatial awareness for VIPs,
more, we found issues with the tools that infuenced how efectively
these elements can prove to be insufcient, especially in complex
they communicated spatial awareness to players, including that
virtual environments. As such, SATs are often an indispensable
some of the tools provided too much information.
view of the game world for VIPs, having an immense impact on
Together, our fndings from RQ1 and RQ2 reveal important de-
their experience within a game.
sign implications for future spatial awareness tools for VIPs within
Researchers have developed several types of SATs which repre-
video games, and we present these in our Discussion. We also dis-
sent very diferent approaches for facilitating a representation of the
cuss the potential for developing purpose-built hardware for spatial
game world for VIPs. These include touchscreen maps, shockwave-
awareness and how our fndings within virtual worlds can inspire
like systems, directional scanners, and audio-based menus. This
further research in physical world navigation and exploration.
relative abundance, however, disguises a gap in our understanding
about how exactly these approaches assist VIPs in gaining spatial
awareness and what the relative merits and limitations of each 2 BACKGROUND AND RELATED WORK
approach are. It is not yet clear to developers, for example, whether
Our work is built upon a rich history of prior work on facilitating
it is best to use a shockwave-like system like The Last of Us Part 2
spatial awareness for VIPs, both within the physical world and
did [16] or audio-based menus like Terraformers did [45, 63], and
within video games.
blind gamers are ultimately the ones who sufer. As such, we take
We begin this section by explaining what we mean by “spatial
a step back and ask two important research questions:
awareness” — in particular, by reviewing aspects of spatial aware-
RQ1: What aspects of spatial awareness do VIPs fnd important ness that are known to be important to VIPs within physical world
within games? contexts (Section 2.1). RQ1 will investigate the relative importance
RQ2: How well do today’s difering SAT approaches facilitate of these aspects for VIPs within games. We then review existing
each aspect of spatial awareness, and why? techniques for facilitating spatial awareness for VIPs, both in the
In this work, we investigate RQ1 and RQ2 by implementing four physical world and within video game environments — and review
leading approaches to facilitating spatial awareness for VIPs and the tradeofs inherent within the design of these tools (Section 2.2).
investigating their merits and limitations. The four tools, illustrated Through RQ2, we take a step back and investigate the relative
in Figure 1, are a smartphone map, a whole-room shockwave, a merits and limitations of these approaches.
Uncovering Visually Impaired Gamers’ Preferences for SATs Within Video Games ASSETS ’22, October 23–26, 2022, Athens, Greece

2.1 What do we mean by “spatial awareness”? 2.2 How do games supplement spatial
Spatial awareness, as used in this work, refers to a user’s awareness awareness?
of their surrounding environment and of their own state within the Games made for VIPs often use ambient signals to provide implicit
environment [35, 64]. Past literature within physical world contexts spatial awareness to players. These ambient signals usually take the
has shown spatial awareness to be multifaceted. Thus, in this work, form of environmental audio cues that communicate information
we investigate RQ1 and RQ2 with respect to six distinct aspects of about the player’s immediate environment. For example, hearing
spatial awareness we identifed through prior work. Specifcally, we running water may indicate that there is a waterfall or stream near
looked through existing research in cognitive map formation and the player. When environmental sounds reverberate, the player
spatial awareness for VIPs within the physical world and looked may realize that they are inside a cave or tunnel, and the extent of
for explicit information on what aspects of spatial awareness are the reverberation can indicate the size of the cave or tunnel. The use
most important to VIPs. We chose to investigate the following six of 3D sound can additionally communicate the relative direction
types of spatial awareness since they were mentioned as important that the source of sound is in with respect to the player.
across a breadth of prior research [20, 22, 24, 29, 31, 34, 49, 65]: Although ambient signals may be sufcient for simple envi-
ronments, they can become less useful to players as environments
Types 1 & 2: Scale and shape of the area. Prior work — become more complex, as is typical for many mainstream 3D games.
mainly in tactile maps [30, 49] and echolocation [4] — has Ambient cues can become overwhelming when there are too many
found area shape to be important to VIPs in obtaining a items in the environment, and they may also be vague, giving
general impression of the area, which can be especially cru- players little information about what the sounds they are hearing
cial when exploring and trying to learn about the environment. actually represent. As a result, using ambient signals alone as a
means to facilitate spatial awareness for players limits the com-
Type 3: Position & orientation. Researchers have found that plexity of games that accessible game designers are able to make.
understanding where one is within a mental map of the area Accessible game designers, thus, face a tradeof between designing
(for example, their Cartesian coordinates or their heading environments that are interesting and designing games that are
direction in degrees) — that is, within an allocentric [34], still accessible and playable by VIPs [2, 52].
map-like mental representation of the environment — is vital Given the limitations of implicit forms of spatial awareness, ac-
to continuously updating their own current state within the cessible game designers often turn to creating tools that explicitly
environment and thus efectively move through it [20, 22, 34]. communicate spatial awareness information to players. These spa-
Yet, prior work in physical world contexts [24] has shown that tial awareness tools (or SATs) — which include (but are not limited
obtaining this understanding is especially demanding for VIPs. to) tactile maps, radar systems, and grid systems — supplement
implicit spatial awareness cues by clarifying environmental infor-
Types 4 & 5: Presence and arrangement of items. mation and afording players greater control over what information
Researchers have emphasized that providing VIPs with the they hear and when they hear it.
information necessary to perceive the locations of objects can Table 1 shows an overview of SATs from prior work. Below we
allow them to infer spatial relationships between objects and review some explicit approaches for facilitating spatial awareness
can lead to increased spatial awareness and more accurate in games and in the physical world.
cognitive maps [24, 29].

Type 6: Areas adjacent to the player’s current area. Prior 2.2.1 Facilitating spatial awareness for VIPs within video games.
work with physical world tactile maps [49] and mobile-based Tools that explicitly communicate spatial awareness information to
spatial tactile feedback for communicating geographical VIPs are not commonplace within mainstream video games. Most
information [65] have underscored the importance of examples, instead, come from “audio games” (audio-based games
understanding the global structure of the world — general created for VIPs), which generally provide players with spatial
overviews of an area and spatial relationships between awareness by presenting environments in the form of lists and
multiple areas — for VIPs, which can help them plan out grids that players can query. This technique is employed by many
routes and backtrack as needed. well-known audiogames, including Terraformers [45, 63], A Hero’s
Call [44], and ShadowRine [39]. These representations may com-
Although prior work has determined these six aspects of spatial municate the presence and arrangement (Types 4 & 5 from Section
awareness to be important to VIPs in the physical world, video 2.1) of items and points-of-interest and are sometimes further sup-
games are very diferent from the physical world. Within the phys- plemented by additional tools such as radars and compasses.
ical world, practicality and physical safety are extremely important Several examples of SATs have come from the research commu-
factors [6], while in video games, agency and pleasure (fun) are nity as well. A notable example is NavStick [41, 42], which repur-
very important, and VIPs’ in-game “safety” may not always be a poses a game controller’s right thumbstick to allow VIPs to “look
major concern. It is possible that, due to these diferences, VIPs may around” their in-game surroundings via line-of-sight. A directional
fnd certain aspects of spatial awareness more or less important scanning system like NavStick could allow VIPs to determine the
within games when compared to the physical world. We, thus, use presence and spatial arrangement of objects around them (Types
RQ1 to explore these preferences. 4 & 5) as well as their relative position and orientation within the
game world (Type 3).
ASSETS ’22, October 23–26, 2022, Athens, Greece Nair et al.

Artifact Ambient spatial awareness cues Explicit spatial awareness tools (SATs) SAT(s) in our study
ShadowRine Audio cues in environment.
Tactile display showing top-down view of player’s location. Smartphone map
(virtual) [39] (e.g., enemy & object sounds)
The Last of Us Part 2 Audio cues in environment.
"Enhanced listen mode" (shockwave-like tool). Whole-room shockwave
(virtual) [16] (with “audio cue glossary”)
NavStick Audio cues in environment.
"NavStick" (direction-based object scanner). Directional scanner
(virtual) [41, 42] (e.g., enemy & checkpoint sounds)
A Hero’s Call
Audio cues in environment. Menu of nearby points of interest. Simple audio menu
(virtual) [44]
Swamp Audio cues in environment. “Radar” (beeps based upon empty space or solid walls). Whole-room shockwave
(virtual) [32] (e.g., zombie growls) Audio menu (with compass- & tile-based guidance). Simple audio menu
Terraformers “Sonar” (provides distance to object in current facing direction). Directional scanner
Audio cues in environment.
(virtual) [45, 63] “GPS” (audio-based menu of nearby objects & positions). Simple audio menu
SmartTactMaps
Sounds from physical environment. Smartphone-based augmentation of physical tactile map. Smartphone map
(physical) [28]
Timbremap
Sounds from physical environment. Touchscreen-based 2D map exploration. Smartphone map
(physical) [56]
Echolocation
Sounds from physical environment. Behavior of refected sounds within the environment. Whole-room shockwave
(physical) [33, 43, 57]
Talking Points 3
Sounds from physical environment. “Directional Finder” (direction-based landmark scanner). Directional scanner
(physical) [64]
MS Soundscape Selection of points of interest from a menu.
Sounds from physical environment. Simple audio menu
(physical) [40] Notifcations about nearby landmarks using 3D sound.
SpaceSense Vibration cues indicating the direction of a location
Sounds from physical environment. Simple audio menu
(physical) [65] selected from a menu.
NavCog3
Sounds from physical environment. Audio notifcations of immediate surroundings. N/A
(physical) [50]
Table 1: An overview of prior work in communicating spatial awareness to VIPs within both virtual and physical contexts.
These artifacts represent a variety of ideas ranging from audio-based solutions to tactile solutions. For each, we present the
ambient/implicit spatial awareness cues that it provides, the explicit spatial awareness tools (SATs) it introduces, and the
corresponding SAT(s) in our study. The four SATs we implement collectively represent a signifcant portion of prior work.

A notable exception to the lack of SATs in mainstream games may be adjacent to a given area (Type 6). These not only include
is The Last Of Us Part 2, a 3D action-adventure game released in physical tactile maps but also mobile-based tactile systems, such as
2020 [16], which introduced an "enhanced listen mode" for VIPs. Timbremap [56] and SmartTactMaps [28], which can allow VIPs to
The enhanced listen mode provides spatial awareness to players by survey the area they are in using a commodity smartphone.
placing 3D audio beacons at the locations of nearby enemies and Echolocation, which has been explored for both physical [33, 43,
other points-of-interest on the press of a button. The beacons may 57] and virtual [2, 3] environments, is another technique that VIPs
give players a sense of the spatial arrangement of items in the area may use to gain spatial awareness within environments. Using the
(Type 5) as well as a sense of the surrounding area’s scale (Type 1). acoustic properties of the environment can allow individuals to
learn about the structure of the area they are in, including the scale
2.2.2 Facilitating spatial awareness for VIPs in the physical world. and shape of the area (Types 1 & 2), as well as the presence and
Some audio-based tools within the physical world have features arrangement of nearby objects (Types 4 & 5) [33, 57].
that explicitly provide VIPs with spatial awareness information and
can thus inform the design of SATs for game worlds. NavCog3 [50],
a turn-by-turn indoor navigation system for VIPs, for example, 3 FOUR SPATIAL AWARENESS TOOLS
emits notifcations about nearby landmarks and points-of-interest In order to investigate RQ1 and RQ2, we implemented four existing
to promote awareness in the user of their presence (Type 4). Simi- approaches for giving VIPs spatial awareness of their surroundings
larly, Microsoft Soundscape [40], an audio-based wayfnding system that represent a wide range of possible designs. Figure 1 depicts
that can be used by VIPs, uses 3D sound to communicate the pres- the four approaches. They include a smartphone map, a whole-
ence and relative direction (i.e., arrangement, Type 5) of nearby room shockwave, a directional scanner, and a simple audio menu.
landmarks. The spatial awareness that these systems provide is, We limited our exploration to just four tools to avoid fatiguing
however, limited. For example, they do not provide any information our user study participants while still efectively evaluating the
about the area’s shape and size (Types 1 & 2). tools. Regardless, Table 1 shows that these four tools collectively
Tactile-based systems provide spatial awareness by providing represent a signifcant portion of approaches from prior work on
overviews of areas [30, 49], which may include the scale and shape explicitly communicating spatial awareness information to VIPs.
of an area (Types 1 & 2), the presence and arrangement of landmarks We were able to replicate two of the tools (the directional scanner
and other points-of-interest (Types 4 & 5), and even what areas and the simple audio menu) concretely from existing work [41, 42,
Uncovering Visually Impaired Gamers’ Preferences for SATs Within Video Games ASSETS ’22, October 23–26, 2022, Athens, Greece

45, 63]. For the smartphone map and the whole-room shockwave, 3.2 Whole-Room Shockwave
however, we went through multiple design iterations because their The whole-room shockwave, depicted in the upper-right corner
implementation included many open design decisions that were of Figure 1, uses an acoustic shockwave that the player triggers
not fully specifed by prior work. In our design iterations, which we to communicate information about their surroundings. When the
describe in Sections 3.1 and 3.2, our focus was to polish the tools’ shockwave hits anything in the room, an announcement and/or
designs and ensure that they showcased the potential of the two sound efect emanates from that object via 3D sound. The shock-
SAT approaches in the best possible way so that our results would wave corresponds to real-world physics in that closer objects will
not be confounded by a potentially bad design. emanate their sounds back to the player before objects that are
In order to ensure that our tools most accurately represented further away. If the player moves while the shockwave is active,
current approaches and to ensure that our study procedure was the rate of expansion will match the player’s speed.
sound, we conducted pilot tests with two visually impaired and The whole-room shockwave originated from our explorations in
eight sighted-but-blindfolded people. We intended for the testing echolocation, which has been shown to promote spatial awareness
phase that included the sighted-but-blindfolded participants to be in VIPs by communicating physical properties of the room and
a naïve-yet-useful way of catching any low-hanging fruit with nearby objects [33, 43, 57]. Our initial echolocation prototype had
respect to procedural, game-related, or tool-related issues before players press a button on their game controller to emit a click sound
piloting with our visually impaired team members. Our visually originating from the player’s position, similar to how some VIPs use
impaired team members — whom we hosted as part of the research echolocation within the physical world [33, 36]. Our echolocation
team during the project — then provided feedback that was critical prototype was similar to virtual echolocation techniques used in
to developing the fnal designs of the tools. prior work [2] that used Steam Audio’s built-in head-related transfer
In the following subsections, we describe the design and imple- function [62] to generate sound refections based on the physical
mentation process of the four SATs. We direct readers to the accom- structure of each room.
panying video fgure for a demonstration of all four tools. We cre- In our pilot tests, however, we found that echolocation by itself
ated all four tools using the Unity game engine (v2020.3.16f1) [59]. was not equivalent to the other tools. While echolocation commu-
nicates the raw layout of an area, the other tools can communicate
raw layout in addition to specifc object information through sound
efects and text-to-speech. Furthermore, our visually impaired pilot
participants were not at all experienced in echolocation and did
3.1 Smartphone Map
not know how to decode and interpret the sound echoes in our
The smartphone map interface, shown in the upper-left corner of game environment; they could only interpret broad qualities of the
Figure 1, uses a smartphone-based touchscreen map that works in area such as how large it was. Although users could learn to use
tandem with the game. The player can use their fnger to survey echolocation, prior work has indicated that it may take weeks for
the map. As the player moves through the level, the map will auto- users to learn how to use click-based echolocation efectively [43].
matically pan and rotate in real-time, allowing users to explicitly We, thus, made modifcations to the initial echolocation design
keep track of their own position and orientation, respectively. and created the whole-room shockwave — a refned and more com-
The smartphone map interface represents prior work in tactile- prehensible version of echolocation. In its frst iteration, the shock-
based maps to support spatial awareness. Tactile maps in the phys- wave announced every item that it hit, which proved to be auditorily
ical world have been shown to support spatial awareness in VIPs overwhelming. Furthermore, both visually impaired participants
by providing general overviews of spaces and landmarks [30, 49]. found the shockwave to be too fast. As a result, our second and
Our work with video games necessitates a digital solution; as such, fnal version halved the speed of the shockwave and implemented
the smartphone map interface we implemented also derives from a fltering mechanism. Players can press the right button on the
prior work in touchscreen-based accessible graphics, particularly D-pad to cycle through four fltering options — all objects, mission-
in presenting foor plans and other maps to VIPs [23, 25, 26, 28, 56]. critical points-of-interest, non-mission-critical (decorative) objects,
When a player places their fnger on the screen, they will begin and walls only. Only items within the selected category will emit
surveying at their position, regardless of where on the screen they sounds during a shockwave.
are touching. As they move their fnger, they will survey the map
relative to their position, with the app announcing anything that
the player touches. The app will announce all items in the world 3.3 Directional Scanner
(as well as the player’s position) using sound efects and/or text- The directional scanner, illustrated in the lower-left corner of Figure
to-speech. The app only reacts to touches within the current room 1, allows players to survey in any direction using the right thumb-
that the player is in — if the player drags their fnger outside the stick. Players use the tool by tilting the thumbstick in any direction.
room, a continuous warning tone will play. This triggers an announcement naming the frst object that lies in
In the frst version of this tool, players started surveying at the that direction via line-of-sight with respect to the player’s current
portion of the map where their fnger touched the screen; however, position and orientation. The announcement is made via 3D sound
our visually impaired pilot participants ended up spending large from the point of the object in space. If the frst object in a direction
amounts of time searching for their current position, which frus- being pointed at is not an object of interest (i.e., a wall or other
trated them. As a result, our second and fnal version registers a generic obstruction), the scanner will emit a 440 Hz sine tone from
player’s initial touch at their current position. the direction of the obstruction.
ASSETS ’22, October 23–26, 2022, Athens, Greece Nair et al.

This tool represents prior work that has sought to replicate


the act of “looking around” (or directionally scanning an area) to
promote spatial awareness for VIPs. We take particular inspiration
from NavStick [41, 42], which introduced the concept of directional
scanning within game worlds and showed how VIPs enjoyed the
ability to survey their game environments directly by “looking
around.” Some prior work with directional scanning also exists in
the physical world. Talking Points 3 [64] is one such example: It
features a “Directional Finder” that provides a list of landmarks that
lie in the general direction that a VIP points their mobile device.
Our implementation of the directional scanner was derived from
NavStick, and we did not implement any major changes to it as a Figure 2: Overhead views of Dungeon Escape’s trial level and
result of our pilot tests. four main levels. All levels have a common set of points-of-
interest: a start point, a key, an obstacle that the key afords
3.4 Simple Audio Menu passage through, and a goal checkpoint. Each level, however,
possesses a unique layout, allowing us to evaluate the four
The simple audio menu, shown in the lower-right corner of Figure
spatial awareness tools within a variety of layouts.
1, represents the idea of using a list to promote spatial awareness
— in particular, by allowing VIPs to learn about the contents of
the area they are currently in. Many audio games made for VIPs, understand where objects are located and how the rooms in each
such as Terraformers [45, 63] and A Hero’s Call [44], use list- and level are laid out in order to succeed — thus testing how well they
grid-based representations to present the world to VIPs. are able to gain spatial awareness using those tools.
The simple audio menu we implemented exposes an audio-based We created Dungeon Escape using the Unity game engine [59],
list of points-of-interest (POIs). Players use the tool by pressing and we designed the dungeon’s layout using the Dungeon Architect
the left bumper button to open a list of POIs within the room Unity asset [12]. Figures 1 and 3 show views from Dungeon Escape.
they are currently in. As the player scrolls through the list using Dungeon Escape consists of four levels (small dungeons), which
the D-pad, they will hear each item’s associated sound efect and allowed us to study the four SATs within separate dungeon layouts.
text-to-speech announcement. The simple audio menu is modeled Figure 2 shows overhead views of all four main levels and the
after list interfaces used in some audio games as well as prior trial level. In each main level, the player must reach a goal area by
research [41, 45, 63] in that it employs an alphabetical ordering gaining passage through an obstacle: either a locked door, a cracked
of items. Previous research has suggested that, for a linear menu, wooden door, a spider web, or a dog blocking the exit. To do so, the
a stable alphabetical ordering is less confusing than a proximity- player must fnd a relevant object in another room: a key, an axe, a
based or direction-based ordering, both of which can change as the burning torch, or a bone, respectively. Each level consists of several
player moves [41]. rooms scattered with decorative objects such as crates and barrels.
Similar to the directional scanner, we did not implement any We generated the four level layouts by deriving them from a
major changes to the simple audio menu as a result of our pilots. single Dungeon Architect “grid fow.” This grid fow defned basic
parameters from which Dungeon Architect would generate levels.
4 USER STUDY Each level consisted of:
We performed a user study to investigate two important research • A “start room” within which the player frst spawns.
questions about SATs within video games for VIPs: • An “obstacle room” containing the obstacle.
RQ1: What aspects of spatial awareness do VIPs fnd important • A “key room” containing the object that clears the obstacle.
within games? • A “main hall” connecting the start, key, and obstacle rooms.
RQ2: How well do today’s difering SAT approaches — as repre- • A “fnal hall” containing the goal checkpoint.
sented by the four tools we implemented — facilitate each We then fed random seed values into this grid fow to generate
aspect of spatial awareness, and why? the fnal layouts. This allowed us to have unique level layouts while
We created a 3D adventure game called Dungeon Escape to in- keeping them equivalent in terms of difculty and structure. The
vestigate these two research questions. We included the four tools trial level followed a similar conceptual structure but was much
within Dungeon Escape and used the game to run a user study with smaller, consisting of a start room, a combined key-and-obstacle
VIPs. In this section, we describe Dungeon Escape and our user room, and a fnal hall with the checkpoint.
study. Players move the main character with the left thumbstick. Tilting
it forward and backward will move the character forward and
4.1 Game: Dungeon Escape backward. Tilting it left and right will rotate the character left and
Dungeon Escape is a 3D third-person adventure game set in a fan- right. This control scheme refects controls found in mainstream
tasy world, in which the player must escape small dungeons by 3D games such as Tomb Raider [15], Resident Evil 1-5 [1, 8–11],
fnding objects that allow them to clear obstacles. We chose to Metroid Prime 1-3 [53–55], Heavy Rain [17], and Silent Hill [51],
create Dungeon Escape to address RQ1 and RQ2 because the game which use a fxed over-the-shoulder camera and use left/right on
requires players to use the tools they are given to search for and the left thumbstick to rotate the character. The right thumbstick
Uncovering Visually Impaired Gamers’ Preferences for SATs Within Video Games ASSETS ’22, October 23–26, 2022, Athens, Greece

is used by the directional scanner condition; thus, to eliminate a


confound, we removed right thumbstick controls from all other
conditions. Players can press the bottom face button to pick up an
object or to use an object to remove the relevant obstacle.
Players hear a scraping sound if they physically hit an obstruc-
tion; the sound will be situated in the direction of contact. Keys,
obstacles, and checkpoints play a relevant sound once the player
is within two meters of the object. Players will hear the name of
the room (for example, “Start Room” or “Key Room”) announced
on entry, and they can also press the right face button to hear the
room name on-demand. We integrated these sounds to allow VIPs
to be informed of these events — i.e., hitting a wall or entering a
room — when they occur. Sighted players can perceive these events Figure 3: Remote study session with a participant and two
solely via sight, but VIPs require notifcations via other means. facilitators. The participant is currently sharing their screen.
We also implemented a “rotation indicator” utility that helps Within the game, a blue key is situated to the participant’s
players understand how much they are rotating when they turn right. The participant will need to collect that key to progress
left or right using the left thumbstick. As the player rotates, a click through the level. (Faces obscured to protect anonymity.)
sound will be played at 15° increments via 3D sound only in the
direction of the player’s objective (i.e., the obstacle that must be We recruited participants through posts on the AudioGames.net
cleared). The rotation indicator mimics snap rotation controls found Forum,1 an online discussion board that centers around audio-
in many games created for VIPs [32, 45, 63], which allow players based games and is frequented by VIPs. Six of our participants
to snap to pre-defned angle increments. In order to bring Dun- reported themselves as being very experienced with video and
geon Escape’s controls closer to the free movement of mainstream other electronic games (4+ on a 5-point Likert scale), while the
3D games, we gave players full analog control via the left thumb- other three (P2, P6, & P8) reported themselves as being moderately
stick but maintained the feedback aforded by snap rotation via the experienced with games (3 on a 5-point Likert scale).
rotation indicator. The rotation indicator was available across all
four tools and pointed in the direction of the objective regardless 4.3 Technical Setup & COVID-19 Challenges
of any intervening obstacles. Similar utilities have also been im- We performed this study remotely due to the COVID-19 pandemic
plemented in prior work that has investigated navigation by VIPs and the difculties that VIPs may face in travelling to our institution.
within virtual environments [2, 4, 41]. We sent each participant an executable of our game for them to
Additionally, players could place looping audio beacons on ob- download to their computer before their study appointment. The
jects of interest so that they could lock onto and keep targets within game included all of the SATs except for the smartphone map. We
their “feld of view.” Once placed, these beacons emit a looping distributed that tool as both iOS and Android apps using the Google
sound, which players can use to orient themselves and move to- Firebase App Distribution service [27]. We designed both Dungeon
wards the target. With NavStick, players point at a target with the Escape and the smartphone map to connect with a cloud backend,
right stick and press the right bumper button to place the beacon. which allowed both components to synchronize with each other,
With the simple audio menu, players scroll to a target and press the and allowed us to remotely observe and control the runtime state
left bumper. With the smartphone map, players tap on the upper of participants’ games using a custom-built control panel.
one-ffth of the screen to place a beacon at the last announced We held the study appointments over Zoom and asked partici-
target. There was no mechanism for beacon placement with the pants to share their computer audio (and, optionally, video) with
whole-room shockwave. We added the beacons exclusively for guid- us. Although there was no way for us to see the smartphone map
ance purposes to speed up the process of walking toward a target during the study, most participants’ microphones picked up the
— players still need to use an SAT to fnd objects and other targets sound from the app. Figure 3 shows a study session in progress.
before they can place a beacon at that object/target. The study and our data collection eforts were approved by the
Columbia University Institutional Review Board (IRB).

4.4 Procedure
4.2 Participants To address RQ1, we began the session by administering a two-
part pre-study questionnaire. The frst part requested demographic
We recruited nine participants for this study. In our pre-study ques-
information alongside information about participants’ existing ex-
tionnaire, eight described themselves as being completely blind and
perience with video games and physical world navigation. The
one (P1) described themselves as having light perception only. All
second part directly asked participants about how important they
participants were male and have had their vision impairments from
fnd each of the six types of spatial awareness — that we identifed in
birth. Six participants were 18–25 years old; two (P5 & P9) were
Section 2.1 — within a video game context. For each type, responses
26–35 years old; and one (P3) was 36–45 years old. In addition to
were given on a 5-point unipolar Likert scale where 1 indicated
having vision impairments, two participants (P3 & P6) reported
having slight hearing loss in one of their ears. 1 https://forum.audiogames.net/
ASSETS ’22, October 23–26, 2022, Athens, Greece Nair et al.

that the type of spatial awareness was not-at-all important and 5


indicated that it was extremely important. Afterwards, we placed
participants in a room within the game where we introduced basic
movement and interaction controls.
For each tool, we frst placed participants into the trial level. We
explained how to use the tool and afterwards allowed participants
to traverse the trial level at their own leisure. The trial level was the
same across all tools. After the trial level, we placed participants
into one of the four main levels. Although all participants played
the four levels in the same order, we counterbalanced the order
of the tools themselves via a Latin square design to reduce any
variations caused by order efects. Figure 4: The importance of the six aspects of spatial aware-
In order to address RQ2, we administered a two-part post-level ness within games for VIPs (RQ1). Responses were given on
questionnaire; we did this after participants traversed a level with a fve-point unipolar Likert scale. Red lines indicate median
a tool. In the frst part, we asked participants to elaborate on their ratings within this box plot. Rankings based on median rat-
impressions of the tool, what they think is missing, and in what ings are shown to the right.
game situations they might use the tool. In the second part, we
gauged how well participants thought the tool satisfed each of the
six types of spatial awareness. Responses were given on fve-point
scales, where 1 indicated that the tool facilitated that type not-at-all Then, all fve coders iterated on the codes together until there was
well and 5 indicated that it facilitated that type extremely well. unanimous agreement that they could not iterate further.
Participants were encouraged to elaborate on all questions.
After completing all four levels, we administered a two-part 5 RQ1 RESULTS: ASPECTS OF SPATIAL
post-study questionnaire. In the frst part, we asked participants AWARENESS IMPORTANT TO VISUALLY
to consider a scenario where they were able to play the levels in IMPAIRED PLAYERS
Dungeon Escape using more than one tool at once; we did this
In this section, we report our fndings regarding our frst research
in order to determine if using multiple tools at once could have
question (RQ1): learning what aspects of spatial awareness VIPs
improved participants’ spatial awareness in any way. As part of this
fnd important within games. We captured these opinions in the
section, participants were asked to provide two combinations of
second questionnaire that we administered as part of our pre-study
two tools each that they would have liked to use if they were given
procedure.
the chance to do so. (We should note that Dungeon Escape is capable
Figure 4 shows box plots of participants’ importance ratings for
of activating two tools at once; however, in our initial pilot tests,
the six types of spatial awareness. Note that the median rating for
including additional game levels to test these combinations made
each measure is highlighted in red in each type’s box plot.
study sessions well exceed our limit of two hours.) In the second
The data suggest that the six types of spatial awareness can be
part, we asked participants how likely they were to recommend
divided into three levels of importance for VIPs. The most important
each individual tool to a friend or colleague, assuming they had the
aspect of spatial awareness to VIPs within games is position and ori-
same visual impairments as the participant. Responses were given
entation (Type 3) awareness, which received a median importance
on a 10-point net promoter score scale [48], where 1 indicated they
rating of 5 on a 5-point unipolar Likert scale. Below that, three
were very unlikely to recommend it and 10 was very likely.
spatial awareness aspects — presence of items (Type 4), arrange-
ment of items (Type 5), and adjacent areas (Type 6) — tied each
4.5 Data Collection & Analysis other with a median importance rating of 4. The least important
aspects of spatial awareness to VIPs within games are scale (Type
We administered all questionnaires by having the facilitator read out
1) and shape (Type 2) awareness, both of which received median
each question and input the participant’s response into an internal
importance ratings of 3.
Google Form. For all choice- and rating-based questions, we asked
The following subsections dive deeper into participants’ rea-
for participants’ open-ended opinions via the questionnaire itself by
soning behind how important they rated each aspect of spatial
explicitly following up on their responses. The facilitator was also
awareness to be within games. All quotes come from participants’
encouraged to follow up on any other points they found interesting
open-ended responses while completing this questionnaire. In Sec-
throughout the session — though they were not allowed to disturb
tion 7, we discuss how these fndings and the fndings from RQ2
the participant while a game level was in progress. We have included
(in Section 6) collectively reveal new design considerations and
the questionnaires as part of our supplementary material.
research opportunities for spatial awareness tools.
We recorded all sessions with participants’ permission for tran-
scription purposes. We also obtained raw data of participants’ ac-
tions within the game by capturing in-game logs. 5.1 Rank 1: Position and Orientation [Type 3]
To analyze sessions, we followed an inductive coding process Participants found position and orientation to be the most impor-
that involved fve members of the research team. Individual coders tant aspect of spatial awareness within a video game context. Six
went through session transcripts and coded quotes and other events. participants explicitly afrmed this aspect of spatial awareness as
Uncovering Visually Impaired Gamers’ Preferences for SATs Within Video Games ASSETS ’22, October 23–26, 2022, Athens, Greece

the most important because it was crucial to determining their “[Knowing arrangement] depends on what the task is [at
current state within the game world: hand]. It’s especially [important] if it involves triggering
“You have an idea of how fast you’re turning and in certain things in certain orders, fnding an item then
what direction. I would say it’s the most important fnding a person, or facing of against a challenger then
thing.” — P3 fnding an NPC.” — P7
Another participant who echoed this sentiment, P9, recounted Six participants felt that having an SAT communicate which
extensive experience with shooter audio games, such as Swamp [32], regions are adjacent to the one they are currently in would be
that require players to move through a complex environment. P9 benefcial as it would make navigation through the game world
afrmed position & orientation awareness — and thus, awareness easier:
of their current state — as extremely important to helping them “To be honest, I’ll say [having an SAT communicate
plan out future actions, which is a crucial aspect of shooter-type adjacencies is] extremely important because it makes
games: it much easier for the player to move from one area to
“You have to know where you are at and where you are another without moving through the whole map.” — P5
oriented to in order to know where to go and what to do However, three others feared that having this information pre-
next.” — P9 sented outright may make exploration and discovery less fun. One
These opinions refect work within the physical world that has such participant was P4 who was a fan of games that required a
found position and orientation to be important to VIPs [22, 24]. They high level of strategy. P4 asserted that the game should preserve
also establish that SATs for VIPs within video games must satisfy a level of challenge and instead convey connections to other nav-
a high bar in terms of communicating position and orientation igable places using plot and contextual cues such as dialogue or
information. In Section 6, we determine if any of the four tools we readable signs in the world itself:
implement for this study satisfy this high bar. “If it’s one of those strategy games where you have to
discover it on your own, it’s not important. Let’s say it’s
5.2 Rank 2 (three-way tie): Presence, a hidden area, [...] it should stay hidden.” — P4
Arrangement, and Adjacent Areas [Types 4, The other two participants, P2 and P6, echoed similar sentiments.
5, and 6] This fnding is quite surprising: These three participants thought
that an SAT did not necessarily need to communicate adjacencies
After position and orientation, participants found the next most
despite us identifying this as a basic aspect of spatial awareness.
important aspects of spatial awareness to be the presence and ar-
This points to the importance that participants place in their ex-
rangement of items within the space (Types 4 & 5) and information
perience within the game over the actual information they receive,
about areas adjacent to their current area (Type 6).
implying that VIPs may be willing to sacrifce receiving some pieces
These aspects are all important to participants in certain con-
of information for the sake of a more interesting gaming experience.
texts, but not in all situations like position and orientation is. As P3
and P9 implied in their quotes in the previous subsection, position
5.3 Rank 5 (two-way tie): Shape and Scale
and orientation awareness grants players with a sense of their cur-
rent state within the world. Prior work in the physical world has [Types 1 & 2]
found ascertaining this knowledge to be cognitively demanding Participants generally found scale and shape information about an
for VIPs as they move through an environment [24, 38]. This in- area to be the least important aspects of spatial awareness. VIPs’
creased cognitive load can interfere with VIPs’ ability to understand opinions generally revolved around the sentiment that, unlike the
other aspects of spatial awareness, making position and orientation other types of spatial awareness, scale and shape information may
awareness essential. be outright unnecessary much of the time.
Seven participants found presence to be very important for spa- Seven participants stated that having a sense of the room’s scale
tial awareness because if an SAT did well at communicating pres- was not important to them and that SATs should focus on conveying
ence, then they could be confdent that they would not miss fnding information about the presence and location of nearby objects
anything within the game: instead:
“If you hear [a familiar object], you know you are rel- “I feel when you are navigating in games you don’t
atively in the right place and you can search the area really need to know how big the area is as long as you
specifcally.” — P7 know where the objects in that area are.” — P1
Five participants clarifed why the importance of knowing the Seven participants thought that an SAT should only convey in-
arrangement of items is heavily context-dependent. For example, formation about the surrounding area’s shape when absolutely
when faced with objectives that involve fnding a specifc item, necessary and that communicating shape information is too much
participants believed that knowing the arrangement of items was for an SAT to do, possibly resulting in information overload. How-
very helpful because it would help them fgure out where to go frst. ever, these participants also thought that knowing shape informa-
However, participants noted that having knowledge about items’ tion in some situations may make navigation more efcient — for
arrangement may be detrimental in less restrictive, exploration- example, in a situation where the room does not have a circular or
oriented tasks since that knowledge may reveal too much infor- rectangular shape:
mation and rob players the enjoyment of discovering items for “If the room is an odd shape — every time I play a game,
themselves: I assume the room is like a square, but that’s not always
ASSETS ’22, October 23–26, 2022, Athens, Greece Nair et al.

the highest in one type each (area shape and tied with the directional
scanner on area scale, respectively). The directional scanner did
not score the lowest on any type.
The results presented in this section are organized thematically,
with each theme representing opinions shared by a majority of our
participants.

6.1 Since participants could trace out the


contours of a room with their fnger, the
smartphone map communicated the shape
of an area better than other tools.
Figure 5 shows that participants’ ratings on how the smartphone
map communicated the shape (Type 2) of the room were generally
the highest out of all four tools. Five participants resonated with
the following sentiment:
"[The smartphone map] gives you a general idea of
where openings and spaces are. It [also] gives you an
Figure 5: Participant ratings for how well each SAT approach idea of how far each edge is from your center point
facilitates each type of spatial awareness. Large text indicates which tells you, ‘OK, [the wall] angles a bit.’" — P6
median values, and small text indicates mean plus/minus
standard deviation. Responses were given on a fve-point One participant even used this ability to their advantage. For
unipolar Likert scale. Blue and red cells indicate the best and example, in the irregularly-shaped fnal room of Level 3, P8 used
worst performing SAT approaches for each spatial awareness the smartphone interface to trace out the walls of the room and
type (column), respectively. The importance rankings from determine that they needed to turn a corner to reach the goal
our RQ1 analysis are shown again at the bottom. checkpoint:
“Having that memory of ‘Oh, I know a little bit more
about the shape of the room than I had previously with
the case, sometimes rooms may have [many] sides, parts
the other tools.’ — that really helped me get a better sense
that jut out — so I believe it’s a consideration.” — P3
of exactly where I needed to go in terms of [knowing
In a situation where a room is not rectangular, knowing the that] I have to round a corner instead of trying to run
room’s shape could help players plan out their movements more directly forward for the target.” — P8
carefully. Otherwise, players may resort to hugging walls to traverse
the room and search for doors, which may become frustrating. 6.2 The whole-room shockwave allowed
6 RQ2 RESULTS: COMPARISON OF SPATIAL participants to quickly ascertain a general
AWARENESS TOOLS FOR VIRTUAL WORLDS overview of an area, especially with respect
to its scale.
In this section, we report our results regarding our second research
question (RQ2): determining how well the four tools we imple- Figure 5 shows that the whole-room shockwave tied the directional
mented facilitate the various aspects of spatial awareness. Com- scanner for being the best tool at communicating an area’s scale
bined with our RQ1 results, these results will shed light on how (Type 1). Participants found that the distance-based volume atten-
SATs should be designed to best facilitate the aspects of spatial uation aforded by Dungeon Escape’s 3D sound system and the
awareness that are most important to VIPs. delayed timing of objects’ sounds during a shockwave helped them
Figure 5 shows an overview of the “winners” and “losers” in approximate how far away objects were and, thus, how big the
terms of participants’ post-level responses on how well each tool room was.
facilitated the various aspects of spatial awareness. We determined P8 was one such participant; they described themselves as "not-
these by looking at the median ratings for all four tools for a given at-all experienced" with echolocation techniques in the pre-study
spatial awareness aspect. A tool “wins” an aspect of spatial aware- questionnaire. (Recall from Section 3.2 that we derived the whole-
ness if it has the highest median rating out of all the tools; a tools room shockwave from our explorations in echolocation.) Yet, they
“loses“ an aspect of spatial awareness if it has the lowest median relayed the following positive sentiment which was shared by many
rating. We broke any ties using the mean rating. other participants despite their inexperience with echolocation:
We see that the directional scanner scored the highest in terms “Even though doors and objects were further away from
of facilitating three aspects of spatial awareness (scale, position & me, I was still able to know that they are still in fact
orientation, and arrangement), meaning that participants thought there. [The shockwave] helped me quickly gauge ’OK,
it was the best tool for facilitating those aspects. The simple audio cool. I know I’m in a corridor [...] and I know there is a
menu scored the highest in two types (presence and adjacencies), door in the far end, and so this helps me determine on a
and the smartphone interface and whole-room shockwave scored higher level how big the room might be.’” — P8
Uncovering Visually Impaired Gamers’ Preferences for SATs Within Video Games ASSETS ’22, October 23–26, 2022, Athens, Greece

The quick nature of the shockwave particularly advantaged par- 6.5 The simple audio menu’s straightforward
ticipants within areas with many items. One such area within Dun- presentation of items meant that
geon Escape was the irregularly shaped room mentioned in Section participants received information about the
6.1, which contained obstacles in the form of barrels and crates.
Participants who used the whole-room shockwave hit the check-
presence of items extremely well.
point in that room much faster (� = 19 sec., �� = 2.8 sec.) than Participants found that the simple audio menu clearly communi-
those who used the directional scanner (� = 51 sec., �� = 17.7 cated the presence of every item in the room. As seen in Figure 5, the
sec.), simple audio menu (� = 123 sec., �� = 5.0 sec.), and smart- simple audio menu received an average/median rating of 5.0 with a
phone map (� = 135 sec., �� = 15.5 sec.). The shockwave provided standard deviation of zero on communicating the presence of items
participants with an almost-instant overview of what obstacles (Type 4) — every participant gave the menu the maximum possible
were in the room. However, those who used the other tools spent score on this aspect of spatial awareness. Participants unanimously
a much longer time searching for these very obstacles — scrolling agreed that they were able to obtain a clear idea of what was in the
through each item (using the menu), trying to point at them (using room because of its straightforward presentation:
the directional scanner), or trying to fnd them on a map (using the “You know everything that’s there because it’s in a
smartphone). menu. There’s nothing hidden. It doesn’t matter if you’re
far away from it. If you’re in the same room as it, it’s
on that list. That’s something I really like.” — P3
6.3 Participants made extensive use of the Yet, some participants complained that the simple audio menu
whole-room shockwave’s flters. provided too much information. Within games, a certain degree of
surprise and exploration — that is, the ability to “discover” aspects
All nine participants made use of the whole-room shockwave’s
of the game world — are core elements for making a game fun for
fltering mechanism. Six of them explicitly mentioned that this
players [37]. Knowing the presence of objects so easily can remove
ability was extremely important to them and that they highly valued
this aspect of discovery from the game. Indeed, fve participants
this ability even in applications outside of games. One participant
felt that the simple audio menu did not promote exploration and
invoked the customizability of screen readers as an example:
made the game less enjoyable:
“I thought that I was using a shortcut. [...] I like that
“Everybody has diferent needs and wants so I really it’s faster, but it takes something out of the game expe-
believe in allowing information to be fltered in such rience.” — P2
a way where you just get the information you need as
Within Dungeon Escape itself, participants tended to avoid de-
you need it like in the shockwave. [...] Screen readers
viating from the task at hand while using the simple audio menu,
have settings like this for a reason.” — P3
going directly to POIs they needed to go to. Figure 6 plots paths
taken by participants within the key room in Level 2. Note how P3
went straight to the key when using the simple audio menu in Level
6.4 The physical use of the right stick in the 2, while participants who used other tools in the same level roamed
directional scanner meant that participants around the room in an efort to survey their surroundings more
could obtain a clear idea of how items were thoroughly. This behavior is also visible in the raw time data we
arranged around themselves. collected within the room: Those who used the menu collected the
key much faster (� = 17 sec., �� = 4.2 sec.) than those who used the
Our fndings with the directional scanner provided insights into
shockwave (� = 84 sec., �� = 6.5 sec.), smartphone (� = 93 sec.,
how physically moving a joystick to survey an environment might
�� = 17.5 sec.), and directional scanner (� = 105 sec., �� = 5 sec.).
provide players with an enhanced sense of its layout. Five partici-
Although players completed the levels with the simple audio menu
pants explicitly mentioned that moving the joystick to “look around”
(and often did so quickly), it remains an open question whether
allowed them to understand how objects were arranged:
players’ increased focus on objectives and lack of exploration is a
net positive for the game experience or not.
“Because of where I had to put the stick to see stuf
around me, it really helped. It was easier to tell what 6.6 No tool excelled at communicating position
was behind me, what was in front of me, or what was and orientation.
in any other direction because I knew where my stick
As we found in Section 5, participants rated position and orientation
position was.” — P6
(Type 3) as the most important aspect of spatial awareness to them.
However, post-level ratings indicate that participants perceived all
P6 went on to say that surveying with the joystick “felt natural” four tools to be mediocre at facilitating position and orientation
and compared the directional scanner to a camera which they could information. As Figure 5 shows, the average score that each SAT
use to “look” for objects. This sentiment is further refected in Fig- received in terms of afording position and orientation information
ure 5, which shows that the directional scanner scored the highest was a low-to-mid three (“moderately well”). Our results indicate
out of all four tools in terms of communicating the arrangement of that these four tools may not meet the high bar that these tools
items (Type 5). need to meet for such an important aspect of spatial awareness.
ASSETS ’22, October 23–26, 2022, Athens, Greece Nair et al.

Figure 6: Illustration of paths taken by participants within the Level 2 key room. The subfgures depict four diferent participants
traversing the same room with diferent tools; the key is located in the lower left corner, and participants enter through the
door at the bottom. Paths are divided into segments — the end of a segment represents a point where the participant paused to
survey using the tool.

6.7 Participants disliked having to juggle slowly for better intelligibility, and every participant used the flters
multiple pieces of hardware when using the to make the shockwave easier to understand.
smartphone map. Yet, despite these improvements, participants still felt that the
shockwave was too information-dense, making it difcult for them
Five participants mentioned that they found the smartphone map to ascertain information about their environment. Our conver-
cumbersome to use. Participants often found themselves needing sations with them yielded insights into how VIPs view similar
to physically switch between their controller and their smartphone echolocation-inspired tools within other games. P3, who described
when they wanted to explore the map. Furthermore, at least six themselves as having played "everything" when asked about their
participants used noise-cancelling headphones during their sessions gaming experience during the pre-study, was especially vocal:
and had to adjust them as well to hear the smartphone’s audio
“Everybody thinks you can just send out a sonar ping
output. These experiences annoyed some participants:
and get information about an environment. [...] Echolo-
"I have mixed emotions [about the smartphone map] cation is very overwhelming, especially in a game. Try-
because I have to do one thing on one device and then ing to hone in on an item that is far away and being
move with the other device [...] That made things a bit masked by another item is ludicrous.” — P3
confusing and annoying." - P2
6.10 Participants preferred combinations of
6.8 The simple audio menu did not tools that excelled across multiple spatial
communicate scale and shape well. awareness aspects.
As Figure 5 shows, participants thought that the simple audio menu Figure 5 gives us an interesting perspective on how combinations
communicated the scale (Type 1) and shape (Type 2) of areas quite of tools can best facilitate multiple aspects of spatial awareness
poorly with average ratings of around 2 out of 5. The simple audio jointly. As stated in Section 4.4, we asked participants to state their
menu did not explicitly communicate boundaries or other char- two most preferred combinations of tools. The (directional scanner
acteristics of the room itself. As such, many participants could + simple audio menu) combination was one of the most selected
not defnitively determine the structure (scale and shape) of the combinations, with fve participants selecting it. This combination
surrounding area using the menu: "wins" in four out of the six aspects of spatial awareness: position
& orientation, presence of items, arrangement of items, and com-
"I could probably use [the simple audio menu’s] 3D
municating adjacent areas. Furthermore, this combination is "tied"
sounds to assume that, say, a bunch of items were
with the whole-room shockwave at being the best at conveying
against a wall if they’re coming from the same general
area scale. Five participants also selected the (directional scanner +
side relative to me [...] but that’s an educated guess." —
whole-room shockwave) combination, which wins at three of the
P6
six aspects of spatial awareness. We discuss these selections further
in Section 7.
6.9 Participants found the whole-room
shockwave to be overwhelming, which 7 DISCUSSION: TAKEAWAYS FROM RQ1 AND
negatively afected their spatial awareness. RQ2 TOGETHER
Five participants felt that despite communicating scale relatively In Sections 5 and 6, we reported our fndings about what aspects of
well, the whole-room shockwave provided too much information, spatial awareness VIPs fnd important (RQ1) and how well current
which negatively afected their sense of spatial awareness. We were SAT approaches facilitate the various aspects of spatial awareness
surprised by this fnding since we designed the shockwave to emit (RQ2). In this section, we now synthesize these fndings together to
Uncovering Visually Impaired Gamers’ Preferences for SATs Within Video Games ASSETS ’22, October 23–26, 2022, Athens, Greece

form broader takeaways for how SATs should be designed. We hope


that game designers can use these takeaways to decide which SAT
is best for them to incorporate into their game to make it accessible,
if they only have the resources to implement one or two of them.

7.1 Position and orientation is the most


important type of spatial awareness for
VIPs, yet is not well-served by current tools.
Our results indicate that communicating position and orientation
well is a crucial challenge that must be addressed when designing fu-
ture SATs. As we reported in Section 6.6, participants rated position
and orientation as the most important aspect of spatial awareness Figure 7: Box plot of net promoter score responses for all four
to them. Yet, they also felt that all four tools were mediocre at fa- tools. Red lines indicate median. The whole-room shockwave
cilitating position and orientation information. Surprisingly, this received some of the lowest scores out of all four tools.
includes the smartphone map, which was the tool that most ex-
plicitly communicated position and orientation information, as we
described in Section 3.1. This indicates that communicating position
and orientation information to VIPs is harder than researchers as-
sume and that a major opportunity for future research is to develop
better indicators for VIPs’ position and orientation. Design Implication #2: Of today’s SATs, the combination of
Previous research has shown that VIPs rely heavily on land- the directional scanner and simple audio menu gives VIPs the
marks and other environmental features to determine their position greatest spatial awareness.
and orientation, which, in turn, allows them to navigate through
environments [21, 66]. These landmarks include walls and other
boundaries dictating the area’s scale and shape (Types 1 and 2) as
well as the layout of items within the space (Type 5). From our 7.3 VIPs highly value the ability to customize
fndings, however, we see that VIPs do not fnd it suitable to merely
infer their position and orientation from these other cues and would
SATs.
rather beneft from having it communicated more explicitly. In addition to the (directional scanner + simple audio menu) com-
bination, fve participants also picked the (directional scanner +
whole-room shockwave) combination as one of their favorite combi-
Design Implication #1: VIPs will beneft greatly from a nations. Unlike the former combination, however, the latter combi-
purpose-built tool for communicating position and orientation nation only “wins” at three of the six types of spatial awareness (i.e.,
in real time. the two tools cover winning values for three columns in Figure 5).
Additionally, Figure 7 shows that the shockwave had some of the
lowest net promoter scores out of all of the tools, and participants
7.2 The four most important aspects of spatial even complained about the tool being overwhelming. This fnding
awareness are covered by two tools. implies that there exists a consideration that VIPs may fnd even
If we consider the four most important aspects of spatial awareness more important than raw spatial awareness.
from our RQ1 fndings — position and orientation (Type 3), item One possible explanation lies in the fact that the whole-room
presence (Type 4), item arrangement (Type 5), and adjacent areas shockwave was the only tool that participants could change the
(Type 6) — we can see that the combination of the directional behavior of — in this case, selecting the type of information they
scanner with the simple audio menu “wins” at communicating all wanted to hear. Participants’ enthusiasm for customizable tools
four of these types. We do not consider area scale (Type 1) and — especially evident in their comparisons with other tools such
area shape (Type 2) because participants found them to be the least as screen readers — shows that SATs should implement similar
important; however, we can also see that the directional scanner is capabilities, allowing VIPs to take control of what they hear.
tied for “winning” scale as well. From a theoretical standpoint, this
implies that VIPs would most gravitate toward this combination,
Design Implication #3: SATs should embrace customizabil-
and indeed, we saw precisely this during our study.
ity, allowing VIPs to customize and flter the information com-
The fact that participants are excited about the (directional scan-
municated.
ner + simple audio menu) combination makes sense. It seems that,
with this combination of tools, participants gravitated toward a
combination that facilitates the greatest number of spatial aware-
ness aspects well. 8 FUTURE WORK
The fndings from our study revealed several avenues for future
work, which we propose in this section.
ASSETS ’22, October 23–26, 2022, Athens, Greece Nair et al.

8.1 Toward optimally communicating each and VIPs’ physical safety considerations [6], for example, could
spatial awareness type. infuence how important VIPs fnd the various types of spatial
awareness and even how they wayfnd and explore using the tools.
Our fndings showed that communicating some aspect of spatial
A direct comparison can help the community establish a hier-
awareness well is not simply about doing so to the maximum extent
archy from the spatial information that we know is important to
possible. For example, when it comes to conveying the presence
VIPs. It can also help the community establish formal principles
of items (Type 4), the simple audio menu facilitated it perfectly
for prioritizing the display of diferent types of information during
(receiving perfect Likert scores), but many participants disliked
physical world navigation.
how it listed every item within the room they were currently in.
They thought that the menu communicated too much information
— enough to afect how much fun they had playing the game.
These fndings indicate that — particularly within video games — 9 LIMITATIONS
communicating a specifc type of spatial awareness optimally does As with many studies that involve people with visual impairments,
not necessarily mean communicating it at the maximum possible we had a low number of participants. The preferences for spatial
level. Future work should address what "optimal" really means awareness that we found are based on the perspective of our nine
in terms of communicating each type of spatial awareness. For participants and may difer for other VIPs. Although the four SATs
example, in the case of item presence information (Type 4): What we implemented covered a broad range of design possibilities, there
is the proper level of item presence that should be communicated may be other designs that we did not consider that could reveal
to the player, and what factors — such as game objectives — may more insights into what VIPs value in a spatial awareness tool for
infuence the level of item presence a tool should communicate? virtual worlds. Furthermore, while we are grateful to the nine VIPs
Similar questions can be extended to the other types as well. who participated in our study, we regret not being able to recruit
a more diverse group of participants. Finally, our work focused
8.2 Toward purpose-built hardware. on 3D adventure video games with large worlds that players can
Participants revealed that they disliked juggling multiple pieces traverse, and our testbed did not feature any moving objects. As
of hardware while using the smartphone map. Future touch-based such, additional work is needed to investigate SATs that may assist
SATs should reduce the number of devices required. One possibility VIPs within other types of video games, especially those that feature
involves using touchpads found on game controllers such as the Du- moving objects (for example, enemies and projectiles).
alShock 4 [18], DualSense [19], and Steam Controller [60]. Hybrid
touchscreen controller devices, such as the Nintendo Switch [47]
and the Steam Deck [61], are also promising alternatives. 10 CONCLUSION
In this work, we explore the merits and limitations of existing
8.3 Applications for physical world navigation. approaches to facilitating spatial awareness for VIPs within video
In Section 5, we addressed RQ1 by reporting participants’ prefer- game worlds in order to allow accessible game designers to have
ences for the six types of spatial awareness within a video game a better understanding of which spatial awareness tools are best
context. Future work could explore VIPs’ preferences within the to include in games to make them accessible to VIPs. Through a
physical world and see how they difer from their preferences within user study, we investigated four leading approaches to facilitating
video games. We found that some of our results resemble prior work spatial awareness for VIPs in an efort to understand what aspects
in the physical world: Participants generally agreed that position of spatial awareness VIPs fnd important within games (RQ1), and
& orientation was extremely important — in our study, they collec- to determine how well today’s difering SAT approaches facilitate
tively saw it as the most important type of spatial awareness. In a the various aspects of spatial awareness (RQ2).
similar vein, much physical world work for both visually impaired Regarding the frst question, we found that participants consid-
and sighted people has found position & orientation awareness ered position and orientation to be the most important aspect of
to be very important [20, 22, 24, 31, 34]. Participants also gener- spatial awareness, and that scale and shape are the least important.
ally found knowledge of item presence, item arrangement, and Regarding the second question, participants found the directional
adjacent areas to be relatively important as well — refecting prior scanner to communicate the arrangement of items very well, the
work that has echoed the importance of inter-object and inter-area simple audio menu to communicate the presence of items very well,
relationships in promoting spatial awareness [24, 29, 49, 65]. the smartphone map to communicate the shape of areas very well,
We were surprised, however, when we found that participants and the whole-room shockwave to communicate the scale of the
did not fnd scale and shape awareness to be very important within area well. Our fndings also revealed defciencies in current SAT
video games. This difers from much prior work from physical world approaches, including that some tools tend to provide too much
contexts — especially in the realm of tactile maps and echolocation information and that no tool excels at communicating position and
— that has found general overviews of spaces, including information orientation information — despite it being the most important type
such as scale and shape, to be crucial for spatial awareness [30, 49]. of spatial awareness to participants.
An interesting direction for future work may involve repeating We hope that better understanding VIPs’ preferences for spatial
the study presented in this paper, but in the physical world, to enable awareness as well as how today’s SATs work can open up access to
a direct comparison. The physical world presents its own challenges more mainstream 3D video games, granting VIPs the same gaming
and circumstances. SATs’ accuracy within physical environments experiences that sighted players are so often aforded.
Uncovering Visually Impaired Gamers’ Preferences for SATs Within Video Games ASSETS ’22, October 23–26, 2022, Athens, Greece

ACKNOWLEDGMENTS https://doi.org/10.3389/fnhum.2020.00087
[24] Nicholas A. Giudice and Gordon E. Legge. 2008. Blind Navigation and the Role
We would like to thank Michael Malcolm and Sebastián Mercado of Technology. John Wiley and Sons, Ltd, 479–500. https://doi.org/10.1002/
for their assistance during our pilot tests. We would also like to 9780470379424.ch25
[25] Cagatay Goncu, Anuradha Madugalla, Simone Marinai, and Kim Marriott. 2015.
extend our sincere gratitude toward our study participants for their Accessible On-Line Floor Plans. In Proceedings of the 24th International Conference
participation and to the anonymous reviewers for their helpful feed- on World Wide Web (WWW ’15). International World Wide Web Conferences
back. Mason Hayes, Hannah Huddleston, and Matthew Donnelly Steering Committee, 388–398. https://doi.org/10.1145/2736277.2741660
[26] Cagatay Goncu and Kim Marriott. 2011. GraVVITAS: Generic Multi-touch Presen-
were funded by National Science Foundation Grants 2051053 and tation of Accessible Graphics. In Human-Computer Interaction – INTERACT 2011
2051060. The opinions, fndings, conclusions, and/or recommen- (Lecture Notes in Computer Science), Pedro Campos, Nicholas Graham, Joaquim
dations expressed are those of the authors and do not necessarily Jorge, Nuno Nunes, Philippe Palanque, and Marco Winckler (Eds.). Springer,
30–48. https://doi.org/10.1007/978-3-642-23774-4_5
refect the views of the National Science Foundation. [27] Google. [n.d.]. Firebase App Distribution. https://frebase.google.com/docs/app-
distribution
[28] Timo Götzelmann and Klaus Winkler. 2015. SmartTactMaps: a smartphone-
REFERENCES based approach to support blind persons in exploring tactile maps. In Proceedings
[1] Capcom Production Studio 4. 2005. Resident Evil 4. Capcom, Osaka, Japan. of the 8th ACM International Conference on PErvasive Technologies Related to
[2] Ronny Andrade, Steven Baker, Jenny Waycott, and Frank Vetere. 2018. Echo- Assistive Environments (PETRA ’15). Association for Computing Machinery, 1–8.
house: exploring a virtual environment by using echolocation. In Proceedings https://doi.org/10.1145/2769493.2769497
of the 30th Australian Conference on Computer-Human Interaction (OzCHI ’18). [29] E.W. Hill, J.J. Rieser, M.-M. Hill, M. Hill, J. Halpin, and R. Halpin. 1993. How Per-
Association for Computing Machinery, 278–289. https://doi.org/10.1145/3292147. sons with Visual Impairments Explore Novel Spaces: Strategies of Good and Poor
3292163 Performers. 87 (Oct 1993), 295–301. https://doi.org/10.1177/0145482X9308700805
[3] Ronny Andrade, Melissa J. Rogerson, Jenny Waycott, Steven Baker, and Frank [30] Emily Holmes and Aries Arditi. 1998. Wall versus Path Tactile Maps for Route
Vetere. 2020. Introducing the Gamer Information-Control Framework: Enabling Planning in Buildings. Journal of Visual Impairment & Blindness 92, 7 (Jul 1998),
Access to Digital Games for People with Visual Impairment. In Proceedings 531–534. https://doi.org/10.1177/0145482X9809200713
of the 2020 CHI Conference on Human Factors in Computing Systems (CHI ’20). [31] Hernisa Kacorri, Sergio Mascetti, Andrea Gerino, Dragan Ahmetovic, Hironobu
Association for Computing Machinery, 1–14. https://doi.org/10.1145/3313831. Takagi, and Chieko Asakawa. 2016. Supporting Orientation of People with
3376211 Visual Impairment: Analysis of Large Scale Usage Data. In Proceedings of the
[4] Ronny Andrade, Jenny Waycott, Steven Baker, and Frank Vetere. 2021. Echolo- 18th International ACM SIGACCESS Conference on Computers and Accessibility
cation as a Means for People with Visual Impairment (PVI) to Acquire Spatial (ASSETS ’16). Association for Computing Machinery, 151–159. https://doi.org/
Knowledge of Virtual Space. ACM Transactions on Accessible Computing 14, 1 10.1145/2982142.2982178
(Mar 2021), 4:1–4:25. https://doi.org/10.1145/3448273 [32] Jeremy Kaldobsky. 2011. Swamp. https://www.kaldobsky.com/ssl/audiogames.
[5] Dominique Archambault, Thomas Gaudy, Klaus Miesenberger, Stéphane Natkin, php
and Rolland Ossmann. 2008. Towards Generalised Accessibility of Computer [33] Daniel Kish. 2009. Human echolocation: How to “see” like a bat. New Scientist
Games. Lecture Notes in Computer Science, Vol. 5093. Springer Berlin Heidelberg, 202, 2703 (Apr 2009), 31–33. https://doi.org/10.1016/S0262-4079(09)60997-0
518–527. https://doi.org/10.1007/978-3-540-69736-7_55 [34] Roberta L. Klatzky. 1998. Allocentric and Egocentric Spatial Representations:
[6] Nikola Banovic, Rachel L. Franz, Khai N. Truong, Jennifer Mankof, and Anind K. Defnitions, Distinctions, and Interconnections. Springer, 1–17. https://doi.org/10.
Dey. 2013. Uncovering information needs for independent spatial learning for 1007/3-540-69342-4_1
users who are visually impaired. In Proceedings of the 15th International ACM [35] Alexander Klippel, Stephen Hirtle, and Clare Davies. 2010. You-Are-Here
SIGACCESS Conference on Computers and Accessibility (ASSETS ’13). Association Maps: Creating Spatial Awareness through Map-like Representations. Spatial
for Computing Machinery, 1–8. https://doi.org/10.1145/2513383.2513445 Cognition & Computation 10, 2–3 (Jun 2010), 83–93. https://doi.org/10.1080/
[7] Felix Bork, Ulrich Eck, and Nassir Navab. 2019. Birds vs. Fish: Visualizing Out-of- 13875861003770625
View Objects in Augmented Reality using 3D Minimaps. In 2019 IEEE International [36] Andrew J. Kolarik, Silvia Cirstea, Shahina Pardhan, and Brian C. J. Moore. 2014.
Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). 285–286. A summary of research investigating echolocation abilities of blind and sighted
https://doi.org/10.1109/ISMAR-Adjunct.2019.00-28 humans. Hearing Research 310 (Apr 2014), 60–68. https://doi.org/10.1016/j.heares.
[8] Capcom. 1996. Resident Evil. Capcom, Osaka, Japan. 2014.01.010
[9] Capcom. 1998. Resident Evil 2. Capcom, Osaka, Japan. [37] Marc LeBlanc. 2008. The collected game design rants of Marc LeBlanc. http:
[10] Capcom. 1999. Resident Evil 3: Nemesis. Capcom, Osaka, Japan. //algorithmancy.8kindsofun.com/
[11] Capcom. 2009. Resident Evil 5. Capcom, Osaka, Japan. [38] Laura Lewis, Sarah Sharples, Ed Chandler, and John Worsfold. 2015. Hearing the
[12] Code Respawn. [n.d.]. Dungeon Architect. https://assetstore.unity.com/ way: Requirements and preferences for technology-supported navigation aids.
packages/tools/utilities/dungeon-architect-53895 Applied Ergonomics 48 (May 2015), 56–69. https://doi.org/10.1016/j.apergo.2014.
[13] Sebastiano M. Cossu. 2019. Designing Platformers. Apress, 367–380. https: 11.004
//doi.org/10.1007/978-1-4842-5010-5_10 [39] Masaki Matsuo, Takahiro Miura, Masatsugu Sakajiri, Junji Onishi, and Tsukasa
[14] Sebastiano M. Cossu. 2019. Metroidvania (Part 2). Apress, 437–497. https: Ono. 2016. Audible Mapper & ShadowRine: Development of Map Editor Using
//doi.org/10.1007/978-1-4842-5010-5_12 only Sound in Accessible Game for Blind Users, and Accessible Action RPG for
[15] Core Design. 1996. Tomb Raider. Eidos Interactive, London, England. Visually Impaired Gamers. In Computers Helping People with Special Needs (Lecture
[16] Naughty Dog. 2020. The Last of Us Part II. Sony Interactive Entertainment. Notes in Computer Science), Klaus Miesenberger, Christian Bühler, and Petr Penaz
[17] Quantic Dream. 2010. Heavy Rain. Sony Computer Entertainment, Tokyo, Japan. (Eds.). Springer International Publishing, 537–544. https://doi.org/10.1007/978-
[18] Sony Interactive Entertainment. 2013. DualShock 4. Sony Corporation. 3-319-41264-1_73
[19] Sony Interactive Entertainment. 2020. DualSense. Sony Corporation. [40] Microsoft Research. 2018. Microsoft Soundscape - Microsoft Research.
[20] Russell A. Epstein, Eva Zita Patai, Joshua B. Julian, and Hugo J. Spiers. 2017. The https://www.microsoft.com/en-us/research/product/soundscape/.
cognitive map in humans: Spatial navigation and beyond. Nature neuroscience [41] Vishnu Nair, Jay L Karp, Samuel Silverman, Mohar Kalra, Hollis Lehv, Faizan
20, 11 (Oct 2017), 1504–1513. https://doi.org/10.1038/nn.4656 Jamil, and Brian A. Smith. 2021. NavStick: Making Video Games Blind-Accessible
[21] Navid Fallah, Ilias Apostolopoulos, Kostas Bekris, and Eelke Folmer. 2012. The via the Ability to Look Around. In The 34th Annual ACM Symposium on User Inter-
user as a sensor: navigating users with visual impairments in indoor spaces face Software and Technology (UIST ’21). Association for Computing Machinery,
using tactile landmarks. In Proceedings of the SIGCHI Conference on Human 538–551. https://doi.org/10.1145/3472749.3474768
Factors in Computing Systems. Association for Computing Machinery, 425–432. [42] Vishnu Nair and Brian A. Smith. 2020. Toward Self-Directed Navigation for People
https://doi.org/10.1145/2207676.2207735 with Visual Impairments. In Adjunct Publication of the 33rd Annual ACM Sympo-
[22] Nicholas A. Giudice, Benjamin A. Guenther, Nicholas A. Jensen, and Kaitlyn N. sium on User Interface Software and Technology (UIST ’20 Adjunct). Association
Haase. 2020. Cognitive Mapping Without Vision: Comparing Wayfnding Per- for Computing Machinery, 139–141. https://doi.org/10.1145/3379350.3416156
formance After Learning From Digital Touchscreen-Based Multimodal Maps [43] Liam J. Norman, Caitlin Dodsworth, Denise Foresteire, and Lore Thaler. 2021.
vs. Embossed Tactile Overlays. Frontiers in Human Neuroscience 14 (2020), 87. Human click-based echolocation: Efects of blindness and age, and real-life im-
https://doi.org/10.3389/fnhum.2020.00087 plications in a 10-week training program. PLOS ONE 16, 6 (Jun 2021), e0252330.
[23] Nicholas A. Giudice, Benjamin A. Guenther, Nicholas A. Jensen, and Kaitlyn N. https://doi.org/10.1371/journal.pone.0252330
Haase. 2020. Cognitive Mapping Without Vision: Comparing Wayfnding Per- [44] Out of Sight Games. 2017. A Hero’s Call.
formance After Learning From Digital Touchscreen-Based Multimodal Maps [45] Pin Interactive. 2003. Terraformers.
vs. Embossed Tactile Overlays. Frontiers in Human Neuroscience 14 (2020).
ASSETS ’22, October 23–26, 2022, Athens, Greece Nair et al.

[46] John R. Porter and Julie A. Kientz. 2013. An empirical study of issues and barriers [57] Lore Thaler and Melvyn A. Goodale. 2016. Echolocation in humans: an overview.
to mainstream video game accessibility. In Proceedings of the 15th International WIREs Cognitive Science 7, 6 (2016), 382–393. https://doi.org/10.1002/wcs.1408
ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’13). Associ- [58] Ross Thorn. 2018. How to Play With Maps. Ph.D. Dissertation. https://minds.
ation for Computing Machinery, 1–8. https://doi.org/10.1145/2513383.2513444 wisconsin.edu/handle/1793/78913 Accepted: 2019-01-17T20:40:13Z.
[47] Nintendo PTD. 2017. Nintendo Switch. Nintendo. [59] Unity Technologies. 2020. Unity. Unity Technologies.
[48] Frederick F. Reichheld. 2003. The one number you need to grow. Harvard Business [60] Valve. 2015. Steam Controller. Valve Corporation.
Review 81, 12 (2003), 46–55. [61] Valve. 2022. Steam Deck. Valve Corporation.
[49] Jonathan Rowell and Simon Ungar. 2005. Feeling our way: tactile map user [62] Valve Software. 2019. Steam Audio. https://valvesoftware.github.io/steam-audio/
requirements-a survey. In International Cartographic Conference. [63] T. Westin. 2004. Game Accessibility Case Study: Terraformers – a Real-Time 3d
[50] Daisuke Sato, Uran Oh, Kakuya Naito, Hironobu Takagi, Kris Kitani, and Chieko Graphic Game. In In Proc. of the The Fifth International Conference on Disability,
Asakawa. 2017. NavCog3: An Evaluation of a Smartphone-Based Blind Indoor Virtual Reality and Associated Technologies. 95–100.
Navigation Assistant with Semantic Features in a Large-Scale Environment. In [64] Rayoung Yang, Sangmi Park, Sonali R. Mishra, Zhenan Hong, Clint Newsom,
Proceedings of the 19th International ACM SIGACCESS Conference on Computers Hyeon Joo, Erik Hofer, and Mark W. Newman. 2011. Supporting spatial awareness
and Accessibility - ASSETS ’17. ACM Press, Baltimore, Maryland, USA, 270–279. and independent wayfnding for pedestrians with visual impairments. In The
https://doi.org/10.1145/3132525.3132535 proceedings of the 13th international ACM SIGACCESS conference on Computers
[51] Team Silent and Konami Computer Entertainment Tokyo. 1999. Silent Hill. and accessibility (ASSETS ’11). Association for Computing Machinery, 27–34.
Konami, Tokyo, Japan. https://doi.org/10.1145/2049536.2049544
[52] Brian A. Smith and Shree K. Nayar. 2018. The RAD: Making Racing Games [65] Koji Yatani, Nikola Banovic, and Khai Truong. 2012. SpaceSense: represent-
Equivalently Accessible to People Who Are Blind. In Proceedings of the 2018 CHI ing geographical information to visually impaired people using spatial tac-
Conference on Human Factors in Computing Systems. Association for Computing tile feedback. In Proceedings of the SIGCHI Conference on Human Factors in
Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3174090 Computing Systems. Association for Computing Machinery, 415–424. https:
[53] Retro Studios. 2002. Metroid Prime. Nintendo, Kyoto, Japan. //doi.org/10.1145/2207676.2207734
[54] Retro Studios. 2004. Metroid Prime 2: Echoes. Nintendo, Kyoto, Japan. [66] Xunyi Yu and Aura Ganz. 2012. Audible vision for the blind and visually im-
[55] Retro Studios. 2007. Metroid Prime 3: Corruption. Nintendo, Kyoto, Japan. paired in indoor open spaces. In 2012 Annual International Conference of the IEEE
[56] Jing Su, Alyssa Rosenzweig, Ashvin Goel, Eyal de Lara, and Khai N. Truong. 2010. Engineering in Medicine and Biology Society. 5110–5113. https://doi.org/10.1109/
Timbremap: Enabling the Visually-Impaired to Use Maps on Touch-Enabled EMBC.2012.6347143
Devices. In Proceedings of the 12th International Conference on Human Computer [67] Krzysztof Zagata, Jacek Gulij, Łukasz Halik, and Beata Medyńska-Gulij. 2021.
Interaction with Mobile Devices and Services (Lisbon, Portugal) (MobileHCI ’10). Mini-Map for Gamers Who Walk and Teleport in a Virtual Stronghold. ISPRS
Association for Computing Machinery, New York, NY, USA, 17–26. https://doi. International Journal of Geo-Information 10, 22 (Feb 2021), 96. https://doi.org/10.
org/10.1145/1851600.1851606 3390/ijgi10020096
Expressive Bodies
Engaging with Embodied Disability Cultures for Collaborative
Design Critiques
Katta Spiel
Robin Angelini
katta.spiel@tuwien.ac.at
robin.angelini@student.tuwien.ac.at
HCI Group – TU Wien
Vienna, Austria
ABSTRACT 1 BRINGING BODIES BACK INTO CRITIQUE
In our experience as researchers engaging with non-academic audi- Methods involving people when evaluating, testing and assessing
ences, we observed that it remains a challenge to receive direct and technologies in Human-Computer Interaction (HCI) come with a
critical feedback from participants. This is particularly amplifed range of expectations as to which skills these people have to bring to
in the context of disabilities even if the researchers identify them- the interaction to be considered as suitable partners. This becomes
selves as disabled given that the interaction is governed by social particularly relevant when designing and developing technologies
status and material power dimensions to say the least. To work with and for disabled people [23], although the notion that bodies
productively with these power dynamics, we explored embodied are relevant to the technological research we conduct has potentials
approaches to articulating critique acknowledging the diferent reaching beyond these populations [40]. Here, the suggestion is
ways of knowing stemming from diferent bodyminds. Here, we often to adapt methods to make them and the overall research envi-
line out two exploratory cases illustrating how physical bodies can ronment, be it virtual or physical, accessible to participants, while
be directly attended to to express critiques in more direct ways leaving the methods themselves untouched [23]. Such adaptations
than participants might be used to on a language based level (spo- become more and more common to be reported on, such as work
ken or signed). We show how communication and critique can by Dingman et al. on adapting interview practices for Deaf and
take on many forms encouraging us to broaden our methodologi- Hard of Hearing (DHH) populations [7]. How easily methods can be
cal toolset to incorporate practices common in disability cultures. adapted to specifc populations might also explain why some disabil-
Our experiences show that we need to embrace crip approaches ities are catered more to in technological accessibility research than
to knowledge production to receive more actionable and useful others [25]. Overall, within communities researching technologies
feedback in developing technologies with disabled communities. for and with disabled people, we further notice an emerging trend
of moving towards ‘cultivating access’ [26]. However, in general
CCS CONCEPTS purpose technological contexts, the ableist paradigms of Western
• Social and professional topics → People with disabilities; • societies [3] seep into the design and development as well as the
Human-centered computing → Accessibility design and eval- methods for assessment allowing us to understand their potentially
uation methods. exclusive character [41], which has been illustrated in detail for
virtual reality technologies [12].
In thinking with disability cultures [36], crip theories [30] and
KEYWORDS
notions of disability justice [33], we draw on an understanding
disability cultures, Deaf cultures, Autism, Neurodivergence, crip of situated knowledges [14] to probe how we might think about
methodologies, embodied critique, critical bodyminds not only making existing methods accessible, but in a notion of
ACM Reference Format: cripistemologies [16] develop complementary methods arriving at
Katta Spiel and Robin Angelini. 2022. Expressive Bodies Engaging with diferent kinds of knowledges. Doing so we aim at acknowledging
Embodied Disability Cultures for Collaborative Design Critiques. In The 24th how bodyminds in their particularities [4] aid us in honouring a
International ACM SIGACCESS Conference on Computers and Accessibility range of ways of knowing about and understanding technologies.
(ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, A frst step here is to involve disabled people in research about
6 pages. https://doi.org/10.1145/3517428.3551350 them due to their individual and relevant expertise [32], which has
been argued for HCI specifcally prominently in previous publi-
cations [28, 44]. However, a recent survey by Sarsenbayeva et al.
illustrated that 32% of articles discussing motor disabilities and
This work is licensed under a Creative Commons technologies are still published without indicating the involvement
Attribution-NonCommercial-ShareAlike International 4.0 License.
of disabled participants at all [38]. Similarly, even in games, we
ASSETS ’22, October 23–26, 2022, Athens, Greece observe a dominance of the medical, i.e., individualising model of
© 2022 Copyright held by the owner/author(s). disability governing research questions, technological design and
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3551350
ASSETS ’22, October 23–26, 2022, Athens, Greece Kata Spiel and Robin Angelini

development as well as assessment [43]. To attend to the particu- Austrian Sign Language (Österreichische Gebärdensprache – ÖGS)
larities of disabled knowledges, recent years have seen an uptake regularly during the past two years. Robin is a Deaf, cis male, white
of frst person research methods, so far mostly in the context of graduate student also from central Europe studying in Austria. He
autoethnographies detailing travel experiences of hard of hearing is a native signer of German Sign Language (Deutsche Gebärden-
[15] or blind travellers [46]. Regardless, the expertise of disabled sprache – DGS1 ). Both of them have collaborated on this work
researchers remains, at least partially, subject to epistemic violence by drawing on their researched and lived experiences across and
[49]. This makes it difcult to consider methodological approaches within their respective communities. However, given the specifcity
with even less foundation in more classical understandings of what of our own embodiments as well as those of our partners, we can
methods can and should do in HCI and what principles they should only ofer the start of a broader conversation towards critical crip
be anchored on. methodologies. We do not stake the claim to ofer any fnalised
To explore how this might look like, we refect on two exploratory insights on these matters, but rather intend to present an additional
case studies that illustrate the potentials of understanding critical “articulation towards Crip HCI” [48], an invitation to a conversa-
feedback not just in ways of direct language based engagements or tion about how we might consider methods grounded in disability
detached sensor measurements (see also, [2]). In both, we collabo- cultures to understand technologies with disabled communities
rated with disabled people in diferent settings at diferent times diferently.
with a focus on what we may achieve by attending to what we
call expressive bodies, i.e. the use of the body-based expressiveness 2.1 Taking on Emotions like Capes
of a range of diferent bodyminds beyond direct language based The frst project we draw on was concerned with the participatory
communication – spoken or signed. With these, we hope to show design of technologies for the holistic wellbeing of autistic chil-
possible steps of how we might go beyond the cultivation of access dren, called OutsideTheBox [10]. As part of this project, which ran
by specifcally positioning our research not just in the context of from 2014 through 2017, Katta was involved in eight diferent case
disabilities but disability cultures and the situated practices and re- studies where a team of designer-researchers developed eight func-
lations therein as they pertain to the construction and negotiations tional prototypes based on year-long participatory engagements
of diferent knowledges [16]. with individual (or, in one case a pair of) autistic children. Our over-
In this experience report, we start by probing the two difer- all methodological approaches and practices have been described
ent ways of embodying critique, we observed, frst in the context elsewhere (e.g., [ibid]).
of participatory design with autistic children then in the context Here, we report on our collaboration with Dean2 , specifcally
of observing actors in the production of a short movie by Deaf from a session during which we aimed at evaluating the technologi-
flmmakers. We show initial steps as to how bodies can express cal artefact we previously designed and built together after a longer
themselves in diferent ways with diferent functions of what cri- break from meeting each other due to the summer holidays. While
tique might be as contextually required. In refecting on these, we we later had developed a participatory form of evaluation [45], this
discuss and speculate on the potentials of expressive bodies as a was one of the sessions, that inspired us to do so, occurring in 2015,
way of moving towards critical crip methodologies in HCI [48] when Dean was about eight years old. In building up a collaborative
more broadly. We choose the form of an experience report as a relationship with Dean in the months prior, we had to frst fgure
non-anonymous venue to openly and transparently engage with out how to build up an environment of trust in which Dean would
our own positionalities and the particularities that come with our feel comfortable voicing his perspective instead of trying to fgure
research contexts and inquiries. out what might be the specifc thing we would like to hear at a
specifc moment.
This was particularly pronounced for Dean, which we deem
2 TWO WAYS OF EMBODYING CRITIQUE
likely to be based in the use of Applied Behavioural Analysis (ABA)
In our initial probing of an understanding of embodied critique as in his education and family. Subsequently, Dean’s parent implied
relevant to the contexts of disabilities and technologies, we draw that they expected us to follow the structural approach of ABA. The
on two case studies that allow us to speculate on the potentials of approach requires a child, to be under a near-constant therapeutic
such approaches. Both lie multiple years apart, the frst one occur- setting (‘intense’ treatments expect 36 hours per week [8]). At the
ring in 2015 whereas the latter happened during the end of 2021. time of our collaboration, it was not yet academically discussed, but
Furthermore, both of these situations come with entirely diferent studies showed later that an exposure to ABA in early education po-
positionalities of participants and, due to the large temporal gap tentially leads to a development of Post Traumatic Stress Disorders
between them, partly also the researcher involved in both. Our in- for individuals [21] and is predominantly assessed as “detrimental”
tent is to show how embodied critique can have diferent functions later in life [29]. This has lead to quite a movement of profession-
even though there are shared aspects as to what can be the base als leaving the feld of ABA [22], with some psychologists even
of a conversation stemming from attending to expressive bodies in going as far as characterising the involved practices, particularly
addition to language based communication. those around operand conditioning, as abuse [37]. Subsequently,
In that we need to acknowledge the epistemological limitations we did not feel comfortable complying with the parent’s request.
of our own embodiments. Katta is a hearing, neurodivergent, non-
binary, white researcher with chronic illnesses from central Europe 1 ÖGS and DGS are, in contrast to the shared spoken language between Austria and
and the main person conducting the research inquiries presented Germany, entirely diferent languages, even from diferent language families.
here, which all occurred in Austria. They have taken classes in 2 We have altered his name to protect his privacy in this publication.
Expressive Bodies ASSETS ’22, October 23–26, 2022, Athens, Greece

Yet, the daily use of ABA still infuenced the communication be- This physical closeness was accompanied and potentially even
tween design-researchers and Dean, requiring a base of trust and fuelled by the simultaneously established aesthetic distance through
a constant, fairly explicit encouragement towards being critical acting things out [19]. Taking on a personifed embodiment of an
or even just silly. In addition with the long break between prior emotion and playing it through creates this distance and, with it,
engagements and the diferent context of use of Dean’s technology creates a space of plausible deniability. In light of insecure power
away from interactions with us towards use in a family context, we positions and the need for re-establishing previously held relation-
were worried about how we could encourage Dean to voice critical ships which had a precarious nature at best, Dean could enter a
feedback, where appropriate. space where he could express critique through his body while also
having the fallback option of declaring this ‘just play’ or ‘a joke’ in
case we would respond negatively to his articulations.
Conceptually, this is supported by the notion of a surrogate body
position [42], one where Dean’s body becomes the expression of the
emotive potential. This means, the self and the embodied persona
(in this case, the specifc emotion with their known characteristics)
create a new melange of critical potential that can be abandoned
at a moment’s notice in case this becomes necessary due to the
social circumstances turning out to be precarious or unsafe for
whatever reason. Through that, this procedure ofers an option
for establishing the trust and safety net that is necessary when
power dimensions are complex and fraught. Or, phrased a bit more
bluntly: if you are used to critique being discouraged, taking it up
again needs to happen in a structural form that allows for plausi-
ble deniability to at least partially remove the stress this form of
communication tends to bring along in your everyday experiences.

Figure 1: Interaction with Dean during the evaluation ses- 2.2 Acting out Technology
sion while talking about the technology with the artefact
Our second case revolves around the notion of sign language avatars
(to the left) and while playing out emotional responses to
and the critique from Deaf individuals. Particularly, native signers
the artefact
are already used to utilising their bodies more in language based
interactions, including literally embodying diferent perspectives
Hence, about a month before the session, we had organised a in constructed dialogues (i.e., the reporting of a dialogical situation)
social outing during which Dean and Katta visited a movie theatre [31]. However, this is an implicit language feature that is applied
to watch the movie “Inside Out” (Pixar, 2015). Inspired by the fve semi-automatically and with little explicit refection on the topic
emotions in the movie (Joy, Sadness, Anger, Fear, Disgust), we discussed (akin to Schön’s concept of refection-in-action). By de-
supplied fve chairs with fve coloured cloths as props. We further liberately thinking through making diferent choices in expressing
provided three diferent scenarios that were familiar to Dean. He critical perspectives through one’s body, signers enter a state of
could pick any emotion for each scenario and show us how he actively refecting on their critique beyond the direct relationality
would interact with his object in that context. Through that, we to language (akin to refection-on-action [39]).
could identify core emotions afecting the experiences Dean had The topic of sign language avatars is a highly controversial one
with the artefact. within Deaf communities globally [6] as well as locally [11]. Deaf
In embodying these emotions, Dean became much more expres- representatives argue that sign language avatars reduce the com-
sive and direct in his assessment of the technology. While during an plexity of both written and signed languages, potentially contribut-
early conversation in the session, the comments Dean made were ing to language deprivation for younger signers or those acquiring
somewhat descriptive and his motions more illustrative of what you sign languages later in life. Further, they are less likely to understand
could see (cf. Figure 1 on the left), but less assessing or critiquing the register of communication required for specifc audiences and
the object or the interactions he had with it. Using the chairs and only operate one-way from written to signed content and should
using his body to express diferent aspects the interaction could be carefully used in specifc contexts only [20]. Recent research by
take on, matching specifc scenarios (i.e., family, school, friends) Quandt et al. illustrates further that native signers require a higher
to distinct emotions and playing out their reactions made Dean degree of quality regarding the overall motion capabilities of sign
loosen up and share more insights into his nuanced assessment language avatars to be sufciently acceptable for them [34]. Addi-
of his artefact, that was context dependent and layered. Whereas tional research suggests, that Deaf populations prefer if language
previously, he presented the object in a matter-of-fact way as global parameters for avatars difer from human signing in some aspects,
“good”, taking on the emotional capes, in a way, encouraged him for example signing speed and timing [1].
to be more expressive with his body in articulating these critiques. Methodologically, within HCI, there are specifc recommenda-
This is further illustrated in the close and animated interaction tions as how to include Deaf communities in research on sign
between Dean and Katta shown to the right in Figure 1). language avatars. Kipp et al. suggest focus group interviews and
ASSETS ’22, October 23–26, 2022, Athens, Greece Kata Spiel and Robin Angelini

online surveys [18]. However, group dynamics in these research


settings and the often heavily text-focused modes of online surveys
come with their own problems and exclusions, privileging people
who feel comfortable to provide feedback in groups. Surveys, even
with questions signed, come with the issue of providing limited op-
tions for answers, which often still have to be provided as text, and,
subsequently, critical nuance. Open questions are only available to
those who feel confdent in using written English (or, as is the case
with our reference, German), which to many native signers is a
language acquired later in life or those with sufcient technological
savvy to provide a link to a video responding in sign. Hence, these
approaches only present a starting point on including Deaf people
as experts, but they do not orient themselves on Deaf cultures and
linguistic styles, rather they nominally adapt existing methods (fol-
lowing a hearing logic) [47]. Further, there is often an increased
strain on organisational capacities and budgets due to the hiring
of interpreters necessary if not all participants are appropriately Figure 2: The actor signing as the avatar during a practice
fuid in a shared sign language – though Mack and Tian suggest session (left) and in the fnal movie (right).
that researchers working with Deaf communities need to acquire
profciency in ASL (or rather, the local sign language) to adequately
understand cultural diferences and nuances) [27]. Similarly, while
we know that diferent people and diferent embodiments result in From these observations and refective conversations, we iden-
diferent assessments [17], we need to go beyond just aggregating tifed a strong emphasis of the perspective of both the flmmaker
those and aiming at a general view but allowing for methods at- and the actor to be centred around sign language avatars as they
tending to the particular and specifc – as driven by the situated experience them in their environment being ridiculously absurd to
interests in how critique towards technological artefacts might be some extent. The director kept on instructing the actor to reduce
articulated by disabled communities themselves. their facial expressions even further until they essentially removed
For this case study, Katta did not plan any sessions or invite this language feature, which serves both afective and grammatical
participants, rather they have been invited by a Deaf flmmaker to functions [35], entirely. At the same time, the actor overemphasised
observe practice sessions and the shooting of a short flm which mouth actions (or visemes, which are used to diferent degrees in
was conceptualised entirely in Austrian Sign Language. The story diferent sign languages [5]) to a point of them becoming meaning-
of the movie concerns itself with the technical hubris and glitches less and void of any information. Additionally, the actor pointed
involved if there would be a car navigation application that included out during the frst refection that he deliberately kept his shoul-
a sign language avatar to replace spoken instructions. The flm ders almost entirely motionless, which additionally restricted the
pokes fun at technology developers, startup cultures and the limited expressiveness of his hands and overall use of the upper body in
capabilities of sign language avatars. Instead of using an actual communication (see also, the additional stifness on the right of Fig-
avatar, the director decided to have the avatar being embodied by a ure 2 in the fnal movie compared to initial practice runs depicted
Deaf actor. to the right). During the practice run it became clear that three-
We report here from the practice session in November 20213 dimensional characteristics of Austrian Sign Language, particularly
during which Katta closely observed how the actor, a native signer, as they pertained the use of classifers [9], ended up being difcult
actively worked on fguring out their embodiment of the avatar to translate to a two-dimensional plane especially in the context
along with the director’s instructions. These observations were of providing directional information. This difculty could explain
recorded as notes, which illustrated diferent styles of signing com- why sign language avatars rarely use classifer constructions.
paring the actor’s general conversational style with how they went Essentially, the perspective of the director and the actor to-
about embodying a sign language avatar. They then discussed their wards sign language avatars as they expressed them collaboratively
notes with the actor (in Austrian Sign Language), which prompted through embodiment and instruction as well as active linguistic
corrections and additional emphasis on certain aspects from the refection illustrates similar critical aspects as Krausnecker and
actor but also aided them in refecting how their acting is perceived Schügerl identifed in their research based on diferent focus groups
and what it communicates, making subtle changes for the second with Deaf and hearing participants separately [20]. “In all focus
practice run, after which we discussed additional observations and groups with deaf participants, it was noted that the avatar “closely
refections. That way, both Katta and the actor could proft from follows the German syntax”, which was described as unpleasant,
the interaction with their respective interests, be they observing tiring, not mature, as a “gimmick”, “nice experiment” and even as
embodied critique or refning the performance for the movie. a “botch-up”” [20, p.5]. Hence, critique is available in other, more
classical settings as well and content wise, our approach comes
to similar conclusions. However, methodologically, our approach
3 Everyone present was recently PCR tested (as was freely available to everyone in is oriented on mutual exchange and presents a collaborative pro-
Austria at the time) and fully vaccinated against COVID-19. cess instead of one shaped solely by researchers. Even if it is more
Expressive Bodies ASSETS ’22, October 23–26, 2022, Athens, Greece

difcult to systematise what we found by doing so and make repro- ACKNOWLEDGMENTS


ducible for other contexts, together with Dean’s case study, we fnd This work would not have been possible without the support of
that there is potential for this concept of expressive bodies to be use- a lot of people. Among those are the original project team mem-
ful in involving disabled participants by honouring their respective bers of the OutsideTheBox project, Christopher Frauenberger and
cultural, personal and communicative styles and preferences. Julia Makhaeva as well as, of course, Dean, who showed us how
tender research relationships can be. Further, Katta needs to thank
Christoph Kopal, Brato Avramovic and Joanna Kinberger for let-
ting them join their movie endeavour and being patient with their
3 EXPRESSIVE BODIES – TOWARDS A limited signing capabilities. Kathrin Gerling provided invaluable
feedback on earlier drafts of this manuscript. Finally, this work
CRITICAL CRIP METHOD(OLOGY)
received fnancial support from the Austrian Science Funds (FWF)
Across these two case studies, we could see that turning to ex- through a Hertha-Firnberg Scholarship T 1146-G.
pressive bodies allows us to understand more about the mental
models that people hold about technologies. We aimed at illus-
trating that the embodiment of critique as a mode of attending
REFERENCES
[1] Sedeeq Al-khazraji, Becca Dingman, Sooyeon Lee, and Matt Huenerfauth. 2021.
to disability cultures allows us to 1) feetingly create safer spaces At a Diferent Pace: Evaluating Whether Users Prefer Timing Parameters in
when the interactions with researchers might be unclear regard- American Sign Language Animations to Difer from Human Signers’ Timing. In
ing their implications for power dimensions; and 2) engage in a The 23rd International ACM SIGACCESS Conference on Computers and Accessibility
(Virtual Event, USA) (ASSETS ’21). Association for Computing Machinery, New
mutually refective dialogue that has an explicitly reciprocal charac- York, NY, USA, Article 40, 12 pages. https://doi.org/10.1145/3441852.3471214
ter compared to the more extractive tendencies occurring in more [2] Christine E Ashby. 2011. Whose "voice" is it anyway?: Giving voice and qualitative
research involving individuals that type to communicate. Disability Studies
traditional methods for knowledge acquisition. Quarterly 31, 4 (2011).
Turning to expressive bodies does not mean abandoning lan- [3] Fiona Campbell. 2009. Contours of ableism: The production of disability and
guage based methods, but using bodies actively as a way to attend abledness. Springer.
[4] Eli Clare. 2015. Exile and pride: Disability, queerness, and liberation. Duke Univer-
to the particular and the situated assessment of disabled peoples sity Press.
along their lines of preferred communication and cultural conven- [5] Onno A Crasborn, Els Van Der Kooij, Dafydd Waters, Bencie Woll, and Johanna
tions. That way, they complement existing methods and present a Mesch. 2008. Frequency distribution and spreading behavior of diferent types of
mouth actions in three sign languages. Sign Language & Linguistics 11, 1 (2008),
way of drawing on individual experiential knowledge, such as is al- 45–67.
ready done by autoethnographic and frst-person research methods, [6] World Federation of the Deaf and World Association of Sign Language In-
terpreters. 2019. WFD and WASLI Statement on Use of Signing Avatars.
in a relational and collaborative way. We deem these particularly http://wfdeaf.org/news/resources/wfd-wasli-statement-use-signing-avatars/
useful to understand situated nuances of marginalised perspectives [7] Becca Dingman, Garreth W. Tigwell, and Kristen Shinohara. 2021. Interview and
on technologies in a mutually respectful manner. Think Aloud Accessibility for Deaf and Hard of Hearing Participants in Design
Research. In The 23rd International ACM SIGACCESS Conference on Computers
In a light analogy to the distinction between expressive and in- and Accessibility (Virtual Event, USA) (ASSETS ’21). Association for Computing
strumental technologies in a queering approach to HCI [24], we Machinery, New York, NY, USA, Article 71, 3 pages. https://doi.org/10.1145/
suggest that there might be a distinction to make between expres- 3441852.3476526
[8] Sigmund Eldevik, Richard P. Hastings, J. Carl Hughes, Erik Jahr, Svein Eikeseth,
sive and instrumental methods in the ways we assess technologies and Scott Cross. 2010. Using Participant Data to Extend the Evidence Base
in HCI more generally and in disability contexts specifcally. Orient- for Intensive Behavioral Intervention for Children With Autism. American
Journal on Intellectual and Developmental Disabilities 115, 5 (2010), 381–405.
ing ourselves towards the expressiveness of diferent bodyminds https://doi.org/10.1352/1944-7558-115.5.381 arXiv:https://doi.org/10.1352/1944-
and the associated cultural aspects means using the research en- 7558-115.5.381 PMID: 20687823.
deavours not in an instrumental way with the intent to answer [9] Karen Emmorey. 2003. Perspectives on classifer constructions in sign languages.
Psychology Press.
specifc research questions that are predominantly shaped by re- [10] Christopher Frauenberger, Katta Spiel, and Julia Makhaeva. 2019. Thinking
searchers and their institutional contexts but instead appreciating outsideTheBox-designing smart things with autistic children. International Jour-
and cherishing the relationships and interactions that might arise nal of Human–Computer Interaction 35, 8 (2019), 666–678.
[11] Österreichischer Gehörlosenbund. 2019. Stellungnahme zum Thema
in collaboration. Gebärdensprach-Avatare. https://www.oeglb.at/wp-content/uploads/2021/05/
Hence, the approach is based on practices of cripping. “Cripping Avatare_OeGLBOeGSDV_Stellungnahme-2019.pdf
[12] Kathrin Gerling and Katta Spiel. 2021. A Critical Examination of Virtual Reality
spins mainstream representations or practices to reveal able-bodied Technology in the Context of the Minority Body. In Proceedings of the 2021 CHI
assumptions and exclusionary efects. Both queering and cripping Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21).
expose the arbitrary delineation between normal and defective and Association for Computing Machinery, New York, NY, USA, Article 599, 14 pages.
https://doi.org/10.1145/3411764.3445196
the negative social ramifcations of attempts to homogenize human- [13] Aimi Hamraie and Kelly Fritsch. 2019. Crip technoscience manifesto. Catalyst:
ity, and both disarm what is painful with wicked humor, including Feminism, Theory, Technoscience 5, 1 (2019), 1–33.
camp” [4]. Our second case study illustrates how these practices [14] Donna Haraway. 1988. Situated Knowledges: The Science Question in Feminism
and the Privilege of Partial Perspective. Feminist Studies 14, 3 (1988), 575–599.
might look like, by actively making fun of and exaggerating the [15] Dhruv Jain, Audrey Desjardins, Leah Findlater, and Jon E. Froehlich. 2019. Au-
embodiment of the sign language avatar, and how they lead to the toethnography of a Hard of Hearing Traveler. In The 21st International ACM
SIGACCESS Conference on Computers and Accessibility (Pittsburgh, PA, USA) (AS-
construction of insights and situated knowledges. Subsequently, SETS ’19). Association for Computing Machinery, New York, NY, USA, 236–248.
we position our approach within cripistemologies [16] and crip https://doi.org/10.1145/3308561.3353800
technoscience [13] and hope that the concept of expressive bodies [16] Merri Lisa Johnson and Robert McRuer. 2014. Cripistemologies: introduction.
Journal of Literary & Cultural Disability Studies 8, 2 (2014), 127–147.
can function as an additional “articulation” towards what might at [17] Hernisa Kacorri, Matt Huenerfauth, Sarah Ebling, Kasmira Patel, and Mackenzie
some point be called Crip HCI [48]. Willard. 2015. Demographic and Experiential Factors Infuencing Acceptance of
ASSETS ’22, October 23–26, 2022, Athens, Greece Kata Spiel and Robin Angelini

Sign Language Animation by Deaf Users. In Proceedings of the 17th International psychology 13 (2022), 730917. https://doi.org/10.3389/fpsyg.2022.730917
ACM SIGACCESS Conference on Computers & Accessibility (Lisbon, Portugal) [35] Judy Reilly and Diane Anderson. 2002. The acquisition of non-manual morphol-
(ASSETS ’15). Association for Computing Machinery, New York, NY, USA, 147–154. ogy in ASL. Directions in sign language acquisition (2002), 159–182.
https://doi.org/10.1145/2700648.2809860 [36] Sheila Riddell and Nick Watson. 2014. Disability, culture and identity. Routledge.
[18] Michael Kipp, Quan Nguyen, Alexis Heloir, and Silke Matthes. 2011. Assessing [37] Aileen Herlinda Sandoval-Norton and Gary Shkedy. 2019. How much compliance
the Deaf User Perspective on Sign Language Avatars. In The Proceedings of the is too much compliance: Is long-term ABA therapy abuse? Cogent Psychology 6,
13th International ACM SIGACCESS Conference on Computers and Accessibility 1 (2019), 1641258.
(Dundee, Scotland, UK) (ASSETS ’11). Association for Computing Machinery, [38] Zhanna Sarsenbayeva, Niels van Berkel, Eduardo Velloso, Jorge Goncalves, and
New York, NY, USA, 107–114. https://doi.org/10.1145/2049536.2049557 Vassilis Kostakos. 2022. Methodological Standards in Accessibility Research
[19] Jeanne Klein and Shifra Schonmann. 2009. Theorizing aesthetic transactions on Motor Impairments: A Survey. ACM Comput. Surv. (may 2022). https:
from children’s criterial values in theatre for young audiences. Youth Theatre //doi.org/10.1145/3543509 Just Accepted.
Journal 23, 1 (2009), 60–74. [39] Donald A Schön. 1986. The Refective Practitioner: How Professionals Think in
[20] Verena Krausnecker and Sandra Schügerl. 2021. Best Practice Protocol on the Use Action. Taylor & Francis.
of Sign Language Avatars. https://avatar-bestpractice.univie.ac.at/en/english/ [40] Katta Spiel. 2021. The Bodies of TEI – Investigating Norms and Assumptions in
[21] Henny Kupferstein. 2018. Evidence of increased PTSD symptoms in autistics the Design of Embodied Interaction. In Proceedings of the Fifteenth International
exposed to applied behavior analysis. Advances in Autism (2018). Conference on Tangible, Embedded, and Embodied Interaction (Salzburg, Austria)
[22] Henny Kupferstein. 2019. Why caregivers discontinue applied behavior analysis (TEI ’21). Association for Computing Machinery, New York, NY, USA, Article 32,
(ABA) and choose communication-based autism interventions. Advances in 19 pages. https://doi.org/10.1145/3430524.3440651
Autism (2019). [41] Katta Spiel. 2022. Transreal tracing: Queer-feminist speculations on disabled
[23] Jonathan Lazar, Jinjuan Heidi Feng, and Harry Hochheiser. 2017. Research methods technologies. Feminist Theory 23, 2 (2022), 247–265.
in human-computer interaction. Morgan Kaufmann. [42] Katta Spiel and Kathrin Gerling. 2019. The Surrogate Body in Play. In Proceedings
[24] Ann Light. 2011. HCI as heterodoxy: Technologies of identity and the queering of the Annual Symposium on Computer-Human Interaction in Play (Barcelona,
of interaction with computers. Interacting with computers 23, 5 (2011), 430–438. Spain) (CHI PLAY ’19). Association for Computing Machinery, New York, NY,
[25] Kelly Mack, Emma McDonnell, Dhruv Jain, Lucy Lu Wang, Jon E. Froehlich, USA, 397–411. https://doi.org/10.1145/3311350.3347189
and Leah Findlater. 2021. What Do We Mean by “Accessibility Research”? A [43] Katta Spiel and Kathrin Gerling. 2021. The Purpose of Play: How HCI Games
Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019. Research Fails Neurodivergent Populations. ACM Trans. Comput.-Hum. Interact.
In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems 28, 2, Article 11 (apr 2021), 40 pages. https://doi.org/10.1145/3432245
(Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, [44] Katta Spiel, Kathrin Gerling, Cynthia L. Bennett, Emeline Brulé, Rua M. Williams,
NY, USA, Article 371, 18 pages. https://doi.org/10.1145/3411764.3445412 Jennifer Rode, and Jennifer Mankof. 2020. Nothing About Us Without Us:
[26] Kelly Mack, Emma McDonnell, Venkatesh Potluri, Maggie Xu, Jailyn Zabala, Investigating the Role of Critical Disability Studies in HCI. In Extended Abstracts
Jefrey Bigham, Jennifer Mankof, and Cynthia Bennett. 2022. Anticipate and of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu,
Adjust: Cultivating Access in Human-Centered Methods. In CHI Conference HI, USA) (CHI EA ’20). Association for Computing Machinery, New York, NY,
on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). USA, 1–8. https://doi.org/10.1145/3334480.3375150
Association for Computing Machinery, New York, NY, USA, Article 603, 18 pages. [45] Katta Spiel, Laura Malinverni, Judith Good, and Christopher Frauenberger. 2017.
https://doi.org/10.1145/3491102.3501882 Participatory Evaluation with Autistic Children. In Proceedings of the 2017 CHI
[27] Kelly Mack and Sophie Tian. 2020. Why Researchers Working with the Deaf Conference on Human Factors in Computing Systems (Denver, Colorado, USA)
Community Should Learn ASL. CHI 2020 Workshop "Nothing About Us Without (CHI ’17). Association for Computing Machinery, New York, NY, USA, 5755–5766.
Us – Investigating the Role of Critical Disability Studies in HCI" (2020). https://doi.org/10.1145/3025453.3025851
[28] Jennifer Mankof, Gillian R. Hayes, and Devva Kasnitz. 2010. Disability Studies as [46] Kate Stephens, Matthew Butler, Leona M Holloway, Cagatay Goncu, and Kim
a Source of Critical Inquiry for the Field of Assistive Technology. In Proceedings of Marriott. 2020. Smooth Sailing? Autoethnography of Recreational Travel by a
the 12th International ACM SIGACCESS Conference on Computers and Accessibility Blind Person. In The 22nd International ACM SIGACCESS Conference on Computers
(Orlando, Florida, USA) (ASSETS ’10). Association for Computing Machinery, and Accessibility (Virtual Event, Greece) (ASSETS ’20). Association for Computing
New York, NY, USA, 3–10. https://doi.org/10.1145/1878803.1878807 Machinery, New York, NY, USA, Article 26, 12 pages. https://doi.org/10.1145/
[29] Owen McGill and Anna Robinson. 2020. “Recalling hidden harms”: autistic 3373625.3417011
experiences of childhood applied behavioural analysis (ABA). Advances in Autism [47] Amelie Unger, Dieter P. Wallach, and Nicole Jochems. 2021. Lost in Translation:
(2020). Challenges and Barriers to Sign Language-Accessible User Research. In The 23rd
[30] Robert McRuer. 2006. Crip theory: Cultural signs of queerness and disability. NYU International ACM SIGACCESS Conference on Computers and Accessibility (Virtual
press. Event, USA) (ASSETS ’21). Association for Computing Machinery, New York, NY,
[31] Melanie Metzger. 1995. Action in American Sign Language. Sociolinguistics in USA, Article 37, 5 pages. https://doi.org/10.1145/3441852.3476473
deaf communities 1 (1995), 255. [48] Rua M Williams, Kathryn Ringland, Amelia Gibson, Mahender Mandala, Arne
[32] Damian EM Milton. 2014. Autistic expertise: A critical refection on the production Maibaum, and Tiago Guerreiro. 2021. Articulations toward a crip HCI. Interactions
of knowledge in autism studies. Autism 18, 7 (2014), 794–802. 28, 3 (2021), 28–37.
[33] Leah Lakshmi Piepzna-Samarasinha. 2018. Care work: Dreaming disability justice. [49] Anon Ymous, Katta Spiel, Os Keyes, Rua M. Williams, Judith Good, Eva Hornecker,
arsenal pulp press Vancouver. and Cynthia L. Bennett. 2020. "I Am Just Terrifed of My Future" – Epistemic
[34] Lorna C Quandt, Athena Willis, Melody Schwenk, Kaitlyn Weeks, and Ruthie Violence in Disability Related Technology Research. In Extended Abstracts of
Ferster. 2022. Attitudes Toward Signing Avatars Vary Depending on Hearing the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI,
Status, Age of Signed Language Acquisition, and Avatar Type. Frontiers in USA) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA,
1–16. https://doi.org/10.1145/3334480.3381828
Data Representativeness in Accessibility Datasets:
A Meta-Analysis
Rie Kamikubo Lining Wang Crystal Marte
College of Information Studies Department of Computer Science College of Information Studies
University of Maryland, College Park University of Maryland, College Park University of Maryland, College Park
United States United States United States
rkamikub@umd.edu lwang0@umd.edu cmarte@umd.edu

Amnah Mahmood Hernisa Kacorri


Department of Mathematics College of Information Studies
University of Maryland, College Park University of Maryland, College Park
United States United States
amahmoo1@umd.edu hernisa@umd.edu
ABSTRACT KEYWORDS
As data-driven systems are increasingly deployed at scale, ethical AI FATE; datasets; inclusion; diversity, representation; accessibility;
concerns have arisen around unfair and discriminatory outcomes aging
for historically marginalized groups that are underrepresented in
ACM Reference Format:
training data. In response, work around AI fairness and inclusion Rie Kamikubo, Lining Wang, Crystal Marte, Amnah Mahmood, and Hernisa
has called for datasets that are representative of various demo- Kacorri. 2022. Data Representativeness in Accessibility Datasets: A Meta-
graphic groups. In this paper, we contribute an analysis of the Analysis. In The 24th International ACM SIGACCESS Conference on Computers
representativeness of age, gender, and race & ethnicity in accessi- and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM,
bility datasets–datasets sourced from people with disabilities and New York, NY, USA, 15 pages. https://doi.org/10.1145/3517428.3544826
older adults—that can potentially play an important role in miti-
gating bias for inclusive AI-infused applications. We examine the 1 INTRODUCTION
current state of representation within datasets sourced by people
As AI-infused systems1 become ubiquitous, ensuring that they
with disabilities by reviewing publicly-available information of 190
work for a diversity of groups is vital [29, 56, 108]. Performance
datasets, we call these accessibility datasets. We fnd that acces-
disparities in these systems could lead to unfair or discriminatory
sibility datasets represent diverse ages, but have gender and race
outcomes for historically and culturally marginalized groups, such
representation gaps. Additionally, we investigate how the sensitive
as on the basis of gender, race, or disability [12, 18, 44, 149, 162, 172].
and complex nature of demographic variables makes classifcation
One fundamental source of disparities is the lack of representation
difcult and inconsistent (e.g., gender, race & ethnicity), with the
in datasets used to train machine learning models and benchmark
source of labeling often unknown. By refecting on the current
their performance [108, 162, 179]. A notable example comes from
challenges and opportunities for representation of disabled data
Treviranus [166], where during a simulation, she found that ma-
contributors, we hope our efort expands the space of possibility
chine learning models for autonomous vehicles would run over
for greater inclusion of marginalized communities in AI-infused
someone who propels themselves backward in a wheelchair. Merely
systems.
adding training examples of people using wheelchairs did not have
the intended efect in this case; the algorithm failed with a higher
CCS CONCEPTS confdence [166]. Treviranus suspected ‘backward propelling’ was
• Human-centered computing → Human computer interac- still an outlier.
tion (HCI); Accessibility; • Social and professional topics → In this important discussion on AI fairness and inclusion, ten-
People with disabilities; Age; Gender; Race and ethnicity. sions around data representativeness involving disability [60, 79,
118] have also arisen. Data sourced from accessibility datasets can
help AI-infused systems work better when deployed in real-world
scenarios, both for assistive and general-purpose contexts [29, 75,
Permission to make digital or hard copies of all or part of this work for personal or 169]. However, privacy and ethical concerns are especially pro-
classroom use is granted without fee provided that copies are not made or distributed nounced in this community, as disclosure of disability can pose
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than the
risks associated with re-identifcation and further discrimination
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or e.g., for one’s healthcare and employment [169, 179]. People who
republish, to post on servers or to redistribute to lists, requires prior specifc permission have distinct data patterns, like in the case of disability, are also
and/or a fee. Request permissions from permissions@acm.org.
ASSETS ’22, October 23–26, 2022, Athens, Greece
more susceptible to data abuse and misuse [1, 60, 167]. In addition,
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
1A term used by Amershi et al. , 2019 [4] to indicate “systems that have features
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00
https://doi.org/10.1145/3517428.3544826 harnessing AI capabilities that are directly exposed to the end user.”
ASSETS ’22, October 23–26, 2022, Athens, Greece Kamikubo et al.

even if AI-infused systems are trained with diverse data, this does adopted diversity considerations deeply in the ongoing challenge
not inherently challenge the power structures in which these sys- of responsible and ethical AI [24, 42, 113]. Much conversation has
tems are embedded, which may be the actual source of harm and been associated with the concepts around balanced representation
marginalization for disabled people [7]. For example, a more equi- of sub-groups (e.g., equal participation of racial sub-groups within a
table AI-infused system for diagnosing autism does not necessarily focal group) [47]. A growing number of studies have explored bias
correspond to greater well-being of autistic people, because it may and performance disparities of AI systems concerning representa-
cement the power that medical institutions have to diagnose and tion [38, 108], especially infuenced by demographic attributes like
gatekeep [7]. age [36, 97, 124], gender [18, 83, 142, 162], race [18, 96], socioeco-
We contribute to these discussions via our exploration of repre- nomic status [34], and disability status [56, 179]. Often such eval-
sentation in accessibility datasets, which reveal nuanced patterns uations found the source of concerns as the under-representation
of representation and marginalization along intersectional lines. In of certain demographic groups in the training data underlying pre-
this work, we conducted a metadata analysis of existing accessibil- dictive and inferential algorithm [108, 162, 179], calling for action
ity datasets (1984-2021, N=190) spanning multiple communities of to create more balanced datasets across diferent demographics. In
focus and data types to understand the representation and report- response, we have seen eforts like constructing image datasets
ing of demographic attributes including age, gender, and race & balanced in race, gender, and age (FairFace dataset [80]) or text
ethnicity of data contributors. We used the publicly available docu- corpora with gender-balanced labels (GAP [175]).
mentation and resources of these datasets to explore the potential In support of the current discourse around diversity in AI data,
opportunities and limitations for increasing data representative- researchers have argued that datasets sourced from people with
ness. disabilities and older adults can play an important role [75, 79, 118]
Our analysis shows mixed results for diverse representation of such as improving speech recognition with stammering data [40]
age, gender, and race & ethnicity. For age, we found that older adults and object recognition with photos taken by blind people [75].
are particularly well-represented, but this did not apply across all Calls for action from this community often center around includ-
communities of focus (with Autism, Developmental, and Learn- ing disability in AI fairness discussions as it pertains to model
ing communities being notable exceptions). Gender representa- performance, data excellence, and privacy [48, 77, 126, 168]. In-
tion skewed towards men/boys being more represented overall but creasing disability representation, however, is complex; there are
varied widely by community of focus. We also found that well- myriads of challenges in collecting and sharing datasets from this
documented structural marginalization in certain communities are group [1, 143]. Consent and disclosure can be problematic regard-
refected in accessibility datasets. For example, women/girls are ing sensitive disability status. Ethical concerns also arise given that
underrepresented in Autism datasets, corresponding to existing datasets collected to mitigate AI bias for people with disabilities
diagnosis gaps [55, 130]. Marginalization is further embedded on a can be used against them by detecting their disabilities, leading
meta level, such as the case of binary categories for gender classif- to further discrimination risks [118]. There are also existing so-
cation in the collection and reporting of gender data within datasets. cial biases and stereotypes refected in data representing disability
Furthermore, we did not fnd consistent norms for reporting data, (e.g., [63, 70]), which may produce AI-infused systems that reinforce
with the lack of standardized documentation, evolving practices, greater harms and marginalization of people with disabilities [7].
and variability of categories used across age, gender and race & Eforts aiming to increase inclusion thus need to be carefully con-
ethnicity. sidered [163].
The contributions of this work are 1) a systematic examination of To recognize the opportunities and limitations of accessibility
whether those sourcing data from the disability community are suc- datasets in the conversation of diversity in broader AI, we frst
ceeding in representing diverse demographics, via an intersectional need to understand the current status of representation in acces-
analysis along the axes of age, gender, and race & ethnicity as well sibility datasets. Prior work investigating issues associated with
as a meta-analysis of reporting methods; 2) codes of 190 existing diversity in AI datasets has mostly focused on examining diferences
accessibility datasets annotated with demographic metadata 2 ; and in model performance across pre-defned demographic attributes
3) connections to larger conversations about the implications of to draw implications for diversity [18, 34, 162]. This often leaves
representation, data stewardship, and epistemological challenges inquiries about the benefts and appropriate implementation of
of data collection. We contend that data representativeness must diversity in data unanswered [47], except for a few exceptions (as
be analyzed contextually using a critical lens, to accurately assess shown in Table 1) that explicitly analyzed datasets or issues related
the potential and implications of greater inclusion of marginalized to datasets in terms of demographic representation like gender
communities in AI-infused systems. and other sociocultural attributes (e.g., language) to explore the
root causes of bias and misrepresentation. These studies concluded
2 RELATED WORK that such AI datasets (often image datasets) are skewed towards
Sociocultural diversity has received attention in a wide range of certain demographics, uncovering under-representation of older
disciplines, such as encouraging gender or ethnic diversity in teams adults [109, 128], darker-skin, and females [109, 185], and lack of
or communities [21, 41, 74], with diferent concepts of diversity geographical diversity [148].
applied in research and applications [159]. More so, AI research has While representation has been discussed broadly across HCI and
accessibility [1, 100] or within specifc communities [114, 138], we
2 Data codes available at https://www.openicpsr.org/openicpsr/project/174761/version/ have only seen a few studies analyzing representation and charac-
V1/view. teristics pertained to AI training datasets in related work [15, 82].
Data Representativeness in Accessibility Datasets: A Meta-Analysis ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Prior work on analysis of broader AI and accessibility datasets with varying sample sizes.

Data # of Datasets Age Gender Race Skin Color Geography Sociocultural


Accessibility
Bragg et al. [15] Sign Language Datasets n=NA •
Kaushal et al. [82] Clinical Image Datasets n=74 •
Broader AI
Dodge et al. [39] C4 Webtext Corpora n=1 • •
Merler et al. [109] Face Image Datasets n=7-8 • • •
Park et al. [128] Face Image Datasets n=92 •
Scheuerman et al. [142] Face Image Datasets n=92 • •
Shankar et al. [148] Open Images, ImageNet n=2 •
Yang et al. [185] ImageNet n=1 • • •

They are yet constrained to very specifc tasks and applications. Ad-
ditionally, discussions of biases against people with disabilities are
found to be manifested in complex ways that require intersectional
attention [63, 150]. This research complements prior work, by an-
alyzing existing accessibility datasets across the communities, to
encourage holistic, societal implications for data representativeness
including people with disabilities and older adults.

3 METHOD
Our aim is to conduct a broad investigation of what and how de-
mographic attributes are represented in accessibility datasets—not
only in terms of disability representation but also age, gender, and
race. To this end, we leverage a recently compiled collection of
accessibility datasets, sourced from people with disabilities and
older adults. We analyze any available information on the data
contributors’ demographics in associated academic publications,
sharing sites, and documentation. Here, we discuss the dataset col-
lections, explain our coding and analysis approach, and refect on
our method and limitations. Refecting on author positionality, we
note that this research was conducted by Asian, Afro-Latina, and
white scholars, four of whom identifed as women, one identifed as
non-binary, and two identifed as disabled. Research in accessibility
ranged from frst year grad students to a professor who has been
publishing accessibility research for about thirteen years.

3.1 Accessibility Datasets in Our Collection


Recently, Kacorri et al. (2020) launched a data surfacing repository,
called IncluSet, as a result of putting together a collection of datasets
sourced from people with disabilities and older adults that were
manually located over a multi-year period [76]. An underlying
promise of these datasets is their potential for training, testing, or
benchmarking machine learning models. The work was later ex-
tended to investigate the risks and benefts of collecting, reporting,
and sharing accessibility datasets, analyzed in terms of 10 commu- Figure 1: Distribution of accessibility dataset count across all
nities of focus, 7 data formats, and 3 data access methods [79]. We communities of focus (a) and data types (b).
leveraged the accessibility datasets (1984-2021, N=190) included in
the existing collection of IncluSet and their groupings (i.e., commu-
nities of focus) as the basis for our investigation. Figure 1a illustrates objects taken by blind people [88], eye-tracking data from autistic
the distribution of the datasets across the communities of focus. children [43], and activity data from older adults [91].
The datasets, including their annotations, are of diferent data types, Identifying publicly available documentation for these datasets
as shown in Figure 1b. For example, there are voice recordings of often depended on how they were shared. Out of 190 datasets,
people with speech impairments [25], video recordings of Deaf about 84 can be downloaded directly and 41 can be accessed upon
signers [69], text written by people with dyslexia [134], stroke ges- request—e.g., through a webpage from the dataset creators or an
tures by people with motor impairments [171], photos of everyday online repository with a summary of the dataset. Summaries vary
ASSETS ’22, October 23–26, 2022, Athens, Greece Kamikubo et al.

highly from a few lines to detailed descriptions of the contents lenses [17], we started with broad coding techniques to identify
of the dataset and how it was collected. Even though none of the any information that pertains to these demographic attributes,
datasets had explicitly adopted standardized documentation such including potential ethnic and cultural descriptors like geogra-
as datasheets for datasets [54], some followed a systematic docu- phy and language. Manly [104] suggests that these attributes are
mentation dictated by the platforms where the datasets were stored proxies for or interrelated with unexamined variables, such as
such as Synapse.org. Associated academic publications were often education and socioeconomic status. To better our understand-
referred to in the web documentation to link more detailed informa- ing of race/ethnicity, it is central to deconstruct and examine
tion about the data collected, though these sources did not always the confounding infuences of ethno-racial factors. We note any
come with consistent information such as the number of data con- categories used to refer to data contributors’ racial groups, such
tributors, which could be easily updated on the web documentation. as those defned in the census [19] and group ethnic and cultural
Dataset downloads sometimes came with relevant summary fles, metadata like nationality, geography, and language under other
including a spreadsheet listing demographic information about sociocultural information. Based on the metadata identifed, we
people represented in the data. The remaining 65 datasets in the update the annotation scheme by specifcally going over how
collection did not include any sharing intent with no sources avail- this information is obtained and shared. Metadata related to
able other than their academic publications. We still include these education included information in terms of how it is obtained,
datasets in our analysis, in accordance with prior work analyzing reported, and shared; language included information on dialect
accessibility datasets [77, 79]. and skills earned which may interact with education; geography
included information on data contributors’ birthplaces and the
3.2 Manual Coding and Analysis recruitment location; and other information such as nationality
or socioeconomic status when available.
We conducted an exploratory analysis where our formulation of
what-to-code was based on (a) whether demographic information
about the data contributors is available, (b) how is it collected and 3.3 Refections on Limitations
reported, and (c) how are accessibility datasets distributed among
Annotation consistency. Annotation tasks are notably difcult, espe-
demographic groups within communities of focus.
cially if they involve manual inspection of large data requiring par-
Specifcally, beyond the existing codes in Kamikubo et al. [79],
ticular skills and knowledge. Given that we inspected both dataset
we extracted information related to demographic attributes follow-
documentations and scholarly articles from various publication
ing prior surveys on datasets and studies in accessibility and AI
venues across many research disciplines and sub-disciplines (e.g.,
that examined diversity and representation (summarized in Sec-
Linguistics, Acoustics, Physiology, Computer Vision, HCI, Accessi-
tion 2.3). A total of three annotators (a PhD student in Information
bility), it was unavoidable to go through a messy process to correct
Studies, a Masters’ student in HCI, and an undergraduate student
errors and disagreement in our codes. The annotators’ varying
in Math) were involved in the process, where at least two reviewed
levels of familiarity with accessibility and AI were also sources of
the documentation for each dataset and discussed to correct any
difculty. This is not a surprise. Even similar annotation tasks that
disagreement and error. They had diferent levels of familiarity with
were more limited in scope (i.e. within the feld of accessibility),
accessibility and AI. We extracted the following diversity-related
were characterized as “challenging and efortful” [100]. To address
information from the documentation, when available:
the challenges, as the coding process initially started with two an-
Age. We note how any age-related information is obtained (e.g., notators (PhD and undergraduate level), we invited a third member
self-reported, inferred, or unknown), reported (e.g., individual (Master’s level) to have a detailed pass. The PhD student took a
level, year of birth, age bins, and/or aggregate statistics), and fnal pass to ensure that the annotations were agreed upon at least
shared (e.g., a separate fle). We only calculate aggregated sta- by two annotators.
tistics from individual-level data when reporting fndings and We also experienced difculty in programmatically extracting
plotting distributions. demographic-related metadata. This often created disparities among
Gender. We note the labels used (e.g., sex, gender), if any; the the annotators in identifying the relevant information from the doc-
categories used; the number of data contributors that belong to umentation. We did not fnd a consistent, standardized method. For
the categories used; and how metadata was obtained (e.g., self- example, some methods we used included manually reviewing web
reported or inferred) and shared (e.g., spreadsheet or publication). documentation that provided summary statistics in writing [135]
In response to concerns raised by trans and information science or table [2] formats; downloading fles containing participants’
scholars that the sex/gender distinction can invalidate trans and demographic data (e.g., age, gender) together with collected data
intersex identities while veiling the socially constructed nature points [164] or a separate csv fle on participant demographics [6];
of sex categories, for this paper we use the term “gender” to refer or extracting metadata from flenames [65]. Without standardized
to discussions of characteristics of data contributors (that may documentation and evolving practices, whether datasets contained
be labeled by researchers as either gender or sex) [46, 142, 146]. demographic-related metadata was often unknown prior to down-
Race and ethnicity. Race is a multidimensional and complex loads. In addition, without proper explanation of the labels used for
concept, not a singular, biological construct with distinct limits demographic categories, such as in one dataset [6] that provided
into which people can be classifed. Alone, race and ethnicity, do a supplementary spreadsheet with a label ’1’ under the Race col-
not reveal much about an individual’s experiences. As race and umn for each participant, we could not fnd the meaning of this
ethnicity can be viewed through multiple socially constructed information.
Data Representativeness in Accessibility Datasets: A Meta-Analysis ASSETS ’22, October 23–26, 2022, Athens, Greece

Lack of documentation. As discussed in the Results, information Non exhaustive collection. One of the main limitations of this
on age, gender and race/ethnicity was in many cases sparse. When work remains the fact that the list of datasets in the collection is not
available, it was often unclear how the demographic-related meta- exhaustive. While somewhat systematic, the identifcation of these
data was obtained. Thus, we could not verify the source of classifca- samples is itself noisy and prone to cascading biased decisions from
tions (such as for gender). Few datasets explicitly documented that the researchers collecting them and those that opt/know to include
the reported information was e.g., “according to self-reports” [191]. their datasets in the IncluSet repository. The lack of inclusion crite-
Even fewer made inferences on these demographics e.g., “using ria related to when these datasets were introduced or whether they
proprietary classifers” [177] or “based on visual inspection” [151]; are currently in use and to what extent, could lead to systematic
typically these inferences were employed on data collected over misalignment between current eforts and past trends. This is exac-
the web. Specifcally, we observed that three datasets indicate esti- erbated by the fact that many datasets that are actually employed
mations on data contributors’ age; all three are solicited from user currently in commercial AI-infused products are not accessible
interactions with a web search engine with users’ age reported for this type of analysis; representation of diferent demographic
being “over the age of 40 years inferred from their date of birth as groups could be perhaps deduced via biased performance results
reported at registration to Bing” [189] or “inferred using proprietary (e.g., [18]) but that is beyond the scope of this work. Thus, any
Bing classifers”[177, 178]. insights from our analysis may not be generalizable beyond the
White et al. [177, 178] employed a similar approach for gender. research community.
Whereas Shi et al. [151, 152] determine the gender of individuals
by visually inspecting sign language videos from YouTube and 4 RESULTS
the signers’ social media; they used the code “Other” for videos Of 190 datasets whose publication and documentation we reviewed,
including people whose gender was deemed unknown or where the most commonly found types of demographic-related metadata
there were multiple signers. While we have included the codes for are age (46.8%) and gender (54.2%), followed by few datasets re-
these datasets in our collection as a reference for future researchers, porting race (4.7%) and education (12.1%). We fnd that 71 datasets
we don’t include them in our analysis of ’reported’ demograph- (37.4%) did not include any information related to the aforemen-
ics; inferences can be inaccurate, perpetuate bias, and perpetuate tioned types of metadata. These numbers difer from publications
exclusion (e.g. via binary classifcation of nonbinary individuals). that also focus on health, wellness, accessibility, and aging, where
None of the datasets in the collection inferred or estimated de- few share data; when looking at 792 HCI studies, Abbott et al. (2019)
mographics that pertain to race/ethnicity or other metadata related found a distribution of 69.7%, 67.3% and 6.6% on age, gender, and
to nationality, geography, language, and education. Yet, this part of ethnicity, respectively [1]. This diference could be due to tensions
our analysis is the weakest one as it solely relies on a small num- inherent in collecting “sensitive attribute data” [1, 11, 16] and con-
ber of datasets where the race/ethnicity information was specif- cerns related to participant consent and re-identifcation risks [1]. A
ically ‘reported’; the majority (8) came from US institutions and similar trend is seen among available metadata with respect to how
one from UK even though the institutions of data stewards in the others can access the datasets. Among those that are not publicly
collection spanned across 42 countries from Asia, Africa, North shared, 69.2% reported at least one of the demographics, compared
America, South America, Europe, and Australia. Thus, our analysis to 57.1% for publicly shared and 53.7% for shared upon request.
of this demographic is inherently limited. Only limited reporting of In this section, we present our fndings surrounding such “sensi-
race/ethnicity may be due to a number of factors, such as diferences tive attribute data” in accessibility datasets across communities of
in census reporting among Western and non-Western countries, a focus (Figure 2). To better understand the current status in terms of
prevailing consensus that racial designations do not identify geneti- reporting and including diferent demographic groups and variables,
cally distinct populations, and the likelihood of misuse (e.g., privacy we focus on the following demographics: age, gender, and race and
risks for disabled people) [84, 122, 147]. Cooper et al. suggest that ethnicity. In our analysis, we compare with existing categories used
“the correlation between the use of unsupported genetic inferences to represent demographic variables in social data collection (e.g.,
and the social standing of a group is glaring evidence of bias and racial categories in census [174]), and investigate representative-
demonstrates how race is used both to categorize and to rank order ness within accessibility datasets.
subpopulations.” [31]. However, since federal and state legislation
in the US have established evident discriminatory practices against 4.1 Age
African Americans, Hispanics, Asians, and other groups, racial cat-
A total of 6050 people within the communities of focus contributed
egorization can be utilized to refect intersectional gaps that are
data to the 89 datasets whose information on age was included.
a product of racial stratifcation practices. Thus, considering the
Their weighted average age was 43.6 (std=26.3). For the remaining
sociocultural and political contexts of diferent regions to further
of the report, statistics are reported at the dataset level (i.e. sam-
understand the decision to utilize racial categories is critical. We
pling distribution of the mean) even though the sample size across
did not see within the scope of this paper a systematic way to re-
datasets varies highly from 1 to 990 people (mean=66.8, std=144.5).
port the somewhat sparse metadata across codes related to data
Data on age from control groups are not included in the analysis.
contributors’ nationality, geography, language, and education and
tie them to sociocultural and political contexts of diferent regions. 4.1.1 What Is Reported. Datasets mostly reported such informa-
Nonetheless, we include these codes in our annotations for future tion in aggregate though some (36.0%) reported age at an individual
reference. level. Aggregate information includes minimum age (1.1%), range
(15.7%), median (1.1%), average (20.2%), or a combination (25.8%).
ASSETS ’22, October 23–26, 2022, Athens, Greece Kamikubo et al.

Figure 2: Proportion of accessibility datasets across all communities including metadata related to the age, gender, race,
education, or other sociocultural factors about their data contributors. Many datasets (e.g., in the Hearing group) did not contain
any metadata.

Typically, age was reported separately for target (i.e., disability) 4.1.3 Representation Across Communities of Focus. Figure 3 illus-
and control groups (e.g., [45]), contributors’ gender (e.g., [170]), and trates with violin plots the sampling distribution of mean age in
dataset purpose (e.g., training versus validation [86]). Few report on datasets across communities, where the white dot represents the
all groups together (e.g., [22]). Data anonymization is a core com- median, the thick gray bar in the center indicates the interquartile
ponent of data management to minimize risk of disclosure while range, and the thin gray line shows the rest of the distribution, ex-
preserving its utility for analysis [81]. However, we fnd that a cept for points that are determined to be “outliers.” Kernel density
majority of the datasets did not incorporate these strategies. For ex- estimations on each side of the gray lines show the distribution
ample, bucketing by age groups (e.g., 18-30, 31-45, 46-60 years [107]) shape. Wider sections indicate a higher probability that datasets
was only found in 7 datasets (7.9%). will have a mean age of the given value; the skinnier sections indi-
Only 5 datasets reported median and 3 datasets reported both cate a lower probability. We note that datasets vary in their sample
mean and median. More than half (58.4%) indicate standard devia- size, which is not accounted for by this visualization.
tion, including those reporting age at the individual level for which We fnd that mean age in datasets difers across communities,
it can be calculated. All three, mean, standard deviation, and range, with some communities particularly inclining towards samples with
can be found for less than half (42.7%) of the datasets (e.g., “The mean a certain target age (e.g., children, older adults). To better under-
age of the subjects was 54.9 ± 13.4 (SD) yr (range 36–70 yr)” [64]). stand the age representation exhibited in accessibility datasets, the
Meanwhile, some documentation noted only the minimum (e.g., remainder of the section follows age groups discussed or referred
“participants aged 50 or older” [180]) or the age requirement for to in prior literature in terms of technology (e.g., ‘older adults’ as
participation (e.g., “18 or older” [13]). 65+, ‘oldest-old adults’ as 85+) [128], disability-related policies (e.g.,
‘children’ between 3 to 21 covered in IDEA [94]), and the commu-
4.1.2 Why Is It Reported. Most often datasets did not specify why nities of focus (e.g., ‘toddlers’ of 18 to 36 months in developmental
the ages were obtained and reported. It could be an efect of per- assessment [30]). Of course, variations exist across studies [154] as
ceived norms and standards for questionnaires within the research there is no rigid defnition for these groupings.
community, which often include age questions [68, 161]. Age is an
established variable that helps understand the general characteris-
tics of participants. Its distribution may refect the quality of data
collection and analysis [5]; not accounting for age can threaten the
generalizability of the work especially when there is a treatment ef-
fect heterogeneity in age or other factors that may covary with age
(e.g., [121]). Some datasets mention eforts to match age between
target and control groups (e.g., [26, 160]) or note age matching
as not feasible (e.g., [111]). Others mention age as a confounding
variable e.g., for early detection of Parkinson’s disease based on
touchscreen typing patterns [72]. Some datasets mentioned the
goal of including data from diverse age groups to assess age-related
decline of cognitive or mobility performance [91, 116]. For example,
in a dataset acquiring age-related pen-based performance [116], Figure 3: Sampling distribution of ’reported’ mean age, which
participants were grouped based on cognition changes (’young’ difers across communities. Means are calculated on varying
for 18-55, ’pre-old’ for 56-75, and ’old’ for 75+). Grouping varies sample sizes.
across communities; in an attempt to build a diverse sign language
corpus, researchers binned groups as 18-35 years, 36-50 years, 51- Older adults. Many accessibility datasets represent older adults.
64 years, and 65+, rationalizing their decision based on language Among the datasets that contained some form of age-related in-
transmission variability within the Deaf community [141]. formation, 48.3% included at least one older adult (65+), and 6.7%
Data Representativeness in Accessibility Datasets: A Meta-Analysis ASSETS ’22, October 23–26, 2022, Athens, Greece

at least one oldest-old adult (85+). The highest proportion of older (e.g., evaluating text readability and comprehensibility via gaze fx-
adults was in the Cognitive and Health groups, reporting at least ations [45, 183, 184].)) Looking further at datasets skewed towards
one older adult in 83.8% and 73.3% of their datasets, respectively. younger and middle-aged adults, the age range of Hearing and
This may not be surprising, as these groups focus on cognitive and Vision groups was limited, even though visual and hearing impair-
physical decline that can relate to age—e.g., the risk of onset of ments could be associated with older age [14, 95]. The datasets in
dementia (e.g., Alzheimer’s disease) increases with older age [131]. the Hearing and Vision groups that reported age have an overall
Specifcally, the Cognitive group had datasets with the highest mean of mean age 28.3 (std=4.2) and 48.7 (std=3.6), respectively.
mean of mean age (mean=61.7, std=12.4) which were often cross- This can be partially explained by how these datasets were col-
listed with the Mobility and Speech groups including speech or lected. For example, the majority (66.7%) of datasets in the Vision
motion data of patients with Parkinson’s disease (e.g., [71, 140]). group did not include any age information; they were collected
The oldest participant, aged 89, was reported in the Cognitive and from thousands of users via real-world applications (e.g., [57, 78]),
Health groups in the image dataset capturing daily activities of where user demographics may not be available or omitted due to
those with episodic memory impairment [89]. Communities that privacy concerns. Similarly, in the Hearing group the majority of
lack older adult representation are Autism, Developmental, and datasets do not include age information; they tend to collect sign
Learning, refecting a broader gap in research pertaining to these language from online sources (e.g., [92, 151]).
groups [66, 73, 130, 139]. This can be due to many factors; for ex- Diverse ages. We observe that the Language group has the
ample, many autistic older adults experienced a severely delayed largest age variability. Among others, they include data sourced
diagnosis [102]. Many adults with learning disabilities live in in- from children with epilepsy (e.g., [160]), adolescents with language
stitutions such as nursing and residential homes, in which they impairment (e.g., [176]), and older adults with aphasia (e.g., [3, 35]).
arrive “before their 65th birthday” with “few opportunities to get Often datasets in this group come from clinical settings such as
out” [165]. the FluencyBank found in TalkBank [101], a shared database estab-
Children and youth. Children and youth are also represented lished in 2002 for studying human communication. Perhaps this
in accessibility datasets; about a quarter (24.7%) of the datasets collaborative efort among a wide range of disciplines could explain
whose information on age was included contained data sourced by the variability of datasets spanning across diferent communities
at least one person younger than 18 years old. It increases to 33.7% over the years. Datasets in Speech also capture diferent age groups.
when including those 21 or younger, as the age criteria for study par- Some can be found in TalkBank, including spoken phrases of older
ticipation is often noted as 18 or older [13, 45]. Perhaps this refects adults with Alzeimer’s disease [105] as well as children [182] and
some of the ethical challenges in collecting data from children [32] adults [187] who stutter.
as the process for obtaining consent, assent, or parental permission
is more complex for those under the legal age [112]. While overall
there are few datasets sourced from youth, they tend to concentrate
4.2 Gender
in the Developmental (85.7% of datasets in this group include at A total of 5598 people within the communities of focus contributed
least one person <18) and Learning (100.0%) groups. Datasets in the data to the 103 datasets whose information on gender was included.
Learning group often focus on dyslexia (e.g., [53, 115]), where diag- Again, we include information at a dataset level even though the
nosis is critical at early ages. Data from toddlers (18 to 36 months sample size across datasets varies highly from 1 to 818 (mean=59.6,
old) are typically seen in the Development group for the purpose of std=106.6). Data on gender for the control groups are not included
developmental assessment (e.g., [30]). They mostly involve speech in the analysis.
data, sourced by stuttering children [58, 182] or late talkers [120].
The youngest reported age across all the accessibility datasets was 4.2.1 What Is Reported. Gender metadata was commonly reported
16 months, in a dataset sourced from autistic children [181], though with the number of data contributors in the form of writing (e.g.,
not many (33.3%) datasets reporting age in the Autism group in- “10 blind participants (5 female) ranging in age from 18 to 63 years
cluded those under the age of 18. The groups that lack data from old” [9]) or table (e.g., a M/F column [6]). Of datasets reporting
children and youth are Vision, Hearing, and Mobility. We suspect such metadata, we observed that a binary classifcation was used
that this is refective of the most common purpose for collecting (female/male, women/men, girls/boys), with only one dataset in our
data such as image and video from this age group, which is to better collection reporting data on the “other” category [49]. However,
assess and diagnose; disabilities related to one’s vision, hearing, it is difcult to draw conclusions from this alone, as few datasets
and mobility have long established methods and instruments that reported their method of gendering contributors. Without this, we
might not require such datasets. cannot distinguish between self-identifcation (e.g., as part of a
Younger and middle-aged adults. When looking at younger demographics questionnaire), or an external inference infuenced
adults (over 18), we fnd that surprisingly, many (9) datasets with by implicit assumptions (e.g., by the study designers or validators).
mean age in the Autism group tend to include people between the Furthermore, if participants were asked to self-identify, they may
age of 18 and 44, with an overall mean of mean age 24.0 (std=13.8). have been limited to choosing from binary options.
This is in striking contrast with the broader research on autism,
where the majority (94%) tends to focus on infants, toddlers, chil- 4.2.2 Why Is It Reported. Similar to age being asked in standard de-
dren, and adolescents [73] due to a focus on early diagnosis and mographic questions [68], datasets often included gender informa-
intervention [117, 127]. Datasets including younger adults in this tion as part of the data distribution, without specifcally describing
group were often collected in the context of assistive technologies the goal of collecting such information.
ASSETS ’22, October 23–26, 2022, Athens, Greece Kamikubo et al.

Nonetheless, we can attempt to extrapolate the reasoning for


some datasets, especially when they contain particular data formats.
The highest presence of gender information was in datasets that col-
lected audio (66%) compared to video (27%) or image (32%). Perhaps,
this is refective of an assumption of the infuence of gender among
those working with speech data. Datasets that capture motion e.g.,
gait of Parkinson’s disease patients [170], also attempt (about 50% of
them) to account for physical measurement diferences represented
in data by using gender as a proxy.
In order to keep the study design as “unbiased” as possible, some
datasets reported that gender (and/or age) was “balanced” in the
test group (e.g., “roughly balanced for gender of the 249 participants,
52% (n= 129) were women” [141]), but eforts to balance distribu-
tion between target and control groups were much more common
Figure 4: Sampling distribution of gender representation
(e.g., [170], [125]).
across accessibility datasets. The representation gap is more
4.2.3 Representation Across Communities of Focus. Gender demo- prominent in some communities than others.
graphics vary across the world, with most countries having a fe-
male3 share of the population between 49% and 51% [137]. However,
overall, accessibility datasets that include gender information tend skew towards women has been identifed by researchers in this
to be imbalanced with men and boys (60.1%) who are more rep- community as possibly attributable to diferences in life expectancy
resented on average4 than women and girls (39.9%). This is also by gender in addition to increased risk of visual impairments with
evident in Figure 4a, which illustrates with violin plots the sampling age (e.g., macular degeneration) [59], which women are noted to
distribution of gender representation in datasets across communi- be at higher risk of than men [156].
ties of focus, where the vertical dash lines indicate the quartiles
and each side of the distribution shows kernel density estimations
for ‘women/girls’ and ‘men/boys’. This illustration also highlights
4.3 Race & Ethnicity
how the gap is more prominent in some communities than others. Race is a complex and sensitive demographic variable [52, 145].
Specifcally, we see a clear imbalance in the representation of data Only 9 (5%) accessibility datasets reported metadata on contrib-
contributors in the Autism and Developmental groups; on average, utors associated with racial or ethnic groups, typically captured
33.1% (std=8.1) and 27.9% (std=9.8) are women and girls, respectively. by demographic surveys (e.g., [19]). Modern racial classifcation
Such highly skewed representation has been actively discussed in systems construct race using both observable physical features (e.g.,
the evaluation and diagnosis of autistic children, given that boys skin color) and nonobservable characteristics such as culture and
constituted 81% of the sample of children [55]. One widely cited language [27]. Thus, ‘other’ related demographic information we
male-to-female diagnosis ratio is approximately 4:1 [51]. However, found could perhaps be utilized to draw some connections and
when the ASD participants are controlled for cognitive impairments, inferences about race, including the place of birth [23], native lan-
this number changes [85, 98, 103, 106, 138]. About 50-55% of autistic guage [72], or dialect [188]. However, in past studies they have
children are estimated to be intellectually disabled (ID) [98]. Among led to issues of forced classifcation and error [11, 123]. Therefore,
ID autistic children, the male-to-female ratio is signifcantly smaller, in this section we don’t make that connection. We report only on
at 2:1 [67]. In autistic children labeled as “high functioning”, the datasets with explicit racial and ethnic information.
existing literature points to a higher male-to-female ratio, about
4.3.1 What Is Reported. The categories we found delineating racial
6:1. Researchers have theorized an explanation for this relationship
composition were mostly ‘White’ and ‘Black’ [144], with variations
could be the tendency of (so-called) “high-functioning” autistic
of reporting them as ‘White-Caucasian’ or ‘Caucasian’ and ‘African-
females to “mask” or “camoufage” core autistic traits [90, 133]. A
American’ [160, 182, 191]. For other racial groups, data were ambigu-
growing body of evidence suggests that current diagnostic criteria
ously grouped together (e.g., “62% Caucasian, 30% African-American
for ASD may fail to account for these phenomena and the subtleties
and 10% other” [160]) or can be extrapolated by subtracting what
in behavior, leading to misdiagnosis and late-diagnosis for minority
was reported as the proportion of the ‘white’ category only [190].
gender groups (e.g., women, girls, non-binary) [87].
The use of these terms also highlight the limitations of the taxonom-
While many communities of focus portray gender disparity in
ical racial categories; ‘Caucasian’, for example, is rather discussed
their represented samples, it is not seen in the Vision group, with
as outdated and disproved [119].
the average of 50.2% (std=3.2) consisting of women per dataset.
Similar to age and gender, race was reported separately for target
According to 2018 U.S. disability statistics [186], 45.3% of visually
and control groups (e.g., [190]). Notably, one speech dataset sourced
disabled people were male, and 54.7 % were female. The slight
from stuttering children aimed at a race-matched (as well as age-
3When referring to data sourced from external collections, we follow the terminology and gender-matched) cohort of children [132]—here, both stuttering
used in their reports. and non-stuttering groups had 2 African American children and 1
4With both gender-related and sex-related categories used in our collection of datasets,
we report data for ‘women/girls’ or ‘men/boys’ combined with data for e.g., ‘female’ child of mixed racial ancestry. This was also the only dataset in the
or ‘male’. collection reporting about mixed race, although we saw an attempt
Data Representativeness in Accessibility Datasets: A Meta-Analysis ASSETS ’22, October 23–26, 2022, Athens, Greece

to collect data on race, including ‘Mixed’, from a demographic 5.1 Addressing Challenges and Seizing
questionnaire in a study on Parkinson’s disease [13]. Opportunities for Representation
Our analysis revealed unique challenges in ensuring representa-
4.3.2 Why Is It Reported. Looking at datasets whose data on race tion of intersecting demographics in accessibility datasets. Some
was collected and/or reported, they are often related to medical representation gaps are attributable to societal and cultural norms
research associated with studies on specifc disorders. Specifcally, and biases that operate intersectionally. For example, communities
they include speech samples collected from people with apha- lacking older adult representation are Autism, Developmental, and
sia [144], Parkinson’s disease[190], Alzheimer’s disease [6], and Learning. This refects not only a broader research gap on these
epilepsy [160] to study early detection of impairments underly- groups [66, 73, 130, 139] but also discrimination at the intersection
ing cognitive disturbance. In medical research domains, there are of disability and age; e.g., many autistic older adults live without
controversies around collecting data on race, raising both benefts an accurate diagnosis [102]. Similarly, looking at the intersection
and risks given disparities in health outcomes established for racial of disability and gender, we observe a gap for Autism, Develop-
minorities [50, 62]. Concerns also lie in the taxonomy of the cate- mental, and Learning groups, where men and boys were often
gories used, which have brought eforts to standardize and improve over-represented. These cases can have pernicious implications
methods of obtaining and reporting data on race [8, 50]. Recent characterized not only by the communities of focus but also long
guidelines [50] suggest including an explanation of who identifed established research frameworks that propagate existing societal
participant race & ethnicity and reasons for collecting the data. We marginalization, highlighting the importance of making gender-
did not fnd disclosure of the source of the classifcations among the specifc changes (e.g., diagnostic criteria for autism [37, 87]).
datasets included (e.g., self-report, observation), nor a justifcation In annotating accessibility datasets, we also surfaced how so-
of why it was collected. cially constructed identity categories such as race and gender are
reproduced. Similar to Scheuerman’s meta-analysis of gender in
4.3.3 Representation Across Communities of Focus. It was hard to face datasets [142], by analyzing information such as reasons for
distinguish the data between race and ethnicity or other sociocul- reporting/data collection and labels used for metadata categories,
tural information, especially when the data spans multiple concepts we contribute a sociological meta-examination through which the
and forms of classifcation (e.g., “129 of Caucasian, 14 of African research and data collection process itself can be analyzed for
American, 2 of Hispanic, and 2 of Asian origin” [182]). For exam- bias. For example, we found that the notion of a gender or sex
ple, in US, guidelines that inform data collection for census note binary was not explicitly challenged in our collection; only one
that the concept of race is separate from the concept of Hispanic dataset reported data on the “other” gender category. This may
origin [173]. have downstream efects in shaping machine learning model de-
For the few datasets that reported data contributors’ race and eth- sign and subsequent problems/contexts—for example, in binary
nicity, the norms of how to report were highly inconsistent. Thus, gender classifcation, which may harm nonbinary communities
with high variability and a small sample, we could not leverage stan- through technology-enabled misgendering [61].
dardized methods to analyze racial group composition among the We also found that there is very little reporting of how identity
communities of focus. The categories we saw (often in Cognitive labels were associated with data contributors, whether through self-
and Language) were associated with ‘white’ or ‘non-white’, por- identifcation or external assumption (e.g., via preformed binary
traying one group as primary over another. Mixed race was rarely categories). We recommend greater transparency in disclosing these
indicated, which is problematic given changes in racial categories aspects of the data collection process, and for gender in particular, to
(e.g., in the US census) refecting racial mixture [20]. include nonbinary, self-describe, and prefer not to disclose options,
as recommended in the related literature [158].
5 DISCUSSION At the same time, we acknowledge the implementation chal-
lenges that may need to be addressed to support transparency—e.g.,
Our overarching goal lies in understanding the current state of
how to produce a set of questions which do not elicit informa-
representativeness of marginalized groups in AI datasets (along the
tion leading to unintentional misuse or unwanted societal biases
axes of age, gender, and race & ethnicity) with a specifc focus on
for data contributors. We emphasize that careful refection on this
disabled data contributors. This is relevant to the greater discourse
process is needed on the part of researchers who are collecting
around AI, ethics, and fairness, as marginalized communities tend
and reporting contributor data, including implications of use (e.g.
to be under-represented in data [47], perpetuating cycles of ex-
surveillance) and any potential harms enacted by power structures
clusion as technology advances even for technologies that meant
through the systems we build. Aligning with recent research [110],
to promote inclusion such as assistive technology. We contribute
we recommend an examination and contextualization of data rep-
to this important ongoing discussion through our analysis of 190
resentativeness grounded in political, economic, and socio-cultural
accessibility datasets. Specifcally, we examine representation gaps
lenses, integrating insights from scholars in felds such as critical
and trends that can potentially lead down the road to further harm
disability studies [28], trans/gender studies [157], and histories of
for the people who stand to be adversely afected by emerging,
social movements [136] into an analysis of power relations. As an
potentially ubiquitous technology. In this section, we recap and
example, one could draw from recent work by disability studies
discuss the challenges and opportunities for representation while
scholars examining the context the data is collected in (i.e., for
considering directions the accessibility feld could take to carefully
AI systems vs for visibility and activism) and how representation
include marginalized communities in AI-infused systems.
impacts are also context-dependent [93].
ASSETS ’22, October 23–26, 2022, Athens, Greece Kamikubo et al.

5.2 Developing Participatory Approaches to and surveillance [7], especially for multiple marginalized contribu-
Data Stewardship tors.
Analyzing other sociocultural factors. A more in-depth analysis of
This challenge of partitioning the pool of accessibility datasets into
the sociocultural contexts in which datasets were produced, not just
sub-communities was very real in our analysis, as the groupings
what was reported, could lead to interesting insights. A quick in-
that we opted for may not necessarily refect the identities of in-
spection of our datasets revealed that when data involves children,
dividual data contributors. Recent work exploring challenges for
specifcally in studies of developmental disability, we sometimes
collecting disability data suggests the voices of contributors to be
fnd family information, such as socioeconomic status [132] or
refected and provides best practices to ask about disability sta-
parental education [58, 160]. Future work could explore represen-
tus [10]. Perhaps, to mitigate harms experienced by those from
tation along axes of level of education, language, nationality, and
marginalized communities who are misclassifed, we can extend
socioeconomic status of the data contributors, as well as intersec-
this approach to other categories such as race and gender. Specif-
tions between them. It would also be interesting to explore the
ically, we urge researchers to come up with approaches for more
infuence of dataset origin (i.e. from the HCI vs medical research
meaningful engagement of data contributors in the data steward-
community) on demographic representation as they may opt for
ing process. Echoing Shneiderman’s motto [153], we recommend
diferent models of disability.
“researchers in the loop, disabled contributors in the group”.
Accounting for dataset impact. Our analysis of the implications
One way we could go about this is to employ participatory ap-
of representation is complicated by the fact that datasets vary in
proaches to the data collection lifecycle in which users have the
research impact. Potential indicators of impact include the number
opportunity to enact their values in how their data is collected,
of citations, the models they are used to train or benchmark, the
maintained, shared, and interpreted in and out [33, 99]. Of course,
venues in which they are published, and whether they originate
this would require careful consideration of the many moving pieces
from academia or industry. Future work remains in investigating
in the Fairness, Accountability, Transparency, and Ethics (FATE)
and defning impact indicators and metrics, and weaving those
landscape both in terms of parties involved as well as exchange and
insights into discussions of representativeness.
access mechanisms; Bragg et al. [15] provide a wonderful starting
Beyond accessibility datasets. While any insights from our analy-
point for this discussion in the context of the Deaf community. For
sis may not be generalizable beyond the research community, our
example, to avoid inadvertently extractive approaches, and aligning
fndings present an opportunity for broader AI communities to
with recent literature, we recommend meaningfully compensating
strive towards more representativeness—along disability and other
participants for their work as data contributors [155]. In this vein,
dimensions—by including accessibility datasets in their training
we also recommend developing long term relationships with data
data. For example, AI datasets have been critiqued for being heav-
contributors and their communities (where possible) to facilitate
ily skewed towards younger adults, and under-representing older
sustainable and mutually benefcial collaboration, especially when
adults [128]. In contrast, accessibility datasets yield a wide vari-
designing and evaluating AI-infused systems that use contributor
ability of age groups. In future research, we strive to connect our
data [155, 163]. Disability community-led initiatives can help con-
discussions of representation gaps with larger trends for broader
centrate research eforts on those most likely to have a positive
AI datasets and investigate whether accessibility can be used as a
impact; the idea generation phase may be particularly fruitful when
lens to diversify representation for the broader AI community.
rooted in frst person lived experience (e.g. as provided in [129]).

5.3 Addressing Epistemological Implications in


Future Work
We encountered epistemological limitations at various stages in the 6 CONCLUSION
annotation and analysis process. One such limitation is the extent to We conducted a detailed analysis of data representativeness among
which strong claims can be made about overall representativeness, 190 accessibility datasets, with an emphasis on the intersections of
due to the lack of reporting and global statistics for disability, age, disability with age, gender, and race & ethnicity. While we found
gender, and race. In addition, our fndings are intrinsically linked diverse representation of age in accessibility datasets, we identifed
to existing sociocultural contexts and hierarchies. Our analysis of gaps in gender and race & ethnicity representation among these
accessibility datasets showcases these epistemological limitations. datasets. Our fndings illustrate the implications of historical and
By acknowledging these limitations, we hope to spark conversa- social contexts. Although we acknowledge there are limitations
tions on the inclusion of marginalized communities in AI-infused when collecting these demographic variables, going forward, we
systems and its myriad challenges. In future eforts, we recommend propose a participatory approach when collaborating with disabled
the following for broader research implications: contributors and encourage transparency regarding data collection
Exploration of disabled people’s concerns around representation. purpose and maintenance throughout the process. We hope our
Increasing representativeness may not always be benefcial; it may efort elucidates the current challenges in representation among the
perpetuate injustice as extensions of existing systems of oppression accessibility community while expanding the space of possibility
and power. As explored in the previous section, it is vital to include for greater inclusion of marginalized communities in AI-infused
frst person disabled perspectives on representativeness and inclu- systems more broadly. Finally, we hope that our eforts provoke
sion, as well as data collection and sharing practices. Future work conversations on data representativeness through a critical and
remains in exploring contributor concerns such as privacy [60, 77] epistemological lens.
Data Representativeness in Accessibility Datasets: A Meta-Analysis ASSETS ’22, October 23–26, 2022, Athens, Greece

ACKNOWLEDGMENTS 4, 9 (2007), e271.


[17] The Editors of Encyclopaedia Britannica. 2021. critical race theory. https:
We thank Hal Daumé III for providing valuable feedback on our pre- //www.britannica.com/topic/critical-race-theory.
liminary work. We also thank our anonymous reviewers for further [18] Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accu-
racy Disparities in Commercial Gender Classifcation. In Proceedings of the 1st
strengthening this paper. This work is supported by National Insti- Conference on Fairness, Accountability and Transparency (Proceedings of Machine
tute on Disability, Independent Living, and Rehabilitation Research Learning Research, Vol. 81), Sorelle A. Friedler and Christo Wilson (Eds.). PMLR,
(NIDILRR), ACL, HHS (#90REGE0008). 77–91. https://proceedings.mlr.press/v81/buolamwini18a.html
[19] United States Census Bureau. 2021. QuickFacts United States. https://www.
census.gov/quickfacts/fact/table/US/PST045221. Accessed: 2022-01-03.
REFERENCES [20] Linda Burhansstipanov and Delight E Satter. 2000. Ofce of Management and
[1] Jacob Abbott, Haley MacLeod, Novia Nurain, Gustave Ekobe, and Sameer Patil. Budget racial categories and implications for American Indians and Alaska
2019. Local Standards for Anonymization Practices in Health, Wellness, Acces- Natives. American Journal of Public Health 90, 11 (2000), 1720.
sibility, and Aging Research at CHI. In Proceedings of the 2019 CHI Conference [21] Lesley G Campbell, Siya Mehtani, Mary E Dozier, and Janice Rinehart. 2013.
on Human Factors in Computing Systems (CHI ’19). Association for Computing Gender-heterogeneous working groups produce higher quality science. PloS
Machinery (ACM), 1–14. https://doi.org/10.1145/3290605.3300692 one 8, 10 (2013), e79147.
[2] Gaurav Aggarwal and Latika Singh. 2018. Evaluation of Supervised Learning [22] Romuald Carette, Mahmoud Elbattah, Federica Cilia, Gilles Dequen, Jean-Luc
Algorithms Based on Speech Features as Predictors to the Diagnosis of Mild to Guerin, and Jérôme Bosche. 2019. Learning to Predict Autism Spectrum Disorder
Moderate Intellectual Disability. 3D Research 9, 4 (2018), 55. https://doi.org/10. based on the Visual Patterns of Eye-tracking Scanpaths. In Proceedings of the
1007/s13319-018-0207-6 12th International Conference on Health Informatics. 103–112. https://doi.org/10.
[3] Meghan Allen, Joanna McGrenere, and Barbara Purves. 2007. The Design and 5220/0007402601030112
Field Evaluation of PhotoTalk: A Digital Image Communication Application for [23] Naomi K Caselli, Zed Sevcikova Sehyr, Ariel M Cohen-Goldberg, and Karen
People with Aphasia. In Proceedings of the 9th International ACM SIGACCESS Emmorey. 2017. ASL-LEX: A lexical database of American Sign Language.
Conference on Computers and Accessibility (ASSETS ’07). Association for Com- Behavior research methods 49, 2 (2017), 784–801.
puting Machinery (ACM), 187–194. https://doi.org/10.1145/1296843.1296876 [24] L. Elisa Celis, Amit Deshpande, Tarun Kathuria, and Nisheeth K. Vishnoi. 2016.
[4] Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira How to be Fair and Diverse? arXiv:1610.07183 [cs.LG]
Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N. Bennett, Kori Inkpen, [25] Ugo Cesari, Giuseppe De Pietro, Elio Marciano, Ciro Niri, Giovanna Sannino,
Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for Human-AI and Laura Verde. 2018. A new database of healthy and pathological voices.
Interaction. Association for Computing Machinery, New York, NY, USA, 1–13. Computers & Electrical Engineering 68 (May 2018), 310–321. https://doi.org/10.
https://doi.org/10.1145/3290605.3300233 1016/j.compeleceng.2018.04.008
[5] Frank M Andrews and A Regula Herzog. 1986. The quality of survey data as [26] Shi Chen and Qi Zhao. 2019. Attention-based autism spectrum disorder screen-
related to age of respondent. J. Amer. Statist. Assoc. 81, 394 (1986), 403–410. ing with privileged modality. In Proceedings of the IEEE/CVF International Con-
[6] James T Becker, François Boiler, Oscar L Lopez, Judith Saxton, and Karen L ference on Computer Vision. 1181–1190.
McGonigle. 1994. The natural history of Alzheimer’s disease: description of [27] Vivian Chou. 2017. How science and genetics are reshaping the race debate of
study cohort and accuracy of diagnosis. Archives of neurology 51, 6 (1994), the 21st century. Science in the News 17 (2017).
585–594. https://doi.org/10.1001/archneur.1994.00540180063015 [28] Eli Clare. 2015. Exile and pride. In Exile and Pride. Duke University Press.
[7] Cynthia L. Bennett and Os Keyes. 2020. What is the Point of Fairness? Disability, [29] Leigh Clark, Benjamin R. Cowan, Abi Roper, Stephen Lindsay, and Owen Sheers.
AI and the Complexity of Justice. 125, Article 5 (March 2020), 1 pages. https: 2020. Speech Diversity and Speech Interfaces: Considering an Inclusive Future
//doi.org/10.1145/3386296.3386301 through Stammering. Association for Computing Machinery, New York, NY,
[8] Rohit Bhalla, Brandon G Yongue, and Brian P Currie. 2012. Standardizing USA. https://doi.org/10.1145/3405755.3406139
race, ethnicity, and preferred language data collection in hospital information [30] Jantina Rochelle Cliford. 2005. An evaluation of the technical adequacy of a
systems: results and implications for healthcare delivery and policy. Journal for parent-completed inventory of developmental skills. (2005).
Healthcare Quality 34, 2 (2012), 44–52. [31] Richard S Cooper. 2003. Race and genomics. The New England journal of medicine
[9] Jefrey P. Bigham, Anna C. Cavender, Jeremy T. Brudvik, Jacob O. Wobbrock, 348, 12 (2003), 1166.
and Richard E Ladner. 2007. WebinSitu: A Comparative Analysis of Blind [32] Imelda T Coyne. 1998. Researching children: some methodological and ethical
and Sighted Browsing Behavior. In Proceedings of the 9th International ACM considerations. Journal of Clinical Nursing 7, 5 (1998), 409–416.
SIGACCESS Conference on Computers and Accessibility (Assets ’07). Association [33] Jennifer L Davidson and Carlos Jensen. 2013. What health topics older adults
for Computing Machinery (ACM), 51–58. https://doi.org/10.1145/1296843. want to track: a participatory design study. In Proceedings of the 15th International
1296854 ACM SIGACCESS Conference on Computers and Accessibility. 1–8.
[10] Brianna Blaser and Richard E Ladner. 2020. Why is Data on Disability so Hard [34] Terrance de Vries, Ishan Misra, Changhan Wang, and Laurens van der Maaten.
to Collect and Understand?. In 2020 Research on Equity and Sustained Partici- 2019. Does object recognition work for everyone?. In Proceedings of the IEEE/CVF
pation in Engineering, Computing, and Technology (RESPECT), Vol. 1. IEEE, 1– Conference on Computer Vision and Pattern Recognition Workshops. 52–59.
8. https://www.washington.edu/doit/sites/default/fles/atoms/fles/RESPECT_ [35] Roxanne DePaul. 2016. DementiaBank English PPA Corpus. https://doi.org/10.
2020_DisabilityData.pdf 21415/T5ZH5T
[11] Miranda Bogen, Aaron Rieke, and Shazeda Ahmed. 2020. Awareness in Prac- [36] Mark Diaz, Isaac Johnson, Amanda Lazar, Anne Marie Piper, and Darren Gergle.
tice: Tensions in Access to Sensitive Attribute Data for Antidiscrimination. In 2018. Addressing Age-Related Bias in Sentiment Analysis. (2018). https:
Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency //doi.org/10.1145/3173574.3173986
(Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, [37] Erin Digitale. 2022. Study fnds diferences between brains of girls, boys with
NY, USA, 492–500. https://doi.org/10.1145/3351095.3372877 autism. https://med.stanford.edu/news/all-news/2022/02/autism-brain-sex-
[12] Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T diferences.html/
Kalai. 2016. Man is to computer programmer as woman is to homemaker? [38] Lucas Dixon, John Li, Jefrey Sorensen, Nithum Thain, and Lucy Vasserman. 2018.
debiasing word embeddings. Advances in neural information processing systems Measuring and mitigating unintended bias in text classifcation. In Proceedings
29 (2016), 4349–4357. of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 67–73.
[13] Brian M. Bot, Christine Suver, Elias Chaibub Neto, Michael Kellen, Arno Klein, [39] Jesse Dodge, Maarten Sap, Ana Marasović, William Agnew, Gabriel Ilharco,
Christopher Bare, Megan Doerr, Abhishek Pratap, John Wilbanks, E. Ray Dorsey, Dirk Groeneveld, Margaret Mitchell, and Matt Gardner. 2021. Documenting
Stephen H. Friend, and Andrew D Trister. 2016. The mPower study, Parkinson Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus.
disease mobile data collected using ResearchKit. Scientifc Data 3 (March 2016), In Proceedings of the 2021 Conference on Empirical Methods in Natural Language
160011. https://doi.org/10.1038/sdata.2016.11 Processing. 1286–1305.
[14] Michael R Bowl and Sally J Dawson. 2019. Age-related hearing loss. Cold Spring [40] Rohan Doshi, Youzheng Chen, Liyang Jiang, Xia Zhang, Fadi Biadsy, Bhu-
Harbor perspectives in medicine 9, 8 (2019), a033217. vana Ramabhadran, Fang Chu, Andrew Rosenberg, and Pedro J. Moreno. 2021.
[15] Danielle Bragg, Naomi Caselli, Julie A. Hochgesang, Matt Huenerfauth, Leah Extending Parrotron: An End-to-End, Speech Conversion and Speech Recog-
Katz-Hernandez, Oscar Koller, Raja Kushalnagar, Christian Vogler, and Richard E. nition Model for Atypical Speech. In ICASSP 2021 - 2021 IEEE International
Ladner. 2021. The FATE Landscape of Sign Language AI Datasets: An In- Conference on Acoustics, Speech and Signal Processing (ICASSP). 6988–6992.
terdisciplinary Perspective. 14, 2, Article 7 (July 2021), 45 pages. https: https://doi.org/10.1109/ICASSP39728.2021.9414644
//doi.org/10.1145/3436996 [41] Susan M Dray, Anicia N Peters, Anke M Brock, Andrea Peer, Allison Druin,
[16] Lundy Braun, Anne Fausto-Sterling, Duana Fullwiley, Evelynn M Hammonds, Shikoh Gitau, Janaki Kumar, and Dianne Murray. 2013. Leveraging the progress
Alondra Nelson, William Quivers, Susan M Reverby, and Alexandra E Shields. of women in the HCI feld to address the diversity chasm. In CHI’13 Extended
2007. Racial categories in medical practice: how useful are they? PLoS medicine Abstracts on Human Factors in Computing Systems. 2399–2406.
ASSETS ’22, October 23–26, 2022, Athens, Greece Kamikubo et al.

[42] Marina Drosou, HV Jagadish, Evaggelia Pitoura, and Julia Stoyanovich. 2017. [63] Saad Hassan, Matt Huenerfauth, and Cecilia Ovesdotter Alm. 2021. Unpacking
Diversity in big data: A review. Big data 5, 2 (2017), 73–84. the Interdependent Systems of Discrimination: Ableist Bias in NLP Systems
[43] Huiyu Duan, Guangtao Zhai, Xiongkuo Min, Zhaohui Che, Yi Fang, Xiaokang through an Intersectional Lens. CoRR abs/2110.00521 (2021). arXiv:2110.00521
Yang, Jesús Gutiérrez, and Patrick Le Callet. 2019. A Dataset of Eye Movements https://arxiv.org/abs/2110.00521
for the Children with Autism Spectrum Disorder. In Proceedings of the 10th [64] Jefrey M Hausdorf, Apinya Lertratanakul, Merit E Cudkowicz, Amie L Peterson,
ACM Multimedia Systems Conference (Amherst, Massachusetts) (MMSys ’19). David Kaliton, and Ary L Goldberger. 2000. Dynamic markers of altered gait
Association for Computing Machinery (ACM), New York, NY, USA, 255–260. rhythm in amyotrophic lateral sclerosis. Journal of applied physiology (2000).
https://doi.org/10.1145/3304109.3325818 [65] Jefrey M Hausdorf, Susan L Mitchell, Renee Firtion, Chung-Kang Peng, Merit E
[44] A Engler. 2019. For some employment algorithms, disability discrimination Cudkowicz, Jeanne Y Wei, and Ary L Goldberger. 1997. Altered fractal dynamics
by default. https://www.brookings.edu/blog/techtank/2019/10/31/for-some- of gait: reduced stride-interval correlations with aging and Huntington’s disease.
employment-algorithms-disability-discrimination-by-default/ Journal of applied physiology 82, 1 (1997), 262–269. https://doi.org/10.1152/
[45] Sukru Eraslan, Victoria Yaneva, Yeliz Yesilada, and Simon Harper. 2019. Web jappl.1997.82.1.262
users with autism: eye tracking evidence for diferences. Behaviour & Information [66] Tamar Heller, P Staford, LA Davis, L Sedlezky, and V Gaylord. 2010. People with
Technology 38, 7 (2019), 678–700. https://doi.org/10.1080/0144929X.2018.1551933 intellectual and developmental disabilities growing old: An overview. Impact:
[46] Anne Fausto-Sterling. 2000. Sexing the body: Gender politics and the construction Feature Issue on Aging and People with Intellectual and Developmental Disabilities
of sexuality. Basic Books. 23, 1 (2010), 2–3.
[47] Sina Fazelpour and Maria De-Arteaga. 2021. Diversity in Sociotechnical Machine [67] Martin Holtmann, Sven Bölte, and Fritz Poustka. 2007. Autism spectrum disor-
Learning Systems. CoRR abs/2107.09163 (2021). arXiv:2107.09163 https://arxiv. ders: Sex diferences in autistic behaviour domains and coexisting psychopathol-
org/abs/2107.09163 ogy. Developmental Medicine & Child Neurology 49, 5 (2007), 361–366.
[48] Leah Findlater, Steven Goodman, Yuhang Zhao, Shiri Azenkot, and Margot [68] Lindsay M Howden, Julie A Meyer, et al. 2011. Age and sex composition: 2010.
Hanley. 2020. Fairness Issues in AI Systems That Augment Sensory Abilities. [69] Matt Huenerfauth and Hernisa Kacorri. 2014. Release of experimental stimuli
125, Article 8 (mar 2020), 1 pages. https://doi.org/10.1145/3386296.3386304 and questions for evaluating facial expressions in animations of American Sign
[49] Leah Findlater and Lotus Zhang. 2020. Input Accessibility: A Large Dataset Language. In Proceedings of the 6th Workshop on the Representation and Processing
and Summary Analysis of Age, Motor Ability and Input Performance. In The of Sign Languages: Beyond the Manual Channel, The 9th International Conference
22nd International ACM SIGACCESS Conference on Computers and Accessibility on Language Resources and Evaluation (LREC ’14). http://dx.doi.org/10.1007/978-
(Virtual Event, Greece) (ASSETS ’20). Association for Computing Machinery, 3-642-39188-0_55
New York, NY, USA, Article 17, 6 pages. https://doi.org/10.1145/3373625.3417031 [70] Ben Hutchinson, Vinodkumar Prabhakaran, Emily Denton, Kellie Webster, Yu
[50] Annette Flanagin, Tracy Frey, Stacy L Christiansen, AMA Manual of Style Com- Zhong, and Stephen Denuyl. 2020. Social Biases in NLP Models as Barriers
mittee, et al. 2021. Updated guidance on the reporting of race and ethnicity in for Persons with Disabilities. CoRR abs/2005.00813 (2020). arXiv:2005.00813
medical and science journals. JAMA 326, 7 (2021), 621–627. https://arxiv.org/abs/2005.00813
[51] Eric Fombonne. 2009. Epidemiology of pervasive developmental disorders. [71] Dimitrios Iakovakis, Stelios Hadjidimitriou, Vasileios Charisis, Sevasti Bostan-
Pediatric research 65, 6 (2009), 591–598. tjopoulou, Zoe Katsarou, Lisa Klingelhoefer, Heinz Reichmann, Sofa B Dias,
[52] Marvella E Ford and P Adam Kelly. 2005. Conceptualizing and categorizing race José A Diniz, Dhaval Trivedi, et al. 2018. Motor impairment estimates via
and ethnicity in health services research. Health services research 40, 5p2 (2005), touchscreen typing dynamics toward Parkinson’s disease detection from data
1658–1675. harvested in-the-wild. Frontiers in ICT 5 (2018), 28.
[53] Núria Gala, Anaïs Tack, Ludivine Javourey-Drevet, Thomas François, and Jo- [72] Dimitrios Iakovakis, Stelios Hadjidimitriou, Vasileios Charisis, Sevasti Bostant-
hannes C Ziegler. 2020. Alector: A parallel corpus of simplifed French texts with zopoulou, Zoe Katsarou, and Leontios J Hadjileontiadis. 2018. Touchscreen
alignments of misreadings by poor and dyslexic readers. In Language Resources typing-pattern analysis for detecting fne motor skills decline in early-stage
and Evaluation for Language Technologies (LREC). Parkinson’s disease. Scientifc reports 8, 1 (2018), 1–13.
[54] Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman [73] Jina Jang, Johnny L Matson, Hilary L Adams, Matt J Konst, Paige E Cervantes,
Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021. Datasheets and Rachel L Goldin. 2014. What are the ages of persons studied in autism
for datasets. Commun. ACM 64, 12 (2021), 86–92. research: A 20-year review. Research in Autism Spectrum Disorders 8, 12 (2014),
[55] Ellen Giarelli, Lisa D Wiggins, Catherine E Rice, Susan E Levy, Russell S Kirby, 1756–1760.
Jennifer Pinto-Martin, and David Mandell. 2010. Sex diferences in the evaluation [74] Aparna Joshi and Hyuntak Roh. 2009. The role of context in work team diversity
and diagnosis of autism spectrum disorders among children. Disability and research: A meta-analytic review. Academy of management journal 52, 3 (2009),
health journal 3, 2 (2010), 107–116. 599–627.
[56] Anhong Guo, Ece Kamar, Jennifer Wortman Vaughan, Hanna Wallach, and [75] Hernisa Kacorri. 2017. Teachable Machines for Accessibility. SIGACCESS -
Meredith Ringel Morris. 2020. Toward Fairness in AI for People with Disabilities: Accessible Computing 119, 10–18. https://doi.org/10.1145/3167902.3167904
A Research Roadmap. SIGACCESS Accessible Computing 125, Article 2 (March [76] Hernisa Kacorri, Utkarsh Dwivedi, Sravya Amancherla, Mayanka Jha, and Riya
2020), 1 pages. https://doi.org/10.1145/3386296.3386298 Chanduka. 2020. IncluSet: A Data Surfacing Repository for Accessibility Datasets.
[57] Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Association for Computing Machinery (ACM). https://doi.org/10.1145/3373625.
Jiebo Luo, and Jefrey P Bigham. 2018. VizWiz Grand Challenge: Answering 3418026
Visual Questions from Blind People. Proceedings of the 2018 IEEE/CVF Conference [77] Hernisa Kacorri, Utkarsh Dwivedi, and Rie Kamikubo. 2020. Data Sharing in
on Computer Vision and Pattern Recognition (Jun 2018). https://doi.org/10.1109/ Wellness, Accessibility, and Aging. (2020).
cvpr.2018.00380 [78] Hernisa Kacorri, Sergio Mascetti, Andrea Gerino, Dragan Ahmetovic, Hironobu
[58] Haya Berman Hakim and Nan Bernstein Ratner. 2004. Nonword repetition Takagi, and Chieko Asakawa. 2016. Supporting Orientation of People with
abilities of children who stutter: An exploratory study. Journal of fuency Visual Impairment: Analysis of Large Scale Usage Data. In Proceedings of the
disorders 29, 3 (2004), 179–199. 18th International ACM SIGACCESS Conference on Computers and Accessibility
[59] Ali G Hamedani, Brian L VanderBeek, and Allison W Willis. 2019. Blindness (ASSETS ’16). Association for Computing Machinery (ACM), 151–159. https:
and visual impairment in the medicare population: disparities and association //doi.org/10.1145/2982142.2982178
with hip fracture and neuropsychiatric outcomes. Ophthalmic epidemiology 26, [79] Rie Kamikubo, Utkarsh Dwivedi, and Hernisa Kacorri. 2021. Sharing Practices
4 (2019), 279–285. for Datasets Related to Accessibility and Aging. In The 23rd International ACM
[60] Foad Hamidi, Kellie Poneres, Aaron Massey, and Amy Hurst. 2018. Who SIGACCESS Conference on Computers and Accessibility (Virtual Event, USA)
Should Have Access to My Pointing Data? Privacy Tradeofs of Adaptive (ASSETS ’21). Association for Computing Machinery, New York, NY, USA, Article
Assistive Technologies. In Proceedings of the 20th International ACM SIGAC- 28, 16 pages. https://doi.org/10.1145/3441852.3471208
CESS Conference on Computers and Accessibility (Galway, Ireland) (ASSETS [80] Kimmo Kärkkäinen and Jungseock Joo. 2019. FairFace: Face Attribute Dataset for
’18). Association for Computing Machinery, New York, NY, USA, 203–216. Balanced Race, Gender, and Age. CoRR abs/1908.04913 (2019). arXiv:1908.04913
https://doi.org/10.1145/3234695.3239331 http://arxiv.org/abs/1908.04913
[61] Foad Hamidi, Morgan Klaus Scheuerman, and Stacy M. Branham. 2018. Gen- [81] Preet Chandan Kaur, Tushar Ghorpade, and Vanita Mane. 2016. Analysis of
der Recognition or Gender Reductionism? The Social Implications of Embed- data security by using anonymization techniques. In 2016 6th International
ded Gender Recognition Systems. In Proceedings of the 2018 CHI Conference Conference-Cloud System and Big Data Engineering (Confuence). IEEE, 287–293.
on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18). [82] Amit Kaushal, Russ Altman, and Curt Langlotz. 2020. Geographic distribution
Association for Computing Machinery, New York, NY, USA, 1–13. https: of US cohorts used to train deep learning algorithms. Jama 324, 12 (2020),
//doi.org/10.1145/3173574.3173582 1212–1213.
[62] Romana Hasnain-Wynia and David W Baker. 2006. Obtaining data on patient [83] Matthew Kay, Cynthia Matuszek, and Sean A. Munson. 2015. Unequal Rep-
race, ethnicity, and primary language in health care organizations: current resentation and Gender Stereotypes in Image Search Results for Occupa-
challenges and proposed solutions. Health services research 41, 4p1 (2006), tions. Association for Computing Machinery, New York, NY, USA. https:
1501–1518. //doi.org/10.1145/2702123.2702520
Data Representativeness in Accessibility Datasets: A Meta-Analysis ASSETS ’22, October 23–26, 2022, Athens, Greece

[84] David Kertzer and Dominique Arel. [n.d.]. Census and identity. ([n. d.]). [105] José Luis Pérez Mantero. 2014. Interacción y predictividad: Los intercambios
[85] Melissa Kirkovski, Peter G Enticott, and Paul B Fitzgerald. 2013. A review of conversacionales con hablantes con demencia tipo alzhéimer. revista de investi-
the role of female gender in autism spectrum disorders. Journal of autism and gación Lingüística 17 (2014), 97–118.
developmental disorders 43, 11 (2013), 2584–2603. [106] Maya Matheis, Johnny L Matson, Esther Hong, and Paige E Cervantes. 2019.
[86] Jochen Klucken, Jens Barth, Patrick Kugler, Johannes Schlachetzki, Thore Henze, Gender diferences and similarities: Autism symptomatology and developmental
Franz Marxreiter, Zacharias Kohl, Ralph Steidl, Joachim Hornegger, Bjoern functioning in young children. Journal of autism and developmental disorders
Eskofer, et al. 2013. Unbiased and mobile gait analysis detects motor impairment 49, 3 (2019), 1219–1231.
in Parkinson’s disease. PloS one 8, 2 (2013), e56956. https://doi.org/10.1371/ [107] Silke Matthes, Thomas Hanke, Anja Regen, Jakob Storz, Satu Worseck, Eleni
journal.pone.0056956 Efthimiou, Athanasia-Lida Dimou, Annelies Brafort, John Glauert, and Eva Safar.
[87] Meng-Chuan Lai, Michael V Lombardo, Bonnie Auyeung, Bhismadev 2012. Dicta-Sign–building a multilingual sign language corpus. In Proceedings
Chakrabarti, and Simon Baron-Cohen. 2015. Sex/gender diferences and autism: of the 5th Workshop on the Representation and Processing of Sign Languages:
setting the scene for future research. Journal of the American Academy of Child Interactions between Corpus and Lexicon (LREC ’12). https://www.sign-lang.uni-
& Adolescent Psychiatry 54, 1 (2015), 11–24. hamburg.de/lrec/lrec/pubs/12016.pdf
[88] Kyungjun Lee and Hernisa Kacorri. 2019. Hands Holding Clues for Object [108] Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram
Recognition in Teachable Machines. In Proceedings of the 2019 CHI Conference Galstyan. 2021. A Survey on Bias and Fairness in Machine Learning. ACM
on Human Factors in Computing Systems (CHI ’19). Association for Computing Comput. Surv. 54, 6, Article 115 (July 2021), 35 pages. https://doi.org/10.1145/
Machinery (ACM), 1–12. https://doi.org/10.1145/3290605.3300566 3457607
[89] Matthew L. Lee and Anind K Dey. 2007. Providing Good Memory Cues for People [109] Michele Merler, Nalini Ratha, Rogerio S. Feris, and John R. Smith. 2019. Diversity
with Episodic Memory Impairment. In Proceedings of the 9th International ACM in Faces. arXiv:1901.10436 [cs.CV]
SIGACCESS Conference on Computers and Accessibility (Assets ’07). Association [110] Milagros Miceli, Julian Posada, and Tianling Yang. 2021. Studying Up
for Computing Machinery (ACM), 131–138. https://doi.org/10.1145/1296843. Machine Learning Data: Why Talk About Bias When We Mean Power?
1296867 arXiv:2109.08131 [cs.HC]
[90] Fritz-Georg Lehnhardt, Christine Michaela Falter, Astrid Gawronski, Kathleen [111] Roman Čmejla-Hana Růžičková Jiří Klempíř Michal Novotný, Jan Rusz and
Pfeifer, Ralf Tepest, Jeremy Franklin, and Kai Vogeley. 2016. Sex-related cogni- Evžen Růžička. 2016. Hypernasality associated with basal ganglia dysfunction:
tive profle in autism spectrum disorders diagnosed late in life: implications for evidence from Parkinson’s disease and Huntington’s disease. PeerJ 4 (2016),
the female autistic phenotype. Journal of Autism and Developmental Disorders e2530. https://dx.doi.org/10.7717%2Fpeerj.2530
46, 1 (2016), 139–154. [112] Emancipated Minors and Self-Sufcient Minors. 2017. Guidance and Procedures:
[91] Daniel Leightley, Moi Hoon Yap, Jessica Coulson, Yoann Barnouin, and Jamie S Child Assent and Permission by Parents or Guardians. https://ora.research.ucla.
McPhee. 2015. Benchmarking human motion analysis using kinect one: An open edu/OHRPP/Documents/Policy/9/ChildAssent_ParentPerm.pdf. (2017).
source dataset. In Proceedings of the 2015 Asia-Pacifc Signal and Information [113] Margaret Mitchell, Dylan Baker, Nyalleng Moorosi, Emily Denton, Ben Hutchin-
Processing Association Annual Summit and Conference (APSIPA ’15). IEEE, 1–7. son, Alex Hanna, Timnit Gebru, and Jamie Morgenstern. 2020. Diversity
https://doi.org/10.1109/APSIPA.2015.7415438 and Inclusion Metrics in Subset Selection. In Proceedings of the AAAI/ACM
[92] DONGXU LI, Cristian Rodriguez, Xin Yu, and HONGDONG LI. 2020. Word- Conference on AI, Ethics, and Society (New York, NY, USA) (AIES ’20). As-
level Deep Sign Language Recognition from Video: A New Large-scale Dataset sociation for Computing Machinery, New York, NY, USA, 117–123. https:
and Methods Comparison. In Proceedings of the IEEE/CVF Winter Conference on //doi.org/10.1145/3375627.3375832
Applications of Computer Vision (WACV). [114] Ross E Mitchell, Travas A Young, Bellamie Bachelda, and Michael A Karchmer.
[93] Megan Marie Quaglia Linton. 2021. The Institutional Remains: Transinstitution- 2006. How many people use ASL in the United States? Why estimates need
alization of Disability & Sexuality. (2021). updating. Sign Language Studies 6, 3 (2006), 306–335.
[94] Paul H Lipkin, Jefrey Okamoto, Kenneth W Norwood, Richard C Adams, Timo- [115] Masooda Modak, Ketan Ghotane, V Siddhanth, Nachiket Kelkar, and Prachi G
thy J Brei, Robert T Burke, Beth Ellen Davis, Sandra L Friedman, Amy J Houtrow, Aravind Iyer. 2019. Detection of Dyslexia using Eye Tracking Measures. Inter-
Susan L Hyman, et al. 2015. The Individuals with Disabilities Education Act national Journal of Innovative Technology and Exploring Engineering (IJITEE) 8
(IDEA) for children with special educational needs. Pediatrics 136, 6 (2015), (2019), 1011–1014.
e1650–e1662. [116] Karyn Anne Mofatt. 2010. Addressing age-related pen-based target ac-
[95] Keng Yin Loh and J Ogle. 2004. Age related visual impairment in the elderly. quisition difculties. Ph.D. Dissertation. University of British Columbia.
The Medical journal of Malaysia 59, 4 (2004), 562–8. http://www.sigaccess.org/2010/01/addressing-age-related-pen-based-target-
[96] Steve Lohr. 2018. Facial recognition is accurate, if you’re a white guy. New York acquisition-difculties/
Times 9, 8 (2018), 283. [117] Vanessa Moore and Sally Goodson. 2003. How well does early diagnosis of
[97] Daria Loi and Thomas Lodato. 2020. On empathy and empiricism: addressing autism stand the test of time? Follow-up study of children assessed for autism at
stereotypes about older adults in technology. Interactions 28, 1 (2020), 23–25. age 2 and development of an early diagnostic service. Autism 7, 1 (2003), 47–63.
[98] Rachel Loomes, Laura Hull, and William Polmear Locke Mandy. 2017. What [118] Meredith Ringel Morris. 2020. AI and accessibility. Commun. ACM 63, 6 (2020),
is the male-to-female ratio in autism spectrum disorder? A systematic review 35–37.
and meta-analysis. Journal of the American Academy of Child & Adolescent [119] Yolanda Moses. 2017. Why Do We Keep Using the Word “Caucasian”? https:
Psychiatry 56, 6 (2017), 466–474. //www.sapiens.org/column/race/caucasian-terminology-origin/
[99] Deborah Lupton. 2017. Digital health now and in the future: Findings [120] Maura Jones Moyle, Susan Ellis Weismer, Julia L Evans, and Mary J Lindstrom.
from a participatory design stakeholder workshop. Digital health 3 (2017), 2007. Longitudinal relationships between lexical and grammatical development
2055207617740018. in typical and late-talking children. (2007).
[100] Kelly Mack, Emma McDonnell, Dhruv Jain, Lucy Lu Wang, Jon E. Froehlich, [121] Kevin Munger, Ishita Gopal, Jonathan Nagler, and Joshua A. Tucker. 2021. Acces-
and Leah Findlater. 2021. What Do We Mean by “Accessibility Research”? A sibility and generalizability: Are social media efects moderated by age or digital
Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019. literacy? Research & Politics 8, 2 (2021), 20531680211016968. https://doi.org/10.
, Article 371 (2021), 18 pages. https://doi.org/10.1145/3411764.3445412 1177/20531680211016968 arXiv:https://doi.org/10.1177/20531680211016968
[101] Brian MacWhinney, Steven Bird, Christopher Cieri, and Craig Martell. 2004. [122] Karama C Neal et al. 2008. Use and misuse of ‘race’in biomedical research.
TalkBank: Building an open unifed multimodal database of communicative inter- Journal of Health Ethics 5, 1 (2008), 8.
action. In Proceedings of the 4th International Conference on Language Resources [123] David R Nerenz, Bernadette McFadden, Cheryl Ulmer, et al. 2009. Race, ethnicity,
and Evaluation (LREC ’04). Evaluations and Language resources Distribution and language data: standardization for health care quality improvement. (2009).
Agency, 525–528. http://www.lrec-conf.org/proceedings/lrec2004/pdf/392.pdf [124] Antony Nicol, Chris Casey, and Stuart MacFarlane. 2002. Children are ready for
[102] David S Mandell, Lindsay J Lawer, Kira Branch, Edward S Brodkin, speech technology-but is the technology ready for them. Interaction Design and
Kristin Healey, Robert Witalec, Donielle N Johnson, and Raquel E Gur. Children, Eindhoven, The Netherlands (2002).
2012. Prevalence and correlates of autism in a state psychiatric hospital. [125] Spiros Nikolopoulos, Kostas Georgiadis, Fotis Kalaganis, Georgios Liaros, Iouli-
Autism 16, 6 (2012), 557–567. https://doi.org/10.1177/1362361311412058 etta Lazarou, Katerina Adam, Papazoglou-Chalikias Anastasios, Elisavet Chatzi-
arXiv:https://doi.org/10.1177/1362361311412058 PMID: 21846667. lari, P. Vangelis Oikonomou, C. Panagiotis Petrantonakis, I. Kompatsiaris, Chan-
[103] William Mandy, Rebecca Chilvers, Uttom Chowdhury, Gemma Salter, Anna dan Kumar, Raphael Menges, Stefen Staab, Daniel Müller, Korok Sengupta, Sev-
Seigal, and David Skuse. 2012. Sex diferences in autism spectrum disorder: asti Bostantjopoulou, Zoe Katsarou, Gabi Zeilig, Meir Plotnin, Amihai Gottlieb,
evidence from a large sample of children and adolescents. Journal of autism and Sofa Fountoukidou, Jaap Ham, Dimitrios Athanasiou, Agnes Mariakaki, Dario
developmental disorders 42, 7 (2012), 1304–1313. Comanducci, Eduardo Sabatini, Walter Nistico, and Markus Plank. 2017. The
[104] Jennifer J. Manly. 2006. Deconstructing Race and Ethnicity: Implications for MAMEM Project - A dataset for multimodal human-computer interaction using
Measurement of Health Outcomes. Medical Care 44, 11 (2006), S10–S16. http: biosignals and eye tracking information. https://doi.org/10.5281/zenodo.834154
//www.jstor.org/stable/41219499 [126] World Institute on Disability. 2021. AI and Accessibility. https://wid.org/2019/
06/12/ai-and-accessibility/
ASSETS ’22, October 23–26, 2022, Athens, Greece Kamikubo et al.

[127] Sally Ozonof, Beth L Goodlin-Jones, and Marjorie Solomon. 2005. Evidence- [147] David Serre and Svante Pääbo. 2004. Evidence for gradients of human genetic
based assessment of autism spectrum disorders in children and adolescents. diversity within and among continents. Genome research 14, 9 (2004), 1679–1685.
Journal of Clinical Child and Adolescent Psychology 34, 3 (2005), 523–540. [148] Shreya Shankar, Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, and D.
[128] Joon Sung Park, Michael S. Bernstein, Robin N. Brewer, Ece Kamar, and Mered- Sculley. 2017. No Classifcation without Representation: Assessing Geodiversity
ith Ringel Morris. 2021. Understanding the Representation and Representative- Issues in Open Data Sets for the Developing World. arXiv:1711.08536 [stat.ML]
ness of Age in AI Data Sets. CoRR abs/2103.09058 (2021). arXiv:2103.09058 [149] Shanya Sharma, Manan Dey, and Koustuv Sinha. 2021. Evaluating Gender Bias
https://arxiv.org/abs/2103.09058 in Natural Language Inference. arXiv preprint arXiv:2105.05541 (2021).
[129] Joon Sung Park, Danielle Bragg, Ece Kamar, and Meredith Ringel Morris. 2021. [150] Linda R Shaw, Fong Chan, and Brian T McMahon. 2012. Intersectionality and
Designing an online infrastructure for collecting AI data from people with disability harassment: The interactive efects of disability, race, age, and gender.
disabilities. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, Rehabilitation Counseling Bulletin 55, 2 (2012), 82–91.
and Transparency. 52–63. [151] Bowen Shi, Aurora Martinez Del Rio, Jonathan Keane, Jonathan Michaux, Diane
[130] Joseph Piven, Peter Rabins, and on behalf of the Autism-in Older Adults Brentari, Greg Shakhnarovich, and Karen Livescu. 2018. American Sign Lan-
Working Group. 2011. Autism Spectrum Disorders in Older Adults: Toward guage Fingerspelling Recognition in the Wild. In 2018 IEEE Spoken Language
Defning a Research Agenda. Journal of the American Geriatrics Society Technology Workshop (SLT). 145–152. https://doi.org/10.1109/SLT.2018.8639639
59, 11 (2011), 2151–2155. https://doi.org/10.1111/j.1532-5415.2011.03632.x [152] Bowen Shi, Aurora Martinez Del Rio, Jonathan Keane, Diane Brentari, Greg
arXiv:https://agsjournals.onlinelibrary.wiley.com/doi/pdf/10.1111/j.1532- Shakhnarovich, and Karen Livescu. 2019. Fingerspelling Recognition in the Wild
5415.2011.03632.x With Iterative Visual Attention. In Proceedings of the IEEE/CVF International
[131] Martin Prince, Martin Knapp, Maelenn Guerchet, Paul McCrone, Matthew Prina, Conference on Computer Vision (ICCV).
A Comas-Herrera, Raphael Wittenberg, Bayo Adelaja, Bo Hu, Derek King, et al. [153] Ben Shneiderman. 2020. Human-centered artifcial intelligence: three fresh
2014. Dementia UK: -overview. (2014). ideas. AIS Transactions on Human-Computer Interaction 12, 3 (2020), 109–124.
[132] Nan Bernstein Ratner and Stacy Silverman. 2000. Parental perceptions of [154] Ajay Singh, Chia Jung Yeh, and Sheresa Boone Blanchard. 2017. Ages and stages
children’s communicative development at stuttering onset. Journal of Speech, questionnaire: a global screening scale. Boletín Médico Del Hospital Infantil de
Language, and Hearing Research 43, 5 (2000), 1252–1263. México (English Edition) 74, 1 (2017), 5–12.
[133] Allison B Ratto, Lauren Kenworthy, Benjamin E Yerys, Julia Bascom, An- [155] Mona Sloane, Emanuel Moss, Olaitan Awomolo, and Laura Forlano. 2020. Partic-
drea Trubanova Wieckowski, Susan W White, Gregory L Wallace, Cara Pugliese, ipation is not a design fx for machine learning. arXiv preprint arXiv:2007.02423
Robert T Schultz, Thomas H Ollendick, et al. 2018. What about the girls? Sex- (2020).
based diferences in autistic traits and adaptive skills. Journal of autism and [156] W Smith, P Mitchell, and JJ Wang. 1997. Gender, oestrogen, hormone replace-
developmental disorders 48, 5 (2018), 1698–1711. ment and age-related macular degeneration: Results from the Blue Mountains
[134] Luz Rello, Ricardo Baeza-Yates, and Joaquim Llisterri. 2014. DysList: An An- Eye Study. Australian and New Zealand journal of ophthalmology 25, 4 (1997),
notated Resource of Dyslexic Errors. In Proceedings of the 9th International 13–15.
Conference on Language Resources and Evaluation (LREC ’14). European Lan- [157] Dean Spade. 2009. Trans law and Politics on a Neoliberal landscape. Trans Law
guages Resources Association (ELRA), 1289–1296. http://www.lrec-conf.org/ and Politics on a Neoliberal Landscape (June 26, 2009). Temple Political & Civil
proceedings/lrec2014/pdf/612_Paper.pdf Rights Law Review 18 (2009), 09–05.
[135] Luz Rello and Miguel Ballesteros. 2015. Detecting Readers with Dyslexia Using [158] Katta Spiel, Oliver L. Haimson, and Danielle Lottridge. 2019. How to Do Better
Machine Learning with Eye Tracking Measures. In Proceedings of the 12th Web with Gender on Surveys: A Guide for HCI Researchers. Interactions 26, 4 (jun
for All Conference (W4A ’15). Association for Computing Machinery (ACM), 2019), 62–65. https://doi.org/10.1145/3338283
Article 16, 8 pages. https://doi.org/10.1145/2745555.2746644 [159] Daniel Steel, Sina Fazelpour, Kinley Gillette, Bianca Crewe, and Michael Burgess.
[136] Michael Rembis. 2021. Crip Camp: A Disability Revolution. Jour- 2018. Multiple diversity concepts and their ethical-epistemic implications.
nal of American History 108, 3 (12 2021), 667–669. https://doi. European journal for philosophy of science 8, 3 (2018), 761–780.
org/10.1093/jahist/jaab339 arXiv:https://academic.oup.com/jah/article- [160] Amy Strekas, Nan Bernstein Ratner, Madison Berl, and William D Gaillard. 2013.
pdf/108/3/667/41938029/jaab339.pdf Narrative abilities of children with epilepsy. International journal of language &
[137] Hannah Ritchie and Max Roser. 2019. Gender Ratio. Our World in Data (2019). communication disorders 48, 2 (2013), 207–219.
https://ourworldindata.org/gender-ratio. [161] SurveyMonkey. [n.d.]. Gathering demographic information from surveys.
[138] Tessa Taylor Rivet and Johnny L Matson. 2011. Review of gender diferences https://www.surveymonkey.com/mp/gathering-demographic-information-
in core symptomatology in autism spectrum disorders. Research in Autism from-surveys/. Accessed: 2022-01-03.
Spectrum Disorders 5, 3 (2011), 957–976. [162] Rachael Tatman. 2017. Gender and Dialect Bias in YouTube’s Automatic Cap-
[139] Amanda Roestorf, Dermot M Bowler, Marie K Deserno, Patricia Howlin, Laura tions. In Proceedings of the First ACL Workshop on Ethics in Natural Language
Klinger, Helen McConachie, Jeremy R Parr, Patrick Powell, Barbara FC Van Hei- Processing. Association for Computational Linguistics, Valencia, Spain, 53–59.
jst, and Hilde M Geurts. 2019. “Older Adults with ASD: The Consequences of https://doi.org/10.18653/v1/W17-1606
Aging.” Insights from a series of special interest group meetings held at the In- [163] Lida Theodorou, Daniela Massiceti, Luisa Zintgraf, Simone Stumpf, Cecily
ternational Society for Autism Research 2016–2017. Research in autism spectrum Morrison, Edward Cutrell, Matthew Tobias Harris, and Katja Hofmann. 2021.
disorders 63 (2019), 3–12. Disability-First Dataset Creation: Lessons from Constructing a Dataset for Teach-
[140] Betul Erdogdu Sakar, M Erdem Isenkul, C Okan Sakar, Ahmet Sertbas, Fikret able Object Recognition with Blind and Low Vision Data Collectors. In The
Gurgen, Sakir Delil, Hulya Apaydin, and Olcay Kursun. 2013. Collection and 23rd International ACM SIGACCESS Conference on Computers and Accessibility
analysis of a Parkinson speech dataset with multiple types of sound recordings. (Virtual Event, USA) (ASSETS ’21). Association for Computing Machinery, New
IEEE Journal of Biomedical and Health Informatics 17, 4 (2013), 828–834. https: York, NY, USA, Article 27, 12 pages. https://doi.org/10.1145/3441852.3471225
//doi.org/10.1109/JBHI.2013.2245674 [164] Krishna Thiyagarajan. 2016. Parkinson’s Disease Observations: Vari-
[141] Adam Schembri, Jordan Fenlon, Ramas Rentelis, Sally Reynolds, and Kearsy ables Regarding Parkinson’s Disease. https://www.kaggle.com/krisht/
Cormier. 2013. Building the British sign language corpus. Language Documen- parkinsonsdisease.
tation & Conservation 7 (2013), 136–154. [165] David Thompson. 2002. Misplaced and forgotten:. Housing, Care and Support 5,
[142] Morgan Klaus Scheuerman, Kandrea Wade, Caitlin Lustig, and Jed R Brubaker. 1 (2022/01/13 2002), 19–22. https://doi.org/10.1108/14608790200200006
2020. How We’ve Taught Algorithms to See Identity: Constructing Race and [166] Jutta Treviranus. 2018. Sidewalk Toronto and Why Smarter is Not Bet-
Gender in Image Databases for Facial Analysis. Proceedings of the ACM on ter*. https://medium.datadriveninvestor.com/sidewalk-toronto-and-why-
Human-Computer Interaction 4, CSCW1 (2020), 1–35. smarter-is-not-better-b233058d01c8
[143] Andrew Sears and Vicki Hanson. 2011. Representing Users in Accessibility Re- [167] Jutta Treviranus. 2019. The Value of Being Diferent. In Proceedings of the 16th
search. In Proceedings of the SIGCHI Conference on Human Factors in Computing Web For All 2019 Personalization - Personalizing the Web (W4A ’19). Association
Systems (Vancouver, BC, Canada) (CHI ’11). Association for Computing Machin- for Computing Machinery (ACM), Article 1, 7 pages. https://doi.org/10.1145/
ery, New York, NY, USA, 2235–2238. https://doi.org/10.1145/1978942.1979268 3315002.3332429
[144] Rajani Sebastian, Carol B Thompson, Nae-Yuh Wang, Amy Wright, Aaron Meyer, [168] Shari Trewin. 2018. AI Fairness for People with Disabilities: Point of View. CoRR
Rhonda B Friedman, Argye E Hillis, and Donna C Tippett. 2018. Patterns of abs/1811.10670 (2018). arXiv:1811.10670 http://arxiv.org/abs/1811.10670
decline in naming and semantic knowledge in primary progressive aphasia. [169] Shari Trewin, Sara Basson, Michael Muller, Stacy Branham, Jutta Treviranus,
Aphasiology 32, 9 (2018), 1010–1030. Daniel Gruen, Daniel Hebert, Natalia Lyckowski, and Erich Manser. 2019. Con-
[145] Maya Sen and Omar Wasow. 2016. Race as a Bundle of Sticks: Designs that siderations for AI Fairness for People with Disabilities. AI Matters 5, 3 (Dec.
Estimate Efects of Seemingly Immutable Characteristics. Annual Review of 2019), 40–63. https://doi.org/10.1145/3362077.3362086
Political Science 19, 1 (2016), 499–522. https://doi.org/10.1146/annurev-polisci- [170] Juan Camilo Vásquez-Correa, Tomas Arias-Vergara, Juan Rafael Orozco-
032015-010015 Arroyave, Björn Eskofer, Jochen Klucken, and Elmar Nöth. 2018. Multi-
[146] Julia Serano. 2013. Excluded: Making feminist and queer movements more inclusive. modal assessment of Parkinson’s disease: a deep learning approach. IEEE
Seal Press. journal of biomedical and health informatics 23, 4 (2018), 1618–1630. https:
Data Representativeness in Accessibility Datasets: A Meta-Analysis ASSETS ’22, October 23–26, 2022, Athens, Greece

//doi.org/10.1109/jbhi.2018.2866873 Research 42, 5 (1999), 1097–1112.


[171] Radu-Daniel Vatavu and Ovidiu-Ciprian Ungurean. 2019. Stroke-Gesture Input [183] Victoria Yaneva, Irina Temnikova, and Ruslan Mitkov. 2015. Accessible Texts
for People with Motor Impairments: Empirical Results & Research Roadmap. for Autism: An Eye-Tracking Study. In Proceedings of the 17th International ACM
In Proceedings of the 2019 CHI Conference on Human Factors in Computing SIGACCESS Conference on Computers & Accessibility (ASSETS ’15). Association
Systems (CHI ’19). Association for Computing Machinery (ACM), 1–14. https: for Computing Machinery (ACM), 49–57. https://doi.org/10.1145/2700648.
//doi.org/10.1145/3290605.3300445 2809852
[172] Darshali A Vyas, Leo G Eisenstein, and David S Jones. 2020. Hidden in plain [184] Victoria Yaneva, Irina Temnikova, and Ruslan Mitkov. 2016. A Corpus of Text
sight—reconsidering the use of race correction in clinical algorithms. , 874– Data and Gaze Fixations from Autistic and Non-Autistic Adults. In Proceedings
882 pages. of the 10th International Conference on Language Resources and Evaluation (LREC
[173] Katherine K Wallman. 1998. Data on race and ethnicity: Revising the federal ’16). European Language Resources Association (ELRA). https://aclanthology.
standard. The American Statistician 52, 1 (1998), 31–33. org/L16-1077
[174] Katherine K Wallman, Suzann Evinger, and Susan Schechter. 2000. Measuring [185] Kaiyu Yang, Klint Qinami, Li Fei-Fei, Jia Deng, and Olga Russakovsky. 2020.
our nation’s diversity: developing a common language for data on race/ethnicity. Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peo-
American Journal of Public Health 90, 11 (2000), 1704. ple Subtree in the ImageNet Hierarchy. In Proceedings of the 2020 Confer-
[175] Kellie Webster, Marta Recasens, Vera Axelrod, and Jason Baldridge. 2018. Mind ence on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT*
the gap: A balanced corpus of gendered ambiguous pronouns. Transactions of ’20). Association for Computing Machinery, New York, NY, USA, 547–558.
the Association for Computational Linguistics 6 (2018), 605–617. https://doi.org/10.1145/3351095.3375709
[176] Danielle Wetherell, Nicola Botting, and Gina Conti-Ramsden. 2007. Narrative [186] K Lisa Yang and Hock E Tan. 2019. Disability statistics: Online resource for US
skills in adolescents with a history of SLI in relation to non-verbal IQ scores. disability statistics. Accessed: 2022-01-12.
Child Language Teaching and Therapy 23, 1 (2007), 95–113. https://doi.org/10. [187] J Scott Yaruss and Robert W Quesal. 2006. Overall Assessment of the Speaker’s
1177/0265659007072322 Experience of Stuttering (OASES): Documenting multiple outcomes in stuttering
[177] Ryen W White, P Murali Doraiswamy, and Eric Horvitz. 2018. Detecting neu- treatment. Journal of fuency disorders 31, 2 (2006), 90–115.
rodegenerative disorders from web search signals. NPJ digital medicine 1, 1 [188] Emre Yilmaz, MS Ganzeboom, LJ Beijer, Catia Cucchiarini, and Helmer Strik.
(2018), 1–4. https://doi.org/10.1038/s41746-018-0016-6 2016. A Dutch dysarthric speech database for individualized speech therapy
[178] Ryen W White and Eric Horvitz. 2019. Population-scale hand tremor analysis research. (2016).
via anonymized mouse cursor signals. NPJ digital medicine 2, 1 (2019), 1–7. [189] Brit Youngmann, Liron Allerhand, Ora Paltiel, Elad Yom-Tov, and David Arkadir.
https://doi.org/10.1038/s41746-019-0171-4 2019. A machine learning algorithm successfully screens for Parkinson’s in
[179] Meredith Whittaker, Meryl Alper, Cynthia L Bennett, Sara Hendren, Liz Kazi- web users. Annals of clinical and translational neurology 6, 12 (2019), 2503–2509.
unas, Mara Mills, Meredith Ringel Morris, Joy Rankin, Emily Rogers, Marcel https://doi.org/10.1002/acn3.50945
Salas, et al. 2019. Disability, Bias, and AI. AI Now Institute, November (2019). [190] Hanbin Zhang, Chen Song, Aosen Wang, Chenhan Xu, Dongmei Li, and Wenyao
https://wecount.inclusivedesign.ca/uploads/Disability-bias-AI.pdf Xu. 2019. PDVocal: Towards Privacy-Preserving Parkinson’s Disease Detection
[180] Maria K Wolters, Jonathan Kilgour, Sarah E MacPherson, Myroslava Dzikovska, Using Non-Speech Body Sounds. In Proceedings of the 25th Annual International
and Johanna D Moore. 2015. The CADENCE corpus: a new resource for inclusive Conference on Mobile Computing and Networking (MobiCom ’19). Association for
voice interface design. In Proceedings of the 33rd Annual ACM Conference on Computing Machinery, Article 16, 16 pages. https://doi.org/10.1145/3300061.
Human Factors in Computing Systems. 3963–3966. 3300125
[181] Dongxin Xu, Jefrey A. Richards, Jill Gilkerson, Umit Yapanel, Sharmistha Gray, [191] Hui Zheng, Pattiya Mahapasuthanon, Yujing Chen, Huzefa Rangwala, Anya S
and John Hansen. 2009. Automatic Childhood Autism Detection by Vocaliza- Evmenova, and Vivian Genaro Motti. 2021. WLA4ND: A Wearable Dataset
tion Decomposition with Phone-like Units. In Proceedings of the 2nd Workshop of Learning Activities for Young Adults with Neurodiversity to Provide Support
on Child, Computer and Interaction (WOCCI ’09). Association for Computing in Education. Association for Computing Machinery, New York, NY, USA.
Machinery (ACM), Article 5, 7 pages. https://doi.org/10.1145/1640377.1640382 https://doi.org/10.1145/3441852.3471220
[182] Ehud Yairi and Nicoline Grinager Ambrose. 1999. Early childhood stuttering
I: Persistency and recovery rates. Journal of Speech, Language, and Hearing
Chronically Under-Addressed: Considerations for HCI
Accessibility Practice with Chronically Ill People
Kelly Mack∗ Emma J. McDonnell∗
University of Washington University of Washington
Seattle, Washington Seattle, Washington
kmack3@uw.edu ejm249@uw.edu

Leah Findlater Heather D. Evans


University of Washington University of Washington
Seattle, Washington Seattle, Washington
leahkf@uw.edu hdevans@uw.edu
ABSTRACT ill people have symptoms that alter their daily lives, and disability
Accessible design and technology could support the large and grow- activism [65, 117] and academic theorizing [42, 72, 148] are begin-
ing group of people with chronic illnesses. However, human com- ning to integrate chronic illness into their approaches. However,
puter interactions (HCI) has largely approached people with chronic despite an active focus on chronic illness in human-computer inter-
illnesses through a lens of medical tracking or treatment rather action (HCI) health research (e.g., [54, 61, 81, 89]), chronic illness
than accessibility. We describe and demonstrate a framework for remains conspicuously underrepresented in HCI accessibility work
designing technology in ways that center the chronically ill expe- [86] (for a few exceptions, see e.g., [19, 57, 69, 85]). We identify
rience. First, we identify guiding tenets: 1) treating chronically ill an opportunity for HCI accessibility practitioners to understand
people not as patients but as people with access needs and exper- the access needs of chronically ill people and to create technology-
tise, 2) recognizing the way that variable ability shapes accessibil- based solutions that are not rooted in medicalized views of chronic
ity considerations, and 3) adopting a theoretical understanding of illness.
chronic illness that attends to the body. We then illustrate these In this paper, we articulate an opportunity for future HCI acces-
tenets through autoethnographic case studies of two chronically sibility research to work with and support chronically ill people.
ill authors using technology. Finally, we discuss implications for To do so, we present three tenets to guide researchers’ approaches
technology design, including designing for consequence-based ac- to chronic illness: 1) move beyond medical framings to understand
cessibility, considering how to engage care communities, and how people with chronic illness as having access needs and valuable
HCI research can engage chronically ill participants in research. expertise, 2) consider that the variability of ability that many chron-
ically ill people experience presents unique accessibility needs, and
KEYWORDS 3) adopt a theoretical approach to chronic illness that attends to
bodily and sociocultural experiences. We then apply these tenets
chronic illness, accessibility, disability studies
to three autoethnographic case studies about the authors’ own ex-
ACM Reference Format: periences with technology use, demonstrating how our tenets can
Kelly Mack, Emma J. McDonnell, Leah Findlater, and Heather D. Evans. 2022.
be used to surface design considerations for chronically ill users.
Chronically Under-Addressed: Considerations for HCI Accessibility Practice
By placing disability studies, HCI, and our lived experience as
with Chronically Ill People. In The 24th International ACM SIGACCESS
Conference on Computers and Accessibility (ASSETS ’22), October 23–26, 2022, chronically ill technology users in conversation, we introduce a new
Athens, Greece. ACM, New York, NY, USA, 15 pages. https://doi.org/10.1145/ paradigm for designing accessible technology. This shift includes
3517428.3544803 viewing access as produced by both a user’s innate abilities and the
physiological consequences doing an action causes, which we call
1 INTRODUCTION a consequence-based approach to accessibility. Further, it encour-
Billions of people around the world [18, 53] are diagnosed with ages researchers to consider technology design for community use
chronic illness, broadly defned as a range of conditions and diag- and alter traditional HCI methods to better match chronically ill
noses that impact functioning and are not expected to go away or participants’ access needs. We also emphasize that approaches to
be immediately fatal [34, 49, 115, 122, 124, 149]. Many chronically technology design for chronically ill people need to be grounded in
community knowledge and can be contextualized within disability
∗ Both authors contributed equally to this research. studies and activism.
In summary, with respect to designing technology for chronically
ill people, we contribute 1) three core tenets to guide research, 2) a
This work is licensed under a Creative Commons Attribution International in-depth authethnographic exploration of how these tenets reveal
4.0 License. opportunities to understand and design technology for chronically
ASSETS ’22, October 23–26, 2022, Athens, Greece ill people, and 3) considerations for HCI accessibility practice when
© 2022 Copyright held by the owner/author(s). engaging chronically ill people.
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3544803
ASSETS ’22, October 23–26, 2022, Athens, Greece Mack and McDonnell et al.

2 BACKGROUND AND RELATED WORK There is a small, but growing body of work that situates chronic
To contextualize our work, we situate our defnition of chronic illness in relation to accessibility and the broader disability com-
illness, explore how HCI has approached work with chronically ill munity. ASSETS has not historically published much work that
people, and introduce guiding concepts from disability studies. engages chronic illness - only a small set of papers include par-
ticipants with chronic illnesses, often focusing on older adults or
2.1 What We Mean by Chronic Illness rehabilitative technologies (e.g., [3, 17, 25, 26, 73, 109, 157, 160]).
Some ASSETS publications have considered chronically ill people
In this paper, we draw on work from disability studies scholarship as having access needs, exploring how people negotiate access at
to broadly defne a chronically ill person as one who has a condition work [85], considering how to make disability activism accessi-
that: impacts functioning, is not expected to go away or be immedi- ble to chronically ill people [19], and centering chronic illness in
ately fatal, may be ameliorated through treatment and, particularly theorizing around the role disability studies ought to play in fu-
when left untreated, can be life-limiting [34, 49, 115, 122, 124, 149]. ture HCI accessibility research [57, 91]. Outside of ASSETS, HCI
Conversations around chronic illness and disability often overlap work that considers chronic illness within the contexts of disabil-
- indeed many people identify as both chronically ill and disabled ity and accessibility remains sparse. Research on disability-related
[119]. While we do not intend to take on questions of chronic illness activist movements has included chronically ill people’s perspec-
or disability identity formation, this overlap guides us to engage tives [1, 80], recent conference workshops on accessibility research
with disability studies as a source of useful guiding theory and to have explicitly considered chronic illness [4, 140], and researchers
see chronic illness as relevant to HCI accessibility work. However, have considered the particular access needs of chronically ill peo-
we also highlight areas where chronically ill people’s experiences ple on dating apps [121], in the workplace [47], during research
diverge from mainstream conceptualizations of disability [119], studies [87] and in public places [69]. Considering the prevalence
motivating the need for a chronic illness specifc approach to tech- of chronic illness, this body of work is notably underdeveloped
nology design. Billions of people globally are diagnosed with at relative to other foci of accessibility research [86, 152]. Though this
least one chronic health condition [18, 53] (a growing group in the handful of papers examine the access needs of chronically ill people
wake of COVID-19 [142]), leading to a wide range of experiences in specifc contexts, no work yet theorizes about the broader design
and varied identifcation with chronic illness. We, however, are considerations needed to make technology for this group. We seek
primarily interested in how shared functional aspects of chronic to grow this body of work by articulating a set of tenets to guide
illness could be better considered within accessibility. While our future accessibility research with chronically ill people.
framework may resonate more strongly with people who identify
as chronically ill, it may also be relevant to many others.
2.3 Core Concepts from Disability Studies
2.2 Chronic Illness and HCI Within the feld of disability studies and activist communities, schol-
Chronic illness has received uneven attention across subfelds of arship by disabled and chronically ill people provides a critical lens
HCI - it is a signifcant topic in HCI health and online communities and crucial background to our thinking on how HCI accessibility
research but is scarcely engaged in accessibility contexts. Specif- practitioners should approach chronic illness. Our work applies
ically, the primary foci of current HCI health work on chronic disability studies topics to HCI scenarios, building on the tradition
illness include: exploring the diferent information needs and prac- set by Mankof et al.’s 2010 invitation to engage with models of
tices of patients and providers [15, 37, 81, 112, 127, 131, 141], how disability [91], Bennett et al.’s integration of interdependence into
people talk to their support networks [5, 14, 89, 95, 114] how to HCI thinking [12], Ringland et al.’s call to see mental-ill health as
support pediatric patients, their parents, and providers in efective a form of psychosocial disability that can be addressed outside of
communication [54, 59–62, 75], and how patients gain the knowl- clinical contexts, [126], Williams et al.’s framing of crip HCI [151],
edge to manage their conditions [20, 24, 63, 120, 150]. This body and Hofmann et al.’s exploration of crip time [57]. Here we provide
of work considers chronically ill people in relation to the medical a brief overview of core disability studies concepts that we will
care they pursue, often primarily referring to them as ‘patients’, utilize when considering how to build non-medicalized technology
and therefore proposes technology solutions within this medical for chronically ill people, namely: social and medical models of
context. An additional focus of HCI work with chronically ill people disability, interdependence, access intimacy, and crip time.
is self-tracking, developing tools to track symptoms [76, 101, 116], One of disability studies’ central pursuits is to name and ana-
treatments [7, 21, 158], and medically-necessary lifestyle changes lyze the efects of two dominant frameworks for understanding
[84, 136]. This work often includes medical professionals and is disability: the social and medical models [113]. The medical model
geared toward helping patients comply with prescribed treatment. of disability characterizes deviation from physical and/or intellec-
Another avenue of HCI research with chronically ill people outside tual norms as undesirable defects that medical intervention can
of accessibility frameworks focuses on how chronically ill people eliminate, augment, or cure, motivated by the belief that a better
connect with and support others via social media communities future is one without disability [30, 133]. The classical counterpoint
[41, 43, 66, 82, 90, 92, 129, 144, 159]. This body of research consid- to the medical model is the social model of disability, which names
ers and ofers solutions for many areas where chronically ill people disability as a natural and vibrant part of human diversity and as
can be better supported when seeking medical care, but rarely cen- a basis for historic and current systemic oppression. Rather than
ters access needs that are not immediately connected to the clinic, focusing on cure, proponents of the social model call for changes
like accommodations for work or social life. to external factors that produce disability (e.g., buildings without
Chronically Under-Addressed: Considerations for HCI Accessibility Practice with Chronically Ill People ASSETS ’22, October 23–26, 2022, Athens, Greece

ramps, discriminatory policies) [113, 133]. Notably, early disability patients. Second, chronic illness causes high variability in ability,
studies scholarship, which articulated the social and medical mod- which is crucial to consider when designing technology to meet
els, did not center chronic illness in its analysis. Initial adherents chronically ill people’s access needs. Third, this work must be done
of the social model drew a distinction between impairments (i.e., using a model of disability that accounts for both physical and
diferences in functioning) and disability (i.e., context-specifc and mental experiences of impairment while also recognizing disabling
social dynamics that create barriers for participation by diferently socio-political factors.
functioning individuals) to help combat what Joel Reynolds calls
the “the ablest confation” of the concept of disability with “pain,
sufering, hardship, disadvantage, morbidity, and mortality” [124]. 3.1 Tenet 1: Beyond Patients
More recent scholarship within critical disability studies and We must view people with chronic illnesses as more than
activist theorizing provide key concepts to analyze and understand medical patients, but rather people with valuable expertise
chronic illness. While the historic push for disability rights often and non-medical access needs.
focused on independence [104], disability studies scholars have Much of the existing body of HCI scholarship around chronic
begun to theorize about the role of dependence in disabled people’s illness adopts a health, rather than accessibility, framework (see 2.2).
lives [77]. Activist scholars have adopted the framing of interde- Under a medical lens, chronically ill people are primarily viewed
pendence as “the state of being dependent upon each other” [65] as patients with technology needs defned by medical care and
which emphasizes the networks of connections and care that peo- symptom management. However, we call for HCI practitioners to
ple provide each other, rather than positioning one person as the contest the dominance of medicalization and emphasis on patient-
sole recipient or provider of care. Disability justice activists have hood when designing technology for people with chronic illnesses.
explored how interdependence reveals the ways sick and disabled Longstanding critique by feminist [31, 83], and queer [110] activists
people form care networks to provide support and access for each calls attention to the ways that labelling people as “patients” takes
other [99, 117, 118]. A related concept, coined by Mia Mingus, is away their agency and imposes a set of assumptions around what
that of access intimacy, or “that elusive, hard to describe feeling patients ought to want, do, and need. The label “patient” also estab-
when someone else ‘gets’ your access needs” [98]. Access intimacy is lishes a clear power hierarchy, implying a subordinate relationship
a meaningful dimension to add to considerations of accessibility to a more knowledgeable and powerful clinician [39]. Viewing peo-
as it helps make visible the relationships and shared context that ple with chronic illnesses primarily as patients suggests that they
shape how access is felt and received. can be best understood in a medical context and situates them as
Disabled scholars developed another key concept to our work, recipients and dependents of medical practitioners’ expertise. On
“crip time,” which theorizes about the diferent temporalities (or, the other hand, approaching people with chronic illnesses with an
experiences of the passing of time) in which sick and disabled peo- accessibility lens views them as people with access needs and cre-
ple operate. Allison Kafer, in frst formalizing crip time, imagines ates room to center individuals’ agency and knowledge, countering
its power: “rather than bend disabled bodies and minds to meet the epistemic violence [156].
clock, crip time bends the clock to meet disabled bodies and minds” A medical, patient-centric approach often obscures the deeply
[72]. Ellen Samuels, a chronically ill disability studies scholar, calls contentious relationship many people with chronic illnesses have
attention to the ways that crip time simultaneously provides tools with the medical feld. While medical treatments, testing, and guid-
to imagine a more accessible future while highlighting that rigid, ance can be critical to chronically ill people’s quality of life, the
normative expectations of life paces can be sites of painful inaccessi- medical feld is often simultaneously hostile to chronically ill people
bility [128]. These concepts provide avenues for thinking critically [10, 51, 52]. For example, it frequently takes years to get formal
about the ways that disabled and chronically ill people often ar- diagnoses for many chronic illnesses [39], patients are routinely
range their lives diferently from nondisabled people and shape our not believed by medical professionals [78, 94, 102], and complex
analysis. medical care is often prohibitively expensive [48, 56]. These experi-
ences are exacerbated when people with chronic illnesses are other-
2.4 Positionality wise marginalized because medical racism, sexism, anti-queerness,
This work was deeply infuenced by authors’ experience with ableism, classism, fatphobia, and other biases harm people’s ability
chronic illness, interactions with medical systems, and (for some) to access care and be treated with dignity [32, 70, 111, 147]. Future
moving through the world with a non-normatively functioning HCI work must understand that while medical care and assessment
body. Three of the four authors identify as chronically ill and all is crucial for many chronically ill people, it can also be a primary
identify as white, cisgender, women. site of trauma, discrimination, and disbelief. Discussion of and en-
gagement with medical systems must be done with caution and
3 DESIGN TENETS FOR CREATING recognize this fraught history. This motivates our focus on non-
medical access needs that remain under-considered within HCI
TECHNOLOGY FOR PEOPLE WITH research.
CHRONIC ILLNESSES At the same time, the knowledge shared outside of medical con-
We present three tenets which outline necessary perspectives to texts makes clear that, individually and in community, chronically
shape technology design for people with chronic illnesses. First, ill people hold vast expertise derived from both their embodied expe-
technology designers must view people with chronic illnesses as riences and navigating the world with a chronic illness. Diagnosis-
having access needs and valuable expertise rather than only as specifc and general chronic illness social media communities are
ASSETS ’22, October 23–26, 2022, Athens, Greece Mack and McDonnell et al.

abundant (e.g., [11, 41, 129]), and they provide a place to share in- designing technology for people with fuctuating abilities: design-
depth knowledge about living with a chronic illness. While much ing for the consequences of actions rather than solely for static
discussion centers on how to live with and acquire care for illness capabilities.
(e.g., symptom and fare identifcation and management, possible Individual Baseline Variability. Beyond interpersonal varia-
diagnoses, how to navigate the medical system), people also share tion in experience with chronic illnesses, a single individual can
information and advice to meet non-medical access needs (e.g., de- experience internal fuctuations in ability. Many people’s chronic
veloping horizontal workstations, suggesting how to disclose access illness experiences include “fares”, or an overall exacerbation of
needs on a date, preparing meals that don’t trigger dietary restric- symptoms for an extended period of time [16, 88, 138, 154]. These
tions) [11]. Indeed, there are myriad individual and group examples fares, as well as shorter periods of fuctuation (e.g., a bad symptom
that demonstrate the sophistication of this expertise, including a day) can be triggered by unknown or hard-to-control circumstances
recent reconsideration of graded exercise therapy as a standard of [13, 51, 97, 154]. For example, environmental factors (e.g., a heat
care for myalgic encephalitis/chronic fatigue syndrome (ME/CFS) wave, pollen, pollution/smog, season changes) or other physical
after ME/CFS advocacy groups demonstrated that it is a harmful experiences (catching the fu, menstruation) can trigger an overall
practice [8, 143]. HCI researchers should value this individual and higher level of disruptive symptoms and, consequently, a lower
community-based knowledge. level of capability to perform daily tasks [40, 96, 97, 154]. These
To conduct HCI accessibility work on chronic illness, researchers baseline fuctuations can occur rapidly, and therefore technology
must go beyond patient framings to view chronically ill people as that is designed for people with chronic illnesses must be usable at
having access needs and valuable expertise to shape accessible a variety of ability levels to meet the user’s current access needs.
technology design work. This view challenges the assumption that Action-determined variability. A chronically ill person’s abil-
medical providers should always be consulted as subject matter ex- ity levels frequently change after performing actions. While ar-
perts, while frmly centering chronically ill people as the relevant, guably, every person enters a diferent state after performing an
necessary experts that can guide the development of accessible action (e.g., after a run, a person might feel more tired) this dif-
technology. This reframing is also necessary to re-render HCI ac- ference in abilities/state can be extreme for people with chronic
cessibility work as relevant to chronically ill people. If researchers illnesses (e.g., post-exertion malaise [22]). For example, a person
perceive chronically ill people as primarily patients, primarily med- without a chronic illness might take a shower and detect no no-
ical technologies emerge as relevant supports. However, if we view ticeable diference in state. On the other hand, the challenges of
chronically ill people as having a wide range of access needs that showering with a chronic illness are thoroughly discussed (and
are not well-met in their daily lives, HCI practitioners are well- even meme’d) among chronic illness communities because they
positioned to create non-medical tools to improve accessibility. For often result in extreme fatigue, overall malaise, or other symptoms
example, access-need driven HCI work with chronically ill peo- [2, 50, 155]. Since actions may trigger lasting, negative symptoms,
ple may explore how technology could adapt to a user’s varied this can result in a cumulative efect that leaves chronically ill
cognitive abilities, develop research practices that better support people with disruptive symptoms and a low capacity to perform
someone with fuctuating capacities, and examine how existing tasks by the end of a day. Therefore, technology design needs to
accessible technologies could be customized to meet chronically ill consider not just the abilities a person begins with, but the abili-
users’ needs. ties they might have after performing an action, with or without
technological support.
The efects of variability: designing for consequences. Be-
3.2 Tenet 2: Variability of Ability cause signifcant fuctuations of ability pervade many chronically
The experience of chronic illness is diverse and inconsistent, ill people’s lives, they often have to map out their days based on
even for an individual from day to day; consequently, we the expected consequences of each action they plan to take [100].
have to view access not only in terms of capability to com- A common metaphor used within the chronic illness community
plete an action, but also in terms of its repercussions (e.g., for this form of variable consequence management is the “Spoon
consequences such as worsening symptoms). Theory” [29, 100]. This metaphor for understanding chronic illness,
Chronic illness often causes varying levels of ability, afecting coined by Christine Miserandino, represents capacity or energy
how chronically ill people move about the world, including what with “spoons” and explains that people have to carefully plan what
technology they use. When people’s bodies have vastly diferent they spend their spoons on in a day, since they are often in short
abilities over time, it takes creativity and planning to go about ev- and inconsistent supply; because of baseline variations, like a fare,
eryday life [9, 38, 106, 117]. For example, someone with fuctuating the amount of spoons you can spend in a day may be diferent on
fatigue may use mobility or other technology aids some days or Monday than Tuesday [100]. Due to the variety in abilities and
times in the day, but not others [9, 106]. Like other disabilities, symptoms an individual can experience, the number of spoons an
chronically ill people’s abilities and access needs vary between action takes cannot be perfectly estimated. Even for the same per-
people. However, what is especially critical in examining the expe- son, the impact of the same action can vary drastically from 10 AM
rience of chronic illness is understanding the variability of abilities to 10 PM, though the consequences they incur for that action may
within an individual. We break down this phenomenon by, frst, be perpetually higher than they are for non-chronically ill people.
categorizing factors that determine (and vary) ability into two main In deciding how to allocate spoons, chronically ill people perform a
categories: individual baseline fuctuations and action-determined complex cost-beneft analysis, informed by the time they’ve spent
variability. Then, we propose a view of accessibility that is key for living with a condition, to predict the likely costs (e.g., symptoms)
Chronically Under-Addressed: Considerations for HCI Accessibility Practice with Chronically Ill People ASSETS ’22, October 23–26, 2022, Athens, Greece

of performing a task compared to the benefts they will receive. illness, we believe researchers should center the embodied experi-
Therefore, we argue that to understand accessibility in the context ences that often characterize chronic illness. This framing balances
of chronic illness, we must account for the consequences an action the tensions between seeking care for unwanted symptoms and
causes. Under this approach, we frame the accessibility of a task as valuing disability as part of human expression. We hope this epis-
not solely in terms of an individual’s capability to perform a task, temological shift away from a purely social model approach makes
but rather the ability to perform a task and remain in an “acceptable” space for HCI accessibility work that can account for people’s bodily
state afterwards. realities without defaulting to a medicalized approach.

4 CASE STUDY APPLICATIONS OF


3.3 Tenet 3: Include the Body DESIGNING FOR PEOPLE WITH CHRONIC
Research with people with chronic illnesses must be done
ILLNESSES
using a model of disability that accounts for both the physi- In the following section, we present three autoethnographic case
ological and sociopolitical barriers they face. studies from two chronically ill authors to show how diferent
Numerous disability studies scholars have explored the ways types of technology can be used to negotiate access in diferent
in which those living in non-normative bodyminds may experi- social and work scenarios. We then demonstrate how our tenets
ence limitations from both physiological impairments and socially can make sense of these experiences and, following that analysis,
constructed dynamics of exclusion [28, 34, 115, 119, 122, 124, 149]. highlight potential directions and considerations for technology
In her foundational essay, Susan Wendell argues that for the “un- development.
healthy disabled”—or people who are chronically ill and experience
frequent pain, fatigue or other forms of discomfort—a social model 4.1 Background and Methods
view of disability focused on curing ableism disregards a core part 4.1.1 Collaborative Autoethnography. Collaborative autoethnog-
of their disability experience. She highlights the fact that many raphy is a methodological approach using autobiographical data
people “experience physical or psychological burdens that no amount as the subject of ethnographic analysis conducted by a group of
of social justice can eliminate,” and calls for an approach to disability researchers [23]. Autoethnography is a well-established method
that does not seek to avoid the realities of physiological impair- among disability studies scholars who have both experienced and
ment [148]. Motivated by Wendell, we argue that researchers must studied the social and structural dynamics of chronic illness, dis-
move beyond the currently discussed social and medical models of ability, or long term pain [45, 148, 161, 162] and continues to be
disability toward an approach that attends to both embodied and employed by similarly situated researchers today [103, 107, 125].
sociopolitical aspects of chronic illness. Conducting autoethnography in collaboration produces rich anal-
Disability scholars have critiqued the social model [52, 132, 133] yses from multiple ‘insiders’ perspectives of complex health pro-
and developed new ways of thinking about disability that cen- cesses [23, 105, 108, 137, 145]. Inspired by this history of chronically
ter the interplay between individual experiences of impairment ill people’s engagement with autoethnography, we present three
and broader society and disability politics. For example, the politi- case studies derived from our own autoethnographic refection.
cal/relational model proposed by Alison Kafer “neither opposes nor We focus on our own experiences and interactions as two chroni-
valorizes” medical care, but makes space for ‘‘the possibility of si- cally ill people to avoid being extractive of broader communities
multaneously desiring to be cured of chronic pain and to be identifed throughout our cases [44].
and allied with disabled people” [72]. This model makes space to
see chronically ill people as political subjects while not needing to 4.1.2 Background. The frst authors, McDonnell and Mack, recog-
cast aside the bodily realities of impairment that have historically nized rich examples in their lives around utilizing technology to
been ignored under social model politics. navigate their own fuctuating access needs associated with chronic
Other scholars have explored ways that living with diferences in illness symptoms. They therefore chose to employ collaborative
functioning can generate deep, visceral forms of knowledge avail- autoethnography to critically analyze their everyday experiences
able only to others who share the same experience. Tobin Siebers and refect on them as case studies.
explains that embodied knowledge arises when “situated knowl- McDonnell and Mack started their doctoral studies at the Univer-
edge adheres in embodiment. The disposition of the body determines sity of Washington in the same year. While they met as colleagues
perspectives, but it also spices these perspectives with phenomenolog- collaborating on accessibility research, the experiences they recount
ical knowledge–lifeworld experience–that afects the interpretation here are primarily shaped by a deep friendship that developed over
of perspective” [134]. In essence, knowledge does not solely come the course of that collaboration. Mack was diagnosed in 2016 with
from a social location, but from the particular, physical experiences a chronic illness that results in fuctuating symptoms that include
of living in a body. Siebers calls for disability theory that engages motion sickness in the form of dizziness and nausea that can be
embodied expertise, not only as an object of analysis but as a concep- triggered by physical movements as well as visual stimuli. She
tual tool that can strengthen design practices and enrich analytical has overall malaise that varies, sometimes feeling perfectly fne,
capacity [135]. other times feeling ill upon waking up. Although McDonnell has
We combine these ideas to articulate a theoretical approach to navigated signifcant dietary restrictions since she was young, her
understanding the experience of people with chronic illnesses. To identifcation with chronic illness shifted in 2019 after what she
adopt a more nuanced and comprehensive approach to chronic thought was post-surgical recovery became unexplained symptoms
ASSETS ’22, October 23–26, 2022, Athens, Greece Mack and McDonnell et al.

that took years to diagnose. McDonnell’s symptoms include unpre- can risk watching this video after she wakes, particularly because
dictable malaise, an inconsistent ability to be upright and active, she knows she’ll enjoy it. It does make her feel slightly ill. Mack
brain fog, heat intolerance, and fatigue. then opens the second video and fnds that McDonnell was more
To develop their cases McDonnell and Mack frst independently cautious than she needed to be, and she was able to watch the
generated a set of scenarios where technology was either inacces- TikTok video without triggering symptoms.
sible and/or allowed them to meet their access needs, identifying
4.2.2 Applying the tenets. Whether or not Mack watches a spe-
individual examples and refecting on instances where they used
cifc video at a given time cannot be determined by either party
technology to support each other. They then met and discussed
alone: McDonnell and Mack both provide key information to in-
areas of overlap between their scenarios, ultimately selecting three
form these decisions. This process is deeply interdependent and
cases for their variance, rich engagement with technology, and
social in nature.
interaction between the two authors. Throughout this process they
Because her symptoms fuctuate hourly (Tenet 2), Mack’s deci-
referenced their shared messaging history to provide more details
sion to watch a TikTok at any given time needs to consider sev-
about interactions. The full set of authors reviewed these refections,
eral interwoven factors. She essentially performs a risk-assessment
probing for more details and explanations when needed. Collec-
where her current symptoms and the described amount of motion
tively the authors engaged in iterative discussions and analyses,
determines the risk, the level of novelty of the content determines
producing the results presented here.
the potential beneft, and missing events or feeling very ill later in
the day determines the potential cost. Mack can make this calcu-
4.2 Case 1: TikTok Sharing and Consumption lation reasonably accurately, thanks to her expertise derived from
Our frst case examines Mack and McDonnell collaboratively cre- years of lived experience around what will and won’t trigger her
ating access to social media content that is inaccessible to Mack. symptoms.
Their experience demonstrates a community-based solution to a What videos to send and what to describe in these videos is
social (nonmedical) access issue and highlights consequence-based highly situated in understanding how elements of the social en-
accessibility. vironment impact Mack’s physical abilities and symptoms (Tenet
3). To make motion descriptions efective, Mack and McDonnell
4.2.1 The Scenario. McDonnell is both deeply hooked on the social
rely on building a shared understanding of Mack’s symptoms and
media platform TikTok and fond of sharing videos that she fnds
how to categorize motion. They work together to defne a shared
amusing. However, Mack cannot watch all TikToks because shaky
vocabulary to consistently describe elements of the social environ-
camera movement makes her sick. Therefore, when McDonnell
ment (e.g., what is “unsteady camerawork”) so that Mack can best
wants to share a TikTok with Mack, she pauses to assess how much
predict her physical response to the stimuli. This process involved
motion is in it, before copying the link over to Facebook Messenger
demonstrative examples and considerable trial and error; to this
and writing up a motion description of the video [123]. Although
day, there are still elements of guessing at what kinds of visual
they have since discussed how to best craft motion descriptions,
stimuli would be accessible and how to convey risk.
McDonnell began providing descriptions without prompting when
Mack’s deep, embodied knowledge and McDonnell’s eforts to
she began sharing TikToks, paralleling how she shares image de-
learn her access needs are crucial to this process (Tenet 1). Mack
scriptions of visual memes with a blind friend. This description
gained an understanding of how her symptoms change in response
explains how much motion is in the video and may also include
to diferent stimuli over years of self-refection and trial and error,
consumption guidelines (e.g., wait until someone says “get my cof-
sometimes accidentally triggering negative symptoms. This self-
fee” to look at the video) to make semi-accessible videos watchable
knowledge is crucial to craft social accommodations. As McDonnell
or context on why Mack might watch it (e.g., it is one of her inter-
is her only friend who regularly sends motion-described TikToks,
ests, or McDonnell thinks it’s hilarious, required viewing). Example
it is the only way Mack has to safely access this content – there is
motion descriptions McDonnell has shared with Mack are:
no existing external mechanism to learn about the motion stimuli
“This seems very up your alley though has a lot of in a TikTok in advance. However, McDonnell is not acting as a
motion. The camera is steady when it’s still but moves visual/motion interpreter for Mack - rather, it is an act of friendship
side to side to track the dancers (in fts and starts that deeply considers access . In practice, McDonnell develops the
though, like it moves, stays, they move out of frame knowledge of what to include in motion descriptions from being
and it then follows, not continuous tracking), and a curious friend who watches a lot of content with Mack, and
they’re dancing at a reasonably close zoom so all the also from Mack being very open with sharing her physiological
spins constitute motion on the screen. I will audio reactions. They rely on access intimacy and interdependence to
describe and pause for you tomorrow if you want “ translate Mack’s expertise into a social workfow that meets her
“Steady cam tho with a lot of jump cuts- you can look access needs.
at a still at the start and then away for the rest of the 4.2.3 Potential for Technology Support: Contextual Awareness and
video and get 95% of it” Customization. This example highlights opportunities and consid-
When Mack gets that message, she reads the motion description erations around how technology could better support people with
and decides if it is something she wants and can aford to watch at fuctuating symptoms that can be triggered by external stimuli
that moment. For example, when she receives the frst description, like motion. First, we found that context (i.e., Mack’s current state
Mack decides that since she usually feels best in the morning, she and her future plans) was critical in determining what content she
Chronically Under-Addressed: Considerations for HCI Accessibility Practice with Chronically Ill People ASSETS ’22, October 23–26, 2022, Athens, Greece

would consume and when. Therefore, gathering users’ contextual access. However, Mack is not a “traditional” screen reader user: she
information through smart device sensors (e.g., microphones, pulse uses her eyes to identify the paragraph of interest, highlights the
sensors) or other information logged in personal devices (e.g., cal- text with the mouse, and then activates the screen reader. Other
endar events)1 could be a promising application for future work times she uses an online TTS tool, NaturalReader2 , designed for
[27, 74, 130]. However, Mack’s example demonstrates the sophis- sighted users. Some of the visual interactions are useful to her, such
tication of expertise needed to identify a user’s current state and as clicking where in a document to start reading, but she has to turn
predict the impact of content consumption, suggesting that so- of others due to her motion sensitivity, such as highlighting each
lutions may need to include a human-in-the-loop to ensure that word as it is read. While this tool limits listening speed, it works
sensed contextual information is adequately interpreted. Finally, well on PDFs, which are often not fully screen reader accessible.
Mack’s experience of chronic illness is also unique from others, and Context often determines which tool Mack uses. For example,
the heuristics she has for what videos may be accessible are partic- one day, while attending a meeting, her peer sent an abstract for
ular to her life and body, indicating that personalization would be her to read. Since an abstract is short, she chose to read this with
key if designing accessible technology for this scenario. her eyes, and it only made her slightly dizzy. However, she was
Further, this case raises questions of how one might develop then asked to review an interview protocol draft. This document
machine learning models to increase access when target users can’t spanned multiple pages, and given her existing symptoms Mack
label training data and face signifcant consequences when using chose to consume the content with a screen reader. Since it was a
inaccurate models. While recent developments in customizable ma- group meeting with multiple people, she pulled out earbuds and put
chine learning models (e.g., few-shot learning [146]), may seem one earbud in, leaving the other ear uncovered to ensure she could
well-suited to the questions of describing motion stimuli or identify- still hear her colleagues while listening to the protocol. She felt
ing accessible videos, this example challenges several core machine comfortable using the earbuds without judgment or explanation
learning practices. Even few-shot learning requires that users pro- since the meeting attendees knew about her chronic illness.
vide a training set and then give feedback to iteratively improve Meanwhile, McDonnell does not experience any uptick in symp-
models. For Mack to independently curate a dataset of inaccessible toms directly related to reading. However, her fuctuating fatigue,
TikToks or types of motion, she would likely have to slowly submit malaise, and brain fog can make both the physical efort of sitting
examples of videos that trigger her symptoms throughout daily life upright enough to read a PDF on her computer and the cognitive ef-
or undertake video labeling sessions that are all-but guaranteed to fort of staying focused on a document prohibitively difcult. When
make her sick. An alternative model of data-labeling could explore McDonnell mentioned that she was struggling to balance work and
a communal approach, where others (in this example, McDonnell) fatigue, Mack recommended NaturalReader. Initially McDonnell
could curate a training set for Mack. This is not a panacea, as it used this tool sporadically, but eventually it became her default
takes time, transparency, and trust to train proxy data labelers and reading method. As someone who is not a skilled screen reader
adds uncertainty to training data - though McDonnell can often user and can consistently navigate interfaces visually, a TTS tool
make reasonable calls about what is clearly accessible to Mack, she alone serves as a signifcant access tool.
does not live with Mack’s symptoms. Additionally, the process of Through more consistent use, McDonnell has discovered that
data labeling may also be a socially-untenable ask to make of others. she uses the tool diferently when she needs to get reading done
Further, assessing model performance and providing feedback to while feeling so physically unwell that she can’t be upright than
improve a model poses signifcant risk, as the tolerable error rate when symptoms are impacting her ability to sustain focus. For ex-
is very low. Finally, since Mack’s decision to watch or not watch a ample, McDonnell was taking a graduate seminar, which included
video depends on many interwoven, nuanced factors, this case raise dense readings, during an academic term where she was experi-
interesting questions for machine learning around how to collect encing frequent symptom fares. If she was feeling unable to work
detailed feedback from a user without burdening them. from her desk, she would pivot to uploading the week’s reading
to NaturalReader, putting in her headphones, pressing play, and
4.3 Case 2: Hacking Text-to-Speech Technology laying down on the foor or couch to listen. This allowed her to
We now examine a case around McDonnell and Mack’s use of continue working and increased the likelihood that she’d be able to
text-to-speech (TTS) technology to improve access during their complete other work later in the day. If she was instead trying to
graduate studies. This common tool had the fexibility required to complete seminar readings while dealing with brain fog, she would
support two diferent sets of access needs in performing the same load the reading into NaturalReader and then simultaneously lis-
task: allowing McDonnell and Mack to continue reading while ten to audio output while using the tool’s sentence-highlighting
symptomatic. feature. Multimodal output and consistent pace allowed her to get
through a heavy reading load while brain fog made staying focused
4.3.1 The Scenario. Mack’s dizziness varies day to day. She fnds on reading difcult.
that 10 minutes of uninterrupted reading consistently makes her
dizzy. Consequently, she started using Text-to-Speech (TTS) tech- 4.3.2 Applying the Tenets. This case demonstrates ways that Mack
nologies to read without triggering dizziness. Sometimes, Mack and McDonnell address the access needs that arise from their
uses a screen reader as a TTS engine, since it has very fast reading chronic illnesses in nonmedical contexts (Tenet 1). Seeing McDon-
speeds, is easily turned on and of, and doesn’t require internet nell and Mack as having access needs de-medicalizes their issues
and allows widely available, nonmedical tools to be a part of the
1 Note that a heavily sensing-based solution cannot be built without careful considera-
tion for user privacy. 2 https://www.naturalreaders.com/online/
ASSETS ’22, October 23–26, 2022, Athens, Greece Mack and McDonnell et al.

solution. Further, McDonnell learning about NaturalReader from when these access needs are met - in fact TTS is often most nec-
Mack exemplifes the common practice of communities creating essary when they are especially symptomatic. By recognizing the
and sharing valuable expertise with each other about navigating social factors at play while also leaving space for physiological
through all areas of life with a chronic illness [11, 92, 117]. Addi- experiences of symptoms, we can better understand the goals of
tionally, McDonnell’s case provides insight into how individual these chronically ill technology users and the role HCI technologies
expertise of chronically ill people evolves: she learned about her can play in achieving them.
own access needs and how to manage them through months of
feeling sick, trying new workfows (i.e., using NaturalReader when
she was sick, lying on the foor), and recognizing where else they 4.3.3 Potential for Technology Support: Broader User Bases and
could be useful in her life (i.e., using NaturalReader as a focusing Contexts. The fact that McDonnell and Mack are not the “tradi-
mechanism). Both experiences exemplify the creative workfows tional” users of TTS technology raises interesting design questions
and rich insights that can be generated by disabled or chronically around how to describe, and market to, technology users. Often
ill people hacking technology [55, 139]. in accessibility research, “people who are blind” and “people who
Mack and McDonnell use TTS in response to the reality of their use screen readers” are used synonymously. This case is a demon-
varied abilities (Tenet 2), despite neither of them being the “typical” strative example: not all screen reader users are blind or have low
target users of TTS systems (e.g., people who are blind, people with vision. When narrowly conceiving of who the users of accessible
dyslexia or other common print-related disabilities). For Mack, TTS technologies are, this purportedly inclusive design ends up exclud-
works to prevent and manage symptoms. For McDonnell, TTS is ing people with chronic illnesses and anyone else who designers
a more accessible option than visual reading when she is sympto- failed to imagine might have a use for an accessible technology.
matic, though she often reads visually without consequence when Categorizations around who is the “intended user” for a technology
she is non-symptomatic. In fact, both authors’ experiences high- can lead to increased resistance from institutions (e.g., insurance
light an interesting perspective on achieving access for chronically companies), social stigma, and even denials of requests (e.g., people
ill people: they both can physically read with their eyes, but fnd with fatigue who can physically walk face resistance requesting
reading with TTS to be more accessible. Therefore, for McDonnell wheelchairs [9]). Chronically ill people’s access needs often overlap
and Mack, oftentimes access is about utilizing modalities that lead with those that are more comprehensively understood by accessibil-
to less friction during or after the activity more so than working ity practitioners, but a lack of attention to chronic illness within the
around an inability to perform an activity. feld means that these unique use cases are not considered in design.
On top of the physical variability that determines technology Future work in HCI accessibility should consider the multiplicity of
use, Mack and McDonnell’s social and environmental contexts are ways people could meet the same access need and the multiplicity
also key. By default, Mack prefers to read any text longer than a of access needs that can be met by the same technology, moving
brief email via TTS as it greatly reduces the risk of long-lasting towards future tools with a wide range of customizable options.
symptoms. However, social context occasionally causes her to be Looking beyond the individual, the broader social and physi-
more willing to risk reading with her eyes than to take on the social cal environments afect technology use, and therefore need to be
stigma of using headphones during a meeting. When working with considered in when and how to adapt technology for chronically
established colleagues who understand her chronic illness, Mack’s ill people. Bennett et al. proposed a model of interdependence for
use of TTS and headphones is unremarkable, but when meeting new viewing a disabled persons’ interactions with their environment
collaborators she risks seeming unprofessional or having to disclose and assistive technologies where technology use is infuenced by
full details of her disability to do so. Additionally, McDonnell’s use factors outside of the disabled person and their technology [12]. In
of TTS is highly shaped by her environment - she is far less likely this case, we see examples where social dynamics and the need to
to work from the foor, requiring TTS, when in the ofce, but will disclose and explain her disability afected Mack’s choice to use
readily do so in her apartment. Both internal and external context screen readers. In other cases, technology supports may become
are key determiners of what technology support is most useful at a less critical when a trusted ally can provide the same care.
given time. Finally, this case, mirroring trends in online chronic illness fo-
This case highlights the importance of viewing disability from rums [11, 90], demonstrates the crucial role that shared expertise in
both a social and physical lens (Tenet 3). For both Mack and Mc- managing illness has in communities. McDonnell started imagining
Donnell, TTS meets access needs that are not fundamentally social the ways TTS technologies could beneft her after watching Mack
in nature - they are seeking ways to limit or live with physiological adapt screen reader technologies to her own needs. Since then, Mc-
symptoms. This is diferent from many social model approaches Donnell and Mack have recommended the technology to numerous
to accessibility which seek to identify and change discriminatory other people who fnd benefts in consuming content auditorily.
social and environmental factors. However, it is also not a medi- This discovery process could be an area to engage occupational
cal model approach - TTS use is by no means a cure or treatment therapists, who frequently focus on creative way to use existing
for underlying symptoms, nor does it seek to normalize them to tools to support people in expanding function [58], though we note
a nondisabled ideal. Further, having access to TTS does, in many that occupational therapy is often not ofered or available to many
ways, meet Mack and McDonnell’s access needs in that it allows chronically ill people. Future access technologies might consider
them to continue their work where chronic illness may have oth- 1) how they market their capabilities and customizability, and 2)
erwise prevented it. However, they do not reach some ideal state how to share settings so that current users could introduce others
where they are no longer experiencing disability or impairment with similar access needs to their use of a tool. This feature could
Chronically Under-Addressed: Considerations for HCI Accessibility Practice with Chronically Ill People ASSETS ’22, October 23–26, 2022, Athens, Greece

reduce the onboarding and learning cost, especially among people the meeting when a person was screen sharing graphs she had to
with less comfort using new technologies. examine, they kept scrolling the screen which made Mack acutely
nauseous. Since that experience, she often starts meetings by es-
4.4 Case 3: Remote work tablishing group norms: asking people to share links to documents
Finally, we explore the ways that remote access allows McDonnell with her rather than screen sharing and to keep their devices on a
and Mack to more easily meet access needs that emerge throughout stable surface or turn their cameras of if they are moving.
their days. This case highlights the importance of viewing some 4.4.2 Applying the Tenets. Mack and McDonnell’s experiences
access barriers and remedies as social and others as based in the with remote work demonstrate the need to move beyond tradi-
body (Tenet 3), and it introduces the idea of internal access conficts. tional models of disability (Tenet 3). While there are some changes
to the built environment that could lessen the burden of in-person
4.4.1 The Scenario. For both Mack and McDonnell, attending meet-
work for the McDonnell and Mack (e.g., access to a place to lay
ings and classes virtually allow them to more easily and efec-
down as needed), this social model thinking has its limits because
tively manage symptoms and participate in otherwise physically-
it is not solely the built environment that is disabling in their cases.
inaccessible events. Both experience symptoms that can be triggered
Their access needs arise from their bodies, highlighting that it is
by activity, such as walking or commuting to campus. Mack fnds
critical to include bodily realities of impairment in theoretical ap-
that she cannot easily attend classes or meetings in the morning
proaches to chronic illness. We do not suggest that isolation by
(and sometimes all day) without feeling debilitatingly sick. For Mc-
way of an inaccessible environment is justifable for those who
Donnell, the COVID-precipitated shift to work-from-home made
are currently prohibited from being able to participate in physical
it so that she no longer has to leave the house at set times. She
daily life. Instead we consider that sometimes the most accessible
prioritizes going on walks or completing errands after her daily
or preferable option is to provide the opportunity for multiple en-
obligations are met, thus lowering the cost if activity triggers symp-
vironments, rather than one universally accessible space, following
toms. On top of remote work enabling Mack and McDonnell to
Dolmage’s invitation to approach the universal design of spaces as
arrange their days to better control symptoms, it also makes it eas-
“multiple and in-process” [36].
ier for them to manage symptoms as they arise. Both fnd that the
The variability of McDonnell and Mack’s symptoms and abilities
work of managing symptoms requires myriad resources, meaning
requires considerable, burdensome preparation (Tenet 2). While
that leaving the house may require packing beverages, snacks that
their homes house a variety of tools to prevent or manage symptoms
meet their dietary restrictions, medication, or mobility aids. Addi-
(e.g., food, medication), these supplies are not usually by-default
tionally, aspects of their environment can impact symptoms, and
available in all work environments or in transit. Consequently, they
when working in shared spaces, having control over temperature
pack their bags with potentially helpful or needed supplies when
or a place to lay down is not guaranteed. Mack and McDonnell do
they leave home to prepare for whatever symptoms might arise.
sometimes choose to go into campus when feeling well or to see
While some days the extra preparation might be fully unnecessary,
specifc people, but by default choose to work from home.
both fnd the uncomfortable, sometimes life-threatening, conse-
McDonnell found further benefts from the ability to disguise
quences of being unprepared outweighs the cost. This cost is not
how sick she was feeling or her access hacks during virtual meetings.
negligible, however. The process of bringing all the tools to feel
Because of video conferencing’s limited view, she could discreetly
prepared takes time, adds stress to their days if they forget an item,
make adaptations that reduced symptoms. For example, during
and adds physical weight to already fatiguing walks. In McDonnell
one evening class session, McDonnell was feeling particularly un-
and Mack’s cases, the preparation required for the variability of
well - her temperature was dysregulated and she was experiencing
their abilities is not insurmountable, but the ease provided by a way
malaise from having been upright all day. She attempted to limit
to remotely engage in work is often preferable.
her symptoms by grabbing Gatorade from her fridge and opening
In this case, we see the benefts and efects on the solutions of
the window next to her desk to cool down. McDonnell was also
viewing people with chronic illnesses as having access, rather than
able to recline somewhat by putting her feet on her windowsill and
solely medical, needs (Tenet 1). If viewing Mack and McDonnell
leaning back in her chair while still appearing attentive in class
as patients, the most obvious tools to address the inaccessibility
with her camera on. However, as time wore on, she continued to
they face while working in person are medical treatments. While
feel worse, so she turned of her camera, grabbed her computer,
both Mack and McDonnell are actively pursuing the medical care
and fnished class while lying on the foor.
that may make the broader world easier to navigate, understanding
Remote work, while still Mack’s overall preference, is not a per-
them as people with access needs makes visible ways they can be
fect solution. Though remote attendance eliminates the need to
better supported holistically. Therefore, the afordances provided
walk early in the day and trigger symptoms, the shaky video feeds of
by remote work (e.g., the ability to go on and of camera or to block
her peers or professors can trigger her motion sickness. Unlike con-
nauseating motion) become legible as accommodations.
suming TikToks, Mack often needs and is expected to pay attention
to visual content in work contexts. Over months of remote work, 4.4.3 Potential for Technology Support: Consider Internal Access
Mack found a variety of hacks to avoid getting nauseous during Conflicts. While prior work has considered access conficts between
video calls. For example, one day, when a meeting attendee started disabled people [35, 57, 85, 87], this case explores how technology
walking around with their laptop, causing signifcant motion, Mack can create internal access conficts, a phenomenon sometimes dis-
opened a Notepad window on her computer and positioned it so cussed within chronic illness communities [93]. While attending
that it blocked only that person’s camera feed. However, later in class remotely alleviated early-morning symptoms for Mack, it
ASSETS ’22, October 23–26, 2022, Athens, Greece Mack and McDonnell et al.

created a confict by causing her motion sickness. Particularly in jar), other access needs arise when performing an action. When we
the case of chronic illness, individuals can have diferent access consider the impact this framing has on technology, we see areas
needs that overlap, confict and synergize in ways that lead to for innovation. First, technology can collect and provide easy access
unique technology use. For example, a technology whose interface to the information that chronically ill people need to make well-
is very visual may lessen cognitive load for users but also lessen the informed decisions (e.g., a snapshot of what they have planned for
ability to use the tool non-visually. Technology designers should the day, recent heart rate trends). Providing the right information
consider that users may have internally conficting access needs at apt times poses interesting technical challenges. Second, systems
and therefore pay attention to implications of all design decisions can consider how to best adapt their interfaces and operations to
and maximize opportunities for customization. meet their users’ needs after an action (within or outside of the
system) triggers symptoms. For example, symptomatic users may
5 DISCUSSION beneft from a lower-cognition interaction mode or shifting from
visual to auditory content output. Future research avenues could
We have identifed three tenets for future accessibility research with
focus on learning what these levels of accessible modes of operation
chronically ill people (Beyond Patients, Variability of Ability, In-
are and when to enable them.
clude the Body), and demonstrated the ways that they can highlight
To operationalize fuctuating access needs, technology designers
new opportunities for technology design throughout our autoethno-
must recognize that chronically ill people constantly defne and
graphic cases. We now discuss additional considerations that follow
redefne what constitutes “unacceptably impaired,” and therefore
from our reframing; the need to account for consequence-based
inaccessibility. Individuals determine what is inaccessible to them
accessibility, approaching design for community use, and method-
at a given time based on deeply personal and contextual factors,
ological changes for working with chronically ill people.
performing a situational “consequence calculus” to determine if
an activity is worth its consequences. Having the ability to adjust
5.1 Consequence-Based Accessibility their defnition of “accessible” to their current context can aford
In this paper, we present a paradigm shift in how we defne ac- chronically ill people greater agency, but also introduces internal
cessibility based on a more dynamic understanding of access. The and external doubt around the validity of people’s access needs.
traditional, binary approach to technology that cleaves access needs In thinking about the technological consequences of redefning
into “I can” versus “I cannot” fails to encompass the fuctuating accessibility, we see that supporting user agency and contextual
needs of people with chronic illnesses. People with chronic illnesses adaptations is key.
can often technically perform an action that is practicably inaccessi- Finally, approaching accessibility through a consequence-based
ble to them because inaccessibility can arise from the repercussions lens that centers the underrepresented experience of chronically
of doing that action. To demonstrate this diference: handwritten ill people creates potential to better meet the individualized, con-
text might be pervasively inaccessible to a blind person without textual needs that many disabled people have when using acces-
the support of technology or sighted companions. However, for sible technology [46, 64]. Future accessibility work done using a
someone with a chronic illness that impacts digestion (e.g., ulcera- consequences-based model could consider that, for example, many
tive colitis) they can technically eat all foods, but face severe and blind and low-vision people’s vision changes based on the time
debilitating reactions to certain foods, rendering those foods practi- of the day, or could account for the optical and mental strain that
cably inaccessible. We, therefore, present a paradigm of designing speechreading for long periods of time has on d/Deaf or hard of
for consequence-based accessibility, which encompasses the conse- hearing people, or better match the needs of people with mental
quences an action causes, rather than solely the innate in/ability to health disabilities that are cyclic in nature (e.g., bipolar disorder).
perform a task. Further, we hope that our interrogation of what designers assume
Viewing technology design through the lens of consequence- when we think about “accessibility” serves as a useful starting place
based accessibility acknowledges that many chronically ill people for future researchers to interrogate the paradigms in which we
have the choice to incur consequences, even if those consequences work.
cause discomfort or more access needs. Perhaps Thanksgiving din-
ner is worth a fare in gastrointestinal symptoms, a cute summer
outft without compression socks may be worth later unsteadiness, 5.2 Designing for Communities
and running to catch a kid falling of a playground structure might Our cases provide examples of two chronically ill people sharing
take precedence over the later malaise these actions could trigger. access support (Case 1) and knowledge (Case 2); these themes of
Chronically ill people learn to live in their bodies, and perform a using care networks or other chronic illness communities to make
complex calculus to determine which consequences to avoid and sense of one’s condition and create access hacks in day-to-day life
which to weather, shaped by variables such as current symptoms, is documented within HCI (e.g., [43, 66, 82, 90, 92, 129, 144, 159])
future plans, environments, urgency, social context, availability of and among disability community activists [6, 98, 117, 118, 153].
accessible options, resources, desire, and many more. When working with a group of people who have already built,
As designers of accessible technology, we need to reconsider engaged in, and found joy within [98] a community, we propose that
what “accessibility” means to people who have the option to partake interdependence may be a more appropriate goal for technology
in an activity, but with varying costs. While chronically ill people design than independence, following Bennett et al’s framework
often also deal with more “traditional” access barriers (e.g., low fne [12]. Indeed, the act of being cared for, like receiving aid from
motor control in their fngers may make it so they can’t open a a care network, might provide emotional benefts that outweigh
Chronically Under-Addressed: Considerations for HCI Accessibility Practice with Chronically Ill People ASSETS ’22, October 23–26, 2022, Athens, Greece

the benefts of independence provided by purely technological chronically ill people’s access needs often manifest diferently than
solutions. HCI anticipates, researchers must pay careful attention to method-
Designing for interdependence involves building for transparency ological in/accessibility when working with people with chronic
with others and oftentimes giving other users capacity to take ac- illnesses.
tion. For example, consider a system where trusted friends could
monitor the biological levels (e.g., heart rate, blood sugar) of a per- 6 LIMITATIONS AND ETHICS
son with a chronic illness and be alerted to intervene or provide Our work has limitations and necessary ethical considerations. First,
more support in symptomatic times. The people given these priv- autoethnographic methods are not designed for broad generaliz-
ileges might be trusted members of care networks. However, we ability, and the examples we provide in this paper come from the
must also resist a naively optimistic view of care, and consider experiences of two people with similar demographic backgrounds.
how to build systems in a way that could protect and grant agency We do not intend our work to serve as a survey of chronic illness ex-
to a chronically ill person in an abusive or unsafe care arrange- periences, but future research could explore how our tenets operate
ment [71]. Further, though interdependence can take the form of a when applied to a wider range of experiences. Additionally, as we
nondisabled person supporting a disabled person, our case studies outline ways of engaging with a large, broadly defned community,
and examples from communities (e.g., [11, 92, 117]) demonstrate our scope is wide. There are open questions around how chronic
support maintained fully within chronically ill spheres. Thus, any illness and other forms of bodymind diference (e.g., mental health
systems designed to support chronically ill people must avoid as- disabilities) overlap and diverge, and we encourage future work to
suming a distinction between support giver and support recipient - explore this nuance. Additionally, we are not able to speak to the
chronically ill people are often already both. wide range of ways that people identify with chronic illness and/or
disability, a promising area for future work. Finally, while we be-
5.3 Doing Research with Chronically Ill People: lieve that HCI accessibility work that includes chronically ill people
could serve under-considered populations, we are also cognizant of
Efects on Methodology the harm that technical intervention can cause. We encourage de-
As we propose an approach to HCI accessibility research with chron- signers and researchers to adopt a critical eye around whether their
ically ill people, we also refect on how research methods may need work is needed and useful, or another disability dongle [67, 68].
to change to be accessible to this population. Prior work describes
how to plan accessible studies for people with disabilities, including 7 CONCLUSION
accommodations for varying fatigue or incorporating notions of In this work, we present three core tenets for HCI community
crip time [33, 79, 87]. Mack et al. describe ways to allow for more members to consider when designing technology for people with
fexibility like allowing interviews to take place over multiple ses- chronic illnesses. First, we must look beyond patienthood to see
sions, building in breaks, and adjusting the space to be comfortable chronically ill people as having access needs and expertise. Second,
for participants’ bodies [87]. Centering a chronically ill perspec- we highlight that variable ability requires us to consider accessibil-
tive, we add allowing access to food and drink, prioritizing remote ity in terms of the consequences actions cause. Finally, we provide
facilitation options, explicitly providing the option to participate a theoretical approach to chronic illness that highlights both bodily
from nontypical locations (e.g., the foor), and considering potential and socioenvironmental factors. We demonstrate the utility of these
sensory sensitivity triggers (e.g., motion, light, loud noises).
tenets through the analysis of three autoethnographic refections
There are other methods which may be challenging to run
on the technology use of two chronically ill authors, noting impli-
with strong internal validity while prioritizing participant benef-
cations for technology design. Finally, we discuss the implications
icence. Consider, for example, within-subjects controlled experi-
of consequence-based accessibility and what researchers should
ments, which rely on the assumption that an individual’s capacities consider when designing technology for and conducting research
are an experimental constant. How might a testing instrument ac- with chronically ill participants. We hope that this work spurs more
count for the reality that someone may begin a study reporting a work in the HCI community that focuses on the access needs of
2/10 on a pain scale but end it at an 8/10 (perhaps directly due to this growing population.
their participation in the study)? Further, what is the procedure if a
participant with fuctuating symptoms shows up to a study without ACKNOWLEDGMENTS
the access need for which they were recruited (e.g., someone with
fuctuating brain fog has no brain fog on the study date)? This per- This work was supported by the National Science Foundation Grad-
ceived “threat” to internal validity may be appeased if symptoms uate Research Fellowship under Grant No. DGE-2140004, by NSF
could be triggered consistently, though we argue that this is unrea- 2009977 and 1836813; and by the University of Washington Center
sonable to ask of participants (e.g., triggering a migraine can have for Research and Education on Accessible Technology and Expe-
hour or day long impacts). One solution may be to perform data riences. We would also like to thank Jennifer Mankof, Cynthia
collection in-situ when the necessary conditions occur naturally Bennett, Megan Hofmann, Abigale Stangl, and Taylor Schenone for
rather than engineering a symptom increase; while this may lessen their support in performing this work.
internal validity, it increases ecological validity and prioritizes par-
ticipant benefcence. In general, we suggest strategies of planning
REFERENCES
[1] Brooke E. Auxier, Cody L. Buntain, Paul Jaeger, Jennifer Golbeck, and Hernisa
studies that prioritize the access needs and comfort of participants, Kacorri. 2019. #HandsOfMyADA: A Twitter Response to the ADA Education
even if it means being more creative in the study design. Because and Reform Act. In Proceedings of the 2019 CHI Conference on Human Factors
ASSETS ’22, October 23–26, 2022, Athens, Greece Mack and McDonnell et al.

in Computing Systems (CHI ’19). Association for Computing Machinery, 1–12. [21] Gabrielle S. Cantor. 2018. Designing Technological Interventions for Pa-
https://doi.org/10.1145/3290605.3300757 tients with Discordant Chronic Comorbidities and Type-2 Diabetes. In Ex-
[2] Shayla Ayn. 2018. The ’Disaster’ That Is Showering With a Chronic tended Abstracts of the 2018 CHI Conference on Human Factors in Computing
Illness. https://themighty.com/2018/08/fbromyalgia-postural-orthostatic- Systems (CHI EA ’18). Association for Computing Machinery, 1–6. https:
tachycardia-syndrome-shower/ //doi.org/10.1145/3170427.3180304
[3] Ron Baecker, Kate Sellen, Sarah Crosskey, Veronique Boscart, and Barbara [22] CDC. 2021. Treating the Most Disruptive Symptoms First and Preventing Worsen-
Barbosa Neves. 2014. Technology to reduce social isolation and loneliness. In ing of Symptoms. https://www.cdc.gov/me-cfs/healthcare-providers/clinical-
Proceedings of the 16th international ACM SIGACCESS conference on Computers & care-patients-mecfs/treating-most-disruptive-symptoms.html#:~:
accessibility (ASSETS ’14). Association for Computing Machinery, 27–34. https: text=Post%2Dexertional%20malaise%20(PEM)%20is%20the%20worsening%
//doi.org/10.1145/2661334.2661375 20of%20symptoms,by%20activity%20management%20(pacing)
[4] Maryam Bandukda, Aneesha Singh, Catherine Holloway, Nadia Berthouze, [23] Heewon Chang, Faith Ngunjiri, and Kathy-Ann C Hernandez. 2016. Collaborative
Emeline Brulé, Ana Tajadura-Jiménez, Oussama Metatla, Ana Javornik, and autoethnography. Routledge.
Anja Thieme. 2021. Rethinking the Senses: A Workshop on Multisensory Embodied [24] Beenish M. Chaudhry, Christopher Schaefbauer, Ben Jelen, Katie A. Siek, and
Experiences and Disability Interactions. Association for Computing Machinery, Kay Connelly. 2016. Evaluation of a Food Portion Size Estimation Interface
1–5. http://doi.org/10.1145/3411763.3441356 for a Varying Literacy Population. In Proceedings of the 2016 CHI Conference
[5] Andrea Barbarin, Tifany C. Veinot, and Predrag Klasnja. 2015. Taking our on Human Factors in Computing Systems (CHI ’16). Association for Computing
Time: Chronic Illness and Time-Based Objects in Families. In Proceedings of Machinery, 5645–5657. https://doi.org/10.1145/2858036.2858554
the 18th ACM Conference on Computer Supported Cooperative Work & Social [25] Chen Chen, Janet G Johnson, Kemeberly Charles, Alice Lee, Ella T Lifset, Michael
Computing (CSCW ’15). Association for Computing Machinery, 288–301. https: Hogarth, Alison A Moore, Emilia Farcas, and Nadir Weibel. 2021. Understanding
//doi.org/10.1145/2675133.2675200 Barriers and Design Opportunities to Improve Healthcare and QOL for Older
[6] Imani Barbarin. 2022. The Pandemic Tried to Break Me, but I Know My Black Dis- Adults through Voice Assistants. In The 23rd International ACM SIGACCESS Con-
abled Life Is Worthy. https://www.cosmopolitan.com/entertainment/a39355245/ ference on Computers and Accessibility (ASSETS ’21). Association for Computing
imani-barbarin-black-disabled-activist-self-love/ Machinery, 1–16. https://doi.org/10.1145/3441852.3471218
[7] Ereny Bassilious, Aaron DeChamplain, Ian McCabe, Matt Stephan, Bill Kapralos, [26] Claudia Chen, Robert Wu, Hashim Khan, Khai Truong, and Fanny Chevalier.
Farid H. Mahmud, and Adam Dubrowski. 2012. Power defense: a serious game for 2021. VIDDE: Visualizations for Helping People with COPD Interpret Dyspnea
improving diabetes numeracy. In CHI ’12 Extended Abstracts on Human Factors During Exercise. In The 23rd International ACM SIGACCESS Conference on Com-
in Computing Systems (CHI EA ’12). Association for Computing Machinery, puters and Accessibility (ASSETS ’21). Association for Computing Machinery,
1327–1332. https://doi.org/10.1145/2212776.2212449 1–14. https://doi.org/10.1145/3441852.3471204
[8] BBC. 2021. Chronic fatigue syndrome advice scraps exercise therapy. https: [27] Youngjun Cho, Simon J Julier, and Nadia Bianchi-Berthouze. 2019. Instant stress:
//www.bbc.com/news/health-59080007 detection of perceived mental stress through smartphone photoplethysmogra-
[9] Brianne Benness. 2019. My Disability Is Dynamic. https://medium.com/age-of- phy and thermal imaging. JMIR mental health 6, 4 (2019), e10140.
awareness/my-disability-is-dynamic-bc2a619fcc1 [28] Eli Clare. 2001. Stolen bodies, reclaimed bodies: Disability and queerness. Public
[10] Brianne Benness. 2020. Disease Begins Before Diagnosis. https://www.ted.com/ Culture 13, 3 (2001), 359–365.
talks/brianne_benness_disease_begins_before_diagnosis [29] Cleveland Clinic. 2021. What Is the Spoon Theory Metaphor for Chronic Illness?
[11] Brianne Benness. 2020. What does #NEISVoid mean? https://noendinsight.co/ https://health.clevelandclinic.org/spoon-theory-chronic-illness/
neisvoid-explained/ [30] David Cobley. 2018. Disability and international development: A guide for students
[12] Cynthia L. Bennett, Erin Brady, and Stacy M. Branham. 2018. Interdependence and practitioners. Routledge.
as a Frame for Assistive Technology Research and Design. In Proceedings of the [31] Boston Women’s Health Book Collective. 1973. Our Bodies, Ourselves. Simon
20th International ACM SIGACCESS Conference on Computers and Accessibility and Schuster. Google-Books-ID: cWxqAAAAMAAJ.
(Galway, Ireland) (ASSETS ’18). Association for Computing Machinery, New [32] Tressie McMillan Cottom. 2019. I Was Pregnant and in Crisis. All the Doctors and
York, NY, USA, 161–173. https://doi.org/10.1145/3234695.3236348 Nurses Saw Was an Incompetent Black Woman. https://time.com/5494404/tressie-
[13] Toni Bernhard. 2019. 7 Ways to Survive a Flare When You’re Chronically mcmillan-cottom-thick-pregnancy-competent/
Ill. https://www.psychologytoday.com/us/blog/turning-straw-gold/201901/7- [33] Kara Pernice Coyne and Jakob Nielsen. 2001. How to conduct usability evaluations
ways-survive-fare-when-you-re-chronically-ill for accessibility: Methodology guidelines for testing websites and intranets with
[14] Andrew B. L. Berry, Catherine Lim, Andrea L. Hartzler, Tad Hirsch, Edward H. users who use assistive technology. Nielsen Norman Group.
Wagner, Evette Ludman, and James D. Ralston. 2017. How Values Shape Col- [34] Liz Crow. 1996. Including all of our lives: Renewing the social model of disability.
laboration Between Patients with Multiple Chronic Conditions and Spousal Exploring the divide 55 (1996), 58.
Caregivers. In Proceedings of the 2017 CHI Conference on Human Factors in Com- [35] Maitraye Das, John Tang, Kathryn E. Ringland, and Anne Marie Piper. 2021.
puting Systems (CHI ’17). Association for Computing Machinery, 5257–5270. Towards Accessible Remote Work: Understanding Work-from-Home Practices
https://doi.org/10.1145/3025453.3025923 of Neurodivergent Professionals. Proc. ACM Hum.-Comput. Interact. 5, CSCW1,
[15] Andrew B. L. Berry, Catherine Y. Lim, Tad Hirsch, Andrea L. Hartzler, Linda M. Article 183 (apr 2021), 30 pages. https://doi.org/10.1145/3449282
Kiel, Zoë A. Bermet, and James D. Ralston. 2019. Supporting Communica- [36] Jay Dolmage. 2015. Universal design: Places to start. Disability Studies Quarterly
tion About Values Between People with Multiple Chronic Conditions and 35, 2 (2015).
their Providers. In Proceedings of the 2019 CHI Conference on Human Factors [37] Melanie Duckert and Louise Barkhuus. 2022. Protecting Personal Health Data
in Computing Systems (CHI ’19). Association for Computing Machinery, 1–14. through Privacy Awareness: A study of perceived data privacy among people
https://doi.org/10.1145/3290605.3300700 with chronic or long-term illness. Proceedings of the ACM on Human-Computer
[16] Jean-Marie Berthelot, Michel De Bandt, Jacques Morel, Fatima Benatig, Ar- Interaction 6, GROUP (Jan 2022), 11:1–11:22. https://doi.org/10.1145/3492830
naud Constantin, Philippe Gaudin, Xavier Le Loet, Jean-Francis Maillefert, [38] Adriana E. and Brianne Benness. 2020. No End In Sight: 69 – Adriana. https:
Olivier Meyer, Thao Pham, et al. 2012. A tool to identify recent or present //noendinsight.co/2020/11/29/episode-69-adriana/
rheumatoid arthritis fare from both patient and physician perspectives: the [39] Laurie Edwards. 2014. In the Kingdom of the Sick: A Social History of Chronic
‘FLARE’instrument. Annals of the rheumatic diseases 71, 7 (2012), 1110–1116. Illness in America. Bloomsbury Publishing USA. Google-Books-ID: bemn-
[17] Amritpal Singh Bhachu, Nicolas Hine, and John Arnott. 2008. Technology BQAAQBAJ.
devices for older adults to aid self management of chronic health conditions. In [40] Chronic Eileen. 2020. Rheumatoid Arthritis and My Period. https://chroniceileen.
Proceedings of the 10th international ACM SIGACCESS conference on Computers com/2020/06/29/rheumatoid-arthritis-and-my-period/
and accessibility (Assets ’08). Association for Computing Machinery, 59–66. [41] Jordan Eschler and Wanda Pratt. 2017. " I’m so glad I met you" Designing Dy-
https://doi.org/10.1145/1414471.1414484 namic Collaborative Support for Young Adult Cancer Survivors. In Proceedings
[18] Peter Boersma, Lindsey I. Black, and Brian Ward. 2020. Prevalence of Multiple of the 2017 ACM Conference on Computer Supported Cooperative Work and Social
Chronic Conditions Among US Adults, 2018. https://www.cdc.gov/pcd/issues/ Computing. 1763–1774.
2020/20_0130.htm [42] Heather Evans. 2017. Uncovering: Making disability identity legible. Disability
[19] Disha Bora, Hanlin Li, Sagar Salvi, and Erin Brady. 2017. ActVirtual: Making Studies Quarterly 37, 1 (2017).
Public Activism Accessible. In Proceedings of the 19th International ACM SIGAC- [43] Shelly Farnham, Lili Cheng, Linda Stone, Melora Zaner-Godsey, Christopher
CESS Conference on Computers and Accessibility (ASSETS ’17). Association for Hibbeln, Karen Syrjala, Ann Marie Clark, and Janet Abrams. 2002. HutchWorld:
Computing Machinery, 307–308. https://doi.org/10.1145/3132525.3134815 clinical study of computer-mediated social support for cancer patients and
[20] Eleanor R. Burgess, Madhu C. Reddy, Andrew Davenport, Paul Laboi, and Ann their caregivers. In Proceedings of the SIGCHI Conference on Human Factors in
Blandford. 2019. “Tricky to get your head around”: Information Work of People Computing Systems (CHI ’02). Association for Computing Machinery, 375–382.
Managing Chronic Kidney Disease in the UK. In Proceedings of the 2019 CHI https://doi.org/10.1145/503376.503444
Conference on Human Factors in Computing Systems (CHI ’19). Association for [44] Colleen Flaherty. 2022. ‘Retract or Attack?’ Two white Africanists publish an
Computing Machinery, 1–17. https://doi.org/10.1145/3290605.3300895 article on centering the scholar’s personal experience to help “decolonize” African
Chronically Under-Addressed: Considerations for HCI Accessibility Practice with Chronically Ill People ASSETS ’22, October 23–26, 2022, Athens, Greece

studies. A call for retraction follows. https://www.insidehighered.com/news/ [66] Nwakego Isika, Antonette Mendoza, and Rachelle Bosua. 2020. “I need to com-
2022/05/24/black-scholars-demand-retraction-autoethnography-article partmentalize myself”: Appropriation of Instagram for chronic illness manage-
[45] Arthur W Frank. 1993. The rhetoric of self-change: Illness experience as narra- ment. In Proceedings of the Australasian Computer Science Week Multiconference.
tive. Sociological quarterly 34, 1 (1993), 39–52. ACM, 1–9. https://doi.org/10.1145/3373017.3373040
[46] Rachel L Franz, Jacob O Wobbrock, Yi Cheng, and Leah Findlater. 2019. Percep- [67] Liz Jackson. 2019. Disability Dongle. https://twitter.com/elizejackson/status/
tion and Adoption of Mobile Accessibility Features by Older Adults Experienc- 1110629818234818570?lang=en
ing Ability Changes. In The 21st International ACM SIGACCESS Conference on [68] Liz Jackson, Alex Haagaard, and Rua Williams. 2022. Disability Dongle. https:
Computers and Accessibility. 267–278. //blog.castac.org/2022/04/disability-dongle/
[47] Kausalya Ganesh and Amanda Lazar. 2021. The Work of Workplace Disclosure: [69] Sylvia Janicki, Matt Ziegler, and Jennifer Mankof. 2021. Navigating Illness,
Invisible Chronic Conditions and Opportunities for Design. Proceedings of Finding Place: Enhancing the Experience of Place for People Living with Chronic
the ACM on Human-Computer Interaction 5, CSCW1 (Apr 2021), 73:1–73:26. Illness. In ACM SIGCAS Conference on Computing and Sustainable Societies. 173–
https://doi.org/10.1145/3449147 187.
[48] Kara Gavin. 2017. Study: Health Plan Deductibles Hit Patients with Chronic [70] Heidi L Janz. 2019. Ableism: the undiagnosed malady aficting medicine. CMAJ
Illness Harder. https://labblog.uofmhealth.org/industry-dx/study-health-plan- 191, 17 (2019), E478–E479.
deductibles-hit-patients-chronic-illness-harder [71] Criminal Justice. N.d.. Caregiver Violence against People with Disabil-
[49] Sara Goering. 2015. Rethinking disability: the social model of disability and ities. http://criminal-justice.iresearchnet.com/crime/domestic-violence/
chronic disease. Current reviews in musculoskeletal medicine 8, 2 (2015), 134–138. caregiver-violence-against-people-with-disabilities/
[50] Chronically Grey. 2017. Showering and Chronic Illness. https://chronicallygrey. [72] Alison Kafer. 2013. Feminist, queer, crip. Indiana University Press.
wordpress.com/2017/11/23/showering-and-chronic-illness/ [73] Yamini Karanam, Andrew Miller, and Erin Brady. 2017. Needs and Challenges
[51] Alex Haagaard. 2019. 2 years of biohacking. https://alexhaagaard.medium.com/2- of Post-Acute Brain Injury Patients in Understanding Personal Recovery. In
years-of-biohacking-8e99b32bb350 Proceedings of the 19th International ACM SIGACCESS Conference on Computers
[52] Alex Haagaard. 2022. Complicating Disability: On the Invisibilization of and Accessibility (ASSETS ’17). Association for Computing Machinery, 381–382.
Chronic Illness throughout History. https://blog.castac.org/2022/02/complicating- https://doi.org/10.1145/3132525.3134794
disability-on-the-invisibilization-of-chronic-illness-throughout-history/ [74] Ravi Karkar, Jessica Schroeder, Daniel A Epstein, Laura R Pina, Jefrey Scofeld,
[53] Cother Hajat and Emma Stein. 2018. The global burden of multiple chronic James Fogarty, Julie A Kientz, Sean A Munson, Roger Vilardaga, and Jasmine
conditions: a narrative review. Preventive medicine reports 12 (2018), 284–293. Zia. 2017. Tummytrials: a feasibility study of using self-experimentation to
[54] Shefali Haldar, Sonali R. Mishra, Maher Khelif, Ari H. Pollack, and Wanda detect individualized food triggers. In Proceedings of the 2017 CHI conference on
Pratt. 2017. Opportunities and Design Considerations for Peer Support in a human factors in computing systems. 6850–6863.
Hospital Setting. In Proceedings of the 2017 CHI Conference on Human Factors in [75] Elizabeth Kaziunas, Mark S. Ackerman, Silvia Lindtner, and Joyce M. Lee. 2017.
Computing Systems (CHI ’17). Association for Computing Machinery, 867–879. Caring through Data: Attending to the Social and Emotional Experiences of
https://doi.org/10.1145/3025453.3026040 Health Datafcation. In Proceedings of the 2017 ACM Conference on Computer
[55] Aimi Hamraie and Kelly Fritsch. 2019. Crip technoscience manifesto. Catalyst: Supported Cooperative Work and Social Computing (CSCW ’17). Association for
Feminism, Theory, Technoscience 5, 1 (2019), 1–33. Computing Machinery, 2260–2272. https://doi.org/10.1145/2998181.2998303
[56] Tara O’Neill Hayes and Serena Gillian. 2020. Chronic Disease in [76] Christina Kelley, Bongshin Lee, and Lauren Wilcox. 2017. Self-tracking for
the United States: A Worsening Health and Economic Crisis. https: Mental Wellness: Understanding Expert Perspectives and Student Experiences.
//www.americanactionforum.org/research/chronic-disease-in-the-united- In Proceedings of the 2017 CHI Conference on Human Factors in Computing
states-a-worsening-health-and-economic-crisis/ Systems (CHI ’17). Association for Computing Machinery, 629–641. https:
[57] Megan Hofmann, Devva Kasnitz, Jennifer Mankof, and Cynthia L Bennett. 2020. //doi.org/10.1145/3025453.3025750
Living Disability Theory: Refections on Access, Research, and Design. In The [77] Eva Feder Kittay. 2011. The Ethics of Care, Dependence, and Disability*. Ratio
22nd International ACM SIGACCESS Conference on Computers and Accessibility Juris 24, 1 (2011), 49–58. https://doi.org/10.1111/j.1467-9337.2010.00473.x
(ASSETS ’20). Association for Computing Machinery, 1–13. https://doi.org/10. [78] Fortesa Latif. 2021. I’ve spent a lifetime trying to get doctors to believe my pain.
1145/3373625.3416996 It’s all too common for women. https://www.thelily.com/ive-spent-a-lifetime-
[58] Megan Hofmann, Kristin Williams, Toni Kaplan, Stephanie Valencia, Gabriella trying-to-get-doctors-to-believe-my-pain-its-all-too-common-for-women/
Hann, Scott E. Hudson, Jennifer Mankof, and Patrick Carrington. 2019. "Occu- [79] Jonathan Lazar, Jinjuan Heidi Feng, and Harry Hochheiser. 2017. Working with
pational Therapy is Making": Clinical Rapid Prototyping and Digital Fabrication. research participants with disabilities. In Research Methods in Human-Computer
In Proceedings of the 2019 CHI Conference on Human Factors in Computing Sys- Interaction. Morgan Kaufmann, 493–522.
tems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, [80] Hanlin Li, Disha Bora, Sagar Salvi, and Erin Brady. 2018. Slacktivists or Activists?
New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300544 Identity Work in the Virtual Disability March. In Proceedings of the 2018 CHI
[59] Matthew K. Hong, Udaya Lakshmi, Kimberly Do, Sampath Prahalad, Thomas Ol- Conference on Human Factors in Computing Systems (Montreal QC, Canada)
son, Rosa I. Arriaga, and Lauren Wilcox. 2020. Using Diaries to Probe the Illness (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–13.
Experiences of Adolescent Patients and Parental Caregivers. In Proceedings of the https://doi.org/10.1145/3173574.3173799
2020 CHI Conference on Human Factors in Computing Systems (CHI ’20). Associa- [81] Catherine Y. Lim, Andrew B.L. Berry, Andrea L. Hartzler, Tad Hirsch, David S.
tion for Computing Machinery, 1–16. https://doi.org/10.1145/3313831.3376426 Carrell, Zoë A. Bermet, and James D. Ralston. 2019. Facilitating Self-refection
[60] Matthew K. Hong, Udaya Lakshmi, Thomas A. Olson, and Lauren Wilcox. 2018. about Values and Self-care Among Individuals with Chronic Conditions. In
Visual ODLs: Co-Designing Patient-Generated Observations of Daily Living Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
to Support Data-Driven Conversations in Pediatric Care. In Proceedings of the (CHI ’19). Association for Computing Machinery, 1–12. https://doi.org/10.1145/
2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, 3290605.3300885
Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA, [82] Leslie S. Liu, Jina Huh, Tina Neogi, Kori Inkpen, and Wanda Pratt. 2013. Health
1–13. https://doi.org/10.1145/3173574.3174050 vlogger-viewer interaction in chronic illness management. In Proceedings of the
[61] Matthew K. Hong, Lauren Wilcox, Daniel Machado, Thomas A. Olson, and SIGCHI Conference on Human Factors in Computing Systems (CHI ’13). Associa-
Stephen F. Simoneaux. 2016. Care Partnerships: Toward Technology to Sup- tion for Computing Machinery, 49–58. https://doi.org/10.1145/2470654.2470663
port Teens’ Participation in Their Health Care. In Proceedings of the 2016 CHI [83] Audre Lorde. 1980. The Cancer Journals. Spinsters, Ink. Google-Books-ID:
Conference on Human Factors in Computing Systems (CHI ’16). Association for HSweAQAAIAAJ.
Computing Machinery, 5337–5349. https://doi.org/10.1145/2858036.2858508 [84] Yuhan Luo, Peiyi Liu, and Eun Kyoung Choe. 2019. Co-Designing Food Track-
[62] Juan Pablo Hourcade, Martha Driessnack, and Kelsey E. Huebner. 2012. Support- ers with Dietitians: Identifying Design Opportunities for Food Tracker Cus-
ing face-to-face communication between clinicians and children with chronic tomization. In Proceedings of the 2019 CHI Conference on Human Factors in
headaches through a zoomable multi-touch app. In Proceedings of the SIGCHI Computing Systems (CHI ’19). Association for Computing Machinery, 1–13.
Conference on Human Factors in Computing Systems (CHI ’12). Association for https://doi.org/10.1145/3290605.3300822
Computing Machinery, 2609–2618. https://doi.org/10.1145/2207676.2208651 [85] Kelly Mack, Maitraye Das, Dhruv Jain, Danielle Bragg, John Tang, Andrew
[63] Jina Huh and Mark S. Ackerman. 2012. Collaborative help in chronic disease man- Begel, Erin Beneteau, Josh Urban Davis, Abraham Glasser, Joon Sung Park,
agement: supporting individualized problems. In Proceedings of the ACM 2012 and Venkatesh Potluri. 2021. Mixed Abilities and Varied Experiences: a group
conference on Computer Supported Cooperative Work (CSCW ’12). Association autoethnography of a virtual summer internship. In The 23rd International ACM
for Computing Machinery, 853–862. https://doi.org/10.1145/2145204.2145331 SIGACCESS Conference on Computers and Accessibility (ASSETS ’21). Association
[64] Amy Hurst, Scott E Hudson, Jennifer Mankof, and Shari Trewin. 2013. Dis- for Computing Machinery, 1–13. https://doi.org/10.1145/3441852.3471199
tinguishing users by pointing performance in laboratory and real-world tasks. [86] Kelly Mack, Emma McDonnell, Dhruv Jain, Lucy Lu Wang, Jon E. Froehlich,
ACM Transactions on Accessible Computing (TACCESS) 5, 2 (2013), 1–27. and Leah Findlater. 2021. What Do We Mean by “Accessibility Research”? A
[65] Sins Invalid. 2019. Skin Tooth and Bone: The Basis of Movement is Our People, a Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to
Disability Justice Primer (2nd ed.). Sins Invalid. 2019. In Proceedings of the 2021 CHI Conference on Human Factors in Computing
ASSETS ’22, October 23–26, 2022, Athens, Greece Mack and McDonnell et al.

Systems. 1–18. eCAALYX project. In Proceedings of the 14th international ACM SIGACCESS con-
[87] Kelly Mack, Emma McDonnell, Venkatesh Potluri, Maggie Xu, Jailyn Zabala, ference on Computers and accessibility (ASSETS ’12). Association for Computing
Jefrey P. Bigham, Jennifer Mankof, and Cynthia L Bennett. 2022. Anticipate Machinery, 41–48. https://doi.org/10.1145/2384916.2384924
and Adjust: Cultivating Access in Human-Centered Methods. In Proceedings of [110] ACT UP Advisory Committee of the People with AIDS. 1983. The Denver
the 2022 CHI Conference on Human Factors in Computing Systems. 1–18. Principles. https://actupny.org/documents/Denver.html
[88] Angus Mackay. 2019. A neuro-infammatory model can explain the onset, [111] Lily O’hara and Jane Taylor. 2018. What’s wrong with the ‘war on obesity?’A
symptoms and fare-ups of myalgic encephalomyelitis/chronic fatigue syndrome. narrative review of the weight-centered health paradigm and development of
Journal of Primary Health Care 11, 4 (2019), 300–307. the 3C framework to build critical competency for a paradigm shift. Sage Open
[89] Haley MacLeod, Grace Bastin, Leslie S. Liu, Katie Siek, and Kay Connelly. 2017. 8, 2 (2018), 2158244018772888.
“Be Grateful You Don’t Have a Real Disease”: Understanding Rare Disease [112] Aisling Ann O’Kane and Helena Mentis. 2012. Sharing medical data vs. health
Relationships. In Proceedings of the 2017 CHI Conference on Human Factors in knowledge in chronic illness care. In CHI ’12 Extended Abstracts on Human Fac-
Computing Systems (CHI ’17). Association for Computing Machinery, 1660–1673. tors in Computing Systems (CHI EA ’12). Association for Computing Machinery,
https://doi.org/10.1145/3025453.3025796 2417–2422. https://doi.org/10.1145/2212776.2223812
[90] Haley MacLeod, Kim Oakes, Danika Geisler, Kay Connelly, and Katie Siek. 2015. [113] Mike Oliver. 2013. The social model of disability: Thirty years on. Disability &
Rare World: Towards Technology for Rare Diseases. In Proceedings of the 33rd society 28, 7 (2013), 1024–1026.
Annual ACM Conference on Human Factors in Computing Systems (CHI ’15). [114] Carolyn E. Pang, Carman Neustaedter, Bernhard E. Riecke, Erick Oduor, and
Association for Computing Machinery, 1145–1154. https://doi.org/10.1145/ Serena Hillman. 2013. Technology preferences and routines for sharing health in-
2702123.2702494 formation during the treatment of a chronic illness. In Proceedings of the SIGCHI
[91] Jennifer Mankof, Gillian R. Hayes, and Devva Kasnitz. 2010. Disability studies as Conference on Human Factors in Computing Systems (CHI ’13). Association for
a source of critical inquiry for the feld of assistive technology. In Proceedings of Computing Machinery, 1759–1768. https://doi.org/10.1145/2470654.2466232
the 12th international ACM SIGACCESS conference on Computers and accessibility [115] Alyson Patsavas. 2014. Recovering a cripistemology of pain: Leaky bodies,
(ASSETS ’10). Association for Computing Machinery, 3–10. https://doi.org/10. connective tissue, and feeling discourse. Journal of Literary & Cultural Disability
1145/1878803.1878807 Studies 8, 2 (2014), 203–218.
[92] Jennifer Mankof, Kateryna Kuksenok, Sara Kiesler, Jennifer A. Rode, and Kelly [116] Adrienne Pichon, Kayla Schifer, Emma Horan, Bria Massey, Suzanne Bakken,
Waldman. 2011. Competing online viewpoints and models of chronic illness. In Lena Mamykina, and Noemie Elhadad. 2021. Divided We Stand: The Collabora-
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems tive Work of Patients and Providers in an Enigmatic Chronic Disease. Proceedings
(CHI ’11). Association for Computing Machinery, 589–598. https://doi.org/10. of the ACM on Human-Computer Interaction 4, CSCW3 (Jan 2021), 261:1–261:24.
1145/1978942.1979027 https://doi.org/10.1145/3434170
[93] Caroline McDonagh. 2022. Have you ever had your access needs directly con- [117] Leah Lakshmi Piepzna-Samarasinha. 2018. Care work: Dreaming disability justice.
fict with someone else’s access needs? https://twitter.com/CazMcDo/status/ Arsenal Pulp Press Vancouver.
1538840632076148736 [118] Leah Lakshmi Piepzna-Samarasinha. 2021. How Disabled Mutual Aid Is Diferent
[94] Stephanie McManimen, Damani McClellan, Jamie Stoothof, Kristen Gleason, Than Abled Mutual Aid. https://disabilityvisibilityproject.com/2021/10/03/how-
and Leonard A. Jason. 2019. Dismissing chronic illness: A qualitative analysis disabled-mutual-aid-is-diferent-than-abled-mutual-aid/#site-content
of negative health care experiences. Health care for women international 40, 3 [119] Ruth Pinder. 1996. Sick-but-ft or ft-but-sick? Ambiguity and identity at the
(Mar 2019), 241–258. https://doi.org/10.1080/07399332.2018.1521811 workplace. Exploring the divide (1996), 135–156.
[95] James Milewski and Hector Parra. 2011. Gathering requirements for a personal [120] Ari H. Pollack, Uba Backonja, Andrew D. Miller, Sonali R. Mishra, Maher Khelif,
health management system. In CHI ’11 Extended Abstracts on Human Factors Logan Kendall, and Wanda Pratt. 2016. Closing the Gap: Supporting Patients’
in Computing Systems (CHI EA ’11). Association for Computing Machinery, Transition to Self-Management after Hospitalization. In Proceedings of the 2016
2377–2382. https://doi.org/10.1145/1979742.1979881 CHI Conference on Human Factors in Computing Systems (CHI ’16). Association
[96] Claudia S Miller, Raymond F Palmer, Tania T Dempsey, Nicholas A Ashford, for Computing Machinery, 5324–5336. https://doi.org/10.1145/2858036.2858240
and Lawrence B Afrin. 2021. Mast cell activation may explain many cases of [121] John R. Porter, Kiley Sobel, Sarah E. Fox, Cynthia L. Bennett, and Julie A. Kientz.
chemical intolerance. Environmental Sciences Europe 33, 1 (2021), 1–15. 2017. Filtered Out: Disability Disclosure Practices in Online Dating Communities.
[97] Julia Milligan. 2022. 5 Things That Can Trigger a Multiple Sclerosis Re- Proceedings of the ACM on Human-Computer Interaction 1, CSCW (Dec 2017),
lapse. https://themighty.com/2022/01/multiple-sclerosis-relapse-triggers/ 87:1–87:13. https://doi.org/10.1145/3134722
?msclkid=a921cfa9b69111ecb40da23e69d8f9ea [122] Margaret Price. 2015. The bodymind problem and the possibilities of pain.
[98] Mia Mingus. 2011. Access Intimacy: The Missing Link. https://leavingevidence. Hypatia 30, 1 (2015), 268–284.
wordpress.com/2011/05/05/access-intimacy-the-missing-link/ [123] Lauren Race, Amber James, Andrew Hayward, Kia El-Amin, Maya Gold Pat-
[99] Mia Mingus. 2011. Changing the Framework: Disability Justice. terson, and Theresa Mershon. 2021. Designing Sensory and Social Tools for
https://leavingevidence.wordpress.com/2011/02/12/changing-the-framework- Neurodivergent Individuals in Social Media Environments. In The 23rd Interna-
disability-justice/ tional ACM SIGACCESS Conference on Computers and Accessibility. 1–5.
[100] Christine Miserandino. 2003. The Spoon Theory written by Christine [124] Joel Michael Reynolds. 2017. “I’d rather be dead than disabled”—the ableist
Miserandino. https://butyoudontlooksick.com/articles/written-by-christine/ confation and the meanings of disability. Review of Communication 17, 3 (2017),
the-spoon-theory/ 149–163.
[101] Sonali R. Mishra, Predrag Klasnja, John MacDufe Woodburn, Eric B. Hekler, [125] Rose Richards. 2008. Writing the othered self: Autoethnography and the problem
Larsson Omberg, Michael Kellen, and Lara Mangravite. 2019. Supporting Coping of objectifcation in writing about illness and disability. Qualitative health
with Parkinson’s Disease Through Self Tracking. In Proceedings of the 2019 CHI research 18, 12 (2008), 1717–1728.
Conference on Human Factors in Computing Systems (CHI ’19). Association for [126] Kathryn E Ringland, Jennifer Nicholas, Rachel Kornfeld, Emily G Lattie, David C
Computing Machinery, 1–16. https://doi.org/10.1145/3290605.3300337 Mohr, and Madhu Reddy. 2019. Understanding mental ill-health as psychosocial
[102] Liz Moore. 2019. How Can I Convince Doctors I’m an Informed Patient? https: disability: Implications for assistive technology. In The 21st International ACM
//www.healthline.com/health/doctors-listen-to-patients SIGACCESS Conference on Computers and Accessibility. 156–170.
[103] Ann Neville-Jan. 2003. Encounters in a world of pain: An autoethnography. The [127] Yasmin Salamah, Rahma Dany Asyifa, and Auzi Asfarian. 2021. Improving The
American journal of occupational therapy 57, 1 (2003), 88–98. Usability of Personal Health Record in Mobile Health Application for People
[104] Kim E Nielsen. 2012. A disability history of the United States. Vol. 2. Beacon with Autoimmune Disease. In Asian CHI Symposium 2021 (Asian CHI Symposium
Press. 2021). Association for Computing Machinery, 180–188. https://doi.org/10.1145/
[105] Joe Norris, Richard D Sawyer, and Darren Lund. 2012. Duoethnography: Dialogic 3429360.3468207
methods for social, health, and educational research. Vol. 7. Left Coast Press. [128] Ellen Samuels. 2017. Six Ways of Looking at Crip Time. Disability studies
[106] Valerie Novak and Brianne Benness. 2020. No End In Sight: 67 – Valerie. https: quarterly 37, 3 (2017).
//noendinsight.co/2020/05/16/episode-67-valerie/ [129] Shruti Sannon, Elizabeth L. Murnane, Natalya N. Bazarova, and Geri Gay. 2019.
[107] Alexandra CH Nowakowski. 2016. You Poor Thing: A Retrospective Autoethnog- “I was really, really nervous posting it”: Communicating about Invisible Chronic
raphy of Visible Chronic Illness as a Symbolic Vanishing Act. Qualitative Report Illnesses across Social Media Platforms. In Proceedings of the 2019 CHI Conference
21, 10 (2016). on Human Factors in Computing Systems (CHI ’19). Association for Computing
[108] Alexandra CH Nowakowski and JE Sumerau. 2019. Reframing health and Machinery, 1–13. https://doi.org/10.1145/3290605.3300583
illness: a collaborative autoethnography on the experience of health and illness [130] Jessica Schroeder, Chia-Fang Chung, Daniel A Epstein, Ravi Karkar, Adele
transformations in the life course. Sociology of Health & Illness 41, 4 (2019), Parsons, Natalia Murinova, James Fogarty, and Sean A Munson. 2018. Examining
723–739. self-tracking by people with migraine: goals, needs, and opportunities in a
[109] Francisco Nunes, Maureen Kerwin, and Paula Alexandra Silva. 2012. Design chronic health condition. In Proceedings of the 2018 designing interactive systems
recommendations for tv user interfaces for older adults: fndings from the conference. 135–148.
Chronically Under-Addressed: Considerations for HCI Accessibility Practice with Chronically Ill People ASSETS ’22, October 23–26, 2022, Athens, Greece

[131] Jessica Schroeder, Jane Hofswell, Chia-Fang Chung, James Fogarty, Sean Mun- [148] Susan Wendell. 2001. Unhealthy disabled: Treating chronic illnesses as disabili-
son, and Jasmine Zia. 2017. Supporting Patient-Provider Collaboration to Iden- ties. Hypatia 16, 4 (2001), 17–33.
tify Individual Triggers using Food and Symptom Journals. In Proceedings of [149] Susan Wendell. 2013. The rejected body: Feminist philosophical refections on
the 2017 ACM Conference on Computer Supported Cooperative Work and So- disability. Routledge.
cial Computing (CSCW ’17). Association for Computing Machinery, 1726–1739. [150] Samantha A. Whitman, Kathleen H. Pine, Bjorg Thorsteinsdottir, Paige Organick-
https://doi.org/10.1145/2998181.2998276 Lee, Anjali Thota, Nataly R. Espinoza Suarez, Erik W. Johnston, and Kasey R.
[132] Tom Shakespeare. 2013. Disability rights and wrongs revisited. Routledge. Boehmer. 2021. Bodily Experiences of Illness and Treatment as Information
[133] Tom Shakespeare et al. 2006. The social model of disability. The disability studies Work: The Case of Chronic Kidney Disease. Proceedings of the ACM on Human-
reader 2 (2006), 197–204. Computer Interaction 5, CSCW2 (Oct 2021), 383:1–383:28. https://doi.org/10.
[134] Tobin Siebers. 2008. Disability theory. University of Michigan Press. 1145/3479527
[135] Tobin Siebers. 2019. Returning the social to the social model. The matter of [151] Rua M. Williams, Kathryn Ringland, Amelia Gibson, Mahender Mandala, Arne
disability: Materiality, biopolitics, crip afect (2019), 39–47. Maibaum, and Tiago Guerreiro. 2021. Articulations toward a Crip HCI. Interac-
[136] Katie A. Siek, Kay H. Connelly, and Yvonne Rogers. 2006. Pride and prejudice: tions 28, 3 (apr 2021), 28–37. https://doi.org/10.1145/3458453
learning how chronically ill people think about food. In Proceedings of the SIGCHI [152] Maria Wolters. 2019. Accessibility and Stigma: Designing for Users with Invis-
Conference on Human Factors in Computing Systems (CHI ’06). Association for ible Disabilities. https://aaate2019.eu/ 15th International Conference of the
Computing Machinery, 947–950. https://doi.org/10.1145/1124772.1124912 Association for the Advancement of Assistive Technology in Europe, AAATE
[137] Alicia Smith-Tran and Tifany Tien Hang. 2021. Professor–Student Interaction 2019 ; Conference date: 28-08-2019 Through 30-08-2019.
in the Midst of Illness: A Collaborative Autoethnography. Humanity & Society [153] Alice Wong. 2020. I’m disabled and need a ventilator to live. Am I expendable
(2021), 0160597621991547. during this pandemic? https://www.vox.com/frst-person/2020/4/4/21204261/
[138] National MS Society. 2022. Managing Relapses. https://www. coronavirus-covid-19-disabled-people-disabilities-triage
nationalmssociety.org/Treating-MS/Managing-Relapses?msclkid= [154] Paige Wyant. 2018. 14 ’Triggers’ That Can Cause a Fibromyalgia
d4196474b69111ec945819a6ee5e9ca1 Flare. https://themighty.com/2018/06/fbromyalgia-triggers-fare-causes/
[139] Ben Spatz. 2015. What a body can do. Routledge. ?msclkid=d419aaa5b69111ec98057bf006611ed3
[140] Katta Spiel, Kathrin Gerling, Cynthia L. Bennett, Emeline Brulé, Rua M. Williams, [155] Paige Wyant. 2019. If Your Illness Makes Showering a Struggle, These 16 Memes Are
Jennifer Rode, and Jennifer Mankof. 2020. Nothing About Us Without Us: for You. https://themighty.com/2019/03/showering-chronic-illness-depression-
Investigating the Role of Critical Disability Studies in HCI. In Extended Abstracts memes-funny/
of the 2020 CHI Conference on Human Factors in Computing Systems (CHI EA ’20). [156] Anon Ymous, Katta Spiel, Os Keyes, Rua M. Williams, Judith Good, Eva Hor-
Association for Computing Machinery, 1–8. https://doi.org/10.1145/3334480. necker, and Cynthia L. Bennett. 2020. “I Am Just Terrifed of My Future” Epis-
3375150 temic Violence in Disability Related Technology Research. In Extended Abstracts
[141] Si Sun, Xiaomu Zhou, Joshua C. Denny, Trent S. Rosenbloom, and Hua Xu. 2013. of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu,
Messaging to your doctors: understanding patient-provider communications HI, USA) (CHI EA ’20). Association for Computing Machinery, New York, NY,
via a portal system. In Proceedings of the SIGCHI Conference on Human Factors in USA, 1–16. https://doi.org/10.1145/3334480.3381828
Computing Systems (CHI ’13). Association for Computing Machinery, 1739–1748. [157] Daihua X. Yu, Bambang Parmanto, Brad E. Dicianno, Valerie J. Watzlaf, and
https://doi.org/10.1145/2470654.2466230 Katherine D. Seelman. 2014. Accessible mHealth for patients with dexterity
[142] Maxime Taquet, Quentin Dercon, Sierra Luciano, John R Geddes, Masud Husain, impairments. In Proceedings of the 16th international ACM SIGACCESS conference
and Paul J Harrison. 2021. Incidence, co-occurrence, and evolution of long- on Computers & accessibility (ASSETS ’14). Association for Computing Machinery,
COVID features: A 6-month retrospective cohort study of 273,618 survivors of 235–236. https://doi.org/10.1145/2661334.2661402
COVID-19. PLoS medicine 18, 9 (2021), e1003773. [158] Tae-Jung Yun and Rosa I. Arriaga. 2013. A text message a day keeps the pul-
[143] Ingrid Torjesen. 2020. NICE backtracks on graded exercise therapy and CBT in monologist away. In Proceedings of the SIGCHI Conference on Human Factors in
draft revision to CFS guidance. Computing Systems (CHI ’13). Association for Computing Machinery, 1769–1778.
[144] Tatiana A. Vlahovic, Yi-Chia Wang, Robert E. Kraut, and John M. Levine. 2014. https://doi.org/10.1145/2470654.2466233
Support matching and satisfaction in an online breast cancer support community. [159] Xiaomu Zhou, Si Sun, and Jiang Yang. 2014. Sweet Home: understanding diabetes
In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems management via a chinese online community. In Proceedings of the SIGCHI
(CHI ’14). Association for Computing Machinery, 1625–1634. https://doi.org/ Conference on Human Factors in Computing Systems (CHI ’14). Association for
10.1145/2556288.2557108 Computing Machinery, 3997–4006. https://doi.org/10.1145/2556288.2557344
[145] Eleanor Rose Walker, Sebastian Charles Keith Shaw, and John Leeds Anderson. [160] Wei Zhu, Boyd Anderson, Shenggao Zhu, and Ye Wang. 2016. A Computer
2020. Dyspraxia in medical education: A collaborative autoethnography. The Vision-Based System for Stride Length Estimation using a Mobile Phone Camera.
Qualitative Report 25, 11 (2020), 4072–4093. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers
[146] Yaqing Wang, Quanming Yao, James T Kwok, and Lionel M Ni. 2020. Gener- and Accessibility (ASSETS ’16). Association for Computing Machinery, 121–130.
alizing from a few examples: A survey on few-shot learning. ACM computing https://doi.org/10.1145/2982142.2982156
surveys (csur) 53, 3 (2020), 1–34. [161] Irving Kenneth Zola. 1972. Medicine as an institution of social control. The
[147] Harriet A Washington. 2006. Medical apartheid: The dark history of medical sociological review 20, 4 (1972), 487–504.
experimentation on Black Americans from colonial times to the present. Doubleday [162] Irving Kenneth Zola. 1982. Missing pieces: A chronicle of living with a disability.
Books. Temple University Press.
Should I Say “Disabled People” or “People with Disabilities”?
Language Preferences of Disabled People Between Identity- and
Person-First Language
Ather Sharif Aedan L. McCall Kianna R. Bolante
asharif@cs.washington.edu aedanmc@cs.washington.edu kbolan@cs.washington.edu
Paul G. Allen School of Computer Paul G. Allen School of Computer Paul G. Allen School of Computer
Science & Engineering | DUB Group, Science & Engineering, Science & Engineering,
University of Washington University of Washington University of Washington
Seattle, Washington, USA Seattle, Washington, USA Seattle, Washington, USA

Figure 1: Language preferences of disabled people. (a) shows the overall preferences for United States, United Kingdom, and
Canada combined (N =519); (b) shows the preferences of disabled people from the United States (N =366); (c) shows the prefer-
ences of disabled people from the United Kingdom (N =112); and (d) shows the preferences of disabled people from Canada
(N =14). I F L stands for “Identity-First Language,” PF L stands for “Person-First Language,” and N P stands for “No Preference.”

ABSTRACT people may have multiple or no preferences. To make our survey


The usage of identity- (e.g., “disabled people”) versus person-frst data publicly available, we created an interactive and accessible
language (e.g., “people with disabilities”) to refer to disabled people live web platform, enabling users to perform intersectional explo-
has been an active and ongoing discussion. However, it remains ration of language preferences. In a secondary investigation, using
unclear which semantic language should be used, especially for part-of-speech (POS) tagging, we analyzed the abstracts of 11,536
diferent disability categories within the overall demographics of publications at ACM ASSETS (N =1,564) and ACM CHI (N =9,972),
disabled people. To gather and examine the language preferences of assessing their adoption of identity- and person-frst language. We
disabled people, we surveyed 519 disabled people from 23 countries. present the results from our analysis and ofer recommendations
Our results show that 49% of disabled people preferred identity-frst for authors and researchers in choosing the appropriate language
language whereas 33% preferred person-frst language and 18% had to refer to disabled people.
no preference. Additionally, we explore the intra-sectionality and
intersectionality of disability categories, gender identifcations, age
groups, and countries on language preferences, fnding that lan- CCS CONCEPTS
guage preferences vary within and across each of these factors. Our • Social and professional topics → People with disabilities; •
qualitative assessment of the survey responses shows that disabled General and reference → Surveys and overviews; Empirical stud-
ies; • Human-centered computing → Empirical studies in acces-
sibility; Web-based interaction; • Computing methodologies
This work is licensed under a Creative Commons Attribution International → Information extraction.
4.0 License.

ASSETS ’22, October 23–26, 2022, Athens, Greece


© 2022 Copyright held by the owner/author(s). KEYWORDS
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3544813 identity-frst, person-frst, language, disability, preferences, survey
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Aedan L. McCall, and Kianna R. Bolante

ACM Reference Format: displaying live results from the survey. Furthermore, the web plat-
Ather Sharif, Aedan L. McCall, and Kianna R. Bolante. 2022. Should I Say form enables users to flter data by any combination of disability
“Disabled People” or “People with Disabilities”? Language Preferences of categories, gender identities, age groups, countries, and years to
Disabled People Between Identity- and Person-First Language. In The 24th support a granular information extraction.
International ACM SIGACCESS Conference on Computers and Accessibility As a secondary exploration, to shed light on the language adop-
(ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA,
tion at academic venues, we analyzed the abstracts from 11,536
18 pages. https://doi.org/10.1145/3517428.3544813
publications at ACM SIGACCESS Conference on Computers and
Accessibility (ASSETS; N =1,564) and ACM Conference on Human
1 INTRODUCTION Factors in Computing Systems (CHI; N =9,972) from 2000- to 2021
Words have power. They refect attitudes that speakers want to (20 years, excluding 2001 and 2003). Our results show that combined
exchange [23]. They also shed light on the sensitivity to matters and separately, both the conferences employed a higher usage of
involving social justice and cultural awareness, especially for under- person-frst language (54.4% combined; 52.6% for ASSETS and 57.1%
represented and marginalized groups [13], such as disabled people. for CHI) compared to identity-frst language (45.6% combined; 47.4%
Several terms, such as “retarded,” are now considered outdated as for ASSETS and 42.9% for CHI). Altogether, our fndings indicate
they assert negative connotations on disabled people [26, 73] and that person-frst language is used more frequently despite disabled
some pejorative terms, such as “crippled” and “gimp,” have been people showing a higher preference for identity-frst language.
reclaimed by the disability community [1, 59]. Similarly, the debate The main contributions of our work are as follows:
between identity- (using identity frst; e.g., “disabled people”) and
(1) Empirical results from a survey of 519 disabled people from
person-frst language (using people frst; e.g., “people with disabili-
23 countries, showing their preferences between identity-
ties”) has been an active and ongoing discussion [2, 17, 18, 27].
and person-frst language. Overall, 49% of disabled people
The American Psychological Association (APA), the American
preferred identity-frst language, 33% of them were in favor
Medical Association, the American Psychiatric Association, the
of person-frst language, and 18% had no preference.
American Speech-Language-Hearing Association, and the Associ-
(2) Empirical results from analyzing the abstracts of 11,536 aca-
ated Press advocate for person-frst language [2, 17, 27]. However,
demic publications published at ACM ASSETS (N =1,564) and
a recent inquiry from Vivanti [70] regarding language usage for
ACM CHI (N =9,972) from the past 20 years, showing the
autistic people shows that well-intentioned scholars are still unsure
of the “right” language to use. Furthermore, Vivanti’s inquiry is par- total count of identity- and person-frst terminologies used
per year. Overall, the publications used person-frst language
ticularly relevant to our exploration as the autistic community has
9.7% and 24.9% more than identity-frst language at ASSETS
long advocated for identity-frst language for themselves, contrary
and CHI, respectively.
to the general recommendation from the APA, which explicitly
(3) Accessible web platform, showing live survey results. The
asks the writers to “put the person frst” [3, 27]. Hence, the prefer-
web platform enables users to flter their language prefer-
ences of the disability categories within the disability community
ences by any combination of disability categories, age groups,
should direct the language used to refer to that particular group
gender identities, countries, and years. We present the de-
[9, 26, 43, 51]. Additionally, as societies evolve, these preferences
sign, functionality, and implementation of our system. Ad-
may become obsolete over time. (For example, the National Federa-
ditionally, we publish our web platform, making it publicly
tion of the Blind (NFB) [44] has, in recent years, started advocating
available at https://disabilityterminology.athersharif.com/.
for “blind and low-vision” in place of “visually impaired.”)
To understand the preferences of disabled people in the usage
of language that refers to them, we designed and distributed a 2 BACKGROUND AND RELATED WORK
survey globally. Specifcally, the survey recorded the respondents’ We review literature from Disability Studies on the discussion of us-
preference between identity-frst language, person-frst language, ing identity- versus person-frst language to refer to disabled people.
and no preference. Additionally, we asked the respondents for the Identity-frst language (IFL) places the identity frst, acknowledging
reasons behind their preferences. We recorded the timestamps of the disability of a person as their defning characteristic (e.g., “dis-
the survey responses to track the language preferences over time. abled person”). In contrast, person-frst language (PFL) emphasizes
As of the date of writing this paper, our survey had responses from the person frst and then their disability (e.g., “person with a disabil-
519 disabled people, representing nine disability categories, six age ity”). We also review prior work on disability language preference
groups, and 23 countries. Our fndings show that, overall, disabled surveys and analysis of accessibility-related academic publications.
people prefer identity-frst language (49.0%) compared to person-
frst language (33.0%). We also explored the intra-sectionality and
intersectionality of disability categories, gender identity, age group,
2.1 Identity- Versus Person-First Language in
and country on language preferences. We found that language pref- Disability Studies
erences vary both within and across these factors. For example, Disability Studies is an interdisciplinary feld that explores the
people with mobility disabilities prefer person-frst language (46.2%) political, intellectual, and cultural dimensions of disability in soci-
over identity-frst language (39.4%). Our qualitative assessment of ety [21, 29, 33]. Several scholars and researchers have contributed
the responses shows that disabled people may have multiple or to the discussion of using identity- versus person-frst language
no language preferences. To make our survey data available to to refer to people with disabilities. As of the date of this writing,
the public, we created an interactive and accessible web platform Semantic Scholar [65], a search engine for academic publications,
Identity- Versus Person-First Language ASSETS ’22, October 23–26, 2022, Athens, Greece

shows 556 results for a search query containing the terms “person- 2.2 Disability Language Preference Surveys
frst,” “identity-frst,” and “disability” (all the words appearing at Several researchers have administered surveys with disabled people
least once anywhere in the publication text). We only explored the to draw inferences about their experiences and language prefer-
publications relevant to our work. ences [6, 22, 37, 38, 48, 60]. Lister et al. [38] conducted surveys with
Identity-frst language (IFL), which emphasizes the disability 723 disabled students to investigate their language preferences for
identity of the person, can help reclaim once pejorative terms (such communication in higher educational institutions, fnding that stu-
as “crippled”) used for disabled people [1, 59] and can be instrumen- dents were uncomfortable with terminologies addressing them as
tal in driving social change [26, 39]. For example, most recently, “disabled” and that language preferences diverged across contexts
Netfix [45], a popular streaming service, released a documentary ti- and demographics of students. They recommended exploring dif-
tled “Crip Camp” [36, 54], which is an indispensable flm that sheds ferential and inclusive approaches to fnd the appropriate language
light on the history of the disability rights movement [24, 25], high- rather than focusing on a single model. Levy et al. [37] surveyed
lighting the inequities disabled people face in society. The National 63 disabled people to investigate the respectfulness of the termi-
Federation for the Blind [44], which is a national advocacy orga- nologies used in ASSETS publications to refer to disabled people,
nization representing blind and low-vision (BLV) people, elected reporting that their respondents found the terms “disabled peo-
to use IFL to refer to BLV individuals in its 1993 resolution [46], ple” and “diferently-abled” the most respectful and disrespectful,
stating that person-frst language “implies shame instead of true respectively. In a similar exploration, Fernald [22] explored the
equality” [16, 17]. Similarly, the autistic community is a strong pro- diferences between American disability terminologies and the ter-
ponent of using IFL for autistic people [8, 10, 34, 35, 50], expressing minologies used in other English-speaking countries by surveying
that IFL encourages society to acknowledge and celebrate them as 26 disability-related professional and advocacy organizations, dis-
autistic individuals [35, 50, 52]. Additionally, IFL can help increase covering disparities in the preference of disability language between
public visibility into the stigma disabled people experience as an diferent countries.
underrepresented minority group and assist in reducing that stigma Additionally, Bickford et al. [6] explored the preferences of 100
to build a more inclusive society [8]. blind and low-vision people on disability terminology, albeit they
On the other hand, person-frst language (PFL), which recognizes used interviews to record their participants’ preferences. They
the person before their diagnostic label, is the most widely-used found that 37% of the individuals interviewed did not have a pref-
language style [2, 17, 27, 49, 66]. One of the frst scholarly works to erence, and among those who did, 76% favored identity-frst termi-
advocate for PFL was by Wright [75], who suggested that the person nologies.
should be the primary focus in language choices to eradicate the Our work draws inspiration from Lister et al.’s [38] recommen-
dehumanizing language used to describe disabled people through- dation of utilizing diverse approaches to determine the appropri-
out the twentieth century [2]. Since then, people and organizations ate language to refer to disabled people. Additionally, we follow
have widely used PFL with well-intended goals of attenuating the Bickford et al. [6] in exploring a polychotomous classifcation of
stigma associated with disabilities [27, 66]. (However, several schol- preferences (including “no preference” as an option), as opposed
ars argue that, although well-intended in its original proposition, to using the dichotomy of choices between IFL and PFL. However,
PFL may have overcorrected to accentuate this stigma, particularly in contrast to these surveys, our survey is the frst scholarly work
in scholarly writing [27, 31, 66].) PFL’s wide adoption includes to perform all of the following in combination: (1) Identify the lan-
its usage in the Americans with Disabilities Act (ADA) [67, 74] guage preferences (IFL, PFL, no preference, or multiple preferences)
and recommendations of use in academic writing by numerous of disabled people representing at least one of several disability
style guides, including the American Psychology Association (APA), categories; (2) explore intersectionality and intra-sectionality of
American Medical Association, American Psychiatric Association, disability category, gender identity, age, country, and year in deter-
American Speech-Language Hearing Association, and Associated mining language preferences; (3) examine the temporal evolution
Press [2, 17, 27]. of language preferences; and (4) display live results from the sur-
While PFL is employed widely by several organizations to avoid vey through an interactive website that enables users to flter their
daily discourse [17], disability rights advocates and activists pro- query by any combination of the factors mentioned above.
mote IFL [2, 40]. However, several scholars and researchers have
identifed the need to utilize both appropriately, claiming that a
singular linguistic model is non-representative of the entire dis-
2.3 Analyzing Accessibility-Related Academic
ability community [15, 34, 43, 51, 61]. Therefore, disability groups Publications
should direct the language that describes their respective com- Researchers have analyzed text from academic publications to in-
munities [9, 26, 43, 51]. We are sensitive to this discussion and vestigate matters in accessibility research [5, 12, 37, 42, 62]. Most
acknowledge that each individual may have a unique language recently, Levy et al. [37] conducted a qualitative literature review
preference, which may or may not align with the consensus among of 106 papers published at ASSETS from 2018- to 2020 (3 years) to
the members of their respective disability category. We also note understand the terminology used to refer to disabled people. They
that at least one of the authors of this work identifes as a disabled found that authors used PFL terms more than IFL terms. However,
person. In our work, we seek to gather insights into the language as they stated, their exploration was only preliminary and contained
preferences of disabled people, paying close attention to avoid any a small, non-representative sample size. Mack et al. [42] analyzed
of our personal biases and beliefs on the matter. 835 technical papers published at ASSETS and CHI, refecting on
the growth and history of the feld of accessibility. Their results
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Aedan L. McCall, and Kianna R. Bolante

Table 1: Overview of demographics for our N = 491 participants (after exclusion) per country, further classifed by disability
category, gender identity, and age group. N is the total number of participants and % is the percentage compared to the total
number of participants. For the “overall” column, the % is shown as “-,” naturally assuming it to be 100.

Overall United States United Kingdom Canada


N % N % N % N %
Overall 491 - 365 74.3% 112 22.8% 14 2.9%
By Disability Category (DSB)
Mobility 104 - 59 56.7% 42 40.4% 3 2.9%
Visual 248 - 235 94.8% 7 2.8% 6 2.4%
Cognitive 112 - 72 64.3% 34 30.4% 6 5.4%
Learning 36 - 24 66.7% 11 30.6% 1 2.8%
Neurological 53 - 30 56.6% 22 41.5% 1 1.9%
Auditory 50 - 39 78.0% 8 16.0% 3 6.0%
Chronic Illness 149 - 57 38.3% 90 60.4% 2 1.3%
Mental Health Related 122 - 72 59.0% 44 36.1% 6 4.9%
Other 4 - 3 75.0% 1 25.0% 0 0.0%
By Gender Identity (GND)
Woman 305 - 211 69.2% 89 29.2% 5 1.6%
Man 137 - 121 88.3% 10 7.3% 6 4.4%
Non-binary 49 - 31 63.3% 14 28.6% 4 8.2%
Prefer not to disclose 7 - 6 85.7% 1 14.3% 0 0.0%
By Age Group
18-25 61 - 52 85.2% 6 9.8% 3 4.9%
26-35 125 - 96 76.8% 25 20.0% 4 3.2%
36-50 150 - 106 70.7% 40 26.7% 4 2.7%
51-60 88 - 58 65.9% 30 34.1% 0 0.0%
61-70 49 - 38 77.6% 9 18.4% 2 4.1%
71 and older 18 - 15 83.3% 2 11.1% 1 5.6%

show that accessibility research focuses disproportionately on the 3.1 Method


blind and low-vision (BLV) community. Although their methodol- We administered an online survey to assess the language prefer-
ogy and sample size are plausible, their work did not explore the ences of disabled people using a mixed-methods approach. Specif-
language used to refer to disabled people. ically, we investigated the diference in preferences based on dis-
Our work examines the language used to refer to disabled peo- ability category, gender identity, and country of residence using
ple by analyzing the abstracts from 11,536 publications (including quantitative methods. Additionally, we evaluated the reasons be-
poster papers and extended abstracts) published at ASSETS and hind their choice of preferences using qualitative methods.
CHI from 2000- to 2021 (20 years, excluding 2001 and 20031 ).
3.1.1 Procedure. Participants took part in our survey online, with-
3 PREFERENCES SURVEY out supervision. The survey comprised three steps. In the frst step,
To gather and understand the language preferences between identity- the survey showed the purpose of our study, eligibility criteria,
and person-frst language, we surveyed 519 disabled people globally. defnitions and examples of identity- and person-frst language,
We present our methodology to conduct the survey and results from and data anonymity clause. We collected demographic information
analyzing the survey responses. from our participants in step two, including their gender identity,
pronouns, age, country, disability category, diagnosis, and age of di-
1 ASSETS skipped publications in 2001 and 2003 agnosis. We selected the disability categories by contacting several
Identity- Versus Person-First Language ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 2: Overview of language preferences from N = 895 observations, further classifed by disability category, gender iden-
tity, country, and age group. N is the total number of participants and % is the percentage compared to the total number of
participants. For the “overall” column, the % is shown as “-,” naturally assuming it to be 100.

Overall United States


IFL PFL NP IFL PFL NP
T N % N % N % T N % N % N %
Overall 895 435 48.6 295 33.0 165 18.4 598 290 48.5 172 28.8 136 22.7
By Disability Category (DSB)
Mobility 104 41 39.4 48 46.2 15 14.4 59 24 40.7 24 40.7 11 18.6
Visual 248 99 39.9 77 31.0 72 29.0 235 94 40.0 70 29.8 71 30.2
Cognitive 112 80 71.4 18 16.1 14 12.5 72 49 68.1 11 15.3 12 16.7
Learning 36 22 61.1 9 25.0 5 13.9 24 11 45.8 9 37.5 4 16.7
Neurological 53 22 41.5 25 47.2 6 11.3 30 12 40.0 14 46.7 4 13.3
Auditory 50 29 58.0 11 22.0 10 20.0 39 21 53.8 8 20.5 10 25.6
Chronic Illness 149 65 43.6 69 46.3 15 10.1 57 33 57.9 16 28.1 8 14.0
Mental Health Related 122 70 57.4 34 27.9 18 14.8 72 41 56.9 17 23.6 14 19.4
Other 4 1 25.0 3 75.0 0 0.0 3 1 33.3 2 66.7 0 0.0
By Gender Identity (GND)
Woman 305 114 37.4 136 44.6 55 18.0 211 85 40.3 78 37.0 48 22.7
Man 137 54 39.4 45 32.8 38 27.7 121 47 38.8 38 31.4 36 29.8
Non-binary 49 40 81.6 3 6.1 6 12.2 31 24 77.4 3 9.7 4 12.9
Prefer not to disclose 7 3 42.9 2 28.6 2 28.6 6 3 50.0 1 16.7 2 33.3
By Country
United States 365 157 43.0 119 32.6 89 24.4 - - - - - - -
United Kingdom 112 45 40.2 60 53.6 7 6.3 - - - - - - -
Canada 14 6 42.9 6 42.9 2 14.3 - - - - - - -
By Age Group
18-25 61 37 60.7 13 21.3 11 18.0 52 32 61.5 10 19.2 10 19.2
26-35 125 65 52.0 38 30.4 22 17.6 96 51 53.1 24 25.0 21 21.9
36-50 150 60 40.0 63 42.0 27 18.0 106 44 41.5 39 36.8 23 21.7
51-60 88 28 31.8 41 46.6 19 21.6 58 17 29.3 23 39.7 18 31.0
61-70 49 15 30.6 20 40.8 14 28.6 38 10 26.3 16 42.1 12 31.6
71 and older 18 3 16.7 10 55.6 5 27.8 15 3 20.0 7 46.7 5 33.3

disability-related advocacy organizations (for transparency, at least multiple options from choices including “women,” “men,” “non-
one of the authors is disabled and is a member of some of these binary,” “prefer not to disclose,” and “prefer to self-describe.” Our
organizations). Participants were allowed to select multiple disabil- survey displayed an additional text feld if participants preferred to
ity categories. To appropriately ask our participants which gender self-describe their gender identit(y/ies).
identities they relate to, we followed guidelines from [63]. Similar In the fnal step, we asked the participants their preference be-
to the disability categories, we enabled our participants to choose tween identity- and person-frst language, providing them with the
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Aedan L. McCall, and Kianna R. Bolante

Figure 2: Percentage of total count for language preferences by (a) disability categories, (b) countries, (c) age groups, and
(d) gender identities. (a), (c), and (d) are further classifed by regions “overall” (United States, United Kingdom, and Canada)
and “United States.” I F L stands for “Identity-First Language,” PF L stands for “Person-First Language,” and N P stands for “No
Preference.”

additional choice of “no preference.” Finally, we asked the partic- contacted several local, national, and global disability-related or-
ipants to state the reason for their preference in detail. To obtain ganizations (e.g., The National Federation of the Blind [44]) via
further contextual insights, we inquired if the participants were contact forms and email addresses mentioned on their websites for
familiar with the terms “identity-frst” and “person-frst” before advertising the survey. Altogether, 519 participants (M=42.7 years,
taking this survey and the language style they encounter the most SD=14.9) from 23 countries responded to our survey. We excluded
in their everyday lives. (We provide our survey responses [col- the responses from countries that had a total count of fewer than
lected as of the time of this writing—before March 16th , 2022] in 10 responses.
the supplementary materials.) After exclusion, our participant pool comprised 491 participants
(M=42.8 years, SD=14.9) from three countries: (1) United States
3.1.2 Participants. Our survey respondents (“participants”) vol- (N =365); (2) United Kingdom (N =112); and (3) Canada (N =14).
untarily took part in our online survey, advertised through word- Three-hundred-and-fve participants identifed as women, 137 as
of-mouth, snowball sampling, social media channels (Facebook men, and 45 as non-binary. Eleven participants described their gen-
and Twitter), and email distribution lists for disabled people. We der identity themselves, and seven did not disclose their gender
Identity- Versus Person-First Language ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 3: Summary of statistical results from N =895 overall (CTY =“United States,” “United Kingdom,” and “Canada”) and N =598
specifc (CTY =“United States”) observations. “DSB” is the disability category, “GND” is the gender identity, and “CTY” is the
country. Cramer’s V is a measure of efect size [20].

Overall United States


N χ2 p Cramer’s V N χ2 p Cramer’s V
DSB 895 33.95 < .05 .20 598 21.47 .161 .19
GN D 895 61.26 < .001 .26 598 13.19 < .05 .15
CTY 895 16.99 < .05 .14 - - - -
Aдe 895 26.17 < .001 .17 598 37.50 < .001 .25

identity. Table 1 shows a demographic breakdown of participants .2 demonstrate small efect.) Disability categories including Visual,
across the United States, United Kingdom, and Canada. Cognitive, Learning, and Auditory preferred IFL (39.9%, 71.4%, 61.1%,
As we noted in Section 2.1 of this paper, at least one of the and 58.0%, respectively) whereas categories Mobility, Neurological,
authors identifes as a disabled person. To avoid our personal biases and Chronic Illness preferred PFL (46.2%, 47.2%, and 46.3%, respec-
and beliefs, we did not partake in our survey. tively). Figure 2 and Table 2 show the PRF percentages across all
independent variables used in our analysis. For participants from
3.2 Quantitative Evaluation the United States, DSB did not have a statistically signifcant efect
We used the following factors and levels: on PRF (p ≈.161).
We also found a signifcant main efect of Gender Identity (GND)
• Disability Category (DSB), within-Ss.: {Mobility, Visual, Cog-
on PRF (χ 2 (6, N =895)=61.26, p<.001, Cramer’s V =.26). This result
nitive, Learning, Neurological, Auditory, Chronic Illness,
indicates that PRF difers signifcantly between diferent gender
Mental-Health Related}
identities. People who identifed as Non-binary and as Man pre-
• Gender Identity (GN D), within-Ss.: {Woman, Man, Non-Binary}
ferred IFL (81.6% and 39.4%, respectively) and people who iden-
• Country (CTY ), within-Ss.: {United States, United Kingdom,
tifed as Woman preferred PFL (44.6%). For participants from the
Canada}
United States, GN D also had a signifcant main efect on PRF (χ 2 (6,
Our dependent variables were Language Preference (PRF) and N =598)=13.19, p<.05, Cramer’s V =.15), with all gender categories
Language Commonly Encountered (LCE). To analyze PRF , we used preferring IFL over PFL.
a polychotomous representation (0 for “identity-frst language,” The factor Country (CTY) also had a signifcant main efect on
1 for “person-frst-language,” and 2 for “no preference”) and a PRF (χ 2 (4, N =895)=16.99, p<.05, Cramer’s V =.14). Specifcally, peo-
multinomial logistic regression model [4, 69] with the above fac- ple in the United States preferred IFL (43.0%), people in the United
tors and a covariate to control for Age. Our statistical model was: Kingdom preferred PFL (53.6%), and people in Canada demonstrated
PRF ← DSB + GN D + CTY + Aдe. We did not include interactions an equal preference.
between our factors as our research exploration centered around We investigated the efects of Age on PRF . Age had a signifcant
investigating the main efects of these factors with PRF . efect on PRF overall (χ 2 (2, N =895)=26.17, p<.001, Cramer’s V =.17)
We analyzed LCE classifying it dichotomously (0 for “identity- and for participants from the United States ((χ 2 (16, N =598)=37.50,
frst language” and 1 for “person-frst language”), using a mixed p<.001, Cramer’s V =.25), indicating that PRF difered signifcantly
logistic regression model [28] with the above factors and a covariate across the ages of our participants. Participants 35 or older preferred
to control for Age. Our statistical model was the same as for PRF , PFL (54.0%), whereas participants under 35 preferred IFL (43.9%). For
and we only explored the main efects of these factors with LCE. participants from the US, the trend was similar, but for participants
Additionally, we performed a separate analysis for CTY =United 50 years or older (41.4%). Those under 50 preferred IFL (50.0%).
States as a large majority (N =74.3%) of our survey respondents Table 3 shows the statistical results from all of our analyses.
were from the United States. We used the same above-stated model, Bickford et al. [6] did not fnd signifcant main efects of Gender
naturally removing CTY from the list of terms. We present our and Age on language preferences of blind and low-vision individuals
quantitative results for Language Preference (PRF) and Language (N =100). As they conducted their study 18 years ago, we performed
Commonly Encountered (LCE). a second analysis, examining the current efects of GN D and Age on
PRF for our visually-disabled participants. We found that Age had
3.2.1 Language Preference (PRF). Disability Category (DSB) had a
a signifcant main efect on PRF both overall (χ 2 (2, N =250)=13.52,
signifcant main efect on PRF (χ 2 (16, N =895)=33.95, p<.05, Cramer’s
p<.05, Cramer’s V =.23) and for participants from the US ((χ 2 (16,
V =.20), indicating that PRF difers signifcantly between the nine
N =237)=10.76, p<.05, Cramer’s V =.21). GN D, however, did not have
disability categories. (Cramer’s V is a measure of efect size for χ 2
a signifcant main efect on PRF . Hence, our results only partially
tests, ranging from 0 to 1. Values greater than .6 demonstrate large
efect, between .2 and .6 demonstrate medium efect, and less than
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Aedan L. McCall, and Kianna R. Bolante

agreed with those from Bickford et al.’s analyses, showing that lan- Similarly, P220, who identifes as a person with a mobility dis-
guage preferences signifcantly varied among our visually-disabled ability, and P120, who is a person with a chronic illness, expressed
participants. clear preferences for PFL:
It’s not a deep loathing, but I just don’t
3.2.2 Language Commonly Encountered (LCE). The factor Disabil- like it when ‘‘disabled’’ is literally the
ity Category (DSB) had a signifcant main efect on LCE (χ 2 (8, first way a person learns of me, as in, ‘‘A
N =895)=21.98, p<.05, Cramer’s V =.16). This result indicates that disabled woman I work with.’’ There, I’m
LCE difers signifcantly between the nine disability categories. disabled before literally anything else, and
Specifcally, disability categories including Visual, Cognitive, Neuro- it’s really not one of the most interesting
logical, Auditory, Chronic Illness, and Mental-Health Related com- things about me. (P220)
monly encounter IFL (62.9%, 58.0%, 60.4%, 56.0%, 55.0%, and 57.4%,
It is important to me that a person is said
respectively) whereas category Mobility encounter PFL more com-
first. We are humans. We have feelings and
monly (56.7%). Category Learning had an equal percentage of lan-
deserve to be recognized as a person before
guage encounter (50%).
a disabled person. (P120)
Similarly, Country (CTY) also had signifcant main efects on LCE
(χ 2 (2, N =895)=16.23, p<.001, Cramer’s V =.14), showing that LCE Overall, in line with our quantitative results, our frst theme
difers signifcantly between the three countries used in our analysis. shows that language preferences vary between disability categories.
Our participants from the United States and United Kingdom more 3.3.2 Not Everyone Has a Preference. Our fndings show that 18.4%
commonly encountered IFL (56.2% and 56.3%, respectively), whereas and 22.7% survey respondents overall and in the United States,
participants from Canada encountered PFL more commonly (71.4%). respectively, had no preference between identity- and person-frst
The factors GN D and Aдe did not have a statistically signifcant language. For example, P308, P419, and P200 shared their opinions:
efect on LCE. Additionally, none of the factors were statistically
All I care about is that people know I’m
signifcant for participants from the United States.
blind. If someone wants to say I’m an individual
who is blind, or that I’m a blind person, it
3.3 Qualitative Evaluation doesn’t make a difference to me. (P308)
To qualitatively assess the language preferences of disabled peo- I do not have a preference because I prefer
ple, we analyzed their free-form survey responses. Specifcally, we others to feel comfortable and am secure
examined the reasons for their choice of preference. We used stan- enough in who I am as a person to not become
dard semantic thematic analysis processes [11, 47] to analyze the overly offended or upset by the ways people
responses. Our fnal analysis revealed three themes: (1) one size go about communication. Far too often this
does not ft all; (2) not everyone has a preference; and (3) people can issue of person vs identity-first language
have multiple preferences. We discuss these below, in turn. becomes unnecessarily heated and seems to
cause more anxiety than is needed, healthy,
3.3.1 One Size Does Not Fit All. Our frst theme shows that the and helpful. (P419)
language preferences of disabled people can vary between diferent The order of the person and the disability
disability categories. For example, as shown in Table 2, the blind and doesn’t change the end result. Whether the
low-vision community prefers IFL, whereas people with mobility person or the disability comes first the
disabilities prefer PFL. Therefore, using one-size-fts-all language disability is still present and, in my opinion,
might not be appropriate for all disabled people. P26, who is autistic, does not really modify any sort of context.
and P20, who is blind, had emphatic preferences for IFL: (P200)
Person-first language implies that being disabled Our second theme shows that disabled people may not have a
is a bad thing, which we should continue to preference as long as their disability is “seen” and acknowledged.
stigmatize. I’m a disabled, autistic person,
3.3.3 People Can Have Multiple Preferences. Our third theme re-
not a person with autism or a person with a
veals that disabled people can have multiple language preferences.
disability---you can’t separate those experiences
We found that disabled people with multiple disabilities may have
out from the rest of me. (P26)
diferent preferences for each disability category. For example, P447,
I’m not ashamed of my disability. I am who who is autistic and has PTSD, had diferent preferences for each
I am, there’s no point in denying that. disability category:
‘‘Person with blindness’’ just... sounds wrong I use both. I often use identity-first language
to me. I feel like person-first language when relaying the fact that I am autistic,
tries to hide our disabilities. It communicates and will use person-first language to explain
the message that, even though you’re disabled, that I have PTSD. I do this as it seems to be
you’re still a person, treating disability the preferential consensus of self-advocates
as something wrong and something to be ashamed that I know, and it helps people feel comfortable
of, which is not the way I feel about it. (P20) if I use the language they prefer. I don’t
Identity- Versus Person-First Language ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: The user interface of the accessible web platform showing fltering options for intersectional explorations using
multi-select dropdown for age group, country, disability category, gender identity, and month/year. Tabs for these factors are
also shown for intra-sectional exploration.

feel very strongly about either one over the 4 ACCESSIBLE WEB PLATFORM
other personally. I think individuals have To provide transparency and comprehensive means of obtaining up-
the right to choose to use whichever is most to-date language preferences of disabled people, we developed an
comfortable for them. (P447) accessible web platform that shows the live results from our survey
(updated immediately after a participant flls out the survey). We
Similarly, P393, who is autistic and has anxiety and ADHD, present our web platform’s design considerations, functionalities,
shared her opinions: and implementation details.
Depends. With Autism I prefer identity first.
Mostly because I do not see my autism as a 4.1 Design Considerations
condition. I don’t want it treated like a In developing the accessible web platform, our goal was to create a
disease. With my anxiety I prefer person straightforward interface that allows users to explore and extract
first. The same is true of ADHD. (P393) information efectively and granularly. Our user interface had two
sections: (1) Sidebar Navigation; and (2) Content Area.
Additionally, disabled people may have multiple preferences de-
pending on the context. P329 explained their preference diferences 4.1.1 Sidebar Navigation. The sidebar navigation contained fve
based on professional and informal environments: options (as shown in Figure 3): (1) Home; (2) Identity-First Language;
(3) Person-First Language; (4) Share Your Preference; and (5) Contact.
I code switch between the two depending on Home page displays the survey results and is the entrypoint for the
my audience. When speaking informally, or to website. Identity-First Language and Person-First Language pages
groups of people with disabilities, I use show defnitions and examples for identity- and person-frst lan-
blind person, but for professional settings guage, respectively. Clicking on Share Your Preference navigates the
I use person with a disability. (P329) user to the survey, whereas the Contact page displays the names
and email addresses for the members of the research team.
In our third theme, we found that disabled people may have
multiple preferences. Our fndings show that their language pref- 4.1.2 Content Area. We displayed the contents for each page in the
erences could vary for each disability category they represent or content area of our website (as shown in Figure 3). For the Home
based on the context or environment. To make our survey data page, the contents involved tabs for each independent variable and
publicly available, we created an accessible web platform, which a visualization displaying the distribution of language preferences
we present in the section below. between IFL (identity-frst language), PFL (person-frst language),
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Aedan L. McCall, and Kianna R. Bolante

and NP (no preference). All the other pages contained appropriate with neurological disabilities prefer person-frst lan-
text organized using headings and paragraphs. guage, deaf and hard of hearing people prefer identity-
frst language, people with chronic illnesses disabilities
prefer person-frst language, mental-health-related dis-
4.2 Intersectional & Intra-sectional abled prefer identity-frst language. The data table is
Exploration presented below.
We recorded the disability category, gender identity, age, and coun- We used appropriate colors for Color Vision Defciency (CVD)
try of the survey respondents to enable users to examine the lan- in our data visualizations. We also checked the contrast ratio to be
guage preferences of disabled people both separately for each inde- at least 3:1, using the WebAIM Contrast Checker tool [72]. Finally,
pendent variable (intra-sectionality) and in combination (intersec- we tested our platform for accessibility with and without several
tionality). We implemented “tabs” (or subsections) to support users screen readers. However, we did not conduct studies with screen-
in exploring the intersectionality and intra-sectionality of these reader users to test the accessibility of our platform. We plan on
factors. The Overall tab allows users to examine the intersectional- continually improving our web platform’s accessibility and usability
ity of these factors, whereas the other subsections enable users to by conducting formative studies with diverse groups of users.
perform intra-sectional exploration. Additionally, we added the By
Year tab to track the evolution of language preferences over time. 4.4 Implementation Details
As we started collecting the survey results only a few months back,
In developing our accessible web platform, we generated 119,402
the data under this subsection, at present, may not be signifcant.
lines of developed code through 19 commits, excluding comments.
However, we hope that tracking the survey data over time will
We used the React [32] framework to build our platform. Therefore,
reveal patterns and results worth exploring.
JavaScript was naturally our choice of programming language. Ad-
In the Overall tab (Figure 3), users can flter the results using the
ditionally, we used EcmaScript [30], employing modern JavaScript
multi-select dropdowns in our web-interface to perform a granular
features. We implemented our survey using Google Forms, auto-
intersectional exploration as per their needs and curiosity. For
collecting the responses in a Google Sheets document. We used
example, users can select “United States” and “United Kingdom”
Metis [55]—a React plug-in that allows the usage of Google Sheets
from the countries dropdown and “2021” and “2022” to display the
as a database—to display live results on our web platform. Currently,
language preferences collected from participants from these two
our data is not downloadable; we have started the development
countries in the past two years. For simplicity, we chose a pie chart
work to support exporting the data as CSV and JSON fles.
to display the results, showing the total count and percentage for
each type of language preference. By Disability (see Appendix A,
Figure 5), By Age Group (see Appendix B, Figure 6), By Gender (see
5 LANGUAGE USED IN PUBLICATIONS AT
Appendix C, Figure 7), and By Country (see Appendix D, Figure 8) ASSETS AND CHI
subsections display the results using a bar chart, whereas By Year In a secondary exploration, we analyzed 11,536 abstracts from pub-
(see Appendix E, Figure 9) shows the results using a smoothed line lications at ASSETS and CHI to assess the language usage in publi-
graph. We created our graphs using the D3 visualization library [7]. cations at these academic venues. In this section, we present our
methodology, analysis, and results.

4.3 Accessibility 5.1 Method


We used VoxLens [58], an open-source JavaScript plug-in that im- Our goal was to investigate the adoption of identity- and person-
proves the accessibility of online data visualizations using a multi- frst language in publications at ACM SIGACCESS Conference on
modal approach. However, VoxLens is currently only applicable Computers and Accessibility (ASSETS) and ACM Conference on Hu-
to visualizations created using two-dimensional single-series data. man Factors in Computing Systems (CHI). Therefore, we analyzed
Therefore, we only used it with our pie charts. For the other visual- 11,536 abstracts from papers at these conferences.
izations containing multi-series data, we followed the recommen-
dations from prior work [41, 56, 57] to generate the alternative text 5.1.1 Data Set. Similar to prior work [37, 42], we queried the ACM
(“alt-text”) dynamically. Specifcally, we used Accessible Rich Inter- Digital Library for all papers published since the year 2000 (inclu-
net Applications (ARIA) attributes [71] to add alt-text to our graphs. sive) at ASSETS and CHI, collecting their abstracts. Unlike prior
Additionally, we added a tabular representation of data only visible work [37, 42], our data set was not solely limited to technical papers
to screen readers, similar to the accessibility measures employed by and comprised all the publications, including extended abstracts
Google Charts [14, 56]. For example, the alt-text for our bar chart and poster papers, resulting in a total of 1,564 (ASSETS) + 9,972
showing the language preferences by disability category was: (CHI)=11,536 abstracts.
Our goal was to determine the identity- and person-frst language
Bar chart showing counts for identity-frst, person-frst, from the collected abstracts using an automated Natural Language
and no preference per disability. People with mobility Processing (NLP) approach. Therefore, we utilized “SpaCy” [19, 68]—
disabilities prefer person-frst language, visually dis- a widely-used NLP library—to generate Part-of-Speech (POS) tags.
abled people prefer identity-frst language, cognitively (POS tags are the grammatical tags that identify the part of speech
disabled people prefer identity-frst language, learning of words in text based on both their defnitions and contexts.) First,
disabled people prefer identity-frst language, people we compiled a list of all nouns appearing in the 11,536 abstracts,
Identity- Versus Person-First Language ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 4: Percentage of total count for IFL terms versus PFL terms per year, published at ACM ASSETS and ACM CHI since the
year 2000 (excluding 2001 and 2003). Total count for terms is shown in parenthesis.

resulting in 466 unique nouns. Then, we manually fltered the list, (1) IFL: One of the six IFL terms followed one of the 17 person-
retaining only the most commonly appearing nouns that identify identifers (e.g., disabled user).
a person (“person-identifers”). Our fnal list included 17 person- (2) PFL: One of the 17 person-identifers, followed by an adpo-
identifers: people, individual, user, student, person, participant, adult, sition (e.g., with) and one of the six PFL terms (e.g., student
children, researcher, subject, practitioner, learner, developer, designer, with autism).
population, activist, and faculty. In addition to collecting the person-
identifers, based on prior work [37, 42], we composed a list of six To increase the efciency of our approach, we removed the deter-
IFL terms and their equivalent PFL terms that people commonly miners (e.g., a or an) from the sentences. We designed our algorithm
use to refer to disabled people. The IFL/PFL terms are as follows: to include plurals (e.g., disability and disabilities) in search queries.
disabled/disabilit(ies), impaired/impairment, wheelchair/wheelchair, Additionally, our algorithm included compound nouns (e.g., peo-
blind/blindness, deaf/deafness, and autistic/autism. ple with physical impairments) when searching for PFL terms. In
Then, for each sentence in each abstract, we searched for IFL addition to extracting the total count of the terms, we recorded all
and PFL terms following the criteria below: the sentences containing those terms. We manually tested every
sentence to check for false positives, making algorithmic adjust-
ments wherever necessary. The supplementary materials comprise
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Aedan L. McCall, and Kianna R. Bolante

our collected data, including all recorded sentences containing the language preferences within disability categories for participants
IFL and PFL terms for ASSETS and CHI for each publication year. within and outside the United States.
5.1.2 Analysis. In addition to calculating the total count for IFL
and PFL terms in publications at ASSETS and CHI, we explored the
diference in adoption of these terms between the two venues. We 6.2 Intersectionality Matters
used Conference (CNF) as our independent variable with the follow- Similarly, we found that language preferences varied within and
ing levels: {ASSETS, CHI}. Our dependent variable was Language across gender identities, age groups, and countries. For example,
Preference (PRF). We calculated PRF as the ratio between the total women 36 years or older had a dominant preference for person-frst
count for IFL terms and the total for PFL terms in a given publi- language (53% vs. 29%; 18% had no preference). In contrast, men
cation year. To analyze PRF , we used Independent-samples t-test in the same age group had no prominent diference in language
[53, 64] to determine signifcance. As ASSETS skipped publications preferences (37% for identity-frst, 35% for person-frst, and 28%
in 2001 and 2003, we excluded the data for CHI from these years in for no preference). Likewise, men and women in the United States
our analysis. had almost identical preferences between identity- and person-frst
language (38% and 33% for men; 39% and 39% for women, respec-
5.2 Results tively). However, men and women in the United Kingdom had
We investigated the efects of Conference (CNF) on Language Pref- prominent and opposite preferences for identity- (60% for men and
erences (PRF) but did not fnd a signifcant main efect (p ≈.497). 30% for women) and person-frst language (30% for men and 63%
For each conference separately and combined, PFL counts were for women). For non-binary participants, the preference was signif-
higher (54.4% combined; 52.6% for ASSETS and 57.1% for CHI) than icantly higher for identity-frst language in all of our explorations.
IFL counts (45.6% combined; 47.4% for ASSETS and 42.9% for CHI). These fndings indicate that intersectionality can play a pivotal role
We also found this trend persistent for publications in the last fve in determining the language preferences of disabled people.
years (2017-2021). Figure 4 shows the percentage of total count for Additionally, our results only partially agreed with those from
IFL and PFL language used across ASSETS and CHI since 2000. Bickford et al.’s [6] analysis, in which they studied the intersection-
ality of gender and age on the language preferences of blind and
6 DISCUSSION low-vision people. Their results showed no diference in the pref-
To provide insights into the language preferences for and by dis- erences for both gender and age. In contrast, our results showed a
abled people, we surveyed 519 disabled people from 23 countries statistically signifcant efect for age but not for gender for visually-
representing at least one of nine disability categories. Our results disabled survey respondents. We attribute this contrast to the evo-
show that 49% of disabled people preferred identity-frst language, lution of preferences over time, as their exploration dates back to
33% favored person-frst language, and 18% did not have a pref- 2004, about 18 years ago. This fnding accentuates that language
erence. Additionally, we explored the intra-sectionality and inter- preferences can drastically evolve, therefore, presenting a necessity
sectionality of disability categories, gender identities, age groups, for up-to-date data on the language preferences of disabled people.
and countries on language preferences of our survey respondents, To keep our data on language preferences up-to-date, we intend to
fnding that language preferences vary within and across these redistribute the survey every quarter. We also built the functionality
factors. We also investigated language usage at ASSETS and CHI on our accessible web platform to track the preferences over time
by analyzing 11,536 publication abstracts, fnding that PFL’s usage to understand and explore the evolution of language preferences.
was 16.1% more than identity-frst language at these venues.

6.1 Diversity Within Disability 6.3 Language Adoption in Academia


In the discussions involving the usage of identity- or person-frst Our results show that abstracts of published papers at ACM ASSETS
language to refer to disabled people, fndings and recommenda- and ACM CHI, since 2000, have used person-frst language 16.1%
tions are usually generalized for anyone with a disability, irrespec- more than identity-frst language. We did not fnd a statistically
tive of the disability categories they represent. Our survey results signifcant diference in language adoption between the two venues.
showed that although 48.6% of disabled people preferred identity- Although our results align with fndings from prior work [37, 42],
frst language overall, compared to 33.0% who favored person-frst it is worth noting that our exploration was holistic—we did not in-
language, these preferences varied across disability categories. For vestigate the relationship between the language used and disability
example, people with mobility disabilities, neurological disorders, categories, gender identities, age groups, or countries. We also only
or chronic illnesses preferred person-frst language to identity-frst used the abstracts for our analysis, similar to prior work [37, 42], as
language (as shown in Table 2), highlighting the diversity within the ACM’s laws prohibit the extraction of full texts from publications.
disabled community. However, interestingly, Disability Category However, our survey results and the analysis of the language used
(DSB) did not have a signifcant main efect on language prefer- in publications at ASSETS and CHI, taken together, indicate that al-
ences for participants from the United States, with only people with though disabled people prefer identity-frst language, the language
neurological disorders having a higher preference for person-frst used to refer to them is more commonly person-frst. Future work
language. Our work did not explore this disparity. We invite schol- can study the nuances in language adoption more comprehensively
ars and researchers to utilize our publicly-available survey data to in publications at scholarly venues, including conferences other
investigate the underlying factors contributing to the diference in than ACM ASSETS and ACM CHI, to identify adoption patterns.
Identity- Versus Person-First Language ASSETS ’22, October 23–26, 2022, Athens, Greece

6.4 Recommendations 8 CONCLUSION


Based on our fndings, we ofer three recommendations for authors We surveyed disabled people globally to collect their language pref-
and researchers when choosing between identity- and person-frst erences between identity- and person-frst language and recorded
language to refer to disabled people: the reasons behind their preferences. Additionally, we explored the
intersectionality and intra-sectionality of their preferences with
their disability category, gender identifcation, age group, and coun-
• First and foremost, we recommend authors and researchers try. Our results show that although disabled people prefer identity-
respectfully ask individual users for their language prefer- frst language overall, their preferences vary across disability cat-
ence (e.g., during pre-study or demographic questionnaires) egories, gender identities, age groups, and countries of residence.
whenever and wherever possible. To convey to readers that We made our data publicly available through an interactive and
the language used refects the individual’s preference, au- accessible web platform that enables users to granularly extract in-
thors and researchers can clearly state their processes at formation by fltering language preferences using any combination
the beginning of the article or in-line when referring to the of disability category, gender identifcation, age group, and country.
individual (e.g., “P0, who preferred identity-frst language...”). We also investigated language usage in papers at ACM ASSETS
While such additions can increase the word count of the text, and ACM CHI, fnding a higher usage of person-frst language than
they contribute toward inclusivity and cultural awareness identity-frst language at both conferences.
around disability-related matters. Our fndings, taken together, indicate that although disabled
• Second, we recommend referring to the intended demo- people prefer identity-frst language, person-frst language is more
graphic group using their self-identifed language prefer- commonly used. We provided recommendations for authors and
ences, employing intersectionality using their disability cat- researchers in choosing the appropriate language. By releasing our
egory, gender identity, age group, and country. When re- survey data on language preferences through an accessible web
ferring to disabled people as a group, the overall language platform, we hope our work will guide people in using appropriate
preference (e.g., identity-frst language, at present) may be language to refer to disabled people and investigate the intersec-
the most appropriate. Authors and researchers can use our tional diferences in language preferences.
web platform to stay up-to-date with the language prefer-
ences of disabled people. ACKNOWLEDGMENTS
• Finally, we recommend that authors and researchers keep
We extend our gratitude to the organizations, groups, and individ-
themselves up-to-date with the latest language preferences
uals who helped us distribute our survey. In particular, we would
of disabled people, considering that preferences may change
like to thank the National Federation of the Blind, Accelerated Cure
over time. We intend to distribute our survey every quarter
Project, Association of Late Deafened Adults, and Fibromyalgia
and keep our data publicly available through our live website
Awareness UK for their immense support and assistance. We thank
to assist authors, researchers, and interested individuals in
the anonymous reviewers for their helpful comments and sugges-
staying up-to-date with the latest preferences of disabled
tions. Finally, we thank and remember our recently-departed team
people.
member Zoey for her feline support, without which the purrusal of
this work would not have been as efective. May she live forever
and in peace in cat heaven.
7 LIMITATIONS & FUTURE WORK
We did not collect our participants’ race and ethnicity in our sur-
REFERENCES
[1] Erin E Andrews. 2019. Disability as diversity: Developing cultural competence.
vey; including these factors in intersectional and intra-sectional Oxford University Press, New York, NY, USA.
explorations may produce noteworthy results. We designed our [2] Erin E Andrews, Anjali J Forber-Pratt, Linda R Mona, Emily M Lund, Carrie R
Pilarski, and Rochelle Balter. 2019. # SaytheWord: A disability culture commentary
survey using Google Forms, enabling future work to include these on the erasure of “disability”. Rehabilitation psychology 64, 2 (2019), 111.
factors in our survey and analyze their efects on the language [3] American Psychological Association et al. 2019. Publication Manual of the Ameri-
preferences of disabled people. Our analysis of the language used can Psychological Association, (2020). American Psychological Association, Wash-
ington, DC, US.
in publications at ASSETS and CHI was holistic. While our results [4] Colin B Becg and Robert Gray. 1984. Calculation of polychotomous logistic
reafrm the fndings from prior work, future work can employ regression parameters using individualized regressions. Biometrika 71, 1 (1984),
more rigorous and nuanced hybrid (human + AI) methodologies to 11–18.
[5] Alexy Bhowmick and Shyamanta M Hazarika. 2017. An insight into assistive
examine language adoption in academic publications. technology for the visually impaired and blind people: state-of-the-art and future
Additionally, utilizing our fndings, future work could build sys- trends. Journal on Multimodal User Interfaces 11, 2 (2017), 149–172.
[6] James O Bickford. 2004. Preferences of individuals with visual impairments for
tems (e.g., browser extensions) that automatically update the lan- the use of person-frst language. RE: view 36, 3 (2004), 120.
guage used on web pages to match the individualized preferences of [7] Michael Bostock, Vadim Ogievetsky, and Jefrey Heer. 2011. D3 data-driven
users and study its efects on users. Users could specify these pref- documents. IEEE transactions on visualization and computer graphics 17, 12 (2011),
2301–2309.
erences using a centralized confguration system. Similarly, future [8] Monique Botha and David M Frost. 2020. Extending the minority stress model
work could replicate “language checker” plug-ins that proofread to understand mental health problems experienced by the autistic population.
the text for authors and researchers based on our fndings, assist- Society and mental health 10, 1 (2020), 20–34.
[9] Monique Botha, Jacqueline Hanlon, and Gemma Louise Williams. 2021. Does lan-
ing them by ensuring appropriate language usage and providing guage matter? Identity-frst versus person-frst language use in autism research:
suggestions wherever applicable. A response to Vivanti. , 9 pages.
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Aedan L. McCall, and Kianna R. Bolante

[10] Kristen Bottema-Beutel, Steven K Kapp, Jessica Nina Lester, Noah J Sasson, [38] Kate Lister, Tim Coughlan, and Nathaniel Owen. 2020. Disability’or ‘Additional
and Brittany N Hand. 2021. Avoiding ableist language: Suggestions for autism study needs’? Identifying students’ language preferences in disability-related
researchers. Autism in Adulthood 3, 1 (2021), 18–29. communications. European Journal of Special Needs Education 35, 5 (2020), 620–
[11] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. 635.
Qualitative research in psychology 3, 2 (2006), 77–101. [39] Rebecca-Eli M Long and Albert Stabler. 2021. “This is NOT okay:” Building
[12] Emeline Brulé, Brianna J. Tomlinson, Oussama Metatla, Christophe Joufrais, a creative collective against academic ableism. In Journal of Curriculum and
and Marcos Serrano. 2020. Review of Quantitative Empirical Evaluations of Pedagogy. Taylor & Francis, Oxfordshire, UK, 1–27.
Technology for People with Visual Impairments. In Proceedings of the 2020 CHI [40] Paul K Longmore. 1985. A note on language and the social identity of disabled
Conference on Human Factors in Computing Systems. Association for Computing people. American Behavioral Scientist 28, 3 (1985), 419–423.
Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376749 [41] Alan Lundgard, Crystal Lee, and Arvind Satyanarayan. 2019. Sociotechnical
[13] Taylor Cox. 1994. A comment on the language of diversity. Organization 1, 1 considerations for accessible visualization design. In 2019 IEEE Visualization
(1994), 51–58. Conference (VIS). IEEE, IEEE, New York, NY, USA, 16–20.
[14] Google Developers. 2014. Charts. https://developers.google.com/chart/ [42] Kelly Mack, Emma McDonnell, Dhruv Jain, Lucy Lu Wang, Jon E. Froehlich,
[15] Thomas P Dirth and Nyla R Branscombe. 2018. The social identity approach to and Leah Findlater. 2021. What Do We Mean by “Accessibility Research”? A
disability: Bridging disability studies and psychological science. Psychological Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019.
bulletin 144, 12 (2018), 1300. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
[16] Ellary A Draper. 2018. Navigating the labels: Appropriate terminology for stu- (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York,
dents with disabilities. General Music Today 32, 1 (2018), 30–32. NY, USA, Article 371, 18 pages. https://doi.org/10.1145/3411764.3445412
[17] Dana S Dunn and Erin E Andrews. 2015. Person-frst and identity-frst lan- [43] Brett Ranon Nachman, Ryan A Miller, and Edlyn Vallejo Peña. 2020. “Whose
guage: Developing psychologists’ cultural competence using disability language. Liability Is It Anyway?” Cultivating an Inclusive College Climate for Autistic
American Psychologist 70, 3 (2015), 255. LGBTQ Students. Journal of Cases in Educational Leadership 23, 2 (2020), 98–111.
[18] Patrick Dwyer. 2022. Stigma, Incommensurability, or Both? Pathology-First, [44] National Federation of the Blind. n.d.. Homepage | National Federation of the
Person-First, and Identity-First Language and the Challenges of Discourse in Blind. https://nfb.org/. (Accessed on 03/12/2022).
Divided Autism Communities. Journal of developmental and behavioral pediatrics: [45] Netfix, Inc. n.d.. Netfix - Watch TV Shows Online, Watch Movies Online.
JDBP 43, 2 (2022), 111–113. https://www.netfix.com/. (Accessed on 04/05/2022).
[19] Explosion. n.d.. spaCy · Industrial-strength Natural Language Processing in [46] National Federation of the Blind. 1993. Convention Resolutions ’93. https://nfb.
Python. https://spacy.io/. (Accessed on 03/18/2022). org/sites/nfb.org/fles/images/nfb/publications/convent/resol93.htm. (Accessed
[20] Christopher J. Ferguson. 2016. An efect size primer: A guide for clinicians and on 04/05/2022).
researchers. In Methodological issues and strategies in clinical research, A.E. Kazdin [47] Michael Quinn Patton. 1990. Qualitative evaluation and research methods. SAGE
(Ed.). American Psychological Association, Washington, DC, USA, 301––310. Publications, Inc., Thousand Oaks, CA, USA.
[21] Philip M Ferguson and Emily Nusbaum. 2012. Disability studies: What is it [48] Victoria Pearson, Katharine Lister, Elaine McPherson, Anne-Marie Gallen, Gareth
and what diference does it make? Research and Practice for Persons with Severe Davies, Chetz Colwell, Kate Bradshaw, N St J Braithwaite, and Trevor Collins.
Disabilities 37, 2 (2012), 70–80. 2019. Embedding and Sustaining Inclusive Practice to Support Disabled Students
[22] Charles D Fernald. 1995. When in London.: Diferences in Disability Language in Online and Blended Learning. Journal of Interactive Media in Education 2019,
Preferences Among English-Speaking Countries. Mental Retardation 33, 2 (1995), 1 (June 2019), 4.
99. [49] Danielle Peers, Nancy Spencer-Cavaliere, and Lindsay Eales. 2014. Say what you
[23] Phillip Ferrigon and Kevin Tucker. 2019. Person-First Language vs. Identity-First mean: Rethinking disability language in Adapted Physical Activity Quarterly.
Language: An examination of the gains and drawbacks of Disability Language Adapted Physical Activity Quarterly 31, 3 (2014), 265–282.
in society. In Journal of Teaching Disability Studies. CUNY Academic Commons, [50] Beth Pickard, Grace Thompson, Maren Metell, Efrat Roginsky, and Cochavit
New York. NY, USA, Online. Elefant. 2020. It’s Not What’s Done, But Why It’s Done. Voices: a world forum for
[24] Vic Finkelstein. 2007. The ‘social model of disability’and the disability movement. music therapy 20, 3 (2020), 19.
The Disability Studies Archive UK 1 (2007), 15. [51] Jesse Rathgeber. 2019. Troubling Disability: Experiences of Disability in, through,
[25] Doris Fleischer and Fleischer Doris Zames. 2012. The disability rights movement: and around Music. Arizona State University, Tempe, AZ.
From charity to confrontation. Temple University Press, Philadelphia, PA. [52] Lidia Ripamonti. 2016. Disability, diversity, and autism: Philosophical perspectives
[26] Patrick Flink. 2021. Person-frst & identity-frst language: Supporting students on health. The New Bioethics 22, 1 (2016), 56–70.
with disabilities on campus. Community College Journal of Research and Practice [53] Amanda Ross and Victor L Willson. 2017. Independent samples T-test. In Basic
45, 2 (2021), 79–85. and advanced statistical tests. SensePublishers, "Rotterdam", 13–16.
[27] Morton Ann Gernsbacher. 2017. Editorial perspective: The use of person-frst [54] Marrok Sedgwick. 2021. Review of Crip Camp co-directed by James LeBrecht
language in scholarly writing may accentuate stigma. Journal of Child Psychology and Nicole Newnham. Disability Studies Quarterly 41, 1 (2021), Online.
and Psychiatry 58, 7 (2017), 859–861. [55] Ather Sharif. n.d.. athersharif/metis: React HOC to use Google Sheets as
[28] Arthur Gilmour, Robert D. Anderson, and Alexander L. Rae. 1985. The analysis a Makeshift Database. https://github.com/athersharif/metis. (Accessed on
of binomial data by a generalized linear mixed model. Biometrika 72, 3 (1985), 03/30/2022).
593–599. [56] Ather Sharif, Sanjana Shivani Chintalapati, Jacob O. Wobbrock, and Katharina
[29] Dan Goodley. 2016. Disability studies: An interdisciplinary introduction. Sage, Los Reinecke. 2021. Understanding Screen-Reader Users’ Experiences with Online
Angeles, CA. Data Visualizations. In The 23rd International ACM SIGACCESS Conference on
[30] Jordan Harband, Shu-yu Guo, Michael Ficarra, and Kevin Gibbons. 1999. Standard Computers and Accessibility (Virtual Event, USA) (ASSETS ’21). Association for
ecma-262. Computing Machinery, New York, NY, USA, Article 14, 16 pages. https://doi.
[31] Holly Hofman, Marie Hengesbach, and Shana Trotter. 2020. Perspectives on org/10.1145/3441852.3471202
Person-First Language: A Focus on College Students. Journal of Postsecondary [57] Ather Sharif and Babak Forouraghi. 2018. evoGraphs — A jQuery plugin to create
Education and Disability 33, 1 (2020), 39–48. web accessible graphs. In 2018 15th IEEE Annual Consumer Communications
[32] Facebook Inc. n.d.. React – A JavaScript library for building user interfaces. Networking Conference (CCNC). IEEE, Las Vegas, NV, USA, 1–4. https://doi.org/
https://reactjs.org/. (Accessed on 08/08/2021). 10.1109/CCNC.2018.8319239
[33] David Johnstone. 2012. An introduction to disability studies. Routledge, Abingdon, [58] Ather Sharif, Olivia H. Wang, Alida T. Muongchan, Katharina Reinecke, and
Oxfordshire, UK. Jacob O. Wobbrock. 2022. VoxLens: Making Online Data Visualizations Accessible
[34] Lorcan Kenny, Caroline Hattersley, Bonnie Molins, Carole Buckley, Carol Povey, with an Interactive JavaScript Plug-In. In CHI Conference on Human Factors in
and Elizabeth Pellicano. 2016. Which terms should be used to describe autism? Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing
Perspectives from the UK autism community. Autism 20, 4 (2016), 442–462. Machinery, New York, NY, USA, Article 478, 19 pages. https://doi.org/10.1145/
[35] Emily Ladau. 2014. What should you call me? I get to decide: Why I’ll never 3491102.3517431
identify with person-frst language. In Criptiques. May Day Publishing, Seattle, [59] Mark Sherry*. 2004. Overlaps and contradictions between queer theory and
WA, USA, 47–56. disability studies. Disability & Society 19, 7 (2004), 769–783.
[36] James LeBrecht and Nicole Newnham. n.d.. Crip Camp | A Disability Revolution. [60] Carmit-Noa Shpigelman and Carol J Gill. 2014. How do adults with intellectual
https://cripcamp.com/. (Accessed on 04/05/2022). disabilities use Facebook? Disability & Society 29, 10 (2014), 1601–1616.
[37] Lior Levy, Qisheng Li, Ather Sharif, and Katharina Reinecke. 2021. Respectful [61] John Sommers-Flanagan and Rita Sommers-Flanagan. 2012. Clinical interviewing:
Language as Perceived by People with Disabilities. In The 23rd International 2012-2013 update. John Wiley & Sons, Hoboken, NJ, USA.
ACM SIGACCESS Conference on Computers and Accessibility (Virtual Event, USA) [62] Katta Spiel, Christopher Frauenberger, Os Keyes, and Geraldine Fitzpatrick. 2019.
(ASSETS ’21). Association for Computing Machinery, New York, NY, USA, Article Agency of autistic children in technology research—A critical literature review.
83, 4 pages. https://doi.org/10.1145/3441852.3476534 ACM Transactions on Computer-Human Interaction (TOCHI) 26, 6 (2019), 1–40.
Identity- Versus Person-First Language ASSETS ’22, October 23–26, 2022, Athens, Greece

[63] Katta Spiel, Oliver L Haimson, and Danielle Lottridge. 2019. How to do better
with gender on surveys: a guide for HCI researchers. Interactions 26, 4 (2019),
62–65.
[64] Student. 1908. The probable error of a mean. Biometrika 6, 1 (1908), 1–25.
[65] The Allen Institute for Artifcial Intelligence. n.d.. Semantic Scholar | About Us.
https://www.semanticscholar.org/. (Accessed on 04/05/2022).
[66] Tanya Titchkosky. 2001. Disability: A rose by any other name?“People-First”
language in canadian society. Canadian Review of Sociology/Revue canadienne de
sociologie 38, 2 (2001), 125–140.
[67] U.S. Department of Justice. n.d.. Americans with Disabilities Act of 1990,AS
AMENDED with ADA Amendments Act of 2008. https://www.ada.gov/pubs/
adastatute08.htm. (Accessed on 04/05/2022).
[68] Yuli Vasiliev. 2020. Natural Language Processing with Python and SpaCy: A
Practical Introduction. No Starch Press, San Francisco, CA.
[69] W. N Venables and B. D Ripley. 2013. Modern Applied Statistics with S-Plus.
Springer, New York, NY.
[70] Giacomo Vivanti. 2020. Ask the editor: What is the most appropriate way
to talk about individuals with a diagnosis of autism? Journal of Autism and
Developmental Disorders 50, 2 (2020), 691–693.
[71] W3C. n.d.. WAI-ARIA Overview | Web Accessibility Initiative (WAI) | W3C.
https://www.w3.org/WAI/standards-guidelines/aria/. (Accessed on 04/11/2021).
[72] WebAIM (Web Accessibility in Mind). n.d.. WebAIM: Contrast Checker. https:
//webaim.org/resources/contrastchecker/. (Accessed on 03/30/2022).
[73] Elizabeth A West, Darlene E Perner, Linda Laz, Nikki L Murdick, and Barbara C
Gartin. 2015. People-First and Competence-Oriented Language. International
Journal of Whole Schooling 11, 2 (2015), 16–28.
[74] Americans with Disabilities Act (ADA). 1990. Americans with Disabilities Act of
1990, Pub. L. No. 101–336, 104 Stat. 328.
[75] Beatrice A Wright. 1960. Physical disability–A psychological approach. Harper &
Row Publishers, New York, NY, USA.
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Aedan L. McCall, and Kianna R. Bolante

A LANGUAGE PREFERENCES BY DISABILITY

Figure 5: Screen capture from our accessible web platform showing a bar chart that displays the language preferences of dis-
abled people between identity-frst language (IFL), person-frst language (PFL), and no preference (NP) by disability categories.

B LANGUAGE PREFERENCES BY AGE GROUP

Figure 6: Screen capture from our accessible web platform showing a bar chart that displays the language preferences of
disabled people between identity-frst language (IFL), person-frst language (PFL), and no preference (NP) by age group.
Identity- Versus Person-First Language ASSETS ’22, October 23–26, 2022, Athens, Greece

C LANGUAGE PREFERENCES BY GENDER IDENTITY

Figure 7: Screen capture from our accessible web platform showing a bar chart that displays the language preferences of
disabled people between identity-frst language (IFL), person-frst language (PFL), and no preference (NP) by gender identity.

D LANGUAGE PREFERENCES BY COUNTRY

Figure 8: Screen capture from our accessible web platform showing a bar chart that displays the language preferences of
disabled people between identity-frst language (IFL), person-frst language (PFL), and no preference (NP) by country.
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Aedan L. McCall, and Kianna R. Bolante

E LANGUAGE PREFERENCES OVER TIME

Figure 9: Screen capture from our accessible web platform showing a line chart that displays the language preferences of
disabled people between identity-frst language (IFL), person-frst language (PFL), and no preference (NP) over time. (As we
started collecting the survey results only a few months back, the data, at present, may not be signifcant to draw any conclusion.
However, we hope that tracking the survey data over time will reveal patterns and results worth exploring.)
Assistive or Artistic Technologies? Exploring the Connections
between Art, Disability and Wheelchair Use
Giulia Barbareschi Masa Inakage
Keio Graduate School of Media Design, Yokohama, Japan Keio Graduate School of Media Design, Yokohama, Japan
barbareschi@kmd.keio.ac.jp inakage@kmd.keio.ac.jp

ABSTRACT researchers have a “second life” as artists that enables them to cre-
Art has deep connections with both disability and HCI research. ate deeper connections with stakeholders, and promotes a more
From disabled bodies becoming avatars of novel forms of expression, instinctive forms of knowledge. In a similar fashion, Andersen et al
to artistic work being created as an act of resistance, art has been a 2018 [2] have explored the role of “disruptive improvisations”, ma-
powerful tool to subvert ableist narratives. Artistic practices have terial and procedural explorations used to question and creatively
also helped to inspire, innovate and push the boundaries of HCI, problematize mainstream and habitual thinking patterns, to fos-
giving rise to new technologies and interaction possibilities. Our ter novel insights. More generally, the majority of speculative and
paper presents the exploration of the experiences and practices of provocative design futuring practices used by researchers in HCI
17 artists who used wheelchairs for mobility. Through the thematic encompasses methods with strong roots in diferent forms of art
analysis of interviews, we conceptualize three themes: (1) Personal [46].
journeys through art and disability; (2) Social encounters through Despite these overlaps, within the context of assistive technology
art, (3) Skills and technology in art making. From these themes, and accessibility, researchers have only begun to investigate how
we refect on how art can help HCI researchers to capture the the use of assistive and accessible technologies contributes to the
complexity of the experiences of disability and assistive technology creative expressions of disabled artists. Similarly, it is unclear how
use and how collaboration with disabled artists could help to rethink disability art could disrupt the design of assistive and accessible
the design of disruptive artistic technologies. technologies leveraging provocative narratives that reshape the
perception of disability. A number of accessibility researchers have
CCS CONCEPTS investigated the barriers that creative professionals and artists with
disabilities encounter in their work [1, 4, 21, 69]. Others have lever-
• Human Centered Computing; • Accessibility; • Accessibil-
aged novel technologies to create radically diferent interaction
ity theory, concepts and paradigms;
modalities that make artistic creation processes more accessible
to people with diferent capabilities [17, 21, 39]. These lines of en-
KEYWORDS quiry are essential to increase the accessibility of the art world,
Disability, Art, Wheelchairs, Creativity enabling disabled artists to overcome existing barriers and allowing
ACM Reference Format: for more intuitive and adaptable ways of expressing themselves
Giulia Barbareschi and Masa Inakage. 2022. Assistive or Artistic Technolo- regardless of one’s capabilities. However, they fall short of high-
gies? Exploring the Connections between Art, Disability and Wheelchair lighting what is unique and disruptive about the existing practices
Use. In The 24th International ACM SIGACCESS Conference on Computers of disabled artists, what role do their assistive technologies play
and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, in their artistic expressions, and how these learnings could serve
New York, NY, USA, 14 pages. https://doi.org/10.1145/3517428.3544799 to push the boundaries of current research around assistive and
accessible technologies.
1 INTRODUCTION In this paper we propose a diferent approach to re-think ac-
cessibility and assistive technology research in HCI through the
The relationship between art and disability is powerful and complex
lens of art. Through a series of semi-structured interviews with 17
[56]. Depending the artist, and the socio-historical context, art has
artists who use wheelchairs for mobility, we examine their prac-
helped the voice of people with disabilities to be heard or silenced,
tice across the domains of visual art, music, dance, crafting and
disabled bodies to be seen as aesthetically beautiful or ugly, and
performance focusing on the infuences that their experiences of
Crip narratives to be stereotyped or reclaimed [34, 48, 49, 70, 76].
disabilities and wheelchair use have on their art. Specifcally, we
Art is also deeply interconnected with HCI research [41, 43, 73,
framed our research around three research questions:
81]. The pictorial by Sturdee et al 2021 [81] shows how many HCI
• How do embodied and social experiences of disability afect
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed the practices of wheelchair-using artists?
for proft or commercial advantage and that copies bear this notice and the full citation • What role do wheelchairs and other assistive technology
on the frst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
play in their practice of art?
to post on servers or to redistribute to lists, requires prior specifc permission and/or a • How do the practice of wheelchair-using artists be leveraged
fee. Request permissions from permissions@acm.org. to foster new paradigms for accessibility research?
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 Through refexive thematic analysis we conceptualize three dif-
https://doi.org/10.1145/3517428.3544799 ferent themes:
ASSETS ’22, October 23–26, 2022, Athens, Greece Giulia Barbareschi and Masa Inakage

• Personal journeys through art and disability: art, dis- specifc to disability, others do not, preferring to position their work
ability, and wheelchair use are embodied experiences that away from Disability Culture [9, 25, 82].
continually interact and afect each other, shaping internal
meaning and external manifestations. The result is a deeper
personal connection with one’s body and a richer artistic
2.1 Disability and art practice between social
practice. and creative narrative
• Social encounters through art: social aspects of artistic Solvang 2018, argues that the intersections between art and disabil-
practices range from enriching professional collaborations ity can be examined using diferent frames: art therapy, outsider
and interactions with supportive communities, to navigating art, disability art, and disability aesthetics [78]. These diferent dis-
access barriers, and being faced with ableist discrimination. courses are not mutually exclusive and often coexist, but each of
Resulting acts of resistance include advocacy, mentoring, them carries a number of specifc implications.
and outreach within and beyond the art world. When disabled people’s artistic activities are seen through the
• Skills and technology in art making: diferent dynamics lens of therapy, the focus is on understanding the value of the prac-
of movement that emerge in connection to disability shape tice to the individual, or specifc group of disabled people engaged
the process through which artists create and perform. The in the creative process [22, 78]. In a variety of contexts, individ-
use of technology, the development of specifc competencies, uals with disabilities have been shown to beneft from engaging
and the integration of wheelchair experiences are weaved in projects that seek to leverage art to improve one’s physical and
together to shape unique creative strategies. psychological wellbeing, or to increase self-esteem and promote
community empowerment [40]. However, the art therapy frame-
Based on these fndings we elaborate on how interacting with work is connected to a predominantly medicalized view of disability,
disabled artists can help HCI researchers to develop a more nu- belittling the artistic value of the work produced by disabled people
anced understanding of the many intersectional factors shaping and favoring ableist discrimination within the mainstream art com-
the experience of disability, access, and assistive technology use munity [77, 78]. In contrast, the framing of outsider art seeks to
[18, 35, 74, 84, 86, 88]. Finally, we invite researchers to move beyond highlight how the perceived “dysfunctionality of artist” (primarily
dichotomies of impairment-based and ability-based technology and in connection to their mental and social status as outsiders) can
engage with artists towards the creation of artistic technologies that produce bolder, more intuitive and powerful art expressions that
enhance the unique competencies, interaction modalities and cre- are unique to these outcast groups [47, 77]. However, outsider art
ative possibilities that emerge as a result of disability and assistive has been criticized due to its tendencies to reinforce othering of
technology use [13, 14, 42, 67, 79]. disabled artist and promote exploitation from curators and art deal-
In summary, our contributions in this paper include: (1) a frst ers who have a vested interest in maintain the status of artists as
exploration of how disabled artists integrate their personal experi- outsiders [47, 62].
ences of disability and wheelchair user in various artistic practices Disability Art is defned by Solvang 2018 as encompassing the
including visual art, dance, music, crafting and performance; (2) a work of disabled artists that is specifcally informed by their expe-
series of refections on how the work of disabled artists can fos- rience of disability, and is often rooted in the social dynamics of
ter a deeper understanding of embodied and social experiences of identity, disability culture, and the struggle for disability justice and
disability, and assistive technology use; (3) insights on leveraging equality [55, 77]. The close connections of disability art with the
work with disabled artists to challenge current assumptions around disability rights and disability justice movements have led to the
the role of accessibility and assistive technology, and re-imagine creation of powerful works of art that seek to reframe the disability
technology that enhances the uniqueness of disabled users. discourse and question ableist narratives [25, 30, 37, 72, 78]. Yet, the
specifcity of disability art has also had the counter efect of sidelin-
ing the work of disabled artists to socio-political activism, resulting
2 RELATED WORK in challenges to receive adequate recognition from the mainstream
In the following sections we provide a brief review of research artistic world [77, 82, 87]. Finally, the discourse related to disability
examining the practices of disabled artists, focusing on both frst- aesthetics can help to reclaim the visibility of disability in main-
hand accounts of creative processes and the broader social role of stream art, in particular visual and performance arts, through the
art in questioning disabled narratives. We also provide a summary depiction of disabled bodies as both beautiful and inspiring [48, 78].
of HCI literature that specifcally looked at the use of assistive Disability aesthetics does not necessarily carry a social meaning
and accessible technology in the context of art. Although these of inclusion, but leverages the uniqueness of disabled bodies as a
short reviews are not exhaustive, they serve to provide a contextual means to re-imagine artistic expressions in bold and innovative
understanding of our study involving wheelchair using artists and ways [48, 76, 78].
help to frame the refections and implications presented in the Much of the literature on art and disability is focused on broader
paper. societal discourse. However, more recently, researchers have dis-
It is important to highlight that throughout the paper we in- played a growing interest towards the experiences of disabled artists
terchangeably use identity-frst and person-frst language. This is and have attempted to unpick the complex relationship between
to recognize the diferent preferences of individuals and disability their overlapping identities [5, 22, 38, 82, 85]. For example, Bang
communities and that, while certain artists strongly identify as and Kim 2015 [5] conducted questionnaires and interviews with 12
disabled and see their work as being part of artistic movements Korean disabled artists, including writers, visual artists, musicians,
Assistive or Artistic Technologies? Exploring the Connections between Art, Disability and Wheelchair Use ASSETS ’22, October 23–26, 2022, Athens, Greece

and performance artists to understand their motivations, creative been leveraged to subvert expectations around mobility, able and
processes, and the barriers experienced throughout their careers. disable bodies [27].
Results show how their journeys to become professionals were
riddled with difculties ranging from personal struggles with pain
or functional impairments, to barriers accessing formal art educa-
tion, and prejudice and discrimination that negatively infuenced
2.2 HCI literature on assistive and accessible
their confdence as individuals and professional artists [5]. Consid- technology use in the art
erations around disability stigma drove certain artists to highlight Within the domain of assistive technology and accessibility re-
social components of their work, situating their artistic endeavors search in HCI, there are several examples of studies focusing on
in the Disability Art movement, whereas it led others to purpose- artistic practice. Many of these studies are specifcally focused on
fully choose to not disclose their disability status, in order to not the domain of visual arts and mainly investigate the access barriers
be labelled as a disabled artist [5]. encountered by artists with disabilities and the role of technology
On the other hand, the interviews with 47 young artists con- in addressing, or reinforcing them [4, 19, 20, 60]. For example, both
ducted by Sulewski et al 2012 show a more nuanced picture [82]. Perera et al 2007 and Creed 2018 engaged with physically impaired
Identities as artists were shaped by a variety of factors from the artists to understand how they leveraged digital and non-digital
presence of role models in the family, the support and encourage- technologies to produce their work, and what difculties emerged
ment of teachers, or the inspiration that emerged when interacting though these interactions [20, 60]. Findings showed how artists
with the work of famous artists. Similarly, their disability identity often used diferent technologies to address limitations connected
was shaped by personal experiences facing social and environmen- to their own impairments, but existing interaction modalities often
tal challenges or navigating their own impairments, as well as their forced them to perform cumbersome repetitive tasks and unnatural
pride in the broader disability culture [82]. The intersection of these movements that could generate fatigue and injuries [20, 60]. Tech-
two identities was often dynamic and complex with art being used nology limitations also limited their independence which could
as a way to claim and reinforce their own identities as disabled, or create unnecessary limitations about the time and modality of work
promote disability pride. Some participants explained how art rep- depending on the availability of help [20, 60].
resented a therapeutic outlet that allowed them to better elaborate In a similar fashion, the recent study by Payne et al 2020 investi-
personal experiences of emotional and physical sufering [82]. In gated the practices of visually impaired composers, music produc-
some cases, young artists felt that these two identities had little ers, and songwriters to understand the accessibility of currently
overlap, and simply saw themselves as artists who happened to available technologies [59]. Results highlighted how, despite their
have a disability [82]. facilitating role, technologies for composition and production still
In their recent review on the social role of disabled artists and featured signifcant access barriers that required participants to use
their work, Wolbring & Jamal Al-Deen 2021 attempted to capture custom made scripts, and develop DIY solutions and workarounds
an additional dimension of the art disability relationship by specif- to successfully accomplish their goals. These limitations created
cally looking at the intersections with technology and science [85]. artifcial ceilings that hampered creativity, complicated collabora-
Their fndings showed that there is anecdotal evidence of artists tions between musicians with and without sights, and restricted
leveraging diferent technologies to create and perform, enhancing one’s operational choices [59].
their artistic expressions and providing better ways to connect with Rather that investigating the barriers faced by disabled artists, a
the audience [85]. However, the authors highlighted how, despite number of researchers have focused on developing technological
the proliferation of artistic work showcasing synergies and ten- solutions that could make artistic practices more accessible, often
sions between art, disability and technology use, there is still little leveraging embodied interaction modalities [3, 4, 17, 19, 32, 39, 45,
academic research examining how the use of technology helps to 66]. Some of these designs were developed to address difculties
shape the work of disabled artist and how in turn, disabled artists linked to specifc tasks, such as the HapticEQ by Karp & Pardo
could help to shape the development of disruptive new technologies 2017 that allowed visually impaired music producers to create and
[85]. modify equalization curves in an intuitive manner [45]. Others,
Although studies focusing specifcally on the experiences of such as the AirSticks by Ilsar & Kenning 2020, aimed to be more
artist wheelchair users are rare, some poignant examples are pro- fexible and adaptable, enabling users with diferent disabilities to
vided by both Linton 2021 [50] and García-Santesmases Fernández engage in a variety of diferent musical improvisation according to
& Arenas Conejo 2017 [27]. The former highlights how despite the their own taste and preferred interaction modalities [39].
widespread existence of numerous physical barriers in the world of Most of the accessible technologies for art developed by HCI re-
theatre, the greatest obstacles faced by wheeling actors are the mis- searchers are targeted towards novices with an interest in art, rather
guided negative attitudes of producers, and fellow actors. Regard- than professional artists [19, 39, 58, 66]. For example, Ragone 2020
less of barriers, the author presents several examples of extremely developed OSMoSIS an interactive system that leverages motion
successful performances of wheeling actors who were cast for roles tracking technologies to allow autistic children to create diferent
that were not necessarily written to include a wheelchair, without sounds through body movements [66]. However, technologies such
the need for any change in the original script [50]. The latter study as the voice-controlled system for creative object positioning de-
describes examples of how the creative performance of a disabled veloped by Aziz et al 2021 have been created and tested with the
actor using a wheelchair to immobilise an non-disabled actor has needs of both novice and expert graphic and interface designers
with physical impairments in mind [4].
ASSETS ’22, October 23–26, 2022, Athens, Greece Giulia Barbareschi and Masa Inakage

Examples of creative interactions between disabled individuals no limitation on the type of wheelchair used, modality, or length
and technology have also been explored by HCI scholars focusing of use was introduced in the inclusion criteria. Seven participants
on crafting practices. The experience report by Hawthorn & Ash- identifed as having multiple disabilities (P1, P2, P5, P6, P11, P14,
brook 2017 [33] illustrates how the creative act of developing one’s P15), and nine stated they used multiple assistive technologies,
own prosthetic hand turns the body itself into a living prototype beyond the wheelchair, in their everyday lives (P1, P2, P5, P6, P8,
and the artist into a self-making cyborg. Even when the diferences P9, P11, P14, P15). A summary of participants’ characteristics is
generated by one’s unique creative interactions might be impossi- provided in Table 1
ble to perceive by others, the visually impaired weavers who took Potential participants were approached by the frst author and
part in the study conducted by Das et al. 2020 [23] explained how informed about the purpose of the research. If interested in taking
individual making practices leave a personal signature that will part in the study, the research team provided them with the consent
always be recognizable by the artist. form and invited them to participate in an interview. According
Although there are exceptions to this, much of the existing re- to the preferences and access needs of participants, the interview
search around the use and creation of accessible and assistive tech- could take place synchronously (though phone or video call using
nology for artistic expressions is primarily centered around visual MS Teams or Zoom), or asynchronously (in a written format). All
art, making, and music. This is interesting especially if we consider participants but one, P14, chose to take part in a synchronous
the strong embodied connections of many aspects of the experi- interview. Another participant, P11 who communicates nonverbally,
ence of disability and assistive technology use, often described chose to participate in the interview alongside his father, who was
as an extension of the body, which might be more prominent in also his artistic collaborator.
dance and performance art [11, 24, 54, 68, 79]. Moreover, existing
studies largely focus on the functional aspects of the interaction 3.2 Materials & Procedure
between the artist and the technology, often focusing on the neg-
Most participants preferred to be interviewed in English, but P11
ative connotations of access barriers, without looking at how the
asked for the interview to be conducted in Italian, and P9 in Japan-
use of technology by disabled artists might create unique artistic
ese. The frst author who conducted the interviews is fuent in
expression and generate social implications around the perception
Italian and the second author is fuent in Japanese, which made it
of disability more generally [31, 35, 36, 61].
possible to easily conduct the interview in the language requested
by the artists. Ahead of the interview participants were provided
3 METHODS with a consent form for the study in their preferred language and
3.1 Participants written or verbal consent was obtained before the start of the in-
terview. The duration of synchronous interviews varied between
In total, 17 artists who used wheelchairs for mobility were recruited
40-96 minutes. For the asynchronous interview, P14 was emailed
for the study. The choice of focusing our research on artists that
the questions by the frst author, and provided the answers in writ-
used wheelchairs for mobility was driven by a combination of fac-
ten format. Clarifcations were requested as appropriate to ensure
tors. Firstly, we assumed that using a wheelchair for mobility would
that the experience of the artist could be adequately captured. The
have afected several aspects of artistic creation, regardless of the
interview guide was structured across four diferent areas, with the
particular feld of the artist, which we believed could enable indi-
goal to capture the complexity of the personal and creative experi-
viduals to experiment and explore with diferent creative strategies.
ences of artists: details of the artistic practice, use of a wheelchair in
Secondly, wheelchairs users are an extremely diverse group that
everyday life, integration of the wheelchair and other technologies
encompass individuals who can have extremely diferent capabili-
in the artistic practice, public perceptions of wheelchair use and
ties, experience diferent limitations, and have diferent preference.
art. A full copy of the interview guide is provided as supplemental
While this is true for most “sub-groups” of disabled people, we
material. Participants were also asked to share any audio/visual/or
felt that the additional diversity created by the wide variety of
written material that could help illustrate their personal experiences
wheelchair types available across diferent countries could further
as disabled artists and the particular characteristics of their artistic
contribute to the development of unique creative paradigms. Fi-
practice. This material was often part of the personal portfolio of
nally, the choice was also motivated by the positionality of the
the artists and, in accordance with participants’ wishes, it will not
frst author, who has previously worked as a physiotherapist and
be made available in connection to this publication to protect their
has extensive experience conducting HCI research with wheelchair
privacy.
users. Increased understanding of how wheelchair-using bodies
move enabled us to explore in more depth participants’ artistic
practices. 3.3 Analysis
Artists were recruited through a variety of strategies ranging Synchronous interviews were audio recorded using a portable
from social media posts, direct approach through personal websites, recorder and transcribed verbatim by the frst author. The Japanese
introduction from the personal and professional network of the and Italian interview were frst translated in their original language
authors, approach through national and international disability and then translated into English. The corpus of data included inter-
art organizations, and word of mouth. Recruitment criteria were views transcripts (or written response to the interview questions in
self-identifcation as artist, age above 18, and use of a wheelchair the case of P14), supplemented by digital photographs and videos
in daily life. No restrictions were introduced concerning the feld provided by participants, feld notes taken by the frst author. The
of art practiced, or the professional status of the artist. Similarly, data was analyzed thematically using an inductive approach with
Assistive or Artistic Technologies? Exploring the Connections between Art, Disability and Wheelchair Use ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Summary of Participants

Participant Art feld Country Primary Wheelchair Daily Disability Artist before
wheelchair experience wheelchair Onset wheelchair
(y) use use (Y/N)
1 Digital UK Electric and power 42 Part-time Acquired No
Visual Art assisted wheels
2 Sculpture UK Electric and 21 Full-time Acquired Yes
and attendant propelled
installation manual
3 Baking and Canada Manual self-propelled 14 Full-time Acquired No
Cake
Design
4 Dance US Manual self-propelled 28 Full-time Acquired No
5 Dance and Australia Manual both self and 5 Part-time Acquired Yes
Perfor- attendant propelled
mance
Art
6 Dance and UK Manual self-propelled 3 Full-time Acquired Yes
Music
7 Dance and US Manual self-propelled 7 Full-time Acquired Yes
Music
8 Dance UK Manual self-propelled 9 Part-time Acquired Yes
9 Dance and Japan Manual self-propelled 30 Full-time Congenital No
Circus Art
10 Music Canada Electric 46 Full-time Acquired Yes
11 Painting Italy Electric 30 Full-time Congenital No
12 Music Canada Manual self-propelled 33 Full-time Acquired Yes
13 Painting US Manual self-propelled 35 Full-time Acquired No
14 Combined UK Manual and electric 23 Part-time Acquired Yes
Arts
15 Dance US Manual self-propelled 34 Full-time Acquired No
16 DJ-ing and India Electric and Manual 27 Full-time Congenital No
Music
17 DJ-ing India Manual 5 Full-time Acquired No

the initial coding carried out by the frst author and discussed 4 RESULTS
between both authors to defne themes [16]. During the concep- As explained in the methods section, the following themes were
tualization of themes, the researchers consulted with the artists conceptualized through an iterative collaborative process between
on multiple occasions, to ensure that the interpretation of the data the researchers and the artists. The goal was to provide a compre-
(by a non-disabled researcher with no experience as a professional hensive and shared understanding of the complex relationships
artist and a second non-disabled researcher with experience as an between creative expressions and experience of disability, paying
interactive artist) captured the meaning of their own experiences. attention to the specifc role of assistive and accessible technologies.
Once the analysis was completed, the researchers provided a sum-
mary of results to the artists to verify the representativeness of the
themes. Although participants did not receive additional compen- 4.1 Personal journeys through art and
sation for their feedback, most artists stated that the possibility to disability
contribute in the analysis had been empowering as it gave them
more agency on the outcomes of the study. Suggested changes were Although the individual journeys that led each participant to the
discussed between the researchers and artists to develop a joint decisions to pursue an artistic practice and claim their identities as
interpretation of the themes that could refect the complexity of wheelchair users and/or people with disabilities were deeply rooted
the experience of diferent artists. in their own personal experiences, we identifed several common
elements. Identities as artists and disabled people and/or wheelchair
users were generally formed separately. However, these identities
were also deeply interconnected. For many, their combined iden-
tities as wheelchair users and disabled individuals signifcantly
ASSETS ’22, October 23–26, 2022, Athens, Greece Giulia Barbareschi and Masa Inakage

afected the artistic practice in ways that were not necessarily con- What is the cause of this, why I’m not able to move my
scious or immediately visible. Ultimately, the shared belief was legs? I just started my life.” – P17 DJ
that disability and wheelchair use led an artist to deeper, albeit While some of the participants had a traumatic accident or a
often challenging, life experiences that created a more complex sudden illness that led them to become wheelchair users, others
connection to the body and a richer creative practice. started to use a wheelchair as a result of a progressive condition. The
transition from walking, with or without other kinds of mobility
4.1.1 Becoming an artist. Most participants spoke of their decisions
aids, to using a wheelchair was often resisted, largely due to self-
to become an artist as something that occurred at a very young
stigma. However, as they begun to use a wheelchair, participants
age. This decision was sometimes inspired by a member of their
felt that they had found a tool of freedom, something that could
social circle, who was an artist. Other participants shared stories
enable them to claim back not only their mobility, but also their
about how interacting with works of famous artists captured their
life aspirations.
imagination to the extent that motivated them to pursue an artistic
practice themselves (“When I was very, very young, I think about “My chair enables me to live. My chair is... So, I fun-
fve years old I watched a ballet performance in [name of city in the damentally I’m a dancer. That’s my core. That’s my
UK]. And I went to see Swan Lake. And apparently, I couldn’t take my essence, it’s who I am. So my chair enables me to live
eyes of the stage. So I think that’s kind of kind of where it started.” my purpose as a dancer” – P6 Dancer and musician
– P8 Dancer). Although the initial interest towards art was strong Interestingly, while most participants identifed themselves both
for all participants, the decision to become a professional artist as wheelchair users and disabled, others did not. P14 for example
often matured over the years. Moreover, the interest towards their described themselves as a disabled person (using the social model
chosen feld of art was often accompanied by multifaceted passion of disability), and only as wheelchair-using artist in the context of
for other creative endeavors which were pursued for leisure. dance performance. In contrast, P12 described himself as someone
Interestingly, both P1 and P5, described their artistic journey as who had a spinal cord injury and used a wheelchair but did not
a personal calling rather than a conscious decision. The strength of identify as disabled. For some participants claiming their identity as
their imagination, or the ways in which their personal neurocog- disabled and wheelchair users had been complicated by internalized
nitive patterns were articulated, made art the only logical way to notions of ableism (“I had all these notions about disability at frst,
express themselves. In a way, both participants felt that they had and it was super scary. I didn’t want to be a part of that world”- P4
been born artists rather than having become one. Dancer). However, for others it was made difcult by prejudices
“I think fundamentally, there’s something about the way within the disabled community and the difculties that emerged as
that my brain is wired, the way my neurodiversity exists, a result of other intersectional identities.
the way I sensorially exist in the world, that meant that “I once went to a really big event, where there was sort of
art felt like a really natural form of expression for me, celebrating disability, but I felt really misplaced, because
and natural way of connecting and communicating and I’d never chained myself to a bus or anything like that.
it’s also where I felt the most myself in the most alive” But there were like, a hierarchy of people that had. And
– P5 Dancer and Performance Artist if you weren’t one of those people, you hadn’t really
Although the majority of participants had an interest towards tried hard enough. And I thought, well, that’s wrong.
art from a relatively young age, others did not. Both P9 and P15 had Because, you know, I came from this background where
never thought about pursuing a career as a dancer until they started opportunities were really difcult” – P1 Digital Artist
to take classes for leisure. On the other hand, P11 and his father In contrast to other participants, P9, P11 and P16 who acquired
begun painting as a playful family bonding activity that became their disability at birth spoke of their life journeys through disability
more serious as they gained more recognition and felt that the as more linear. This included their transition to using a wheelchair,
nature of their work was becoming more deliberate and focused. which was seen as a natural process, as much as learning to walk
Finally, P14 stated that they did not initially recognize themselves could be for a child without a motor impairment.
as artists, but that some of her previous research and documentary
work had been labeled as such by others. 4.1.3 Interconnections between art and disability. Although per-
sonal journeys that led participants to claim their identities as artists
4.1.2 Becoming Disabled. The large majority of participants, see and disabled people/wheelchair users were seen as being gener-
Table 1, had acquired their disability throughout their life as a ally separated, they were also deeply intertwined. After the onset
result of an accident, or an illness. Regardless of the individual of acquired disabilities, the connections with previously existing
circumstances, the onset of an acquired disability was described artistic practices could be used with an almost therapeutic goal. P4
as a traumatic event that completely changed one’s life trajectory. for example spoke about art being integrated in her occupational
This sudden transition was often associated with shock and grief therapy to regain strength after a spinal cord injury. On the other
as one struggled to understand how to resume their life. hand, P7 leveraged her own artistic practice as a way to process
“When I got into this injury, I will not run from it. But for trauma and grief:
six to eight months, I was into too much of a depressive “In the beginning of my injury, I was processing a lot of
state. I was only 23 years old and just lying on the bed grief, a lot of trauma. I also was not very mobile. So a
and trying to understand what exactly had happened. lot of the ways that I passed the time and was able to
Assistive or Artistic Technologies? Exploring the Connections between Art, Disability and Wheelchair Use ASSETS ’22, October 23–26, 2022, Athens, Greece

sort of just emotionally processing thing was through 4.2 Social encounters through art
art.” – P7 Dancer and Musician Participants described how their artistic practices were strongly
infuenced by social aspects. Some participants had active collabora-
Some of the participants who had an artistic practice before
tions where they practiced, performed or experimented with other
they acquired their disability recalled having a complete rejection
artists. Social connections occurred both within the disabled and the
of art for a certain period of time. This creative hiatus was often
art community and beyond. Depending on the context, these con-
sustained by the fear that their functional impairments would make
nections could have either positive and empowering connotations,
it impossible to resume practicing art in the same way they did
or be riddled with ableist discriminations.
before (“I tried getting my drums back going, but I didn’t imagine that
you could play with less than three limbs” – P12 Musician). Resuming 4.2.1 Connections and collaborations. While some artists worked
their artistic practice was often challenging, with individuals facing primarily alone, many collaborated with others on a regular basis.
barriers that were both personal and projected by others. Ultimately, For example, P10 and P12 both played the drums in multiple bands.
strong personal motivations, signifcant resilience, and substantial As part of his training as a DJ and electronic music composer, P16
trial and error enabled artist to fnd ways to return to their practice. had frequent interactions with a mentor that supported his eforts
The experience of acquiring or being born with a disability had a in music production. P4, P6, and P15 were regular members of in-
signifcant impact on one’s life, and artists found a range of creative clusive dance troupes, whereas P5, P9, and P14 had performed with
ways to integrate this experience in the way they danced, made other artists and dancers as part of specifc projects. The deepest
music, or created visual art. For example, P9, who was born with example of collaboration was probably the one between P11 and
spina bifda, explained: his father. Together they described themselves as a symbiotic pair
of artists. P11 is a painter who creates his work by driving his
“When the doctor took out a piece of vertebra after a wheelchair, to which brushes dipped in diferent colors of paint
surgery, he told me that the fat spread after taking out have been attached. His father has worked in the textile industry
the vertebra looked like a blooming fower. Inspired by for a number of years and, creating and mixing colors used on dif-
this, I came up with an action like making fowers bloom ferent items of clothing. Besides helping his son switching between
with my hands behind my back. There are two versions diferent brushes during the painting process, attaching them and
of this work one is to crawl on the ground with only removing them to the modifed wheelchair as required, P11’s father
hands, and the other is to perform on the wheelchair was also the one who applied the resin locking the colors in place
with exactly the same choreography.” – P9 Dancer and as they worked through the layers of their joint artistic creations.
Circus Artist While P11 is the hand, or the wheels, drawing the lines which make
up the paintings, he often leverages his father professional expertise
The experience of disability and wheelchair use could both af-
in choosing colors:
fect the modality in which one’s created art, and the meaning
underneath one artistic expression. Themes around disability and “When we work on paintings I ask him to choose which
assistive technologies would fnd ways to enter artists’ work in colours he wants to use next, because our taste can be
both subtle and deliberate ways. This could be done in provocative diferent. But he often just looks at me and points at the
ways with the aim to challenge stereotypes and other times simply colours. He makes me understand that choosing colours
to promote diverse representation. is my everyday job, so I should be the one to do it” -
Father and artistic collaborator of P11, Painter
“I’m just more aware of the fact that like inclusivity is Collaborations established by participants could involve artists
important to me in terms of disability. So, you know, if with or without disabilities. For example, P6 was member of an
I’m doing something that has numerous people in it, I’ll inclusive dance troupe, but also did extensive work with other
make sure to include a silhouette of somebody in a chair non-disabled dancers. P6 explained how establishing artistic part-
as well or similar” – P3 Baker and Cake Designer nerships between dancers who had a signifcantly diferent prac-
tices and bodies could be challenging and often required signifcant
Regardless of the specifc ways in which artists felt that their
“translation efort” when devising choreographies. However, suc-
disability and their use of wheelchairs afected their artistic prac-
cessful collaborations could lead to transformative discoveries that
tice, all of them described this infuence in positive terms. For some,
enriched everyone’s understanding of movement.
acquiring a disability led them to commit to their career as artists.
Many collaborations were largely professional, but in several
Others spoke of the ways in which mobility impairments that lim-
cases, artistic collaborations could also become meaningful friend-
ited their possibilities drove them to explore in more depth the
ships. In particular, long-standing collaborations such as the ones
movement and techniques which were accessible to them. In con-
amongst dancers in a troupe or musicians in a band, could become
trast, others became more instinctive in their practice which led
an integral part of one’s social support structure. Often, personal
them to create art that could connect with their audience in more
and professional aspects of these relationships could positively
meaningful ways.
enforce each other, ultimately increasing creative drive.
“Does my wheelchair infuence my painting style? Yes, “So our dance company can feel like a little bit like
absolutely. Am i a better painter in the wheelchair, then a family, so it is one reason I stuck with dancing. It
if i was walking? Absolutely” – P13 Painter also opened opportunities to engage in other things like
ASSETS ’22, October 23–26, 2022, Athens, Greece Giulia Barbareschi and Masa Inakage

acting. The more I feel confdent with them the more I circumstances, enabled participants to solicit unfltered comments
am willing to try things which are out of my comfort from their audience:
zone” – P15 Dancer
“No one suspects the person in the wheelchair to be an
4.2.2 Barriers and prejudice. Although many of the social interac- artist. So whenever I went an art gallery that features
tions described by artists had positive connotations, others did not. my work into their exhibitions, I just go around, go up
All participants reported having faced signifcant barriers through- to people and just sort of listening to what they say, and
out their careers, largely in connection with prejudice and ableism. ask them what they think. Because they would never
Challenges and discrimination were particularly common at the imagine is mine, I can actually get honest feedback” –
beginning of one’s artistic journey, when trying to obtain training P2 Sculptor and installation artist
and support to develop skills. Being able to get the proverbial “foot
4.2.3 Advocacy, mentoring, and supporting the community. As so-
in the door” required signifcant resilience from the artist. Moreover,
cial barriers and ableist prejudices represented some of the most
even after being able to access training, many participants were
signifcant frustrations and difculties encountered by participants
expected to have to prove themselves to show that they deserved
in their personal and artistic lives, acts of resistance aimed at de-
to be given access to opportunities that would be easily granted to
bunking stereotypes and promoting inclusions were seen as ex-
someone who did not have a disability.
tremely important. For some artists such as P1, P2, P5, P9, and P14
“When I started to apply for pastry schools, the frst the advocacy intent could sometimes manifest directly in artistic
school I contacted, was an accessible building, but said work inspired by outrage and protest, combined with the drive to
they weren’t willing to accommodate somebody in a specifcally shock or surprise their audience:
wheelchair. So just like the inconvenience, I guess, of
having me that they weren’t up for. The second place, I “I don’t always get inspired by disability - ideas just
tried to take some classes with said that I’d have to pay come - but sometimes my work is driven by outrage.
for two spots, because I’d take up double the amount of That’s a powerful energising force when you have
room of an able bodied person” – P3 Baker and Cake chronic pain and fatigue.” – P14 Combined art artist
Designer Moreover, artists such as P1, P2, P5, P6, P7, P8, P10 and P14 were
Ableism could manifest in diferent ways. One of the most com- regularly engaged in various eforts to increase the accessibility
mon ones was mistrust of one’s ability. For example, P11’s father of the art world to diferent marginalised groups within the dis-
recalled being asked in numerous occasions if he “touched-up” his ability community. For example, P1 and P2 had been engaged in a
son’s paintings as they looked “too beautiful to be painted with a number of collaborative art projects involving young people with
wheelchair”. Patronizing comments and inspiration porn were also profound learning disabilities. P8 instead, worked with diferent
common with artist being complimented for their “courage” rather professional ballet organizations to promote the inclusion of people
than their skills. The continuous questioning of one’s ability could with disabilities. P5, P6, and P7 had studied extensively to develop
signifcantly hamper confdence both personally and professionally. ways to make dancing instruction more inclusive towards people
Stereotypical beliefs were seen as a problematic not only for the with diferent bodies and capabilities and were engaged in train-
individual, but for disabled artists and their outputs more in general, ing others to make their dance classes more accessible. Together
with disability art often being declassed as a social efort rather with another wheelchair musicians and through the support of a
than art in its own right. number of organizations in Canada, P10 had set up and maintained
an accessible music recording studio which has been used by both
“If I ever, like say that I’m a dancer and someone doesn’t
amateur and professional musicians with disabilities for over 30
know me, or has never seen anything I’ve done they’re
years.
like, Oh, okay. . .. Like, in their brain, they’re like, how
did that person dance? Or they think I’m just going to “When we set up the music studio, I don’t know that
do it for fun. I think that perception is like super limited.” we were setting out to prove something but I think we
– P4 Dancer wanted to do something about it. Because we were kind
of looking at this difculty with music practice and
As they became more established, artists would often face less
thinking that there must be lots of people with disabili-
ableist prejudice, but many participants stated that it would never
ties who want to get into music and the arts and, you
entirely disappear. Although most artists had developed their own
know, and they weren’t just able to do it. I mean it was
strategies and could sometimes rely on the support of close allies
self-serving, but it was also for a larger community.” –
to challenge other’s misguided ableist beliefs, the need to con-
P10 Musician
stantly maintain confdent and confrontational personalities was
described as exhausting, especially when one was already feeling Finally, both P4 and P15 engaged in broader community outreach
“knackered because of medication and stuf ” (P1). Some of the artists programmes targeted towards children alongside other members
disclosed their disability on their personal websites and art portfo- of their dance troupes. As part of these programmes and events
lios, whereas others chose not to as they did not want to have their they would perform, deliver fun inclusive dance classes, and enable
work to be necessarily associated to the theme of disability. The children to ask questions about disability, with the goal of chal-
audience’s unawareness of the artist disability status could allow lenging stereotypical beliefs and promoting empowering images.
for a more open-minded interpretation of the work and, in certain These programmes where described as important for disabled and
Assistive or Artistic Technologies? Exploring the Connections between Art, Disability and Wheelchair Use ASSETS ’22, October 23–26, 2022, Athens, Greece

non-disabled children alike, as they would help to tackle both public I guess I didn’t consider before I was disabled.” – P7
and self-stigma. Dancer and Musician
Creative movement experimentation was not solely restricted to
4.3 Skills and technology in art making dance. For example, P2’s body produced involuntary movements
Both individual impairments and the use of wheelchair for mobility through painful spasms that afected both her arms and legs. In the
had a signifcant impact on participants’ movement. Although some past, she had leveraged these movements to create art by letting the
functional limitations could create challenges in one’s artistic prac- spasms draw the initial lines and progressively adding smoother
tice, artists experimented with diferent technologies and their own lines as the spasms slowly subsided.
movement strategies to fnd new and exciting creative possibilities.
“I’ve literally just held my pencil on the on a piece of
4.3.1 A continuous innovative process. Mobility impairments and paper and just let my spasms draw. And it felt quite
other embodied aspects of disability and wheelchair use could be upsetting. So it’s like. . . ‘what is it doing? I can’t control
linked to signifcant challenges in one’s artistic practice. Dancers this’ but then it was like. . . I looked at it and like it so I
pointed out how jumps and leaps could be difcult to perform on continued to work on it” – P2 Sculptor and installation
a wheelchair, musicians explain how limited movement or pos- artist
tural issues could hinder their ability to play an instrument, and
4.3.2 Combining personal and technological mediations. Although
visual artists with reduced trunk and upper limb mobility would
in certain circumstances participants explained that they were di-
struggle to prepare a canvas or set up an installation. However,
rectly able to take advantage of how their disabled bodied moved to
participants explained that these limitations only prevented one to
create unique artistic expressions, in other cases this was achieved
make art in the “conventional way”. As persons with disabilities who
through the development of personalised skills and techniques or
had become somehow profcient in navigating challenges through
through the use of technology.
creativity in their everyday life, participants employed the same
To make up for his inability to play the kick drum using his foot,
innovative and experimental approach in their artistic practices.
in 1990 P12 developed an electric pedal with a solenoid that could
Innovating often begun by questioning the established ways
be activated using a mouth switch. The solution proved widely
to create art and starting to look beyond the “how” art is conven-
inappropriate due to a signifcant activation delay and the inconve-
tionally made using a certain approach, towards investigating the
nience of featuring exposed copper wire that caused him to “mildly
meaning one wishes to transmit and develop a way to achieve that
electrocute” his face during numerous tests. As technology proved
goal. This was particularly relevant in the context of dance where
to be so unsuccessful, he opted to focus on modifying his technique
choreographies were often created using a particular vocabulary
and adapting the set-up of his drum kit to accommodate his new
of movement based on the features and abilities of standing and
style, which enabled him to be able to play the drums without hav-
walking bodies. Dancers and performance artists such as P4, P5,
ing to rely on additional equipment that could interfere with his
P6, P7, P8, P9, P14, and P15 illustrated how they didn’t simply rely
experience as a musician.
on mimicking the movements of a standing dancers while sitting
in a wheelchair, but looked at the purpose of each movement and “The biggest diference is that instead of playing with
created a completely diferent vocabulary of movement that would four limbs, I just play with two hands. So instead of
leverage the unique expressive qualities of a wheelchair using body. having your left hand, let’s say playing the snare drum
and your foot playing the kick drum, I played both
“In the Lecoq pedagogy, there’s a lot of animal work and
with my hands. I just had to change my set up” – P12
looking at how that creates certain movement within
Musician
the body. So when looking at this we found for example
that the kind of shapes that are made by the arm when In contrast, P1 found that his biggest creative advantage was
using a wheelchair, mirrors the shapes that are made by fully realised as he “embraced the pixel” and started exploring the
a bird’s wingtip when it’s fying. So using a wheelchair boundaries of what he could create by integrating diferent forms
is actually quite analogous to the way birds fy in one of technologies in his artwork.
sense” – P5 Dancer and performance artist “I was trained to be a painter, but once I left that be-
These experimentations around movement and its meaning were hind and decided to really look at art and technology,
further enriched by exploring combined dynamics of movements I became able to do anything. I could make a piece of
involving both multiple wheelchair users who had diferent capabil- artwork that’s as big as a block of fats, or I could mass
ities, and other disabled dancers with non-normative bodies [29], to produce little toys that are kind of crazy little versions
disrupt the existing expressive barriers and codifed ways of being of diferent wheelchairs” – P1 Digital artist
creative. As a DJ, P16 made extensive use of technology. However, through
“In [name of the dance company], we often partnered the experience gained navigating inaccessible set-ups where he had
with each other. So a lot of lifting each other a lot of to deal with consoles placed at an excessive height or DJ-ing stations
supporting each other leaning. So when you’re doing that did not enable him to have a view of the dancing crowd, he
that with someone with an entirely diferent body than developed better memory and listening skills. The frst ones would
yours, what are the really creative things you can do? be essential for him to be able to remember the location of various
So it opened up a lot of like, creative opportunities that controls in the interface that he might be unable to see. The latter,
ASSETS ’22, October 23–26, 2022, Athens, Greece Giulia Barbareschi and Masa Inakage

described as a key skill for any DJ, would enable him to improvise could help to challenge people’s idea of what a wheelchair could
and more quickly adjust to the reactions of the crowd: be.
“As a DJ you have to maintain the crowd. Keep their “But if you could make a wheelchair that in the middle
energy up, you don’t let the crowd go out. By listening I of a set starts to project light or something synced to the
can know what we beat or the BPM of the song going music which you are playing, it’s going to be a diferent
on is, but also paying attention and not letting of the kind of feeling for the person who is playing and for the
crowd. All these things are really important and now I for the people who are seeing. Because nobody thinks it
can keep them in mind when I and doing my set” – P16 is possible” – P17 DJ
DJ
5 DISCUSSION
4.3.3 Integrating wheelchairs from life to art. Through their experi- In this section we discuss how the learnings from this study high-
ences in everyday life, participants developed intimate bonds with light tensions and synergies between disability art and HCI research
their own wheelchairs. The wheelchair was described by many as around accessible and assistive technologies. Throughout these re-
an extension of their body, but also as a trusted partner that enabled fections we look at diferent ways in which collaborating with
participants to achieve their goals in both art and life. This embod- disabled artists could help HCI researchers to overcome the limi-
ied connection was largely instinctive. As a result, participants did tations that lead to oversimplifed interpretations of disability and
not actually think about their wheelchair during their daily life, assistive technology use. Moreover, we explore some of the possi-
until external barriers forced them to do so. bilities that could arise if the design of assistive technologies was
“The wheelchair is like my legs. I can control it without re-imagined through the lens of art.
thinking about it. But if there is a staircase in front of
me, the wheelchair suddenly becomes a trouble because 5.1 Embracing complexity and challenging
I cannot use it for climbing the stairs. I can move freely oversimplifcation
at normal times, only when a barrier is in front of me I Over the last decade, thanks to the work of disabled scholars and
suddenly realize the existence of my wheelchair.” – P9 allies within the community, the feld of accessibility and assistive
Dancer and circus artist technology research has begun to change [10, 35, 53, 67, 80]. From
Within the context of art, the connection between the person greater integration of disability studies, to increased recognition
and the wheelchair became both more playful and more deliberate. to the hidden work of people with disabilities, and acknowledge-
Artistic practices allowed individuals to test and expand on the ment of the embodied and biographical relationships between dis-
existing boundaries of how they normally used their wheelchair. abled users and their assistive technologies, we have developed a
At the same time, using their wheelchairs within their art practice more complex understanding of the role of assistive technologies
enabled them to fully leverage its distinctive features making their within the experience of disability [6, 8, 10, 12, 35, 75]. However,
performance unique and exciting. However, current wheelchair accessibility and assistive technology research is still prone to use
design also limited these possibilities and many artists felt that rigid dichotomies such as disability vs ability, personal vs social
more should be done to allow them to truly explore the boundaries dimensions, access vs exclusion, and use vs nonuse of assistive
of their creativity. technologies [7, 35, 64, 74, 75].
The experiences of disabled artists have shown unique features
“I love to explore how diferent wheelchair confgura-
that bypass these dichotomies and allow us to explore complex nar-
tions afect my dancing, it’s almost like having a new
ratives in more powerful ways observing how interactions between
body to dance with. But I need to get out of my chair,
multiple identities, embodied aspects, social encounters, and as-
get tools and. . . It takes time. I can’t do that in a show
sistive technology explorations are reciprocally connected to each
in between routines. I want to be able to make those
other. Journeys that led participants to become artists could be
types of modifcations very, very simply and quickly” –
sparked by external events, embedded into one’s way of experi-
P15 Dancer
encing the world, or brought to light by the recognition of others.
Although participants felt that the way in which their wheelchair In a similar fashion, one’s experience of disability could be rooted
afected their movement and body image was visible, the ways in in the occurrence of a particular traumatic event, be part of one’s
which their movements and their skills changed the movement of identity since birth, or situated in a social context afected by the
the chair was often overlooked. Similarly, artists explained how perceptions of wheelchair using bodies.
the fact that their own bodies and the wheelchair existed both in When looking at the interconnections between these two ex-
connection with each other, and as separate entities was rarely periences, art positively infuenced the experience of disability by
understood by most people without lived experience. Dancers such providing a meaningful way to process physical pain and emo-
as P4, P6, P8 and P9 tried to make this relationship visible by inte- tional grief, or express outrage and frustration. However, art was
grating choreographies that saw them dancing with the wheelchair, never described as being solely a form of therapy, or a tool for
away from it, and using the wheelchair as a dance prop on which individual refection, but maintained its focus on technical quality
movement was enacted. Finally, P1, P5, P14 and P17 suggested that and expressive power [40]. Similarly, disability could deepen one’s
creating ways in which the wheelchair itself could become a work artistic practice by providing a more complex and multifaceted
of art, or an object with a specifc sense of presence on the stage, lived experience, directly inspiring an artistic performance or a
Assistive or Artistic Technologies? Exploring the Connections between Art, Disability and Wheelchair Use ASSETS ’22, October 23–26, 2022, Athens, Greece

piece of artwork, or promoting a more inclusive and meaningful 5.2 The value of exploring artistic technologies
artistic practice. Yet, the work of disabled artists often transcended Engaging with disabled artists enabled us to investigate how
the label of “Disability art” in the same way as one’s identity as a wheelchairs could be used in unique and unconstrained ways as
disabled person constituted an important part of their self-identity, part of artistic expressions and observe how artistic expressions
but it was not the only defning characteristics of an individual could become bolder and more creative as a result of the integra-
[82]. These tensions show how multiple strong identities, don’t just tion of wheelchairs. The embodied experience of wheelchair use
coexist, but continually infuence each other enriching experiences in everyday life was largely instinctive, with participants natu-
and creating dynamics that can converge and diverge at the same rally integrating the image of the wheelchair in their own bodies
time. without necessarily having to refect on it [8, 11]. On the other
Embodied and internalized experiences as artists, disabled in- hand, the use of wheelchairs as part of one’s artistic work involved
dividuals and wheelchair users were also shaped by the social en- thoughtful exploration and a much deeper understanding of the
counters of participants which could incorporate both positive and unique features associated with the combined movement of body
negative connotations. Even when challenging, empowering collab- and wheelchair. For example, the translational work of wheelchair
orations could see the artists either receiving or providing support dancers on challenging the notion of the existence of ideal ways to
to others, co-creating artistic expressions, or promoting more in- dance, required signifcant work on deconstructing the expressive
clusive ways of working. At the same time, interacting with others qualities of movement. Moreover, P5’s work on developing a new
within and beyond the artistic and disabled communities meant vocabulary of movement for non-standing dancers was grounded in
having to consistently face ableist and discriminatory attitudes. the reciprocal analysis of how the movement of the dancer afected
Both the power of collaboration and allyship, and the negative the movement of the wheelchair and vice versa, promoting an idea
impact of ableism and prejudice have been explored in the context of bi-directional partnership rather than top-down control of user
of disability and assistive technology use [10, 28, 35, 51]. However, over device.
we argue that art allows to explore these dynamics in a more pro- Interestingly the connections between the artist and their
found way. First of all, the work of disabled artists can become a wheelchair seemed to go beyond direct interactions and continued
politically situated act of resistance towards predominant ableist to exist even when the artists and the wheelchair were not physi-
narrative incorporating both the individual experience of the artist cally in contact with each other. P1, highlighted how for example it
and broader social considerations [18, 85]. While disabled artists was often possible to identify the wheelchair of a particular person,
did not claim that they could act as representative for the whole even if the user was not actually sitting on it. Previous work by
disabled community, they felt that art gave them unique tools to Profta et al 2016 and 2018 analyzed how cochlear implant users
challenge existing narratives, inviting their audiences to refect engage in purposeful eforts to personalize their devices in order
on diferent representations of concepts of disability and assistive to refect their preferences and match their personality [63, 64].
technologies. Furthermore, the work on the biographical prototypes by Bennet
Secondly, art allowed for the opportunity to move beyond the e al 2019 showed how meaningful connections between disabled
limitations of an inaccessible environment that often equates dis- users and the devices that they use are built through sustained
ability with disadvantage [65]. Within their practices, many artists interactions that take place over long periods of time [12].
did not focus on what they could not do as a result of disability Exploring these dynamics through art can ofer some unique
or assistive technology use. Nor they simply looked at what they opportunities to reimagine how the connections between a disabled
could do despite their disability or assistive technologies. Instead, user and their assistive technology is reciprocal and features both
most artists emphasized the uniqueness of their creative expres- intentional and non-deliberate aspects. While substantial work in
sions by looking at what they could do because of their disability recent years had investigated how the use of assistive technolo-
and assistive technology use, which was often inaccessible to them gies afected the way in which people with disabilities perceive
before they acquired a disability or begun to use their assistive themselves and are perceived by others, the implications surround-
technologies. Removing these artifcially created barriers means ing how the actions of users afect the perception of assistive and
creating better opportunities to challenge ability-based hierarchies accessible technologies is much less understood [7, 26, 75].
and highlight the work of disabled individuals both in connection When examining the use of wheelchairs of other assistive and
with and beyond the use of assistive and accessible technologies accessible technologies in everyday life, as HCI researchers we
[10, 12, 15]. often focus on the identifcation of frictions and access barriers
When considering these refections is important to notice that that can limit one’s ability to use their devices to achieve specifc
the current research has been carried out specifcally with artists goals [52]. However, examining the practices of disabled artist
who used wheelchairs for mobility. Several artists stated that they and the ways in which they manage to integrate assistive and
experienced multiple limitations and that they used a variety of accessible technologies into their art can shift our point of view from
assistive and accessible technologies, within art practice and ev- closing gaps and solving problems to leverage strengths and take
eryday life, there was a certain degree of similarities between their advantage of unique features. Painting with a wheelchair created
experiences as a result of our methodological choice. Although we more instinctive lines than the one’s drawn by hand, getting used
argue that many of the implications presented are generalizable, to DJ-ing without being able to see the crowd made one more
further exploration with artists who use diferent types of assistive responsive, and dancing with a wheelchair created opportunities
technologies should be carried out to embrace the complexity of for more complex choreographies.
experiences even further.
ASSETS ’22, October 23–26, 2022, Athens, Greece Giulia Barbareschi and Masa Inakage

The studies by Braggs 2018 and 2021 and the competency frame- diferent identities as disabled individuals, artists, and wheelchair
work by Reyes Cruz et al 2020, have highlighted how as a result of users were somehow separate, yet deeply interconnected. Moreover,
their disability and assistive technology use, people with disabili- their experiences as disabled people and artists had been shaped
ties have developed unique sets of skills that should be leveraged by embodiment as much as by social interactions. Finally, both
to build on one’s unique strengths [13, 14, 67]. In the examples their creative and everyday lives were afected by a complex bal-
shared by our participants we could identify ways to further ex- ance between their own capabilities, alongside the limitations and
pand this by considering what is uniquely possible through the advantages of the technologies they used, both assistive and non.
use of a wheelchair or other assistive devices and how these fea-
tures could be augmented and expanded to facilitate novel creative
ACKNOWLEDGMENTS
expressions in art and everyday life.
Wheeling dancers experimenting with how moving the centre of We wish to sincerely thanks all the artists who have shared their
mass and the position of the wheels afects their style, represent a knowledge, wisdom and creative journeys with us. This research
poignant example of self-driven body reconfguration. This ability would have never seen the light without their support. We are also
is primarily available to disabled-cyborg bodies, but currently held grateful for the funding from the Japan Society for Promotion of
hostage by technology designers [84]. What would happen if rather Science which supports the frst author (JSPS International Research
than enabling wheeling dancers to merely change the position of Fellow PE 207228).
the axle, we would instead create amorphous mobility technologies
that can be completely adapted to one’s body and preferences? REFERENCES
Explorations rooted in disabled people skills, art and technology [1] Fabiha Ahmed, Dennis Kuzminer, Michael Zachor, Lisa Ye, Rachel Josepho,
can also extend beyond wheelchairs. For example, studies have William Christopher Payne, and Amy Hurst. 2021. Sound Cells: Rendering Visual
and Braille Music in the Browser. In The 23rd International ACM SIGACCESS
shown that Deaf and Hard of Hearing (DHH) individuals displayed Conference on Computers and Accessibility (ASSETS ’21), Association for Comput-
increased ability to interpret haptic and vibro-tactile stimuli [83]. ing Machinery, New York, NY, USA, 1–4. DOI:https://doi.org/10.1145/3441852.
Although this ability has been leveraged to produce music chairs 3476555
[2] Kristina Andersen, Laura Devendorf, James Pierce, Ron Wakkary, and Daniela
and vests that allow DHH individuals to feel music [17, 44, 71], how K. Rosner. 2018. Disruptive improvisations: 2018 CHI Conference on Human
could it be harnessed to enable DHH artists to create revolutionary Factors in Computing Systems, CHI 2018. CHI 2018 - Extended Abstracts of the
musical experiences for all? Could a haptic disco become an activity 2018 CHI Conference on Human Factors in Computing Systems (April 2018), 1–8.
DOI:https://doi.org/10.1145/3170427.3170630
that leads us to redefne what music is, and stimulate new debates [3] Ryoichi Ando, Isao Uebayashi, Hayato Sato, Hayato Ohbayashi, Shota Katagiri,
about the accessibility of tactile versus auditory communication? Shuhei Hayakawa, and Kouta Minamizawa. 2021. Research on the transcendence
of bodily diferences, using sport and human augmentation medium. In Aug-
Would this lead to the development of “feeling aids” that would mented Humans Conference 2021 (AHs’21), Association for Computing Machinery,
replace conventional hearing aids? Another example can be seen New York, NY, USA, 31–39. DOI:https://doi.org/10.1145/3458709.3458981
when looking at the practice of the blind artist Victor Tan Wee [4] Farkhandah Aziz, Chris Creed, Maite Frutos-Pascual, and Ian Williams. 2021.
Inclusive Voice Interaction Techniques for Creative Object Positioning. In Pro-
Tar who waves wire to create sculptures, similarly to how blind ceedings of the 2021 International Conference on Multimodal Interaction (ICMI
weavers interviewed by Das et al 2020 created unique patterns on ’21), Association for Computing Machinery, New York, NY, USA, 461–469.
cloth [23]. The wire sculptures by Tan Wee Tar are somehow akin DOI:https://doi.org/10.1145/3462244.3479937
[5] Gui Hee Bang and Kyung Mee Kim. 2015. Korean disabled artists’ experiences
to the wireframe 3D prints by Mueller et al 2014 [57], but could of creativity and the environmental barriers they face. Disability & Society 30, 4
weaving be used as an input to digital fabrication as well? Could (April 2015), 543–555. DOI:https://doi.org/10.1080/09687599.2015.1030065
[6] Giulia Barbareschi, Catherine Holloway, Katherine Arnold, Grace Magomere,
wove patterns based in braille cells be read as speech or music Wyclife Ambeyi Wetende, Gabriel Ngare, and Joyce Olenja. 2020. The Social
opening the door to new creative endeavors? Ultimately, exploring Network: How People with Visual Impairment use Mobile Phones in Kibera,
the deliberate and daring practices of disabled artists opens the Kenya. In Proceedings of the 2020 CHI Conference on Human Factors in Computing
Systems (CHI ’20), Association for Computing Machinery, Honolulu, HI, USA,
door to reimagining the way we leverage technology in the context 1–15. DOI:https://doi.org/10.1145/3313831.3376658
of art and accessibility. [7] Giulia Barbareschi, Norah Shitawa Kopi, Ben Oldfrey, and Catherine Holloway.
2021. What diference does tech make? Conceptualizations of Disability and
Assistive Technology among Kenyan Youth: Conceptualizations of Disability
and AT. In The 23rd International ACM SIGACCESS Conference on Computers and
6 CONCLUSIONS Accessibility (ASSETS ’21), Association for Computing Machinery, New York, NY,
USA, 1–13. DOI:https://doi.org/10.1145/3441852.3471226
Disability shares a powerful connection with the world of art. Simi- [8] Giulia Barbareschi, Sibylle Daymond, Jake Honeywill, Dominic Noble, Nancy
larly, the interactions of art with HCI have given rise to new meth- N Mbugua, Ian Harris, Victoria Austin, and Catherine Holloway. 2020. Value
ods, questioned assumptions and fostered novel insights. How- beyond function: analyzing the perception of wheelchair innovations in Kenya.
In proceedings of the 22nd International ACM SIGACCESS Conference on Computers
ever, within the context of accessibility and assistive technology and Accessibility (ASSETS ’20), Association for Computing Machinery, New York,
research, it has rarely been explored how the work of disabled NY, USA.
artists help to overcome limitations of existing research approaches [9] Colin Barnes. 2003. Efecting change: Disability, culture and art. 31.
[10] Cynthia L. Bennett, Erin Brady, and Stacy M. Branham. 2018. Interdependence
and spur disruptive thinking about ways to explore the unique as a Frame for Assistive Technology Research and Design. In Proceedings of the
creative possibilities that emerge in connection to disability and 20th International ACM SIGACCESS Conference on Computers and Accessibility
(ASSETS ’18), Association for Computing Machinery, Galway, Ireland, 161–173.
assistive technology use. In this paper we present the results from DOI:https://doi.org/10.1145/3234695.3236348
collaboratively analyzed interviews that examined the practices [11] Cynthia L. Bennett, Keting Cen, Katherine M. Steele, and Daniela K. Rosner.
of seventeen artists who used wheelchair for mobility to explore 2016. An Intimate Laboratory?: Prostheses As a Tool for Experimenting with
Identity and Normalcy. In Proceedings of the 2016 CHI Conference on Human
the reciprocal infuences between art, disability and assistive tech- Factors in Computing Systems (CHI ’16), ACM, New York, NY, USA, 1745–1756.
nology use. The insights shared by participants showed how their DOI:https://doi.org/10.1145/2858036.2858564
Assistive or Artistic Technologies? Exploring the Connections between Art, Disability and Wheelchair Use ASSETS ’22, October 23–26, 2022, Athens, Greece

[12] Cynthia L. Bennett, Burren Peil, and Daniela K. Rosner. 2019. Biographical Pro- Computers and Accessibility (ASSETS ’17), Association for Computing Machinery,
totypes: Reimagining Recognition and Disability in Design. In Proceedings of New York, NY, USA, 422–426. DOI:https://doi.org/10.1145/3132525.3134780
the 2019 on Designing Interactive Systems Conference (DIS ’19), Association for [34] Anna Catherine Hickey-Moody. 2009. Unimaginable bodies: Intellectual disability,
Computing Machinery, San Diego, CA, USA, 35–47. DOI:https://doi.org/10.1145/ performance and becomings. In Unimaginable Bodies. Brill.
3322276.3322376 [35] Megan Hofmann, Devva Kasnitz, Jennifer Mankof, and Cynthia L Bennett. 2020.
[13] Danielle Bragg, Cynthia Bennett, Katharina Reinecke, and Richard Ladner. 2018. Living Disability Theory: Refections on Access, Research, and Design. In The
A Large Inclusive Study of Human Listening Rates. In Proceedings of the 2018 CHI 22nd International ACM SIGACCESS Conference on Computers and Accessibility
Conference on Human Factors in Computing Systems (CHI ’18), Association for (ASSETS ’20), Association for Computing Machinery, New York, NY, USA, 1–13.
Computing Machinery, New York, NY, USA, 1–12. DOI:https://doi.org/10.1145/ DOI:https://doi.org/10.1145/3373625.3416996
3173574.3174018 [36] Catherine Holloway and Giulia Barbareschi. 2021. Disability Interactions:
[14] Danielle Bragg, Katharina Reinecke, and Richard E. Ladner. 2021. Expanding a Creating Inclusive Innovations. Synthesis Lectures on Human-Centered
Large Inclusive Study of Human Listening Rates. ACM Trans. Access. Comput. 14, Informatics 14, 6 (December 2021), i–198. DOI:https://doi.org/10.2200/
3 (July 2021), 12:1-12:26. DOI:https://doi.org/10.1145/3461700 S01141ED1V01Y202111HCI053
[15] Stacy M. Branham and Shaun K. Kane. 2015. Collaborative Accessibility: How [37] Rhiann Holloway. 2021. Artists as Defenders: Disability Art as Means to Mobilise
Blind and Sighted Companions Co-Create Accessible Home Spaces. In Proceedings Human Rights. (2021).
of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI [38] Erin J. Hoppe. 2017. Perspectives of Young Artists with Disabilities: Negotiating
’15), Association for Computing Machinery, Seoul, Republic of Korea, 2373–2382. Identity. In Handbook of Arts Education and Special Education. Routledge.
DOI:https://doi.org/10.1145/2702123.2702511 [39] Alon Ilsar and Gail Kenning. 2020. Inclusive improvisation through sound and
[16] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. movement mapping: from DMI to ADMI. In The 22nd International ACM SIGAC-
Qualitative Research in Psychology 3, 2 (January 2006), 77–101. DOI:https://doi. CESS Conference on Computers and Accessibility (ASSETS ’20), Association for
org/10.1191/1478088706qp063oa Computing Machinery, New York, NY, USA, 1–8. DOI:https://doi.org/10.1145/
[17] Doga Cavdir and Ge Wang. 2020. Felt sound: A shared musical experience for 3373625.3416988
the deaf and hard of hearing. [40] Jens Ineland and Lennart Sauer. 2007. Institutional Environments and Sub-
[18] Eliza Chandler, Nadine Changfoot, Carla Rice, Andrea LaMarre, and Roxanne Cultural Belonging: Theatre and Intellectual Disabilities. Scandinavian Journal
Mykitiuk. 2018. Cultivating disability arts in Ontario. Review of Education, Peda- of Disability Research 9, 1 (January 2007), 46–57. DOI:https://doi.org/10.1080/
gogy, and Cultural Studies 40, 3 (May 2018), 249–264. DOI:https://doi.org/10.1080/ 15017410601029770
10714413.2018.1472482 [41] Rachel Jacobs, Steve Benford, and Ewa Luger. 2015. Behind The Scenes at HCI’s
[19] Chris Creed. 2016. Assistive tools for disability arts: collaborative experiences in Turn to the Arts. In Proceedings of the 33rd Annual ACM Conference Extended
working with disabled artists and stakeholders. Journal of Assistive Technologies Abstracts on Human Factors in Computing Systems (CHI EA ’15), Association
10, 2 (January 2016), 121–129. DOI:https://doi.org/10.1108/JAT-12-2015-0034 for Computing Machinery, New York, NY, USA, 567–578. DOI:https://doi.org/10.
[20] Chris Creed. 2018. Assistive technology for disabled visual artists: exploring 1145/2702613.2732513
the impact of digital technologies on artistic practice. Disability & Society 33, 7 [42] Mohit Jain, Nirmalendu Diwakar, and Manohar Swaminathan. 2021. Smartphone
(August 2018), 1103–1119. DOI:https://doi.org/10.1080/09687599.2018.1469400 Usage by Expert Blind Users. In Proceedings of the 2021 CHI Conference on Human
[21] Chris Creed, Maite Frutos-Pascual, and Ian Williams. 2020. Multimodal Gaze Factors in Computing Systems (CHI ’21), Association for Computing Machinery,
Interaction for Creative Design. In Proceedings of the 2020 CHI Conference on New York, NY, USA, 1–15. DOI:https://doi.org/10.1145/3411764.3445074
Human Factors in Computing Systems. Association for Computing Machinery, [43] Myounghoon Jeon, Rebecca Fiebrink, Ernest A. Edmonds, and Damith Herath.
New York, NY, USA, 1–13. Retrieved April 12, 2022 from https://doi.org/10.1145/ 2019. From rituals to magic: Interactive art and HCI of the past, present, and
3313831.3376196 future. International Journal of Human-Computer Studies 131, (November 2019),
[22] Simon Darcy, Hazel Maxwell, Simone Grabowski, and Jenny Onyx. 2022. Artistic 108–119. DOI:https://doi.org/10.1016/j.ijhcs.2019.06.005
Impact: From Casual and Serious Leisure to Professional Career Development in [44] Maria Karam, Carmen Branje, Gabe Nespoli, Norma Thompson, Frank A. Russo,
Disability Arts. Leisure Sciences 44, 4 (May 2022), 514–533. DOI:https://doi.org/10. and Deborah I. Fels. 2010. The emoti-chair: an interactive tactile music exhibit.
1080/01490400.2019.1613461 In CHI ’10 Extended Abstracts on Human Factors in Computing Systems (CHI EA
[23] Maitraye Das, Katya Borgos-Rodriguez, and Anne Marie Piper. 2020. Weaving ’10), Association for Computing Machinery, New York, NY, USA, 3069–3074.
by Touch: A Case Analysis of Accessible Making. In Proceedings of the 2020 CHI DOI:https://doi.org/10.1145/1753846.1753919
Conference on Human Factors in Computing Systems. Association for Computing [45] Aaron Karp and Bryan Pardo. 2017. HaptEQ: A Collaborative Tool For Visually
Machinery, New York, NY, USA, 1–15. Retrieved July 5, 2022 from https://doi. Impaired Audio Producers. In Proceedings of the 12th International Audio Mostly
org/10.1145/3313831.3376477 Conference on Augmented and Participatory Sound and Music Experiences (AM
[24] Hazel Dixon. 2021. Immersive performance and inclusion through a lens of ’17), Association for Computing Machinery, New York, NY, USA, 1–4. DOI:https:
the social model of disability. interactions 28, 3 (April 2021), 70–72. DOI:https: //doi.org/10.1145/3123514.3123531
//doi.org/10.1145/3460111 [46] [46] Sandjar Kozubaev, Chris Elsden, Noura Howell, Marie Louise Juul Søn-
[25] Jennifer Eisenhauer. 2007. Just Looking and Staring Back: Challenging Ableism dergaard, Nick Merrill, Britta Schulte, and Richmond Y. Wong. 2020. Expand-
through Disability Performance Art. Studies in Art Education 49, 1 (October 2007), ing Modes of Refection in Design Futuring. In Proceedings of the 2020 CHI
7–22. DOI:https://doi.org/10.1080/00393541.2007.11518721 Conference on Human Factors in Computing Systems. Association for Com-
[26] Heather A. Faucett, Kate E. Ringland, Amanda L. L. Cullen, and Gillian R. Hayes. puting Machinery, New York, NY, USA, 1–15. Retrieved April 12, 2022 from
2017. (In)Visibility in Disability and Assistive Technology. ACM Trans. Access. https://doi.org/10.1145/3313831.3376526
Comput. 10, 4 (October 2017), 14:1-14:17. DOI:https://doi.org/10.1145/3132040 [47] Petra Kuppers. 2016. Diversity: Disability. Art Journal 75, 1 (January 2016), 93–97.
[27] Andrea García-Santesmases Fernández and Miriam Arenas Conejo. 2017. Playing DOI:https://doi.org/10.1080/00043249.2016.1171549
crip: the politics of disabled artists’ performances in Spain. Research in Drama [48] Mike Levin. 2010. The art of disability: An interview with Tobin Siebers. Disability
Education: The Journal of Applied Theatre and Performance 22, 3 (July 2017), Studies Quarterly 30, 2 (2010).
345–351. DOI:https://doi.org/10.1080/13569783.2017.1327804 [49] Susan Levy and Hannah Young. 2020. Arts, Disability and Crip Theory: Temporal
[28] Giulia Barbareschi, Mark T. Carew, Norah Kopi, and Catherine Holloway. 2021. Re-Imagining in Social Care for People with Profound and Multiple Learning
When they see a wheelchair, they’ve not even seen me”- Factors shaping the Disabilities. Scandinavian Journal of Disability Research 22, 1 (March 2020), 68–79.
experience of disability stigma and discrimination in Kenya. International Journal DOI:https://doi.org/10.16993/sjdr.620
of Environmental Research and Public Health (in press 2021). [50] Regan Linton. 2021. Acting Training and Instruction for Wheelchair-Using Artists.
[29] Dan Goodley and Katherine Runswick-Cole. 2013. The body as disability and Routledge. DOI:https://doi.org/10.4324/9781003125808-2
possability: theorizing the ‘leaking, lacking and excessive’ bodies of disabled [51] Kevin M. Storer and Stacy M. Branham. 2021. Deinstitutionalizing Independence:
children. Scandinavian Journal of Disability Research 15, 1 (March 2013), 1–19. Discourses of Disability and Housing in Accessible Computing. In The 23rd
DOI:https://doi.org/10.1080/15017419.2011.640410 International ACM SIGACCESS Conference on Computers and Accessibility (AS-
[30] Aimi Hamraie. 2022. (Ir)resistible Stairs: Public Health, Desiring Practices, and SETS ’21), Association for Computing Machinery, New York, NY, USA, 1–14.
Material-Symbolic Ableism. Journal of Architectural Education 76, 1 (January DOI:https://doi.org/10.1145/3441852.3471213
2022), 49–59. DOI:https://doi.org/10.1080/10464883.2022.2017691 [52] Kelly Mack, Emma McDonnell, Dhruv Jain, Lucy Lu Wang, Jon E. Froehlich,
[31] Aimi Hamraie and Kelly Fritsch. 2019. Crip technoscience manifesto. Catalyst: and Leah Findlater. 2021. What Do We Mean by “Accessibility Research”? A
Feminism, Theory, Technoscience 5, 1 (2019), 1–33. Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019.
[32] Kaito Hatakeyama, MHD Yamen Saraiji, and Kouta Minamizawa. 2019. MusiArm: In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems,
Extending Prosthesis to Musical Expression. In Proceedings of the 10th Augmented 1–18.
Human International Conference 2019 (AH2019), Association for Computing Ma- [53] Jennifer Mankof, Gillian R. Hayes, and Devva Kasnitz. 2010. Disability Studies As
chinery, New York, NY, USA, 1–8. DOI:https://doi.org/10.1145/3311823.3311873 a Source of Critical Inquiry for the Field of Assistive Technology. In Proceedings of
[33] Peregrine Hawthorn and Daniel Ashbrook. 2017. Cyborg Pride: Self-Design in
e-NABLE. In Proceedings of the 19th International ACM SIGACCESS Conference on
ASSETS ’22, October 23–26, 2022, Athens, Greece Giulia Barbareschi and Masa Inakage

the 12th International ACM SIGACCESS Conference on Computers and Accessibility [71] Anastasia Schmitz, Catherine Holloway, and Youngjun Cho. 2020. Hearing
(ASSETS ’10), ACM, New York, NY, USA, 3–10. DOI:https://doi.org/10.1145/ through Vibrations: Perception of Musical Emotions by Profoundly Deaf People.
1878803.1878807 DOI:https://doi.org/10.48550/arXiv.2012.13265
[54] Neil Marcus, Devva Kasnitz, and Pamela Block. 2016. If Disability Is a Dance, [72] Caitlin Ostrow Seidler. 2011. Fighting Disability Stereotypes with Comics. Art
Who Is the Choreographer? A Conversation About Life Occupations, Art, Education 64, 6 (November 2011), 20–23.
Movement. In Occupying Disability: Critical Approaches to Community, Jus- [73] Phoebe Sengers and Chris Csikszentmihályi. 2003. HCI and the arts: a conficted
tice, and Decolonizing Disability, Pamela Block, Devva Kasnitz, Akemi Nishida convergence? In CHI ’03 Extended Abstracts on Human Factors in Computing
and Nick Pollard (eds.). Springer Netherlands, Dordrecht, 347–358. DOI:https: Systems (CHI EA ’03), Association for Computing Machinery, New York, NY,
//doi.org/10.1007/978-94-017-9984-3_24 USA, 876–877. DOI:https://doi.org/10.1145/765891.766044
[55] Paddy Masefeld. 2006. Strength: Broadsides from disability on the arts. Trentham [74] Kristen Shinohara and Jacob O. Wobbrock. 2011. In the Shadow of Misperception:
Books Limited. Assistive Technology Use and Social Interactions. In Proceedings of the SIGCHI
[56] Ann Millett-Gallant, Elizabeth Howie, and Ann Millett-Gallant. 2017. Disability Conference on Human Factors in Computing Systems (CHI ’11), ACM, New York,
and art history. Routledge London and New York. NY, USA, 705–714. DOI:https://doi.org/10.1145/1978942.1979044
[57] Stefanie Mueller, Sangha Im, Serafma Gurevich, Alexander Teibrich, Lisa Pfs- [75] Kristen Shinohara and Jacob O. Wobbrock. 2016. Self-conscious or self-confdent?
terer, François Guimbretière, and Patrick Baudisch. 2014. WirePrint: 3D printed A diary study conceptualizing the social accessibility of assistive technology.
previews for fast prototyping. In Proceedings of the 27th annual ACM symposium ACM Transactions on Accessible Computing (TACCESS) 8, 2 (2016), 1–31.
on User interface software and technology (UIST ’14), Association for Computing [76] Tobin Siebers. 2013. Disability and the theory of complex embodiment—for
Machinery, New York, NY, USA, 273–280. DOI:https://doi.org/10.1145/2642918. identity politics in a new register. The disability studies reader 4, (2013), 278–297.
2647359 [77] Per Koren Solvang. 2012. From identity politics to dismodernism? Changes in
[58] Timothy Neate, Abi Roper, Stephanie Wilson, Jane Marshall, and Madeline Cruice. the social meaning of disability art. Alter 6, 3 (July 2012), 178–187. DOI:https:
2020. CreaTable Content and Tangible Interaction in Aphasia. In Proceedings //doi.org/10.1016/j.alter.2012.05.002
of the 2020 CHI Conference on Human Factors in Computing Systems (CHI ’20), [78] Per Koren Solvang. 2018. Between art therapy and disability aesthetics: a so-
Association for Computing Machinery, Honolulu, HI, USA, 1–14. DOI:https: ciological approach for understanding the intersection between art practice
//doi.org/10.1145/3313831.3376490 and disability discourse. Disability & Society 33, 2 (February 2018), 238–253.
[59] William Christopher Payne, Alex Yixuan Xu, Fabiha Ahmed, Lisa Ye, and DOI:https://doi.org/10.1080/09687599.2017.1392929
Amy Hurst. 2020. How Blind and Visually Impaired Composers, Producers, [79] Katta Spiel. 2021. The Bodies of TEI – Investigating Norms and Assumptions in
and Songwriters Leverage and Adapt Music Technology. In The 22nd Interna- the Design of Embodied Interaction. In Proceedings of the Fifteenth International
tional ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’20), Conference on Tangible, Embedded, and Embodied Interaction (TEI ’21), Association
Association for Computing Machinery, New York, NY, USA, 1–12. DOI:https: for Computing Machinery, New York, NY, USA, 1–19. DOI:https://doi.org/10.
//doi.org/10.1145/3373625.3417002 1145/3430524.3440651
[60] Dharani Priyahansika Perera, Jim R. T. Eales, and Kathy Blashki. 2007. The drive [80] Katta Spiel, Kathrin Gerling, Cynthia L. Bennett, Emeline Brulé, Rua M. Williams,
to create: an investigation of tools to support disabled artists. In Proceedings of Jennifer Rode, and Jennifer Mankof. 2020. Nothing About Us Without Us:
the 6th ACM SIGCHI conference on Creativity & cognition (C&C ’07), Association Investigating the Role of Critical Disability Studies in HCI. In Extended Ab-
for Computing Machinery, New York, NY, USA, 147–152. DOI:https://doi.org/10. stracts of the 2020 CHI Conference on Human Factors in Computing Systems
1145/1254960.1254981 (CHI EA ’20), Association for Computing Machinery, New York, NY, USA, 1–8.
[61] Helen Polson. 2013. “ The Dance is in Your Body and Not in Your Crutches”: DOI:https://doi.org/10.1145/3334480.3375150
Technique, Technology, and Agency in Disability Movement Performance. (2013). [81] Miriam Sturdee, Makayla Lewis, Angelika Strohmayer, Katta Spiel, Nantia
[62] Jesse Prinz. 2017. Against outsider art. Journal of Social Philosophy 48, 3 (2017), Koulidou, Sarah Fdili Alaoui, and Josh Urban Davis. 2021. A Plurality of Prac-
250–272. tices: Artistic Narratives in HCI Research. In Creativity and Cognition (C&C
[63] Halley P. Profta, Abigale Stangl, Laura Matuszewska, Sigrunn Sky, and Shaun K. ’21), Association for Computing Machinery, New York, NY, USA, 1. DOI:https:
Kane. 2016. Nothing to Hide: Aesthetic Customization of Hearing Aids and //doi.org/10.1145/3450741.3466771
Cochlear Implants in an Online Community. In Proceedings of the 18th In- [82] Jennifer Sullivan Sulewski, Heike Boeltzig, and Rooshey Hasnain. 2012. Art and
ternational ACM SIGACCESS Conference on Computers and Accessibility (AS- Disability: Intersecting Identities Among Young Artists with Disabilities. DSQ 32,
SETS ’16), Association for Computing Machinery, Reno, Nevada, USA, 219–227. 1 (January 2012). DOI:https://doi.org/10.18061/dsq.v32i1.3034
DOI:https://doi.org/10.1145/2982142.2982159 [83] Pauline Tranchant, Martha M. Shiell, Marcello Giordano, Alexis Nadeau, Isabelle
[64] Halley P. Profta, Abigale Stangl, Laura Matuszewska, Sigrunn Sky, Raja Kushal- Peretz, and Robert J. Zatorre. 2017. Feeling the Beat: Bouncing Synchronization
nagar, and Shaun K. Kane. 2018. “Wear It Loud”: How and Why Hearing Aid and to Vibrotactile Music in Hearing and Early Deaf People. Frontiers in Neuroscience
Cochlear Implant Users Customize Their Devices. ACM Trans. Access. Comput. 11, (2017). Retrieved July 6, 2022 from https://www.frontiersin.org/articles/10.
11, 3 (September 2018), 13:1-13:32. DOI:https://doi.org/10.1145/3214382 3389/fnins.2017.00507
[65] Graham Pullin. 2009. Design Meets Disability. MIT Press. [84] Rua M. Williams and Juan E. Gilbert. 2019. Cyborg Perspectives on Computing
[66] Grazia Ragone. 2020. Designing Embodied Musical Interaction for Children with Research Reform. In Extended Abstracts of the 2019 CHI Conference on Human
Autism. In The 22nd International ACM SIGACCESS Conference on Computers and Factors in Computing Systems (CHI EA ’19), Association for Computing Machinery,
Accessibility (ASSETS ’20), Association for Computing Machinery, New York, NY, New York, NY, USA, 1–11. DOI:https://doi.org/10.1145/3290607.3310421
USA, 1–4. DOI:https://doi.org/10.1145/3373625.3417077 [85] Gregor Wolbring and Fatima Jamal Al-Deen. 2021. Social Role Narrative of Dis-
[67] Gisela Reyes-Cruz, Joel E. Fischer, and Stuart Reeves. 2020. Reframing Disability abled Artists and Both Their Work in General and in Relation to Science and
as Competency: Unpacking Everyday Technology Practices of People with Visual Technology. Societies 11, 3 (September 2021), 102. DOI:https://doi.org/10.3390/
Impairments. In Proceedings of the 2020 CHI Conference on Human Factors in soc11030102
Computing Systems (CHI ’20), Association for Computing Machinery, Honolulu, [86] Anon Ymous, Katta Spiel, Os Keyes, Rua M. Williams, Judith Good, Eva Hornecker,
HI, USA, 1–13. DOI:https://doi.org/10.1145/3313831.3376767 and Cynthia L. Bennett. 2020. “I am just terrifed of my future” — Epistemic
[68] Kathryn E. Ringland. 2019. A Place to Play: The (Dis)Abled Embodied Experience Violence in Disability Related Technology Research. In Extended Abstracts of
for Autistic Children in Online Spaces. In Proceedings of the 2019 CHI Conference the 2020 CHI Conference on Human Factors in Computing Systems (CHI EA ’20),
on Human Factors in Computing Systems (CHI ’19), Association for Computing Ma- Association for Computing Machinery, New York, NY, USA, 1–16. DOI:https:
chinery, New York, NY, USA, 1–14. DOI:https://doi.org/10.1145/3290605.3300518 //doi.org/10.1145/3334480.3381828
[69] Abir Saha and Anne Marie Piper. 2020. Understanding Audio Production Practices [87] Jung Hyoung Yoon, Caroline Ellison, and Peggy Essl. 2021. Shifting the perspec-
of People with Vision Impairments. In The 22nd International ACM SIGACCESS tive from ‘incapable’ to ‘capable’ for artists with cognitive disability; case studies
Conference on Computers and Accessibility (ASSETS ’20), Association for Comput- in Australia and South Korea. Disability & Society 36, 3 (January 2021), 443–467.
ing Machinery, New York, NY, USA, 1–13. DOI:https://doi.org/10.1145/3373625. DOI:https://doi.org/10.1080/09687599.2020.1751079
3416993 [88] Najma Al Zidjaly. 2011. Managing Social Exclusion through Technology: An
[70] Richard Sandell, Jocelyn Dodd, and Rosemarie Garland-Thomson. 2010. Re- Example of Art as Mediated Action. Disability Studies Quarterly 31, 4 (2011).
presenting disability. Activism and agency in the museum. London (2010).
“Just like meeting in person” - Examination of interdependencies
in dementia-friendly virtual activities.
Elaine Czech Paul Marshall Oussama Metatla
elaine.czech@bristol.ac.uk p.marshall@bristol.ac.uk o.metatla@bristol.ac.uk
University of Bristol University of Bristol University of Bristol
Bristol, UK Bristol, UK Bristol, UK
ABSTRACT
Many dementia-friendly social programs were adapted to online
delivery due to the COVID pandemic. Hasty adaptations make it
unclear how to design these programs to capture the benefts of
online delivery and face-to-face interactions. To understand the
complexities of program delivery, we interviewed program coor-
dinators and held focus groups with people living with dementia
(PLWD) and their informal carers. We applied an interdependence
framework to examine how the relationships between individuals
afect program benefts. We found that interdependencies within an
organization related to fnances and networking are key and that
organizational and individual interdependencies converge during Figure 1: [Left] Interdependence frame[7]. [Right] Our fnd-
program delivery. Our fndings suggest these two interdependen- ings suggest that the environment consists of organizational
cies could infuence one another more efectively if technology, interdependencies. The intervention (program’s method of
like video conferencing, were designed to account for it. We dis- delivery or for Bennett et al. assistive technology) bridges
cuss how an expanded notion of interdependency for the design of the interdependency types
technology helps expand inclusivity in accessible social programs.
services1 . Many of these organizations chose to adapt their pro-
CCS CONCEPTS grams to use technologies like videoconferencing. However, factors
• Social and professional topics → Seniors; • Human-centered such as an organization’s fnances, service users’ willingness to
computing → Accessibility technologies; Accessibility theory, engage with technology, and staf members’ technology familiarity
concepts and paradigms. impacted program adaptation. These factors caused some programs
to become unfamiliar or unreliable to people living with dementia,
KEYWORDS leading to decreased participation and increased social isolation
interdependency, dementia, socialization, community [16, 57, 78, 80]. Since stability is key for these programs, fexibility
to sudden or drastic changes is limited.
ACM Reference Format: Nevertheless, fexibility is essential for social programs pre-
Elaine Czech, Paul Marshall, and Oussama Metatla. 2022. “Just like meeting scribed to help a person maintain their overall wellbeing, as having
in person” - Examination of interdependencies in dementia-friendly virtual these resources suddenly unavailable can be detrimental [32]. So-
activities. . In The 24th International ACM SIGACCESS Conference on Comput-
cial prescribing is a non-medical intervention that links "clinical
ers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM,
New York, NY, USA, 15 pages. https://doi.org/10.1145/3517428.3544815
practice with activities and support services within [a] community"
[26]. It can help to reduce social isolation and depression for PLWD
[2]. However, quick adaptations not tailored to those living with de-
1 INTRODUCTION mentia can make accessing these programs impossible, eliminating
Due to the COVID-19 pandemic, social organizations which ofered any benefts.
face-to-face programs for people living with dementia (PLWD) and Although the term "social prescription" is relatively new to HCI
their informal carers considered alternative ways of delivering their [90] there has been extensive research into strengthening social
connections for PLWD [30, 36, 40, 59, 83, 87]. For example, solu-
tions that create opportunities for sharing social experiences within
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed families [36, 59, 87] and developing activities for care homes resi-
for proft or commercial advantage and that copies bear this notice and the full citation dents at varying stages of dementia [30, 40]. However, there is little
on the frst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission
and/or a fee. Request permissions from permissions@acm.org. 1 Forthis research, "social organizations" are those which provide a cultural, historical,
ASSETS ’22, October 23–26, 2022, Athens, Greece or social service to people with dementia, such as art and history museums, public
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM. zoos, and dementia-specifc social services. "Programs" refers to activities hosted by
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 these organizations such as art creation, tours of their collections, and planned social
https://doi.org/10.1145/3517428.3544815 gatherings.
ASSETS ’22, October 23–26, 2022, Athens, Greece Czech, et al.

research on how technology can assist those living with mild cog- 2 RELATED WORKS
nitive impairment or early stage dementia to pursue wider social
2.1 Accessing Social Spaces and Dementia
engagements.
With other groups, there has been a shift towards exploring Social participation by all is considered vital for a community to
social accessibility and how environmental factors afect the design be considered age-friendly [65]. Social connection is essential for
of technology [34, 66, 75]. Academic literature outside of HCI advo- overall community members’ health and wellbeing [65, 81]. Social
cates for developing "dementia-friendly" communities to improve prescription, which encourages integration into one’s community, is
social connectedness and inclusivity [67, 79]. In turn, HCI and de- becoming more common. When used in conjunction with a medical
mentia research has also started to broaden its understanding of prescription, it allows a person’s health to be managed holistically
community engagement and PLWD [19, 24, 48]. This research has [2]. Since dementia is progressive and there is no current cure,
highlighted that the degree of societal inclusion infuences how social prescribing can help slow progression and reduce medical
PLWD engage with the world. Therefore, there is potential in under- costs [26].
standing how organizations that provide social prescriptive services Often it involves art appreciation or making/crafting through
might improve their social inclusion eforts through technology. an institution such as a museum or care home, although Kelson
However, when organizations assume a one-to-one adaptation of et al. [44] found that enjoyment of public art was also benefcial
an in-person program to any readily online platform will have for PLWD. Their research suggests that social prescribing does not
similar positive efects this potentially limits social interactive for need to occur within an institutional setting and can be informal.
PLWD [38]. Such adaptations do not consider that readily available However, Thomson et al. [81] found that museum-based interven-
platforms are designed for a hyper-cognitive society [80]. tions allowed participants to interact with a range of people not
In parallel with this shift in perspective towards broader so- typically encountered. Compared to informal interactions, inter-
cietal interactions, within accessibility and assistive technology ventions occurring within a museum increased the likelihood of
research there has been a change from an independence framework short-term or long-term bonding between participants.
to an interdependence framework [7]. Interdependence repositions While HCI has explored leveraging technology to make museums
research to focus on being in the world rather than just micro- more accessible and interactive [13, 39, 82], there is little research
interaction with immediate social circles. However, this shift as it on how to support the social aspects of visiting a museum for PLWD.
applies to the engagement of PLWD with social programs has not Ryding [72] designed a two-person interactive app for museums
been explored yet [21, 24, 25]. In this paper, we considered how and found that this social dynamic allowed participants to create a
technology can improve dementia-friendly programs. We believe more bespoke experience and explore the museum more playfully.
that designing these programs to support interdependency will, in In contrast, Clarke et al.’s [17] research on interactive exhibition
the future, allow social programs to be more inclusive. design explored distributed control, fnding enforced cooperation
To achieve our aim, we consider interdependencies within dementia- allowed participants to have more enriched social interactions by
friendly programs and how technology can strengthen them. We getting them to focus on others.
report fndings from semi-structured interviews with 15 program Beyond museums, there is not enough research on the appli-
coordinators who manage and design dementia-friendly program- cation of technology to improve the social accessibility of public
ming. We also report fndings from focus groups conducted with six spaces, not only for PLWD but for older adults in general [69].
participants living with dementia and three of their informal carers. Hodge et al. [37] examined how PLWD and their loved ones created
We interviewed key stakeholders deliberately as we believe that new memories on public outings. However, they focused on en-
realizing dementia-inclusivity means understanding and meeting riching the relationships between the person living with dementia
the needs of those directly impacted. Our fndings suggest that and their loved ones and not the interactions between participants
an expanded application of Bennett et al.’s [7] interdependency and the public. Therefore, more research is needed to examine how
framework is needed to outline future social-dementia technology technology can improve opportunities for social connectivity in
research (Fig. 1). Bennett et al. discussed that interdependence could various public settings for PLWD.
lead to individuals working as a collective. However, our analysis
reveals that individuals working as part of a collective within an 2.2 Dementia-Specifc Interventions
organization creates a diferent layer of interdependence not yet Overall, dementia and HCI research has helped us understand what
considered. We outline how organizational interdependencies can type of interactions and designs are appropriate for those afected
afect individual interdependencies. Technology used in program by dementia [29, 30, 49, 56, 85, 87], advocating for approaches that
delivery creates an opportunity that allows these two networks allow the person living with dementia to have agency and express
of interdependencies to converge and then infuence each other. their personhood [10, 47, 49]. Still, since the focus of these solutions
Our paper makes three primary contributions to the ASSETS com- is dementia, often it seems the fndings and solutions do not appeal
munity: 1) identifying and highlighting the complexities related to to those not living by dementia.
designing social programs for PLWD; 2) grounding and expanding In the past few years, researchers in the dementia and HCI feld
the interdependency framework in this context; and 3) providing a have recognized this limitation and have begun to focus on en-
design checklist to assist coordinators and the ASSETS community compassing a broader frame of reference. Although still within
with developing technology for facilitating social programs. the care home setting, Dixon and Lazar [24] looked at the efects
of social perspective and dementia care. Their research analyzed
the various infuences that determine the care practitioner’s care
“Just like meeting in person” - Examination of interdependencies in dementia-friendly virtual activities. ASSETS ’22, October 23–26, 2022, Athens, Greece

style when interacting with a person with dementia. An example store through environmental cues and relying on each other. This
is carer wages: Since carer positions are often underpaid, carers example highlights how anyone can use an accommodation, even
may save time with pragmatic approaches that attribute behav- if it was not intentionally designed for them.
iors to dementia instead of looking for meaning in the person’s Das et al. [21] further examine the social and structural aspects
behavior [24]. Similarly, Dai and Mofatt’s [19] research looked at of collaboration that afect task completion. They highlight how
how social perspective comes into play in a community setting by ableism and power dynamics in the workplace make completing
examining how technology could support a reminiscent activity tasks inaccessible to those with visual impairments, preventing
in a public library for carers and their loved ones with dementia. access to jobs and promotions. While not implicitly stated, their
Social interaction in a public space provided them with "a sense of fndings suggest that completing a task is interdependent with the
normality." This normalcy is extended to the community as it can workplace’s structure. Thus, understanding structural infuences
help destigmatize the stereotypical notions of living with dementia can shed light on the more signifcant issues that need to be ad-
[19]. Progressing the boundaries of contextual infuences further, dressed before implementing a technology solution. A limitation
Dixon, Piper, and Lazar [25] outlined everyday self-management of this research is that it did not include sighted collaborators,
strategies used by PLWD. Infuenced by critical realism theory and who might have shed light on how interdependency then afects
Bennett et al.’s interdependence framework [7], Dixon, Piper, and colleagues’ interpersonal relationships. Moreover, design research
Lazar [25] suggest that societal pressures afect how PLWD navi- by Baldwin et al. [3] applied the interdependence framework to
gate interpersonal relationships through the use of technology to community design and was able to increase public attentiveness
manage their dementia [25]. Overall, these papers explore how so- to inclusivity and gain community interest in their work. Increas-
cial perspectives infuence technology use, how people view those ing public awareness of dementia is crucial for destigmatizing it
living with dementia, and how PLWD view themselves. and building community inclusivity. As these researchers have ap-
Despite progress in the feld of HCI and dementia, the COVID plied it, the interdependence frame suggests that it can holistically
pandemic forced many social interactions to be conducted at a examine an experience making clear appropriate solutions.
distance. As a result of the pandemic, it became important to un- Overall, there is a lack of research on social programs associated
derstand how readily available technologies could be adapted to with social prescription services for PLWD in HCI. The public na-
provide support to those living with dementia and their carers ture of these programs means an intermix of abilities infuenced by
[16, 60, 80]. For instance Mynatt et al. [60] researched how an societal structures will occur. Therefore, there is a need to frst iden-
in-person empowerment program for people with mild cognitive tify and characterize the interdependencies involved within these
impairment was adapted to asynchronously online delivery. While programs to then be able to improve the social support participants
they added synchronous video calls over Zoom, they found that receive. There is an opportunity for technology to improve the qual-
participants still required more facilitation to have meaningful in- ity of the programs and lead to greater inclusivity. In this paper, we
teractions. However, facilitators found that the online format made use a critical realist approach to the interdependence framework to
scafolding activities difcult. As of writing, HCI research related examine how high-level or contextual factors afected individuals
to the impact of the pandemic on online social interactions for not [7, 31].
only those living with dementia, but also older adults in general[76],
is limited. This research builds upon the fndings of Mynatt et al. 3 STUDY OF DEMENTIA-FRIENDLY SOCIAL
[60] by considering the contextual infuences that afect the devel- PROGRAMS
opment of meaningful interactions for PLWD online.
To identify and characterize the interdependencies that assist with
program delivery, we interviewed program coordinators and held
2.3 Interdependence to Promote Inclusivity two focus groups: one with PLWD and one with carers. We applied
Models for creating societal inclusion have tended to focus on a grounded theory [15], critical realist [12, 63] approach to gain a
helping those with disabilities achieve independence. However, thorough understanding of social programs. Since programs are
emphasizing self-sufciency ignores that all people form interde- typically designed and then experienced, we aimed to mirror this in
pendencies by contributing and relying on community support our data collection and analysis procedures. First, we collected and
examined data from the coordinators to identify the key structural
and assistance [7]. The prevalence of social connections suggests
infuences and understand the development to delivery process
that strengthening interdependencies between people of varying
of programs. Once the structure and development process was
abilities can improve societal inclusion. According to White et al.
[88], supporting interdependence increases social capital, helping a understood, we could then identify the primary interdependencies.
society function more efciently. We applied the resulting interdependency structure deductively to
Therefore, the interdependence framework proposed by Bennett examine the perspectives of PLWD and their carers to determine
et al. [7] encourages "tangible accommodations", such as assistive whether social interdependencies were supported.
technologies or a wheelchair ramp, to be assessed and designed
within their context of use. They defne context to include people 3.1 Participants
who might not interact with the accommodation in addition to the The participants were divided into two groups: program coordina-
overall environment. However, they focus on how the accommoda- tors, and PLWD and their carers. An information sheet and consent
tion afects interpersonal relationships. One example their paper form were given to those interested and signed before the study
provides is how two visually impaired people navigated out of a began.
ASSETS ’22, October 23–26, 2022, Athens, Greece Czech, et al.

Six participants who self-identifed as living with dementia (ages These notes and transcriptions of the audio-fles were used for
55-90, mean 72, 4 male) and three of their carers (ages 56-65, mean analysis. Pseudonyms were used, and identifable information was
62, all female) were recruited through an Alzheimer’s Social Club removed from the transcription. We followed a semi-structured for-
(where the frst author volunteers, Tab.1). Participation in the Social mat to allow participants to talk freely about their experiences with
Club requires an early-to-mid-stage dementia diagnosis; therefore, social programs. For the interviews, coordinators were asked about
we did not require participants to share their specifc diagnoses. their programming in general, how programming has changed and
They were asked to participate in a group interview: one for only adapted, and how they envision programming to be in the future2 .
carers and one for those living with dementia. These were con- For the group with participants living with dementia, they were
ducted separately so that the voices of those living with dementia asked how they would want to change the programs, who they
would not potentially be dismissed or overshadowed [20]. The par- would want to invite to the programs, and how dementia has af-
ticipants were interviewed as a group since they were comfortable fected their lives. For the group of carers, we asked how they got
and felt safe discussing various topics with each other and encour- involved with programming, how the pandemic is afecting their
aged each other to share. The group interview was also chosen so loved one’s dementia, how dementia has afected their lives, and
that it would mimic the feel of a Social Club gathering. Participants how programming could be improved.
living with dementia were not tech-averse and had a basic under-
standing of technological devices (such as smartphones or voice
controlled home devices). Although all, with the exception of Evan, Program Coordinator Interviews

needed some form of carer assistance with using Zoom, ranging


Grounded Theory
from reminders to use it to being entirely dependent on the carer
1. Open Coding: Influenced by person-centered care model
to get them online. 2. Axial Coding: institution features, programming creation and design, and working with others
Theme development: Reliance and Support
Deeper analysis with Critical Realism
Table 1: List of pseudonyms for participants living with de-
mentia and their informal carers. Abduction
Develop hypotheses for research question: What allows programs to develop and be successful?
Determined interdependencies as most plausible hypothesis and that two networks are at play:
Person Living with Dementia Carer (relation) Contextual Social
Adam Alice, daughter Occur before Occur during
Beth program delivery or after delivery

Chris Christine, spouse Retroduction

Daniel Dana, spouse Review data with specific question:


What are the causal mechanisms behind these interdependency networks?
Evan Once causal mechanisms determined, review other data confirm these

People living with dementia Focus Group Carer Focus Group

Additionally, program coordinators from various social orga- Confirm causal mechanisms are correct
nizations were recruited to be interviewed (Tab.2). Coordinators Determine where and when these mechanisms either support or deter
interdependency
were recruited from the US and UK due to having similar social
services available for PLWD. Coordinators in the US were recruited Figure 2: Diagram of our process with grounded theory criti-
via word of mouth, while those in the UK were found via groups cal realist analysis.
listed on the Dementia Engagement and Empowerment Project UK
[43], and were recruited via email invite. Fifteen semi-structured 3.3 Data Analysis
interviews were conducted with seven program coordinators from We took a critical realist3 approach to our data (Fig. 2)[14, 28, 31, 63].
the UK and eight from the US. While each coordinator’s job role To begin, the frst author approached the data with a constructivist
varied slightly, most oversaw managing and designing accessibility grounded theory approach [14] to focus on the empirical and ac-
programming. tual [12]; this provided a concrete framework to determine demi-
regularities [28, 63]. The frst author began by open coding the data
3.2 Procedures from the program coordinator interviews [15]. Then, following
We obtained approval to conduct this research from our university
2 For example:How do you design your programs to be accessible? How has your
Ethics Committee. Before the start of each individual or group in-
programming changed? What is the future of programming?
terview, we collected verbal consent. All interviews and both group 3 Critical realist ontology stratifes reality into three layers: the empirical, the actual, and
interviews were performed remotely via Zoom, with all participants the real. Analysis thus moves from what we experience, to events that occur whether
joining individually from their homes. Interviews were one-to-one we experience them or not, to fnally the causal mechanisms of an experience[28].
Causal mechanisms are the "inherent properties in an object or structure that act as
(except for Michael and Nancy, who interviewed together) with the causal forces to produce events" [28]. The key steps of critical realism are identifcation
frst author and lasted 30-60 minutes. The frst group interview was of demi-regularities, abduction, and retroduction [28, 63]. Since the world is not always
orderly, critical realists believe that there are only tendencies or demi-regularities to
conducted with six participants living with dementia and was sched- occurrences [28]. Through abduction, possible theoretical explanations are considered
uled for 1 hour but ended up lasting 2 hours since the participants for these demi-regularities. Hypotheses are framed for each possible explanation and
wanted to keep socializing. While the second group interview, with empirically checked by examining the data until the most plausible explanation is
selected to be pursued [14, 63]. Finally, through retroduction, a process like abduction
the three informal carers, lasted 1.5 hours. Detailed notes and au- but in which the researcher looks at the data with a specifc question [28], the most
dio recordings were taken for the group and individual interviews. plausible explanation is explored.
“Just like meeting in person” - Examination of interdependencies in dementia-friendly virtual activities. ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 2: List of pseudonyms for coordinators.

Coordinator’s Name Gender Organization Type Location


Barb female Dementia Social Service Midwestern US
Elizabeth female History Museum Midwestern US
Francis female Dementia Social Service Eastern US
Gretchen female History Museum Western US
Isabella female Dementia Social Service Southeast England
Joan female Art Museum Southwest England
Kevin male History Museum Northern England
Laura female History Museum Northern England
Michael male Dramatic Arts Organization Southwest England
Nancy female Dramatic Arts Organization Southwest England
Polly female History Museum Midwestern US
Quentin male Art Museum Midwestern US
Rachel female Older Adults Social Services Southwest England
Theresa female Botanical Garden Eastern US
Ursula female Zoo Midwestern US

constructivist grounded theory methods, we refected on our per- living with dementia and their carers. This data allowed us to un-
spectives and recognized our approach to the data was infuenced derstand better the current strengths and weaknesses of the social
by the person-centered care model perspective [15]. Throughout interdependencies which occur because of a program.
our analysis, the frst author wrote analytic memos and engaged
in a process of constant comparison of data to data. These codes 4 FINDINGS
were then reviewed collectively by all the authors. The codes made We determined two main networks of interdependency which infu-
clear the hierarchy that went into program development, and all ence the various aspects of social programming. The frst includes
the authors again validated these emerging concepts. The codes the interdependencies which occur on a contextual level before the
were categorized into institution features, programming creation and program among the organizations involved with it (such as inter-
design, and working with others through axial coding. /intra-organization networking and fnances/resources). The other
In an initial analysis, the themes of Reliance and Support were main network of interdependencies occurs during and after the
identifed in all levels of program development. Organizations and program on the social level. These are the interactions among par-
program coordinators relied on and were supported by other or- ticipants and determine the amount of social fulfllment provided
ganizations for funding, as sources of knowledge and comparison, by a program (such as interactions between staf, PLWD, carers,
and as a means of improving community outreach. Participants etc.). Then, during program delivery, these two networks have an
relied on and were supported by their carers, other participants, and opportunity to converge.
program coordinators to supply consistent programming. Organiza-
tions and participants were brought together through the delivery 4.1 Organizational Interdependencies
of the program. Informed by our earlier interviews, we adjusted our Organizational interdependencies occur between organizations pro-
interview questions to understand the complexity of the interde- viding the context that determines the nature of a program and
pendencies better and asked coordinators how their organizations whether a program exists. Based on the interviews with the pro-
engaged with their community and what are the challenges they gram coordinators, we derived that two key mechanisms, fnances
face with online. and networking, infuenced the reliability and supportive strength
However, with critical realism, researchers do not commit to of the interdependency between organizations.
their initial theory as it is recognized that deeper analysis is needed
to develop a "more accurate explanation of reality" [28]. Next, the 4.1.1 Financial Interdependency. While accessible and inclusive
frst author engaged with the process of abduction and examined programming was important to all the organizations we inter-
interdependency as one plausible explanation for the causes behind viewed, what each organization provided depended on their f-
a program’s development and success [63]. Finally, we retroduc- nancial stability. Stable fnances often meant an organization could
tively analyzed the codes to identify the causal mechanisms behind hire enough staf to accommodate community needs. For Polly,
the interdependencies [7, 12, 31, 63]. We distinguished codes re- COVID meant her team was downsized: “I am the manager in our
lated to things that occurred before a program was delivered as department and then I have two part time staf members [. . . ] before
part of the organization. In contrast, the codes that occurred during COVID, we had a quite a few more people involved [. . . ] the downsiz-
or after a program was delivered were tied to social interactions. ing has defnitely, hurt all of us.” Fewer staf mean that organizations
We then deductively analyzed our focus group data to determine have less human capacity to design and provide programs.
when and how interdependencies are or are not supporting those Loss of funding could also mean, for larger organizations like the
zoo, reshufing of staf. During the COVID lockdowns, Ursela had
ASSETS ’22, October 23–26, 2022, Athens, Greece Czech, et al.

to once again take over running the dementia programming since exchanging ideas, expanding community outreach, and develop-
the program lead was laid of: “[I had taken] a step back from the ing new programs. For well-known and larger organizations, like
[dementia programs . . . ] And one of my colleagues took over, and then Quentin’s art museum or Ursula’s zoo, the organization’s promi-
when COVID happened, he unfortunately got laid of. And so I was nence made fnding partners easy: “Our recurring partners have
able to step back into the program, and kind of take over and get back been a result of somebody at [one of our main partner organization’s]
acquainted with our individuals.” Having to get reacquainted with saying, you know you really need to talk to so and so 10 miles up the
a group can mean that program quality sufers since participants road or [this other organization’s] got a program there that would
have built up a rapport and expectation of the program based on really beneft [from working with your organization]” (Quentin).
the previous coordinator. For organizations still establishing themselves in their communi-
Since the 2020 pandemic caused loss of funding for many or- ties, they joined or formed a group of local organizations. Around
ganizations, afordable solutions had to be creatively confgured. 2015, Elizabeth seeing a need for accessibility services in museums
In general, the coordinators interviewed found it cheaper or more pioneered an “Alliance of Cultural Accessibility” so that organiza-
straightforward to adapt programs online. The cost efciency meant tions in her State could have relevant conversations around being
that Rachel’s organization has kept a signifcant portion of its pro- and becoming more accessible. Similarly, in the UK, Kevin’s orga-
grams online or via telephone. "We don’t have to pay for the venue nization became “a member of a local collaboration of community
[... or] getting people to locations." Before the COVID lockdowns groups [. . . which] enabled us to engage with groups, develop new
in the UK, Rachel’s organization did not focus on using technol- partnerships, and [has] supported us in our ongoing program.”
ogy. However, technology has become more critical now that the Because of its critical role in the functioning of social organiza-
organization sees fnancial and even logistical benefts. tions and even its possibility to afect fnances, the ebb and fow
Online solutions increased participation accessibility previously of networking, despite the COVID-19 pandemic, functioned con-
prevented by physical barriers. Barb’s organization could save costs sistently. In Polly’s case, the pandemic seemed to help with her
by not needing to set up multiple locations since, through online ability to network efectively. "I have a lot of organizations that have
programming, they could, for instance, reach one participant living helped us out in like again, the last fve, six years that I have been in
more than 400 miles away. Without online programming, Barb’s this role, and I haven’t been able to be as reciprocal in the past. And
organization would not provide an opportunity for that person to thankfully, the pandemic has provided me a little bit of fexibility to
socialize with others living with dementia. "I think virtually there’s do that [. . . ] I’ve got probably about seven or eight partnerships, right
that impact that we can reach more people, right? Especially, you now (laughs)." By reciprocating her partnering organizations, Polly
know we have [participants on the other side of the State attending can strengthen her organization’s inter-organizational interdepen-
. . . ] I think doing it virtually it just really shows how much more could dencies. Strengthening existing partnerships also can improve an
be done." By providing virtual programs, Barb realized how limited organization’s reputation.
the reach of in-person programs has been since people living too However, the efort and reward of reciprocity had to remain
far away could never attend thems. equal for partnerships to occur or remain. Rachel has found that
Finances in another sense afected programming since coordi- "you have to be giving something extra that the activity coordinator
nators also had to be aware of the fnancial limitations of their can’t give for it to be worth their while, [if it is] something that
participants. For Ursula “[accessibility] was a conversation [our orga- they probably can do themselves in the home then there’s not much
nization] had a lot, at the time, we weren’t thinking about seniors, we of a draw for them." Which for social organization programming,
were thinking about kids. So these kids who don’t have internet, who often revolves around providing space for socializing, virtual or
don’t have computers, how are we going to access them?” Despite the telephone activities could sometimes prove difcult to recruit for.
digital divide in the US, all of the organizations we interviewed in Without something appealing, partnerships are unable to form
the States had opted to do online programming. However, in the or be successful. Alternatively, in the case of Michael and Nancy,
UK, organizations we talked with seemed more likely to opt for they dissolved some of their care home partnerships because staf
more low tech solutions (such as mailing items or phone commu- turnover made coordination difcult.
nication) since this would not limit their program’s availability to Often, interdependencies through networking occur because of
older participants. Yet, beyond this distinction, coordinators in the the organization’s reputation or due to staf members changing
two countries shared similar experiences. Finances then decide if felds (i.e., an educator moving into museum education). These
there will be a program, what type of program it will be, who it traits help organizations expand their community outreach. For
will be for, and how it will be delivered. Finances can either help instance, Francis’ organization expanded community outreach by
to strengthen or weaken an interdependency between a program relying on her established relationships from her previous job. Fran-
and an organization. Thus, while fnances determine how inter- cis was formerly a museum educator and was hired to develop the
dependencies at the organizational level come about; networking cultural programming for a dementia organization: "In my role, my
determines the types of interdependencies. experience as a long-time museum educator allowed me to establish
rapport with almost everyone, I’ve done education programs with
and invite them to [join the dementia organization’s] programs and
4.1.2 Networking Interdependency. Networking was another key services." Organizations will hire specifc staf to absorb their con-
mechanism for determining whether an inter-organizational inter- nections and make themselves more qualifed to develop programs
dependency occurs. The benefts to an organization in establishing that will interest those connections. At the organizational level, the
professional partnerships include: opening funding possibilities,
“Just like meeting in person” - Examination of interdependencies in dementia-friendly virtual activities. ASSETS ’22, October 23–26, 2022, Athens, Greece

reliance and support, or lack thereof, determines what programs participants to co-discover what is going on or being inspired by
are developed and for who they are developed. the moment.
However, over virtual platforms coordinators and participants
4.2 Social Interdependencies found it difcult to maintain the same level of social atmosphere.
Christine noted that she did not think the virtual presentation
Person-centered strategies within dementia-specifc programs led
was as benefcial as in-person had been for her husband. "Since
to social interdependence between the participants and program
COVID, Chris just sits back. You know he doesn’t participate." Despite
coordinators. We derived four strategies which develop social in-
Christine feeling that attending the programs overall is positive,
terdependence and lead to feelings of social inclusion: creating a
decreased engagement has her concerned about the current virtual
socially safe atmosphere, supporting social needs, co-leading pro-
format.
grams, and building lasting connections.
Hastily adapting their programs online allowed coordinators to
4.2.1 Socially Safe Atmosphere to Support Creation of Interdepen- continue to nurture the bonds already developed by participants
dencies. The atmosphere is crucial for making a program into some- who had frequently engaged with their programs. Being able to
thing socially fulflling and must be negotiated by the staf and extend this engagement has also allowed Ursela to speculate about
participants. The difculty of socializing that PLWD face was re- how they can add to their programs: “I like the combination of both.
ported by Dana, wife and carer of Daniel, "When [my husband] I think there’s a lot of value in being able to connect with people
is with other people that don’t have any issues, he truly clams up frst [virtually] before they come to the zoo.” Virtually connecting
and takes himself away from the group [. . . with the Social Club then not only allows for maintaining connections but potentially
though] that’s usually the time when I hear him laugh." In social for building them in advance of coming into a new space. There-
situations outside of the Club, Daniel is uncomfortable; however, fore, how participants interact and how coordinators design their
being around those with a similar diagnosis allows him to relax and program makes a space socially safe for PLWD.
have fun. Presentation of self with dementia might infuence this
impression[33]. Participants of the Social Club would use humor 4.2.2 Supporting Unique Social Needs through Interdependence. Co-
to provide empathy and serve as a distraction when someone had ordinators further try to reduce anxiety and social pressures for
some difculty due to their cognitive impairment. In this exchange those living with dementia and their carers. As dementia can afect
Adam had asked Beth how she had found out about their group: social interactions [11, 19, 49, 87], this ends up afecting a person’s
closest relations the most (i.e., their spouse/informal caregiver).
Beth: I don’t remember, psssh. I really don’t.
Carers are then not only relied on for daily living needs but also
Adam: (Laughs) That’s why you’re in the group! (col-
social needs. "[Daniel was becoming] more and more isolated and he,
lective laughter)
you know, basically any friends, acquaintances whatever they’ve kind
Laughter was observed to help relieve tension in the group and of fallen by the wayside, sooo it was me entertaining [him by myself]
ease the social anxiety associated with a dementia diagnosis. This and I (laughs) I needed help with that." For that reason, Dana had
relaxed and playful atmosphere allowed participants to have a her spouse attending various programs for PLWD. Social organi-
positive experience while engaging in socialization. Dementia co- zations recognize this reliance and thus design their programs for
ordinators, like Barb, know their programs create "a safe space the person with dementia and their carer. Coordinators then work
[for PLWD] to still socially interact." So, unsurprisingly, we found with carers within the program setting to support the person living
that often those working at dementia services were the ones who with dementia to engage socially. By sharing this responsibility, the
reached out to the cultural organizations we interviewed to set up coordinator and the carer can avoid the pressure and other emo-
activities. By partnering with other organizations, the dementia tions that can come with being the sole source of another person’s
service providers could bring that socially safe atmosphere with social fulfllment.
them to other spaces. This reliance between the informal carer and the coordinator
Coordinators also tried to be mindful of how they structured became more apparent to coordinators when the informal carer
their programs by trying to avoid singling out a participant. Laura was not present. Quentin’s museum, for instance, designed their
for instance avoids using the terms “reminiscence” or “remember.” in-person art appreciation and creation program with this pair in
“I don’t want to sit down with someone and be like, remember this, mind. However, with the sudden switch to virtual, he was surprised
remember this [. . . so] we call the boxes [with museum items] in- to fnd that often just the person with dementia attended. Because
spiration boxes rather than reminiscence boxes [. . . for example] an the person with dementia was alone, Quentin found it challenging
object that me and you don’t remember, we can’t reminisce about it, to engage with the participants and felt the result of the program
but we can be inspired by it.” Coordinators at the various organiza- was not as socially fulflling. "We see in the in-person version a lot of
tions would then focus on the use of senses in the present. At the appreciation at the end. [. . . The participants] get to know each other
botanical garden, Theresa encourages participants to focus on their through the years and form an extra level of support [. . . ] we see from
senses: “[You can] talk to them about a color, a sense. Something that time to time, a level of appreciation, sometimes within [the] pairs."
they ate, something that they you know can recall, or something that Additionally, working in-person allowed coordinators to provide
they’re feeling at the moment [. . . ] It’s about enjoying the moment, individual support to participants helping them remain engaged.
being in nature, connecting with plants, no judgment, nothing to learn, However, when programs moved online, participants like Adam
nothing to know, just enjoying.” To focus on the present (and not had to get online on his own, making attending programs more
past or future) coordinators need to work interdependently with inconvenient for his daughter Alice. Alice not only did not receive
ASSETS ’22, October 23–26, 2022, Athens, Greece Czech, et al.

the social benefts of attending the programs with her father, but For PLWD, social situations can become difcult to follow. There-
she also had to do more work on her own to help coordinate Adam’s fore, unless clear opportunities are presented, they can become
ability to engage with a program. Alice would have to organize withdrawn[47]. Applied in other contexts with other objects not
all the links and be ready to do over-the-phone troubleshooting. necessarily part of a museum collection could mean that staf does
"There’s a lot of accommodation, a lot of things that I do [. . . ] I make not need to design their program structure extensively. Instead,
the spreadsheet for him [. . . and] I FaceTime him, so I watch every they can rely on their interdependency with their participants to
step he does." Therefore, for the informal carers who did not live enrich program content. For instance, during the focus group for
with the person living with dementia, supporting their social needs members of the Social Club, the participants ofered each other
became more difcult with the hasty switch to online. advice. Adam told a story about his difculties remembering names
Aware of the technical limitations that not only those with de- and numbers, a situation with which Francine empathized. Adam
mentia face, but older adults in general deal with, in the UK, organi- then told Francine one of his tricks, "[On Zoom] the banner, if you
zations tended to avoid virtual programs. As Francis noted, “The US just take a picture of the screen, you get everybody’s picture, with a
is all Zoom. Assuming everyone has access [. . . which is] especially name on it." He then shared with the group his printed-out image,
difcult for older adults who are not familiar or comfortable with and the other members afrmed it was a good idea. Sharing advice
technology." By not ofering programs that meet the technological and coping strategies provide members with an opportunity to
limitations of the participants, programs are unable to meet their build up empathy and express agency.
social needs. Which, as Rachel found with her participants in the Co-leading did not always involve teaching as often participants
UK, “it’s the conversation, regardless of how it’s delivered [. . . ] they’re in the Social Club shared items or anecdotes based on what others
happy just to chat on the phone.” In contrast, those we interviewed shared. During the focus group, Francine changed the discussion
who are living with dementia felt that video conferencing was im- topic when she shared some of her stufed animals with the group.
portant for their experience. Daniel felt that virtual was: “just like The topic change caused Beth, despite forgetting some words, to
meeting in person, we see each other, and we can see if we’re happy or talk about her stufed animal: "It’s just a little ball like this (makes a
sad or, then we can talk about experiences, [Zoom] is a very way, good circle with her hands) and it’s, it looks like a dog". By switching to a
way [sic] to handle it.” Therefore, negotiating the engagement efort topic that interested Beth, Francine provided Beth an opportunity
interdependently so that it would meet the needs of participants to share. Then, Beth’s use of body language helped her illustrate
and coordinators leads to the greatest sense of social fulfllment for what her words could not communicate to the group. By directing
all involved. the conversation, participants can then express their personhood
by sharing items or stories that are important to them.
4.2.3 Co-leading Programs to Build Interdependence. When design-
ing programs for people with dementia, organizations try to cater
to their participants’ interests while keeping the topics or themes
inclusive. These design strategies tended to revolve around a tangi- 4.2.4 Using the Program to Develop Lasting Interdependencies. How-
ble item that was used as inspiration to allow the participants to tell ever, participants acting as teachers for more personal topics helped
others about themselves and control the direction of the activity. create a more lasting connection between staf and participants, for
Designing programs in this way also allows participants to teach instance, when signifcant but commonly experienced life changes
organizations about items in their collections. This opportunity occurred in either a staf member’s or participant’s life. For Polly,
creates an interdependency based on knowledge exchange and pro- participants remember her being pregnant when they last met
vides a sense of social fulfllment by teaching the person a sense in person. When they see her now in their virtual sessions, even
of self-worth. Laura’s museum developed with a local dementia though she has had the baby, they ask for updates:
services organization their "Inspiration Box" program. During this "They’re like, ’oh my gosh Polly did you have your baby yet?’ And
program, informal carers and their loved ones with dementia would they always ask about my son and it’s just uber cute. And every time
gather at the museum and use the items in the box as social inspira- I tell a story, with things that I’m learning and motherhood, they’re
tion. Laura then provided us with an instance in which she gained like ’oh yeah we remember that.’" -Polly
a new appreciation for an object because the participants shared The mutual experience shared between Polly and the partici-
their knowledge of how to use the object: pants created an opportunity to exchange knowledge about raising
"When I frst came across [the scrub brush] and put it in the box I children. Yet, instead of adding to the organization’s knowledge, the
was looking at, I was like, ’eww’, like it just looks a bit scrufy, but it information provides personal gain and a tool for connectedness.
ended up in the box. And do you know what that is? My star object Furthermore, the process of raising a child is ongoing, encouraging
[...] I get all these older people who then show us, like, get down on a lasting connection.
their hands and knees and start scrubbing (laughter)." -Laura Lasting connections also manifested in other ways. For Theresa,
Stepping into the role as teacher and demonstrating how they during her programs, they would often have participants pot plants
would use the scrub brush allowed participants to have agency over to take home:
the program material. It also provided the organization staf with "[Participants can] look at that plant, the next day or the next week
information about the scrub brush and its relation to their local and say ’Oh, we went to the botanic garden. Look at this plant is still
culture. Moreover, the opportunity to fnd the trigger that would doing great.’ Well, that’s another thing! I have some people who come
allow the person living with dementia to communicate their life on the [Zoom] call who say, "remember, we potted these last year?"
story provides both staf and the participants with social benefts. and I’m like "holy Christmas you still have that?" -Theresa
“Just like meeting in person” - Examination of interdependencies in dementia-friendly virtual activities. ASSETS ’22, October 23–26, 2022, Athens, Greece

The physical presence of the plant serves as, and supports, a Christine: In the Atrium we had those two hours that
lasting memory of the interaction between the staf and the par- we felt that we could share. Or, I have a conversation
ticipants. Keeping the plant keeps the participant’s connection to with somebody else besides [Chris].
Theresa and her organization intact. Seeing the plant again in the Dana: Yea it was a very relaxed atmosphere [...] It was
participant’s home via her online program also reafrms Theresa’s like sharing with a friend.
work. The strategies supporting social interdependencies seemed Dana and Christine expressed that attending informal social
to add to the amount of fulfllment achieved at the social level. gatherings aforded them similar benefts to formal programs and
created a safe atmosphere to develop lasting connections. Thus, how
the two networks converge is not rigidly bound to the organization
designing a program but relies more on the organization and social
4.3 Opportunity for convergence context by which informal engagements can happen.
With the shift to online, carers felt that such informal activities
The opportunity for convergence is the intervention that allows were not supported. This lack of support occurred since coordina-
organizational interdependencies to intermingle with social interde- tors were pressured to adapt online hastily and could not consider
pendencies. This convergence allows for the networks to infuence all ramifcations. Thus, when one of the Social Club members passed
one another. Programs can be designed to encourage participants to away, Christine and Chris were afected. "I felt alienated cuz I didn’t
be interdependently supportive (organizational infuencing social), know what happened to [the member. . . ] it was such a shock [. . . ] I felt
or participants who feel supported by the program invite more like I couldn’t even send [the spouse of the deceased member] a [card]
people to join, increasing the organization’s outreach (social level I felt sad, you know." While organizations are trying to respect their
infuencing the organizational level). Overall, the method of de- member’s privacy by not sharing personal information, they were
livery can afect the organizational by determining the program the gatekeeper of this social connection and inhibited Christine
design and the social by determining how people will interact. from informally reaching out to share her condolences. Therefore,
While many social organizations are primed to be pillars of accommodating participants online still needs more consideration.
their communities due to their location or prominence of their However, with in-person programs, when bringing participants
building, paradoxically, they struggle to break through to the social to the organization’s space was not feasible, social organizations
level. Despite obstacles, museums such as Kevin’s and Quentin’s would host programs locally within a targeted community. For ex-
have used their prime physical space to have public exhibitions of ample, Joan’s organization, pre-pandemic, held drawing workshops
community members’ artworks alongside famous artists’ artworks: at various community centers, and during the pandemic, Rachel’s
"It’s just a sweet moment at the museum for building awareness organization continued to manage and devise programming for
[. . . ] I think that the casual museum visitor really enjoys seeing the diferent community gardens. Rachel’s organization also supported
community represented in that way because their expectation [of the shift to online by providing teleconferencing social spaces for
coming] into our big marble building is to see the Van Gogh or the those living with dementia. By using these adapted methods of de-
Monet ... so to see beautiful clay works made by veterans or cognitively livery, organizations can extend their social network through their
challenged adults or people in recovery that is a really nice meaningful programs. "The community garden is so nice. It’s not ghettoizing old
thing for the casual visitor too" -Quentin people it’s having everybody there". As Rachel points out, when or-
For Quentin, such displays of the community members’ art were ganizations make changes to the method of delivery, this can allow
a way to bring the community together within their space and the organization to diversify its participant reach. The larger the
elevate the community’s image of itself. Thus, making the organi- spectrum of participants an organization can reach, the more signif-
zation’s space more accessible by allowing it to be an extension icantly it could increase the depth of the social interdependencies
of the community. This displays how the social interdependencies, they infuence. Therefore, deliberately designing the opportunities
such as building that safe atmosphere, can infuence organizational for convergence efectively to better support the organizational and
interdependencies and help expand an organization’s network. social levels in expanding their infuence is crucial for moving pro-
As Gwen put it, "the goal [of the programs]... is to encourage this gramming into a realm of developing inclusive programs and not
audience of people to see the museum as a social setting and a safe just those for specifc accessibility needs, such as dementia-specifc
space for them to enjoy their time with [others]." By opening up programming.
their space to the community, the organizational interdependen-
cies, those that have gone into the development and design of the
organization’s physical space, infuence the social interdependen- 5 DISCUSSION
cies by revealing that the space is safe, supportive, and wants to Unlike in-person/onsite programming, virtual/ofsite programming
create lasting connections with the community. for PLWD has only been around for many organizations since
In some instances, developing a specifc program was unneces- the start of the COVID pandemic. Thus, the advantages that such
sary as the building’s atmosphere lent itself to informal gatherings. programming can provide require further exploration. Our fnd-
Barb’s two-hour social groups for only those living with dementia ings examined the program development and execution of a small
spawned an informal social group of carers since the organization’s subset of organizations, leading us to expand Bennett et al.’s [7])
physical space aforded them privacy and opportunity for social interdependency framework to include the relationships between
engagement. Such informal programs proved to be as benefcial for organizations in addition to the social interactions happening be-
informal carers as the organization’s more formal programs: tween individuals. Additionally, when organizational and social
ASSETS ’22, October 23–26, 2022, Athens, Greece Czech, et al.

interdependencies converge, they infuence how the other will react Table 4: Checklist for designing virtual spaces for PLWD.
and adapt (Fig. 3). As this infuence increases, the outreach of these
interdependencies expands, which could allow for more inclusivity. Action to be Done Accomplishing this Action
Organizations design programs to be socially inclusive of those Participant Share Participants engage with the
living with dementia (organizational infuencing social) and a di- their Individual space/objects around them and
verse range of participants feel invited to join in (social infuencing Sensory Experience their senses
organization). Thus, we found that changing the opportunity for Objects and/or Objects are transported between
convergence afects cross-infuence between the two interdepen- Technology locations creating a tangible
dencies. Improving interdependencies potentially can increase the Connect Diferent memory or potentially
social prescriptive benefts of such programs for PLWD. Locations asynchronous technologies create a
Consequently, the design and application of technology afects link
the amount of cross-infuence of the interdependencies. Most blended Social Agendas that Focus on empowering PLWD
programming currently involves thinking of virtual as a replace- Appropriately during virtual and support bonding
ment or equivalent to in-person. We argue that future programming Match the and inclusion of informal carers
for PLWD would beneft from complementary virtual and in-person Locations when in-person
activities since this is a promising method for building and sustain-
ing interdependencies. We also believe that understanding how
virtual spaces function is crucial to design virtual programs that
5.1 Designing for Blended Programming for
are more appealing to PLWD beyond when circumstance necessity
their use. In Tables 3 and 4, we provide a checklist for coordinators People Living with Dementia
and technologists to assist them with the design of a program’s Programs Should be Designed to Highlight the Strengths of Where
structure or technologies that will support social programs. They Will Occur: Blended programming often means that the con-
tent in person is as close as possible to the online content. This
model for blended programming is frequently applied to educational
Table 3: Checklist for designing blended programs. settings as it is successful at conveying information to students
[54]. However, students in fully virtual courses gain few social
benefts [86]. Thus, current, readily available technologies cannot
Action to be Questions to Keep Accomplishing
allow either type of site to imitate the other, nor is it efective to
Done in Mind this Action
try such imitation [38]. Since the primary purpose of the programs
Highlight location is the program - create greater
examined in this paper is providing social engagement to PLWD,
strengths in-person/onsite? social engagement
the application of blended programming in this way is likely to be
- make it a sensory
inappropriate.
experience
- raise awareness • The strengths of in-person are that they encourage greater
about the social engagement, sensory experience, and awareness of
organization by the organization within the community. In-person programs
bringing people in thus allow for social interdependencies to be infuenced by
or virtual/online? - personalize organizational interdependencies. For example, not having
- customize the ability to have informal meetings with other carers in
- and localize the organization’s atrium made participants feel alienated
content and socially disconnected when another participant passed
Complementary does the program - participants away.
Program incorporate should not be • In comparison, virtual programs require an organization to
continuous use of constrained by personalize, customize, and localize content for participants.
in-person and mobility, distance, These adjustments mean social interdependencies have more
virtual? or a lack of access infuence over the organizational interdependencies as they
to technology cause organizations to cater to their participants’ needs. For
- program ofering instance, when organizations hire certain staf or approach
and accessibility specifc organizations to form partnerships, their motivation
should be reliable comes from the need to serve their community.
is there an - celebrate
overarching repetition by While Boyd et al. [8] suggested that individuals’ roles when
program narrative? building it into the jointly using assistive technology need fexibility, our work suggests
program’s that technology should also have role fexibility. Preparation of
narrative technology to be fexible would allow program coordinators to
quickly adapt a participant to, for instance, a suitable asynchronous
activity when there are connection issues or switching from in-
person to virtual should another pandemic occur.
“Just like meeting in person” - Examination of interdependencies in dementia-friendly virtual activities. ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: [Left] We represent organizational interdependencies on top of our diagram since they determine whether a program
even exists. The opportunity for interdependency networks to converge is the method by which a program is delivered. [Middle]
When an organization’s programs to cater to a community’s needs, the infuence shifts so that the organizational is more
infuenced by the social. [Right] Ideally, the amount of infuence should be balanced, helping programs be more socially
inclusive.

Programs Should Consist of Complementary Components: The 5.2 Design of Virtual Spaces for Programs for
complementary design of virtual and in-person programs will allow People Living with Dementia
more fexibility for engagement since some participants will only
Virtual spaces are an organization’s online space or platform for
be able to attend in-person or virtually. Keeping the programs
delivering programming. Virtual spaces allow the organization to
complementary then permits those who can attend in-person and
enter the participant’s home, which allows the participant to con-
virtually to feel encouraged to attend without potentially getting
trol the environment [25, 60] while remaining in a location that
bored with repetitive content [58].
is familiar and presumably considered a safe space for them. An
• Overlap allows participants to gain access to the content instant beneft of virtual programs is that concerns about preparing
without being constrained by mobility, distance, or a lack of the in-person atmosphere to be primed as a safe space are unneces-
access to technology issues. As Francis pointed out, virtual sary. The organizations we interviewed used video conferencing
participation will not be accessible to everyone, but online for their virtual spaces. Participants Should Share the Sensory Expe-
adaptations, for those that could use them, mediated more rience of their Individual Environments: A major issue with virtual
social interactions by giving participants the ability to use spaces is the lack of sensory engagement. In-person, coordinators
and express with body language [40, 44, 49, 55, 62, 76]. Body could easily create and share sensory experiences by encouraging
language allows a participant to demonstrate a forgotten participants to describe what they were sensing or by engaging in
word or phrase to communicate and remain socially active. tangible activities such as art-making. As other HCI researchers
Other participants can also show support and interdepen- have shown, jointly sharing multi-sensory and tangible interactions
dency when they assist with word or phrase recall. can add to the quality of engagement for those living with dementia
[9, 30, 41, 49, 70, 73, 81]. This joint engagement is lost in virtual
• A complementary program builds opportunities for repeti- spaces since each person’s individual space is diferent.
tion, celebrating the individual’s life story instead of focusing • Coordinators can embrace location diferences by asking
on repetition as a stigmatized symptom of dementia [84]. participants to share and describe personal items to create
Furthermore, having some repeat in the content creates fa- unique sensory experiences. Additionally, by having to de-
miliarity, which could inspire recollection of new or diferent scribe their items, participants take on the role of program
stories from participants [22, 40, 47, 70, 84]. Much like when leader since no one else can smell or physically touch the
the one coordinator was pregnant, this encouraged partici- item they are sharing.
pants to ask for updates and share or recollect stories related
Objects and/or Technology Can Be Used to Create Links Between
to whatever stage of pregnancy she was in.
Diferent Program Locations: Typically, when participants enter an
• Complementary programs should build upon a program’s organization’s space, they add value to objects by sharing their sto-
overarching narrative. Always ofering both types of pro- ries with staf. Although this was not the coordinator’s intention,
grams will allow for stability of program ofering and acces- participants shared that the plants they had grown at the organiza-
sibility. Participants that attend both programs gain from tion were still alive and well in their homes in the virtual program.
seeing familiar faces and build upon those social connections This sharing served to re-link the plant back to the organization. In
while also gaining new knowledge – important for cogni- contrast, a beneft of the virtual context is that staf can instead add
tive stimulation and slowing the progression of dementia value to participants’ objects by linking them to their organization.
[1, 6, 19, 20, 42, 51, 81, 89]. Technology could further assist • Bonding an interaction through objects, when applied to
with reinforcing social ties between virtual and in-person a blended program, can be built over time, potentially de-
attendees through the design of an asynchronous platform veloping associations across virtual and in-person contexts.
that congregates key points brought up in the programs. Such associations through diferent contexts and sensory
ASSETS ’22, October 23–26, 2022, Athens, Greece Czech, et al.

stimuli can help with memory recall and communication their examination of accessible computing [52], we believe that
[18, 27, 74, 77]. As one coordinator alluded to, an initial as- unless approached diferently, the power dynamics and structures
sociation could be created in the virtual space as it allows underlying activities such as dementia programming will continue
organizations, such as museums, to share rarely viewed ob- to be present. Barros Pena et al. [5], for instance, used a life course
jects. Coordinators can weave a narrative between items in perspective to “challenge reductive understandings of technology
museum storage with items participants have in their home design.” Our research illustrates another method researchers can
or recall using. A further association can occur during the on- apply to help change the existing narrative. Instead of using critical
site program, as a coordinator can connect the items shared realism to point out the existing power dynamics, we used it to fnd
virtually with those that are available in the organization’s interdependencies since interdependency as a concept promotes
space. For example, after having participants play with the mutual access[7]. We found that within organizational structures,
organization’s scrubbing brush, in the next virtual program, which are normally entrenched with power imbalances, there are
participants could share and compare the brushes they use. interdependencies at play that help create accessible programs
Having a program fow in this way blends the organization’s space and applications of technology. We found that reinterpreting the
with the participant’s space. This blending of spaces by researchers context to focus on interdependencies allowed us to consider how
is usually considered in terms of asynchronous interactions, such that afects the design of technology—for instance, considering how
as collaborative work [46] or synchronous activities, such as family organizational shifts, like hiring or fring of staf, can afect what
meals [4]. We propose developing this link between spaces over time technology is used and how it is used with participants.
to help form a lasting bond between individual staf and participants. By looking at social programming and not focusing on a specifc
This bond could help strengthen organizational interdependencies location, our research also expanded the types of contextual set-
by promoting the organization to understand the participants’ lives tings typically explored in dementia and HCI research. Surprisingly,
and needs. It could also improve social interdependency by allowing despite recruiting coordinators from several types of organizations,
empathetic bonds to be created through developing a collective there were only minor diferences in program presentation for those
narrative throughout the blended program [18, 47, 49, 85, 87]. that ofered virtual presentations. The zoo, botanical garden, art
The Program’s Social Agenda Should Match with Where the Pro- museums, and history museums centered their program around
gram will Occur: Finally, for the carers, programs provided virtually a PowerPoint presentation. Deviations did occur, such as the zoo
allow them to have individual respite since they know their loved showing videos of the animals or the history museums using online
ones will be engaged and enjoying themselves. While coordinators maps to show current locations. The major diferences occurred
see this as discouraging participation, the carer not being present based on geographical location, with the organizations we inter-
can allow the person with dementia to have greater ownership over viewed in the US relying more on virtual programs while those in
the group. By requiring carers to attend virtual programs, coordina- the UK tended to try less technologically advanced options frst.
tors could negatively afect the carers’ wellbeing and be taking away Since the frst author was already acquainted with volunteering vir-
some of the participants’ sense of agency [27, 30, 35, 45, 47, 53]. tually, we chose to pursue examining that space. Virtual dementia
Additionally, since organizations act as gatekeepers, some carers’ programming allowed us to think about how diferent coordinators
agencies are also lost by not being provided an informal virtual would approach a group of participants that already have some
space to gather. familiarity with each other. For this reason, we felt that the group
interview of the carers and those living with dementia were useful
• As a blended program, coordinators can ensure that they
since they mimicked the atmosphere of a supportive social environ-
provide opportunities that allow participants to rely on each
ment. This setup allowed us to see how a group of somewhat typical
other more during virtual. Then when in-person, they can
attendees interacted with each other virtually when not attending a
instead encourage bonding and inclusion of the carer. By de-
program. By interviewing this way, we could also approach the one
signing the program this way, it can encourage and enforce
group interview with a diferent context. The interview session was
social interdependencies between participants as each space
then able to last well beyond what was expected since participants
changes who is relied on for social satisfaction. Blended
were enjoying their time together and, even when given the option,
programs then allow for fuctuations in the relationship dy-
did not want to leave. Thus, we found value in recontextualizing
namics between the carer and person living with dementia
how we conducted our interview with PLWD.
[45]. Moreover, blended programs can change how organi-
zational interdependencies develop and support programs
based on what is best for the participants. Virtual spaces
can be more efective when designed explicitly as a part of a 6 LIMITATIONS AND FUTURE WORKS
blended program and not as a separate component.
For this study, the programs we looked at targeted middle to upper-
5.3 Refections on Critical Realism, class Caucasian PLWD. Additionally, of the coordinators who of-
fered to be interviewed, 13 of 15 were female and Caucasian, with
Interdependencies, and Dementia no visible disabilities. Also, we only interviewed coordinators in the
Programming US and UK. Therefore, this research does not consider the variety
There are numerous eforts within HCI to promote empowerment of interdependencies that arise for those who do not ft this descrip-
of those living with dementia through technology [16, 18–20, 25, tion or those locations. This means we did not explore the depth of
29, 49, 56, 85]. However, as Storer and Branham unpacked with systemic racism that has led to the underdiagnosis of dementia in
“Just like meeting in person” - Examination of interdependencies in dementia-friendly virtual activities. ASSETS ’22, October 23–26, 2022, Athens, Greece

people of color nor how it afects who interacts with social organi- REFERENCES
zations [23, 68]. However, we still feel that our research provides a [1] Elisa Aguirre, Robert T. Woods, Aimee Spector, and Martin Orrell. 2013. Cognitive
baseline for the types of interdependencies that can occur in public stimulation for dementia: A systematic review of the evidence of efectiveness
from randomised controlled trials. Ageing Research Reviews 12, 1 (2013), 253–262.
situations for those living with dementia and how technology can https://doi.org/10.1016/j.arr.2012.07.001
support them. In the future, we hope to apply these baseline ideas [2] Keith Baker and Adele Irving. 2015. Co-producing Approaches to the Manage-
ment of Dementia through Social Prescribing. Social Policy and Administration
to work with more diverse community groups. 50, 3 (16 March 2015), 379–397. https://doi.org/10.1111/spol.12127
Our hope with this work is to move towards inclusivity; how- [3] Mark S. Baldwin, Sen H. Hirano, Jennifer Mankof, and Gillian R. Hayes. 2019.
ever, we realize that further studies on achieving that are necessary. Design in the Public Square: Supporting Assistive Technology Design Through
Public Mixed-Ability Cooperation. Proc. ACM Hum.-Comput. Interact. 3, CSCW,
Even if our fndings applied to programming, this would not au- Article 155 (Nov. 2019), 22 pages. https://doi.org/10.1145/3359257
tomatically make them dementia inclusive. The inability of our [4] Pollie Barden, Rob Comber, David Green, Daniel Jackson, Cassim Ladha, Tom
fndings to be applied to any program (those not designed to be Bartindale, Nick Bryan-Kinns, Tony Stockman, and Patrick Olivier. 2012. Telem-
atic Dinner Party: Designing for Togetherness through Play and Performance.
dementia-friendly) was made more apparent to us when an informal In Proceedings of the Designing Interactive Systems Conference (Newcastle Upon
carer mentioned how her spouse withdraws from even close friends Tyne, United Kingdom) (DIS ’12). Association for Computing Machinery, New
York, NY, USA, 38–47. https://doi.org/10.1145/2317956.2317964
and family. Therefore, study into mixed ability interactions and [5] Belén Barros Pena, Rachel E Clarke, Lars Erik Holmquist, and John Vines. 2021.
interdependencies is required. We are currently researching tech- Circumspect Users: Older Adults as Critical Adopters and Resistors of Technology.
nological interventions that we hope can be used during program In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
(Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York,
delivery that make it possible for organizations to design programs NY, USA, Article 84, 14 pages. https://doi.org/10.1145/3411764.3445128
that support social interdependencies and infuence organizational [6] Mark R Beauchamp, D Ph, Serena Mccutcheon, B Sc, Oliver Harper, and B Sc. 2007.
interdependencies. Older Adults ’ Preferences for Exercising Alone Versus in Groups : Considering
Contextual Congruence. Annals of Behavioral Medicine 33, 2 (2007), 200–206.
Finally, while we did observe did observe several of the common [7] Cynthia L. Bennett, Erin Brady, and Stacy M. Branham. 2018. Interdependence
virtual communication issues (i.e., latency, eye contact, and issues as a Frame for Assistive Technology Research and Design. In Proceedings of the
20th International ACM SIGACCESS Conference on Computers and Accessibility
with internet) [50, 71], in the context of PLWD, these were unique (Galway, Ireland) (ASSETS ’18). Association for Computing Machinery, New York,
due to the cost associated with conversation repair [64]. Similar NY, USA, 161–173. https://doi.org/10.1145/3234695.3236348
issues were explored in research conducted by Neate et al. [61] [8] LouAnne E. Boyd, Kyle Rector, Halley Profta, Abigale J. Stangl, Annuska Zolyomi,
Shaun K. Kane, and Gillian R. Hayes. 2017. Understanding the Role Fluidity of
with people living with aphasia. For instance, latency issues could Stakeholders During Assistive Technology Research "In the Wild". In Proceedings
cause participants to talk over each other causing confusion and of the 2017 CHI Conference on Human Factors in Computing Systems (Denver,
various participants to lose track of the conversation. However, in Colorado, USA) (CHI ’17). Association for Computing Machinery, New York, NY,
USA, 6147–6158. https://doi.org/10.1145/3025453.3025493
this study we did not look at this issue in close detail, thus further [9] Margot Brereton, Alessandro Soro, Kate Vaisutis, and Paul Roe. 2015. The Mes-
exploration into conversation repair is needed. saging Kettle : Prototyping Connection over a Distance between Adult Children
and Older Parents. In CHI ’15: Proceedings of the 33rd Annual ACM Conference
on Human Factors in Computing Systems. Seoul, Republic of Korea, 713–716.
https://doi.org/10.1145/2702123.2702462
7 CONCLUSION [10] Dawn Brooker. 2005. Dementia Care Mapping: A Review of the Research
Literature. The Gerontologist 45 (10 2005), 11–18. https://doi.org/10.
As a result of the COVID pandemic, many social organizations 1093/geront/45.suppl_1.11 arXiv:https://academic.oup.com/gerontologist/article-
were forced to adapt their dementia-friendly programs to virtual pdf/45/suppl_1/11/1470527/11.pdf
platforms hastily. This shift allowed these organizations to see var- [11] Jessica Budgett, Anna Brown, Stephanie Daley, Thomas E Page, Sube Banerjee,
Gill Livingston, and Andrew Sommerlad. 2019. The social functioning in dementia
ious benefts of online and most of them plan to continue virtual scale ( SF-DEM ): Exploratory factor analysis and psychometric properties in mild
programming to some extent in the future. However, the quick , moderate , and severe dementia. Alzheimer’s & Dementia: Diagnosis, Assessment
& Disease Monitoring 11 (2019), 45–52. https://doi.org/10.1016/j.dadm.2018.11.001
development of virtual programs means there is still a lack of un- [12] Sarah Bunt. 2016. Critical realism and grounded theory: Analysing the adoption
derstanding of how to design programs efectively for PLWD. To outcomes for disabled children using the retroduction framework. Qualitative
understand this, we chose to examine the interdependencies within Social Work 17 (09 2016). https://doi.org/10.1177/1473325016664572
[13] Francesco Cafaro, Leilah Lyons, Joshua Radinsky, and Jessica Roberts. 2010. RFID
dementia-friendly social programs. Through our examination, we Localization for Tangible and Embodied Multi-User Interaction with Museum
recognized that organizational interdependencies play a key role Exhibits. In Proceedings of the 12th ACM International Conference Adjunct Papers on
in infuencing the social interdependencies which occur between Ubiquitous Computing - Adjunct (Copenhagen, Denmark) (UbiComp ’10 Adjunct).
Association for Computing Machinery, New York, NY, USA, 397–398. https:
individuals. We also found that cross-infuence between organi- //doi.org/10.1145/1864431.1864455
zational and social interdependencies happens when there is an [14] Kathy Chamaz. 2017. Constructionism and the Grounded Theory Method. In
Handbook of Contructionist Research, James A. Holstein and Jaber F. Gubrium
opportunity for convergence. This convergence is linked to the (Eds.). The Guilford Press, New York, Chapter 20, 397–412. https://doi.org/10.
method of program delivery, and thus, technology could act as an 4135/9781526402196.n2
intervention to create such an opportunity. However, programs [15] K. Charmaz. 2006. Constructing Grounded Theory: A Practical Guide through
Qualitative Analysis. SAGE Publications. https://books.google.co.uk/books?id=
developed to be completely in-person or online can create some 2ThdBAAAQBAJ
alienating efects for some participants both with and without de- [16] Gary Cheung and Kathryn Peri. 2020. Challenges to dementia care during COVID-
mentia which negatively afects social interdependencies. There- 19: Innovations in remote delivery of group Cognitive Stimulation Therapy.
Aging and Mental Health 25, 6 (2020), 1–3. https://doi.org/10.1080/13607863.2020.
fore, programs designed to be delivered in a hybrid fashion with 1789945
overlapping and complementary content are a feasible way of de- [17] Loraine Clarke, Eva Hornecker, and Ian Ruthven. 2021. Fighting Fires and
Powering Steam Locomotives: Distribution of Control and Its Role in Social
veloping dementia-inclusive programs. Such a blended approach Interaction at Tangible Interactive Museum Exhibits. In Proceedings of the 2021
also encourages the building and reinforcing of interdependencies CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI
over time. Strengthening interdependencies could help expand a ’21). Association for Computing Machinery, New York, NY, USA, Article 344,
17 pages. https://doi.org/10.1145/3411764.3445534
social programs’ outreach, allowing them to be more inclusive.
ASSETS ’22, October 23–26, 2022, Athens, Greece Czech, et al.

[18] Elaine Czech, Mina Shibasaki, Keitaro Tsuchiya, Roshan L. Peiris, and Kouta for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764.
Minamizawa. 2020. Discovering Narratives: Multi-Sensory Approach Towards 3445559
Designing with People with Dementia. In Extended Abstracts of the 2020 CHI [36] James Hodge, Madeline Balaam, Sandra Hastings, and Kellie Morrissey. 2018.
Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI Exploring the Design of Tailored Virtual Reality Experiences for People with De-
EA ’20). Association for Computing Machinery, New York, NY, USA, 1–8. https: mentia. In Proceedings of the 2018 CHI Conference on Human Factors in Computing
//doi.org/10.1145/3334480.3375209 Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery,
[19] Jiamin Dai and Karyn Mofatt. 2020. Making Space for Social Sharing: Insights New York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3174088
from a Community-Based Social Group for People with Dementia. In Proceedings [37] James Hodge, Kyle Montague, Sandra Hastings, and Kellie Morrissey. 2019. Ex-
of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, ploring Media Capture of Meaningful Experiences to Support Families Living with
HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, Dementia. Association for Computing Machinery, New York, NY, USA, 1–14.
1–13. https://doi.org/10.1145/3313831.3376133 https://doi.org/10.1145/3290605.3300653
[20] Jiamin Dai and Karyn Mofatt. 2021. Surfacing the Voices of People with Dementia: [38] Jim Hollan and Scott Stornetta. 1992. Beyond Being There. In Proceedings of the
Strategies for Efective Inclusion of Proxy Stakeholders in Qualitative Research. SIGCHI Conference on Human Factors in Computing Systems (Monterey, California,
Association for Computing Machinery, New York, NY, USA. https://doi.org/10. USA) (CHI ’92). Association for Computing Machinery, New York, NY, USA,
1145/3411764.3445756 119–125. https://doi.org/10.1145/142750.142769
[21] Maitraye Das, Darren Gergle, and Anne Marie Piper. 2019. "It Doesn’t Win You [39] Leona Holloway, Kim Marriott, Matthew Butler, and Alan Borning. 2019. Making
Friends": Understanding Accessibility in Collaborative Writing for People with Sense of Art: Access for Gallery Visitors with Vision Impairments. In Proceedings
Vision Impairments. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 191 of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ’19).
(Nov. 2019), 26 pages. https://doi.org/10.1145/3359293 Association for Computing Machinery, New York, NY, USA, 12 pages. https:
[22] Sabeth Diks, Timothy Hendrik Coen Muyrers, Guangyu Chen, Tzu-Jou Huang, //doi.org/10.1145/3290605.3300250
Myrte Thoolen, and Rens Brankaert. 2021. CoasterChat: Exploring Digital Commu- [40] Maarten Houben, Rens Brankaert, Saskia Bakker, Gail Kenning, Inge Bongers,
nication between People with Early Stage Dementia and Family Members Embedded and Berry Eggen. 2020. The Role of Everyday Sounds in Advanced Dementia
in a Daily Routine. Association for Computing Machinery, New York, NY, USA. Care. In Proceedings of the 2020 CHI Conference on Human Factors in Computing
https://doi.org/10.1145/3411763.3451635 Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery,
[23] Peggye Dilworth-Anderson and Brent E. Gibson. 2002. The cultural infuence of New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376577
values, norms, meanings, and perceptions in understanding dementia in ethnic [41] Agnes Houston and Julie Christie. 2018. Talking Sense: Living with sensory changes
minorities. Alzheimer Disease & Associated Disorders 16 (2002). https://doi.org/ and dementia (1st ed.). HammondCare, Sydney, Australia. 1–63 pages.
10.1097/00002093-200200002-00005 [42] J. D. Huntley, R. L. Gould, K. Liu, M. Smith, and R. J. Howard. 2015. Do cognitive
[24] Emma Dixon and Amanda Lazar. 2020. Approach Matters: Linking Practitioner interventions improve general cognition in dementia? A meta-analysis and meta-
Approaches to Technology Design for People with Dementia. In Proceedings of regression. BMJ Open 5, 4 (2015). https://doi.org/10.1136/bmjopen-2014-005247
the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, [43] Innovations in Dementia. 2021. DEEP: https://www.dementiavoices.org.uk/. Last
USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–15. accessed September 2021.
https://doi.org/10.1145/3313831.3376432 [44] Elizabeth Kelson, Alison Phinney, and Glen Lowry. 2017. Social citizenship,
[25] Emma Dixon, Anne Marie Piper, and Amanda Lazar. 2021. “Taking Care of public art and dementia: Walking the urban waterfront with Paul’s Club. Cogent
Myself as Long as I Can”: How People with Dementia Confgure Self-Management Arts & Humanities 4 (08 2017). https://doi.org/10.1080/23311983.2017.1354527
Systems. Association for Computing Machinery, New York, NY, USA. https: [45] Debbie Kinsey, Iain Lang, Noreen Orr, Rob Anderson, and Daisy Parker. 2021. The
//doi.org/10.1145/3411764.3445225 impact of including carers in museum programmes for people with dementia: a
[26] Chris Drinkwater, Josephine Wildman, and Suzanne Mofatt. 2019. So- realist review. Arts & Health 13, 1 (2021), 1–19. https://doi.org/10.1080/17533015.
cial prescribing. BMJ 364 (2019). https://doi.org/10.1136/bmj.l1285 2019.1700536 PMID: 33538657.
arXiv:https://www.bmj.com/content/364/bmj.l1285.full.pdf [46] Aniket Kittur, Bryant Lee, and Robert E. Kraut. 2009. Coordination in Collective
[27] James Edmeads and Oussama Metatla. 2019. Designing for Reminiscence with Intelligence: The Role of Team Structure and Task Interdependence. In Proceedings
People with Dementia. In Extended Abstracts of the 2019 CHI Conference on Human of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA,
Factors in Computing Systems (Glasgow, Scotland Uk) (CHI EA ’19). Association USA) (CHI ’09). Association for Computing Machinery, New York, NY, USA,
for Computing Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/ 1495–1504. https://doi.org/10.1145/1518701.1518928
3290607.3313059 [47] Tom Kitwood. 1993. Towards a Theory of Dementia Care: The Interpersonal
[28] Amber Fletcher. 2016. Applying critical realism in qualitative research: Method- Process. Ageing and Society 13, 1 (1993), 51–67. https://doi.org/10.1017/
ology meets method. International Journal of Social Research Methodology 19 (02 S0144686X00000647
2016), 1–14. https://doi.org/10.1080/13645579.2016.1144401 [48] Amanda Lazar and Emma E. Dixon. 2019. Safe Enough to Share: Setting the
[29] Sarah Foley, John McCarthy, and Nadia Pantidi. 2019. The Struggle for Recog- Dementia Agenda Online. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article
nition in Advanced Dementia: Implications for Experience-Centered Design. 85 (Nov. 2019), 23 pages. https://doi.org/10.1145/3359187
ACM Trans. Comput.-Hum. Interact. 26, 6, Article 40 (Nov. 2019), 29 pages. [49] Amanda Lazar, Caroline Edasis, and Anne Marie Piper. 2017. A Critical Lens
https://doi.org/10.1145/3359594 on Dementia and Design in HCI. In Proceedings of the 2017 CHI Conference
[30] Sarah Foley, Daniel Welsh, Nadia Pantidi, Kellie Morrissey, Tom Nappey, and on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI ’17).
John McCarthy. 2019. Printer Pals: Experience-Centered Design to Support Agency Association for Computing Machinery, New York, NY, USA, 2175–2188. https:
for People with Dementia. Association for Computing Machinery, New York, NY, //doi.org/10.1145/3025453.3025522
USA, 1–13. https://doi.org/10.1145/3290605.3300634 [50] Minha Lee, Wonyoung Park, Sunok Lee, and Sangsu Lee. 2022. Distracting
[31] Christopher Frauenberger. 2015. Disability and Technology: A Critical Realist Moments in Videoconferencing: A Look Back at the Pandemic Period. In CHI
Perspective. In Proceedings of the 17th International ACM SIGACCESS Conference Conference on Human Factors in Computing Systems (New Orleans, LA, USA)
on Computers & Accessibility (Lisbon, Portugal) (ASSETS ’15). Association for (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article
Computing Machinery, New York, NY, USA, 89–96. https://doi.org/10.1145/ 141, 21 pages. https://doi.org/10.1145/3491102.3517545
2700648.2809851 [51] Else Lykkeslet, Eva Gjengedal, Torill Skrondal, and May-britt Storjord. 2014.
[32] Sophie N. Gaber, Louise Nygård, Anna Brorsson, Anders Kottorp, Georgina Sensory stimulation A way of creating mutual relations in dementia care. Inter-
Charlesworth, Sarah Wallcook, and Camilla Malinowsky. 2020. Social participa- national Journal of Qualitative Studies on Health and Well-being 1 (2014), 1–11.
tion in relation to technology use and social Deprivation: A mixed METHODS [52] Kevin M. Storer and Stacy M. Branham. 2021. Deinstitutionalizing Independence:
study among older people with and Without Dementia. International Jour- Discourses of Disability and Housing in Accessible Computing. Association for
nal of Environmental Research and Public Health 17, 11 (2020), 4022. https: Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3441852.
//doi.org/10.3390/ijerph17114022 3471213
[33] Erving Gofman. 1959. The presentation of self in everyday life. Bantam Doubleday [53] Galina Madjarof and Helena M Mentis. 2017. Narratives of Older Adults with
Dell Publishing Group. Mild Cognitive Impairment and Their Caregivers. In ASSETS. Baltimore, MD,
[34] Jenni Greig, Sabih-Ur Rehman, Anwaar Ul-Haq, Greg Dresser, and Oliver K. USA, 140–149.
Burmeister. 2019. Transforming Ageing in Community: Addressing Global Age- [54] Malissa Maria Mahmud, Marisha Barth Ubrani, and Wong Shiau Foong. 2020. A
ing Vulnerabilities through Smart Communities. In Proceedings of the 9th Inter- Meta-Analysis of Blended Learning Trends. In Proceedings of the 2020 11th Inter-
national Conference on Communities & Technologies - Transforming Communities national Conference on E-Education, E-Business, E-Management, and E-Learning
(Vienna, Austria) (C&T ’19). Association for Computing Machinery, New York, (Osaka, Japan) (IC4E 2020). Association for Computing Machinery, New York, NY,
NY, USA, 228–238. https://doi.org/10.1145/3328320.3328380 USA, 30–36. https://doi.org/10.1145/3377571.3379439
[35] Connie Guan, Anya Bouzida, Ramzy M. Oncy-avila, Sanika Moharana, and Lau- [55] Jane McKeown, Amanda Clarke, Christine Ingleton, Tony Ryan, and Julie Repper.
rel D. Riek. 2021. Taking an (Embodied) Cue From Community Health: Designing 2010. The use of life story work with people with dementia to enhance person-
Dementia Caregiver Support Technology to Advance Health Equity. Association centred care. International Journal of Older People Nursing 5, 2 (2010), 148–158.
“Just like meeting in person” - Examination of interdependencies in dementia-friendly virtual activities. ASSETS ’22, October 23–26, 2022, Athens, Greece

https://doi.org/10.1111/j.1748-3743.2010.00219.x NY, USA, 1–16. https://doi.org/10.1145/3313831.3376361


[56] Kellie Morrissey, John McCarthy, and Nadia Pantidi. 2017. The Value of [75] Kristen Shinohara, Nayeri Jacobo, Wanda Pratt, and Jacob O. Wobbrock. 2020.
Experience-Centred Design Approaches in Dementia Research Contexts. In Pro- Design for Social Accessibility Method Cards: Engaging Users and Refecting on
ceedings of the 2017 CHI Conference on Human Factors in Computing Systems Social Scenarios for Accessible Design. ACM Trans. Access. Comput. 12, 4, Article
(Denver, Colorado, USA) (CHI ’17). Association for Computing Machinery, New 17 (Jan. 2020), 33 pages. https://doi.org/10.1145/3369903
York, NY, USA, 1326–1338. https://doi.org/10.1145/3025453.3025527 [76] Frances Sin, Sophie Berger, Ig-Jae Kim, and Dongwook Yoon. 2021. Digital Social
[57] Thomas Morton, Dawn Brooker, G. Wong, Teresa Atkinson, and Shirley Evans. Interaction in Older Adults During the COVID-19 Pandemic. Proc. ACM Hum.-
2021. Keeping community groups and activities going - Sustainable Community Comput. Interact. 5, CSCW2, Article 380 (oct 2021), 20 pages. https://doi.org/10.
Interventions for people afected by Dementia: Recommendations for practice from 1145/3479524
the SCI-Dem Project. University of Worcester. https://eprints.worc.ac.uk/10225/ [77] Steven M. Smith, Arthur Glenberg, and Robert A. Bjork. 1978. Environmental
7/SCI-Dem%20booklet%20PRACTICE%20%28v4.2%29.pdf context and human memory. Memory & Cognition 6, 4 (1978), 342–353. https:
[58] Wendy Moyle, Cindy Jones, Toni Dwan, and Tanya Petrovich. 2018. E ectiveness //doi.org/10.3758/BF03197465
of a Virtual Reality Forest on People With Dementia : A Mixed Methods Pilot [78] Alzheimer’s Society. 2020. The impact of COVID-19 on People Afected by Dementia.
Study. The Gerontologist 58, 3 (2018), 478–487. Technical Report.
[59] Diego Muñoz, Stu Favilla, Sonja Pedell, Andrew Murphy, Jeanie Beh, and Tanya [79] Kate Swafer. 2014. Dementia: Stigma, Language, and Dementia-friendly. De-
Petrovich. 2021. Evaluating an App to Promote a Better Visit Through Shared mentia 13, 6 (2014), 709–716. https://doi.org/10.1177/1471301214548143
Activities for People Living with Dementia and Their Families. Association for [80] Catherine V Talbot and Pamela Briggs. 2021. The use of digital technologies
Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764. by people with mild to moderate dementia during the COVID-19 pandemic: A
3445764 positive technology perspective. https://doi.org/10.31234/osf.io/d4qv8
[60] Elizabeth D. Mynatt, Kayci Vickers, Salimah LaForce, Sarah K. Farmer, Jeremy M. [81] Linda Thomson, Bridget Lockyer, Paul Camic, and Helen Chatterjee. 2017. Efects
Johnson, Matthew Doiron, Aparna Ramesh, Walter Bradley Fain, Tamara Zubatiy, of a museum-based social prescription intervention on quantitative measures
and Amy D. Rodriguez. 2022. Pivoting an MCI Empowerment Program to Online of psychological wellbeing in older adults. Perspectives in Public Health 138 (11
Engagement. Proc. ACM Hum.-Comput. Interact. 6, GROUP, Article 32 (jan 2022), 2017), 175791391773756. https://doi.org/10.1177/1757913917737563
26 pages. https://doi.org/10.1145/3492851 [82] Milka Trajkova, A’aeshah Alhakamy, Francesco Cafaro, Rashmi Mallappa, and
[61] Timothy Neate, Vasiliki Kladouchou, Stephanie Wilson, and Shehzmani Shams. Sreekanth R. Kankara. 2020. Move Your Body: Engaging Museum Visitors with
2022. “Just Not Together”: The Experience of Videoconferencing for People Human-Data Interaction. In Proceedings of the 2020 CHI Conference on Human
with Aphasia during the Covid-19 Pandemic. In Proceedings of the 2022 CHI Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for
Conference on Human Factors in Computing Systems (New Orleans, LA, USA) Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/
(CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 3313831.3376186
606, 16 pages. https://doi.org/10.1145/3491102.3502017 [83] David Unbehaun, Konstantin Aal, Daryoush Daniel Vaziri, Peter David Tolmie,
[62] Mmachi God’sglory Obiorah, Anne Marie Marie Piper, and Michael Horn. 2021. Rainer Wieching, David Randall, and Volker Wulf. 2020. Social Technology
Designing AACs for People with Aphasia Dining in Restaurants. Association for Appropriation in Dementia: Investigating the Role of Caregivers in Engaging
Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764. People with Dementia with a Videogame-Based Training System. In Proceedings
3445280 of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu,
[63] Carolyn Oliver. 2011. Critical Realist Grounded Theory: A New Approach for HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA,
Social Work Research. British Journal of Social Work 42 (03 2011). https://doi. 1–15. https://doi.org/10.1145/3313831.3376648
org/10.1093/bjsw/bcr064 [84] Feliciano Villar, Rodrigo Serrat, and Stephany Bravo-Segal. 2019. Giving them
[64] J. B. Orange, Rosemary B. Lubinski, and D. Jefery Higginbotham. 1996. Con- a voice: Challenges to narrative agency in people with dementia. Geriatrics
versational repair by individuals with dementia of the alzheimer’s type. Journal (Switzerland) 4, 1 (2019). https://doi.org/10.3390/geriatrics4010020
of Speech, Language, and Hearing Research 39, 4 (Aug 1996), 881–895. https: [85] Jayne Wallace, Peter C. Wright, John McCarthy, David Philip Green, James
//doi.org/10.1044/jshr.3904.881 Thomas, and Patrick Olivier. 2013. A Design-Led Inquiry into Personhood in
[65] World Health Organization. 2007. Global age-friendly cities : a guide. Re- Dementia. In CHI ’13 Extended Abstracts on Human Factors in Computing Systems
trieved April 18, 2021 from http://apps.who.int/iris/bitstream/handle/10665/ (Paris, France) (CHI EA ’13). Association for Computing Machinery, New York,
43755/9789241547307_eng.pdf;sequence=1 NY, USA, 2883–2884. https://doi.org/10.1145/2468356.2479560
[66] Andrea Grimes Parker. 2013. Designing for Health Activism. Interactions 20, 2 [86] Wei Wang, Lihuan Guo, Ling He, and Yenchun Jim Wu. 2019. Efects of social-
(March 2013), 22–25. https://doi.org/10.1145/2427076.2427082 interactive engagement on the dropout ratio in online learning: insights from
[67] Shibley Rahman and Kate Swafer. 2018. Assets-based approaches and dementia- MOOC. Behaviour & Information Technology 38, 6 (2019), 621–636. https:
friendly communities. Dementia 17, 2 (2018), 131–137. https://doi.org/10.1177/ //doi.org/10.1080/0144929X.2018.1549595
1471301217751533 [87] Daniel Welsh, Kellie Morrissey, Sarah Foley, Roisin McNaney, Christos Salis,
[68] Jemma L. Regan. 2013. Redefning dementia care barriers for ethnic minorities: John McCarthy, and John Vines. 2018. Ticket to Talk: Supporting Conversation
The religion–culture distinction. Mental Health, Religion & Culture 17, 4 (2013), between Young People and People with Dementia through Digital Media. In
345–353. https://doi.org/10.1080/13674676.2013.805404 Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems
[69] Olivia K. Richards, Gabriela Marcu, and Robin N. Brewer. 2021. Hugs, Bible (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New
Study, and Speakeasies: Designing for Older Adults’ Multimodal Connectedness. York, NY, USA, 1–14. https://doi.org/10.1145/3173574.3173949
In Designing Interactive Systems Conference 2021 (Virtual Event, USA) (DIS ’21). [88] Glen White, Jamie Simpson, Chiaki Gonda, Craig Ravesloot, and Zach Coble.
Association for Computing Machinery, New York, NY, USA, 815–831. https: 2010. Moving From Independence to Interdependence: A Conceptual Model
//doi.org/10.1145/3461778.3462075 for Better Understanding Community Participation of Centers for Independent
[70] Helma Van Rijn, Joost Van Hoof, and Pieter Jan Stappers. 2010. Designing Leisure Living Consumers. Journal of Disability Policy Studies 20 (03 2010), 233–240.
Products for People With Dementia : Developing “ the Chitchatters ” Game. https://doi.org/10.1177/1044207309350561
American Journal of Alzheimer’s Disease & Other Dementias 25, 1 (2010), 74–89. [89] Bob Woods, Elisa Aguirre, Aimee E Spector, and Martin Orrell. 2012. Cog-
[71] E. Sean Rintel. 2010. Conversational Management of Network Trouble Pertur- nitive stimulation to improve cognitive functioning in people with dementia.
bations in Personal Videoconferencing. In Proceedings of the 22nd Conference of Cochrane Database of Systematic Reviews (2012). https://doi.org/10.1002/14651858.
the Computer-Human Interaction Special Interest Group of Australia on Computer- cd005562.pub2
Human Interaction (Brisbane, Australia) (OZCHI ’10). Association for Comput- [90] Özge Nilay Yalçın, Sylvain Moreno, and Steve DiPaola. 2020. Social Prescribing
ing Machinery, New York, NY, USA, 304–311. https://doi.org/10.1145/1952222. Across the Lifespan with Virtual Humans. In Proceedings of the 20th ACM Inter-
1952288 national Conference on Intelligent Virtual Agents (Virtual Event, Scotland, UK)
[72] Karin Ryding. 2020. The Silent Conversation: Designing for Introspection and (IVA ’20). Association for Computing Machinery, New York, NY, USA, Article 56,
Social Play in Art Museums. In Proceedings of the 2020 CHI Conference on Human 3 pages. https://doi.org/10.1145/3383652.3423897
Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for
Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/
3313831.3376357
[73] Corina Sas. 2018. Exploring Self-Defning Memories in Old Age and Their Digital
Cues. In Proceedings of the 2018 Designing Interactive Systems Conference (Hong
Kong, China) (DIS ’18). Association for Computing Machinery, New York, NY,
USA, 149–161. https://doi.org/10.1145/3196709.3196767
[74] Corina Sas, Nigel Davies, Sarah Clinch, Peter Shaw, Mateusz Mikusz, Madeleine
Steeds, and Lukas Nohrer. 2020. Supporting Stimulation Needs in Dementia Care
through Wall-Sized Displays. Association for Computing Machinery, New York,
Performing Qalitative Data Analysis as a Blind Researcher:
Challenges, Workarounds and Design Recommendations
O. Aishwarya
International Institute of Information Technology
Bangaluru, Karnataka, India
O.Aishwarya@iiitb.ac.in

ABSTRACT given rise to a considerable increase in disability prevalence around


Over the last 2 decades, the way in which disabled body minds are the world [18]. Of the 285 million people in the world who are
regarded in research has evolved considerably. From an understand- visually impaired, 90 percent are in developing countries. While
ing that disability research involves research or enquiry on or of 39 million people are blind, 246 million have severe or moderate
disabled people, there is now a shift in thinking that regards disabil- visual impairments [15].
ity research as research done by and with disabled people as well, Over the last 2 decades, the way in which disabled body minds
thus challenging traditional researcher-participant relationships. are regarded in research has evolved considerably. From an under-
This has lead to a larger number of people with disabilities aspiring standing that disability research involves research or enquiry on or
to enter academia. However, several barriers to full inclusion of of disabled people, there is now a shift in thinking that regards dis-
people with disabilities still remain. This paper talks about one ability research as research done by and with disabled people as well,
such barrier, that is, the lack of accessibility in software packages thus challenging traditional researcher-participant relationships
used for qualitative data analysis. Through the author’s experi- [13]. This has been a signifcant paradigm shift. From regarding
ences in hunting for accessible ways to perform qualitative data people with disabilities as passive subjects to be researched on,
analysis, the paper reviews existing software packages, discusses research methodology has evolved to a point where it recognises
possible workarounds, and considers the feature requirements for people with disabilities as people with agency and Choice, capable
an accessible qualitative data analysis tool. not only of being research subjects, but also of being the ones to
tell their own stories. Consequently, the number of people with
CCS CONCEPTS disabilities who regard academia as a possible inclusive space, and
research as a tool to create genuine positive societal changes, has
• Human-centered computing → Human computer interaction
increased. There has been an increasing interest in participatory
(HCI); Accessibility.
and co-design methodologies, frst-person accounts of disability,
autoethnographies, etc [1, 3, 8, 9, 11, 19] Advocates of participatory
KEYWORDS
research emphasise the right of all people to be actively involved as
Accessibility, Qualitative Data Analysis, Human Centered Design researchers in matters relevant to their own lives and communities
ACM Reference Format: [17]. The United Nations Convention on the Rights of Persons with
O. Aishwarya. 2022. Performing Qualitative Data Analysis as a Blind Re- Disabilities (UNCRPD), supports this principle, articulating that
searcher: Challenges, Workarounds and Design Recommendations. In The people with disabilities have the right to participate in all levels of
24th International ACM SIGACCESS Conference on Computers and Accessibil- society [16], which clearly includes disability research.
ity (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY,
Several studies in recent years talk about the need for integrating
USA, 4 pages. https://doi.org/10.1145/3517428.3551356
disabled scholars into disability studies research, and give concrete
ideas about diferent ways of doing so. Disabled researchers have
1 INTRODUCTION pointed out some barriers to there full and complete participation in
The WHO estimates suggest that the total global number of people academia, such as ableism in the form of denial of accommodations,
with disabilities has already surpassed one billion. About 15 per- the burden of always having to educate the academic community
cent of the world’s population lives with some form of disability, of around them, axis work always being considered the responsibility
whom 2-4 percent experience signifcant difculties in functioning. of individual disabled people,infrastructural accessibilities etc [1, 2,
A majority of these, around 80 percent, live in the global South. 4–6, 10–12, 14].
Population ageing, rapid spread of chronic disabilities and a bet- This paper focuses on one such barrier holding disabled peo-
terment in the methodologies used to measure disability have all ple, particularly blind and visually impaired people, back from full
Permission to make digital or hard copies of all or part of this work for personal or
participation in academia: the inaccessibility of software used for
classroom use is granted without fee provided that copies are not made or distributed qualitative data analysis. By reccounting and analysing my own
for proft or commercial advantage and that copies bear this notice and the full citation experiences of trying to fnd a tool to analyse the qualitative data I
on the frst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, collect as an early career research scholar, I highlight the need for
to post on servers or to redistribute to lists, requires prior specifc permission and/or a greater accessibility in this area, discuss some alternative methods
fee. Request permissions from permissions@acm.org. that can be used to perform qualitative data analysis, and provide
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Association for Computing Machinery. design recommendations and feature requirements for an accessible
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 qualitative data analysis package.
https://doi.org/10.1145/3517428.3551356
ASSETS ’22, October 23–26, 2022, Athens, Greece O. Aishwarya

2 EXPERIMENTING WITH EXISTING times to select snippets and attach labels to the selected audio. Au-
SOLUTIONS dacity allows unlimited labels on an audio fle. I therefore add one
label per Code. Once I am done, I can export my labels to a new
Some of the popular existing software packages that enable re-
text fle. Using keyboard shortcuts, I can attach and remove labels
searchers to collect, clean, and analyse the qualitative data they
easily, and also move from one label to the next. I then use the text
collect includeAtlas.Ti, MaxQDA, Nvivo and RQDA, among others.
fle generated, to create code categories, choose likely quotations,
However, all of them have signifcant accessibility issues. Some of
and perform thematic analysis. Once I got past the strangeness of
them have interfaces that are completely silent when accessed with
coding without a transcript, this method worked the best for me.
a screen reader, some have no accessible ways for adding or viewing
However, not all data collected during a study can be in audio
codes, and yet others have completely visual ways of displaying
format. I do occasionally come across textual Data that needs to be
codes and code categories.
analysed. For this, I use several workarounds.
Realizing that no ready-made solution exists to fullfll my needs,
One such way is to use Microsoft Excel. First, I collect my text
I turned to word processors and other mainstream software pack-
data in Microsoft word. Second, I make sure that each codeable
ages which had already proven to be accessible, such as Microsoft
segment, such as a sentence or a paragraph, is enclosed by blank
Word, Adobe Reader, and Microsoft Excel. Some of these tools were
lines. Next, I create a new blank workbook or sheet in Microsoft
moderately helpful for performing qualitative data analysis, but
Excel and create my columns (which are typically ID, Data, and
some were completely unusable.
Notes). I then copy my data from Microsoft word, and paste into the
The frst software tool I tried to use for data analysis was Mi-
column named data. My data is now ready for coding, with one cell
crosoft Word. I saved each interview transcript as a separate Word
per text segment. I use the ID column to assign numbers, starting
fle, and then tried to use the “Comments” feature of word to add
with 1, for each text segment, which makes sorting and fltering
each Code in. However, there were several problems with this
easier later on while performing analysis. I then read through all
method: I had no way of easily moving from one assigned code
of my data, cell by cell. To code a particular text segment, I create a
to another; highlighting text snippets was of no use because high-
new column to the right of my data for each Code I assign. Now,
lights are not detected by a screen reader; exporting my codes
as I read, to mark a code as being present in my data, I put an ‘X’
and snippets to a new document for analysis could be done only
in its column.’. As and when new codes occur to me, I create new
manually;double checking the codes I have assigned to a particular
columns for them to the right of the data column. I can also have
snippet became difcult as typical screen leader behaviour while
primary and secondary themes/codes for my data. I can indicate
dealing with comments is to only announce “has comment” and an
primary and secondary themes by using X for a primary code and Z
extra keystroke needs to be performed to read the comment itself..
for a secondary code. I then use the “sort by” and “flter by” options
Thus, I determined that using Microsoft Word as a qualitative data
to view the diferent segments according to The codes assigned
analysis tool was not feasible, due to the nature of both the screen
to them. This negates the need to export coded data for further
reader and the software tool itself
analysis. This method is also not without drawbacks. When I have
Using Adobe Reader was equally unsuccessful. Although the
a large amount of data to Code, and consequently have a large
highlights feature on this software program is actually accessible
number of codes as well, it becomes a little difcult to keep track
via the screen reader, the behaviour of the “comments” feature with
of all these diferent codes, as each code has a column of its own.
a screen reader was similar in many ways to my experiences with
In such a case, I end up with so many columns that navigating
Microsoft Word. Once again, I could not navigate easily between
through the spreadsheet becomes difcult. One way of handling
the codes I assigned, nor export them to a new document for further
extremely Virti or text heavy data is too create a new sheet for each
analysis.
interview transcript, or feld notes segment.
One other method I have used is to code data using Markdown.
Markdown is a lightweight markup language that can be used to
3 POSSIBLE WORKAROUNDS add formatting elements to plain text text documents. Created by
In this section, I would like to discuss some possible workarounds John Gruber in 2004, Markdown is now one of the world’s most
that I have used in my process of qualitative data analysis., with popular markup languages [7]. As opposed to selecting text and
specifc emphasis on features that make these solutions ideal for applying formatting through menus and tool bars, markdown uses
my needs. text based ways of indicating formatting. Mark down, as a text-
Rather than transcribe audio data such as interviews into text based system, ofer certain advantages to people who are blind and
and then struggle to fnd a tool to help me analyse the text data visually impaired. Since Markdown uses markdown syntax, typi-
so collected, I now prefer to trim and clean up my audio record- cally consisting of special characters such as *, #, and @ to indicate
ings a little, and analyse them using an open source audio editing formatting elements such as bold, italics, underline, headings, etc,
tool called Audacity. My reasoning behind adopting such an un- formatting elements become uniquely accessible to people who are
conventional method is that, as a screen reader user, I am always blind and visually impaired. Using markdown, screen reader users
listening to the interview, as even the transcripts are read out to can go through documents and understand the way they are for-
me using the screen reader. Moreover, accessible ways of coding matted, without using special keystrokes to check the formatting of
audio qualitative Data do exist. I typically listen to the audio twice any given text. While using word processors, formatting elements
to familiarise myself with the content. I then go through the audio such as font, size, bold, italics, etc are typically not automatically
once more, this time at twice the normal speed, pausing multiple
Performing Qalitative Data Analysis as a Blind Researcher ASSETS ’22, October 23–26, 2022, Athens, Greece

announced by a screen reader, and have to be accessed using a Firstly, the tool should have an easy way of Assigning a partic-
keystroke. ular code to a certain text snippet. To reduce the number of steps
It is the ability to add in-line footnotes To mark down text docu- necessary to select text and then go through menus to fnd the nec-
ments that is of particular signifcance in the context of analysing essary options, there must be a keyboard shortcut for this feature.
qualitative data. Using this feature, I add my codes to my qualitative There must, of course, also be a way of identifying a particular text
data as in-line footnotes. Each snippet is enclosed in to mark them snippet as coded or not coded.
as bold, before attaching a footnote. As I re-read through this doc- Second, information such as whether a particular snippet has
ument, I can read through the formatting using my screen reader, been coded or not, and what code has been assigned to a particular
and understand the codes I have assigned to each segment. This snippet should be announced by a screen reader as the user goes
is particularly helpful while dealing with a large amount of data, through the document, without any additional keystroke. This will
as it negates the need to cycle through a large number of columns go a long way in having accessible ways of re-reading and re-coding
in excel. However, there is no easy way of exporting text snippets, documents. These attributes must also be visually indicated, for the
or the codes attached to them, to a new fle. This has to be done use of researchers who have low vision, or sighted collaborators.
manually, as opposed to the sorting and fltering options in Excel Third, the tool must have an easy way to move from one as-
that make it easier to process coded data. signed code to another. Screen reader behaviour such as the ability
Analysing and coding videos and images are even more chal- to navigate by headings using a keystroke, can be taken as an exam-
lenging as a blind researcher, as these dataformats are generally ple. Just as a user can move through headings by pressing the ‘H’
rather inaccessible to people who are visually disabled. I generally key repeatedly on the web page, the tool should ideally allow for
refrain from using video as a method of documentation. Rather than navigation to the next and previous codes in the document using a
recording videos when I am in the feld, I tend to textually record keystroke. This could possibly be control+c or something similar.
my observations. I prefer to make transcripts of conversations and Next, the tool should also have the ability to deal with and process
record textual and audio feld notes. multiple types of data such as audio, video, text, and images. This
However, in certain situations, collecting data in the form of is especially important since research projects typically do require
videos and images is unavoidable. Projects that involve data col- the researcher to collect multiple types of data in the form of feld
lected from social media platforms, for instance, require me to notes, interviews, photographs etc. Therefore a researcher who is
engage with data in the form of Screenshots, pictures, and videos. blind and visually impaired must have the ability to input all these
In such situations, I must incorporate a preliminary step into my diferent types of data into the analysis tool, and analyse them using
process of data analysis: a step where I convert my pictures and this tool as well.
videos into descriptions of what happens in them. I can analyse this Finally, the tool should also have a way to perform further anal-
extra data in the same way that I would analyse any other piece of ysis on this data. This can possibly be facilitated in two ways. The
textual data. frst is to have an “analysis mode” or something similar which
What is evident from this analysis of possible workarounds to displays quotes one after the other with each assigned text seg-
performing qualitative data analysis using unconventional methods ment displayed below. This mode can also include a feature to
is that, at this point in time, a one size fts all solution to analysing collapse multiple codes into one category or theme. This will facili-
qualitative data in a screen reader accessible way does not exist. tate easier thematic analysis. The option to create sub-themes or
Every method I have outlined above is usable only in specifc situa- sub categories would also be welcome. This information can per-
tions or with specifc kinds of data. Moreover, typically, a research haps be displayed using a treeview control, which a screen reader
project does involve dealing with multiple types of data such as typically understands well. An alternative way of doing this is to
feld notes, interview recordings, pictures, etc, making it necessary provide the ability to export codes and text snippets assigned to
to use more than one method in analysing this data. It is in the them into a spreadsheet-based data format such as XLS orCSV for
complexities like these that blind and visually impaired researchers further analysis.
end up spending more than the usual amount of time to produce
work that is comparable in every way, including methodological
rigour, to that produced by sighted researchers. 5 CONCLUSION
It is evident that new and accessible ways of doing qualitative
research and analysing qualitative data are necessary for the Blind
4 DESIGN RECOMMENDATIONS FOR AN and visually impaired researcher community. As more and more
blind researchers stepped into academia it becomes necessary not
ACCESSIBLE QUALITATIVE DATA ANALYSIS only to avoid social and societal ableism, but also create a safe and
PACKAGE technologically accessible place within academia.
It is evident from the methods and problems I have outlined in This is important not just for the sake of accessibility, but also to
the previous section that developing a tool that enables Blind and promote more research about disability by the disabled community
visually impaired researchers to accessibly analyse qualitative data themselves. If more and more researchers are to participate in
is a necessity. In this section I therefore in numerate the feature research on an equal footing, and if more research using co-design
requirements for an accessible qualitative data analysis tool, from and participatory design as methodologies must be conducted, it
my experiences of using diferent tools and workarounds to analyse is necessary to strip away the inaccessibilities in the platforms
diferent kinds of data. and processes used for conducting research. Creating and fnding
ASSETS ’22, October 23–26, 2022, Athens, Greece O. Aishwarya

accessible ways of performing qualitative research is but one step (Virtual Event, Greece) (ASSETS ’20). Association for Computing Machinery, New
forward in this direction. York, NY, USA, Article 4, 13 pages. https://doi.org/10.1145/3373625.3416996
[9] Dhruv Jain, Audrey Desjardins, Leah Findlater, and Jon E Froehlich. 2019. Au-
toethnography of a hard of hearing traveler. In The 21st International ACM SIGAC-
ACKNOWLEDGMENTS CESS Conference on Computers and Accessibility. 236–248.
[10] Dhruv Jain, Venkatesh Potluri, and Ather Sharif. 2020. Navigating Graduate
I am extremely grateful to all the disabled researchers who have School with a Disability. In The 22nd International ACM SIGACCESS Conference
come before me, whose footsteps into the world of academia have on Computers and Accessibility (Virtual Event, Greece) (ASSETS ’20). Association
for Computing Machinery, New York, NY, USA, Article 8, 11 pages. https:
paved the way for mine. I would also like to extend my gratitude //doi.org/10.1145/3373625.3416986
to my advisor, Dr. Amit Prakash, whose constant support for my [11] Deborah Marks. 1999. Dimensions of oppression: Theorising the embodied
experiments into conducting qualitative data analysis has made subject. Disability & Society 14, 5 (1999), 611–626.
[12] Robert McRuer and Merri Lisa Johnson. 2014. Proliferating cripistemologies: A
this paper possible. virtual roundtable. Journal of Literary & Cultural Disability Studies 8, 2 (2014),
149–169.
REFERENCES [13] Jenny Morris. 2014. Pride against prejudice: Transforming attitudes to disability.
The Women’s Press.
[1] Cynthia L. Bennett, Erin Brady, and Stacy M. Branham. 2018. Interdependence [14] Jackie Leach Scully. 2018. From “she would say that, wouldn’t she?” to “does she
as a Frame for Assistive Technology Research and Design. In Proceedings of the take sugar?” epistemic injustice and disability. IJFAB: International Journal of
20th International ACM SIGACCESS Conference on Computers and Accessibility Feminist Approaches to Bioethics 11, 1 (2018), 106–124.
(Galway, Ireland) (ASSETS ’18). Association for Computing Machinery, New York, [15] India Today. 2017. World sight day 2017: Statistics and facts about visual
NY, USA, 161–173. https://doi.org/10.1145/3234695.3236348 impairment and tips to Protect your eyes. https://www.indiatoday.in/education-
[2] Nicole Brown and Jennifer Leigh. 2018. Ableism in academia: Where are the today/gk-current-afairs/story/world-sight-day-2017-facts-and-fgures-
disabled and ill academics? Disability & Society 33, 6 (2018), 985–989. 1063009-2017-10-12
[3] Fiona Kumari Campbell. 2009. Internalised ableism: The tyranny within. In [16] United Nations. 2006. Convention on the Rights of Persons with Disabilities.
Contours of Ableism. Springer, 16–29. Treaty Series 2515 (Dec. 2006), 3.
[4] Mark Anthony Castrodale and Daniel Zingaro. 2015. You’re such a good friend: [17] Maja van der Velden and Christina Mörtberg. 2021. Participatory Design and
A woven autoethnographic narrative discussion of disability and friendship in Design for Values. Springer Netherlands, Dordrecht, 1–22. https://doi.org/10.
Higher Education. Disability Studies Quarterly 35, 1 (2015), 1–27. 1007/978-94-007-6994-6_33-1
[5] Kristie Dotson. 2011. Tracking epistemic violence, tracking practices of silencing. [18] WorldHealthOrganization. 2018. World Report on Disability. https:
Hypatia 26, 2 (2011), 236–257. //www.who.int/teams/noncommunicable-diseases/sensory-functions-
[6] Hannah S Facknitz and Danielle E Lorenz. 2021. Refections on disability and disability-and-rehabilitation/world-report-on-disability
(dis)rupture in pandemic learning. http://activehistory.ca/2021/10/refections- [19] Anon Ymous, Katta Spiel, Os Keyes, Rua M. Williams, Judith Good, Eva Hornecker,
on-disability-and-disrupture-in-pandemic-learning/ and Cynthia L. Bennett. 2020. "I Am Just Terrifed of My Future" — Epistemic
[7] Keiran Healy. 2018. The Plain Person’s Guide to Plain Text Social Science. Violence in Disability Related Technology Research. In Extended Abstracts of
https://plain-text.co/index.html the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI,
[8] Megan Hofmann, Devva Kasnitz, Jennifer Mankof, and Cynthia L Bennett. 2020. USA) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA,
Living Disability Theory: Refections on Access, Research, and Design. In The 1–16. https://doi.org/10.1145/3334480.3381828
22nd International ACM SIGACCESS Conference on Computers and Accessibility
Blind Users Accessing Their Training Images in Teachable Object
Recognizers
Jonggi Hong Jaina Gandhi Ernest Essuah Mensah
Smith-Kettlewell Eye Research University of Maryland, College Park University of Maryland, College Park
Institute United States United States
San Francisco, United States jaina.gandhi@gmail.com eessuahm@umd.edu
jhong@ski.org

Farnaz Zamiri Zeraati Ebrima Haddy Jarjue Kyungjun Lee


University of Maryland, College Park University of Maryland, College Park University of Maryland, College Park
United States United States United States
farnaz@umd.edu ebjarjue@terpmail.umd.edu kjlee@cs.umd.edu

Hernisa Kacorri
University of Maryland, College Park
United States
hernisa@umd.edu

Figure 1: A blind participant in our study training the MYCam app in their homes to recognize Lays with real-time descriptors.
A dual video conferencing captures participant’s activities via a laptop camera and smart glasses worn by the participant.
ABSTRACT impact in the quality of training set that can translate to model
Teachable object recognizers provide a solution for a very practical performance though this gain is not uniform. Participants found
need for blind people – instance level object recognition. They the app simple to use indicating that they could efectively train
assume one can visually inspect the photos they provide for training, it and that the descriptors were useful. However, many found the
a critical and inaccessible step for those who are blind. In this work, training being tedious, opening discussions around the need for
we engineer data descriptors that address this challenge. They balance between information, time, and cognitive load.
indicate in real time whether the object in the photo is cropped or
too small, a hand is included, the photos is blurred, and how much CCS CONCEPTS
photos vary from each other. Our descriptors are built into open • Human-centered computing → Accessibility; Ubiquitous
source testbed iOS app, called MYCam. In a remote user study in and mobile computing; Empirical studies in HCI.
(� = 12) blind participants’ homes, we show how descriptors, even
when error-prone, support experimentation and have a positive KEYWORDS
blind, visual impairment, object recognition, machine teaching,
Permission to make digital or hard copies of all or part of this work for personal or participatory machine learning
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than the ACM Reference Format:
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or Jonggi Hong, Jaina Gandhi, Ernest Essuah Mensah, Farnaz Zamiri Zeraati,
republish, to post on servers or to redistribute to lists, requires prior specifc permission Ebrima Haddy Jarjue, Kyungjun Lee, and Hernisa Kacorri. 2022. Blind Users
and/or a fee. Request permissions from permissions@acm.org.
Accessing Their Training Images in Teachable Object Recognizers. In The
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
24th International ACM SIGACCESS Conference on Computers and Accessibil-
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 ity (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY,
https://doi.org/10.1145/3517428.3544824 USA, 18 pages. https://doi.org/10.1145/3517428.3544824
ASSETS ’22, October 23–26, 2022, Athens, Greece Hong, et al.

1 INTRODUCTION as variation in object background, distance, and perspective; all


Echoing the end-user programming paradigm, the idea of hav- factors that can afect model performance.
ing end-users consciously provide training examples in AI-infused Through a remote user study with 12 blind participants (with the
applications has recently gained traction along with advances in setup shown in Figure 1), we demonstrate that our data descriptors
neural networks. By leveraging prior work in transfer, meta, and support blind users in reducing photos with cropped objects in their
few-shot learning (e.g., [7, 11, 70, 76, 80]), we are now able to build training sets and increase variation. Many participants chose to it-
teachable applications, where end-users can train models of their erate after inspecting their training sets and refected by improving
own. These applications facilitate personalization as they promise many photo attributes, which resulted in models that generalize
a better ft for real-world scenarios by signifcantly constraining better to photos from others, even though they reduced variation in
the machine learning task to a specifc user and their environ- background. Aligned with prior studies, we also observe challenges
ment. Thus, it is no surprise to see early mentions of the term among participants in crafting good testing examples that could
“teachable” in accessibility research (e.g. [55]), where data is sparse further promote experimentation. Still, their models perform better
and there is a high diversity even for a given disability [32, 33]. when tested on their own photos compared to both aggregated test
A more recent example (and the focus of this work) is teachable sets from all 12 blind participants in our remote study and from 9
object recognizers [3, 34, 41, 42, 47, 68], where blind users train blind participants in an in-lab study [41]. Observations from our
their camera-equipped devices such as mobile phones to recognize analysis of the photos and model performance are also confrmed by
everyday objects by providing a few photos as training examples. participants’ subjective feedback. Responses support the potential
Why is it important to access one’s training examples? In of descriptors, with blind participants indicating that they were
teachable applications such as teachable object recognizers, users able to tell their meaning by looking at relative changes in values
are called to interact with the machine learning model and improve and fnding them useful. However, errors in descriptors afected the
its performance by accessing and controlling their examples [83]. reliability of the app for some. More so, some considered training
Personalization is often the ultimate goal. However, the interactive being tedious referring both to time and cognitive load (e.g., opti-
nature of these applications can also help people uncover basic mize for multiple variables). Many made design recommendations
machine learning concepts and gain familiarity with AI (e.g., [14, that could further improve the efectiveness of the descriptors and
17, 25, 27, 56]). Thus, they can also contribute to the larger goal the training process.
of “making the process of teaching machines easy, fast and above To the best of our knowledge, this is the frst work to propose
all, universally accessible” [65]. An underlying assumption for both non-visual access to training data and to provide empirical results
improving a model and uncovering concepts via experimentation is with blind participants on automatically estimating and incorpo-
that users can inspect their data and iterate the training and testing. rating descriptors for data inspection in teachable computer vision
By doing so, they could build an intuition about what works and applications. Our analysis focuses on object recognizers, where
what doesn’t and perhaps why. However, this assumption does not ‘learning to train’ is deemed as one of the main challenges among
often hold for assistive teachable applications. Inspecting training blind users [31, 34]. However, we see how the underlying methods
examples typically requires similar skills to those the technology for extracting meaningful instance- and set-level descriptors can
aims to fulfll [20, 31]; thus, it is often inaccessible. For example, be adopted for other teachable applications both in assistive and
teachable object recognizers, where users teach the model to recognize informal AI learning contexts. Perhaps, they can also serve towards
objects on their behalf, assume that they can see the training images more accessible approaches for explainable AI interfaces, where
they are providing, which is almost never the case with blind users. there is an underlying assumption on people’s ability to visually
Sure enough, blind participants in prior studies with this technology inspect explanations [62, 66, 69].
wanted to know more about their training examples [31] with one
of them stating “the most challenging and most fun is training the
2 RELATED WORK
person”.
Existing approaches for real-time ‘alt text’ for individual images There is a rich literature exploring how computer vision can beneft
and ‘scene description’ for a series of images are not suitable for this people with disabilities (e.g., [10, 13, 30, 36, 46, 58, 71]). This is
task; they do not capture fne-grained diferences across otherwise especially the case with assistive technologies for the blind, where
similar images (e.g., see Figure 1). In this paper, we explore this computer vision is employed on smartphones (e.g., [2, 24, 38, 60,
challenge of accessing one’s training data. Within the context 78, 82]), smart glasses (e.g., [18, 43, 67, 81]), and smart suitcases
of teachable object recognizers for the blind, we study the poten- (e.g., [23, 35]). A common challenge we share with prior work is
tial and limitations of real-time ‘data descriptors’ that can capture that aiming the camera and inspecting recognition errors typically
users’ training examples with photo- and set-level attributes. Specif- requires similar skills to those the technology aims to fulfll (i.e.,
ically, we investigate whether these descriptors could be derived sight), even though the majority of prior work employs AI-infused
from visual attributes used to code training photos from sighted systems pre-trained by engineers, not fne-tuned by the end-user.
(e.g., [26, 27]) and blind (e.g., [34, 41]) people. To this end, we en- Thus, it is not a surprise to fnd that recognition errors afect blind
gineer photo-level descriptors that communicate to the user in users’ experience [51]; sometimes, to a degree where it can not be
real-time information about the photo they just took such as blur- corrected even by human clarifcation [61]. In fact, blind users may
riness, presence of their hand, object visibility, and framing. We depend on the recognition especially when it is difcult to verify
also engineer set-level descriptors that communicate information its predictions. They may overtrust the predictions even when they
one would get from glancing over a group of training photos such know they can be error-prone [41, 45] though, errors are especially
Blind Users Accessing Their Training Images in Teachable Object Recognizers ASSETS ’22, October 23–26, 2022, Athens, Greece

non-acceptable when they can adversely afect interactions with product logos), and reviewing the training photos after taking them
others [1, 43]. Aligned with prior eforts aiming to support users’ (i.e., quality and characteristics). Lee et al. [41] and Ahmetovic et
recovery from errors [6, 28, 52], we explore how to make training al. [3] aimed to resolve the camera framing challenge by developing
and resolving errors in teachable object recognizers more accessible real-time audio/haptic feedback that helps blind users estimate the
to blind users. Below, we focus on prior work that closely relates proper distance and position of the object in the image frame [3, 41].
to ours and contrast it to our study. However, the challenge of reviewing photos for iteration has not
been addressed yet.
Typically, studies included a training and testing step for explor-
2.1 Teachable Object Recognizers ing participants’ interactions with the teachable interfaces. Very
Looking at prior work on teachable object recognizers, we see di- few of them [17, 27], though, allowed people to refect and iterate
versity in research aims. Some, similar to this work, focus on the giving them access to their training data for review. We believe iter-
blind community. They explore the potential of teachable object ation is a critical step for understanding the potential of descriptors
recognizers as an assistive technology for blind users [34, 68], build for making the review process more accessible. Thus, in our study,
feedback mechanisms for better camera aiming [3, 41], and collect we also provide blind participants with an opportunity to refect
benchmarking datasets for evaluating approaches in transfer learn- and the option to iterate after reviewing their images with the de-
ing and meta learning [72]. Our work is orthogonal and highly scriptors. After all, our goal is to examine how data descriptors that
complementary to these eforts – our shared goal is to improve provide non-visual access to training photos, either individually or
blind users’ experience with teachable object recognizers. as a set, can be helpful during the iterative process of training and
We also see studies involving sighted people both adults testing, as well as how blind users may interact with them.
(e.g., [27]) and children (e.g., [17, 75]). They aim to better understand
the potential of teachable machines for enabling non-experts to un- 3 MYCAM: A TESTBED TEACHABLE OBJECT
cover basic machine learning concepts as well as better understand RECOGNIZER WITH DESCRIPTORS
common AI misconceptions they may have. Insights from these
To explore the potential and limitations of real-time descriptors
studies are very informative for our eforts in making the ‘learning
derived from visual attributes for accessing one’s training data, we
to train’ challenge more accessible to blind adults and perhaps in
the future to blind children that may want to participate in similar build MYCam. MYCam serves as a testbed for deploying descriptors
informal learning activities as in Dwivedi et al. [17]. in a teachable object recognizer. In the background, it sends users’
Table 1 provides a more detailed overview from a sample of these photos to a server, where an image recognition model is being fne-
prior studies over the past fve years (2017-2021) with the number of tuned by the user. While privacy is one of the promises of teachable
participants being typically smaller for in-person studies with blind object recognizers [31] (i.e., by processing photos entirely on the
people and sighted children. As the performance of teachable object user’s mobile device), we fnd that the state-of-the-art on-device
recognizers and users’ behavior in taking photos can be afected training is not there yet. As a screen-reader accessible iOS mobile
by environmental factors such as background, light condition, and app, MYCam enables remote studies with blind participants. This
was critical for us; due to the pandemic, we had to move our study
selection of objects, many studies collected inputs from participants’
from the lab to blind participants’ homes. By open sourcing both
environments to incorporate these factors [3, 27, 34, 72]. We also
opted for this approach in our study; the study was conducted in the MYCam app (available at https://iamlabumd.github.io/MYCam-
the homes of blind participants while we control for factors such Mobile/) and our proof-of-concept implementation of the descrip-
as study procedure and object stimuli. tors (available at https://iamlabumd.github.io/MYCam-Server/), we
The majority of prior work on teachable object recognizers facil- are hoping that others can contribute to further advance this work.
itates training through photos [3, 14, 27, 34, 41], except for one [72],
where blind users are called to use short videos. In our study, we 3.1 Design Rationales
also used photos so that the outcomes of our study could be ap- DR1: Prioritize Blind Users. Both the form factor and interaction
plicable (and comparable) to the majority of existing approaches. modalities of MYCam are informed by prior work with blind users
More so, collecting videos may increase the burden on the user, and teachable object recognizers as well as broader real-world object
especially when they are given several instructions and tasks to recognition applications. We opted for an iOS app since prior work
do [72] as in the case of our study. In addition, video-based assistive in the United States, the location of our study participants, suggests
technology can pose a greater privacy risk for blind users [5] as it that blind smartphone users overwhelmingly favor the iPhone [50]
is more likely to capture unwanted objects and unnecessary infor- though the actual numbers may be changing these past years [49].
mation in a video. Perhaps, live photos [53], could be the middle When users open the app, they enter the main screen (Figure 2a),
ground between the two. We further refect on the potential of this which shows a camera preview. We opted for the default camera
approach in the discussion section. app in iOS maximizing both compatibility with VoiceOver and user
In their early explorations, Kacorri et al. [34] highlighted some familiarity with it. The recognition mode for MYCam was modeled
of the main challenges that blind users may face when training after existing real-world applications, such as Seeing AI [63], where
a teachable object recognizer and testing its performance. They users can immediately ask the app to recognize what is captured by
revolve around camera framing (i.e., adjusting the distance between the camera with a double-tap; the Scan button is activated by default.
the camera and an object and centering the object), capturing the In this case, the app takes a photo, sends it to the personalized object
side of the object with the most distinctive visual features (i.e., recognition model in the server, and indicates the predicted label
ASSETS ’22, October 23–26, 2022, Athens, Greece Hong, et al.

Table 1: Characteristics of related studies on teachable object recognizers juxtaposed with ours.

[34] [68] [41] [27] [3] [75] [17] [72] This study
Blind Adults 8 14 9 10 52 12
Sighted Adults 2 10 2 100
Participants
Sighted Children 6 14
Real-world • • • • • •
Setting
Controlled • • • • • •
Photo • • • • • • •
Input
Video • •
Train • • • • • • • •
Tasks Test • • • • • • • • •
Iterate • • • •
Framing • •
Access
Review • • • •

Figure 2: The user fow of MyCam. MyCam has three main parts: Recognizing an object in the camera view (purple thread),
reviewing and editing the information of the objects (red thread), and teaching an object to the model (green thread).

both via speech and visually (Figure 2b). To mitigate potential errors on the number of photos taken. The number of training examples
that can’t be verifed non-visually, the app says “Don’t know” when (i.e., 30) is also informed by the same study [34] with blind partici-
uncertain (approach for uncertainty is discussed in Implementation). pants spending on average 65 seconds (SD=35.2) to take 30 photos
and often providing variation in their training examples. More so,
DR2: Simplify the Machine Teaching Flow. Users can add a �-shot learning results in the literature are often reported for � = 1,
new object to the recognition model via the Teach button on the 5, or 20. Thus, 30 examples could allow for bootstrap estimates
Home screen. The app displays the (rear-facing) camera preview for future comparisons in this feld. As discussed in Related Work,
with the shutter button at the bottom center and a thumbnail image the majority of prior work in teachable object recognizers opt for
of the last photo in the lower-left corner (Figure 2f). Users are asked photos rather than videos – we followed this approach in hope that
to take 30 photos with the count indicated in real-time via speech photos provide blind users with more control over their training
and visually (Figure 2g); in Kacorri et al. [34], blind participants examples in terms of both conscious variation incorporated and pri-
indicated that they would like to obtain feedback from the camera vacy concerns mitigated (e.g., presence of their hand, bystanders, or
Blind Users Accessing Their Training Images in Teachable Object Recognizers ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: The architecture of the MYCam system indicating approaches for estimating the descriptors and recognizing the
object.

surroundings in the camera frame). In the one study where videos can access these aggregates along with the set-level descriptors at
were used, blind users had to be explicitly trained to record videos the end of a training session (Figure 2h), where they are called to
and follow specifc flming techniques [72]. Given the emphasis of select either OK to proceed or Retrain to retake the photos from
this study on the descriptors, we decided to simplify the machine scratch. Both photo-level and set-level descriptors can be accessed
teaching and not require any explicit training steps for the users. at a later time when reviewing and editing trained objects, as shown
After the training examples, a dialogue box with a text feld shows in Figure 2d.
up prompting users to enter the name of the object (Figure 2i). More
so, in this screen users can opt to add an audio description. (Both 3.2 Implementation
object name and description can be edited at a later time, as shown
in Figures 2d and 2e.) Once this step is completed, the app notifes We built the MYCam testbed on Apple iPhone 8 with the object
the user with a “Training in progress” message (Figure 2j). At this recognition models and descriptor estimators running on our server
point Scan and Teach buttons are made inactive. They are activated on an NVIDIA GeForce GTX 1080 Ti GPU; the two communicate
once training on the server is complete and the user is notifed. through HTTP. The architecture of the system, indicating how both
DR3: Enable Access to Training Data with Descriptors. Some descriptors and recognition predictions are obtained, is illustrated
of the main concerns of blind participants about teachable object in Figure 3. The estimation of the descriptors in the current im-
recognizers in Kacorri et al. [34] were: “knowing whether the photos plementation of MYCam is error-prone; our approaches merely
were good, knowing the area of a package where the label or distin- serve as proof of concept. Prior to making these approaches more
guishing information resides,.., and deciding on the distance between robust, we wanted to examine whether blind users can leverage
the object and camera lens.” We observe that this information wanted such descriptors in the frst place for accessing their training data
by the participants can be provided both at a photo level and at and experimenting with the model.
a higher level across a set of photos. Thus, we devise two types
3.2.1 Descriptors. In all previous studies that informed our descrip-
of descriptors, shown in Table 2. These are all derived from visual
tors, researchers coded the attributes of photos manually through
attributes used to code training photos from sighted (e.g., [26, 27])
visual inspection of the photos from participants. Given that this is
and blind (e.g., [34, 41]) people. Photo-level descriptors are binary,
a time-consuming process, methods like Wizard of Oz do not deem
they indicate whether the object is too small or partially included
appropriate in this early exploration of descriptors for facilitating
in the frame (cropped), whether the photo is blurred, and if user’s
accessible non-visual experimentation. Thus, we opt for methods
hand is included in the frame. Set-level descriptors are indicated as
that attempt to automatically estimate them, even though, develop-
a percentage. They draw from parallels to how humans recognize
ing techniques for more accurate estimations is beyond the focus
objects independent of size, viewpoint, and location [54].
of this paper and is briefy discussed in Section 6. Specifcally, we
As shown in Figure 2f, users access the photo-level descriptors
employ state-of-the-art computer vision techniques such as world
after every photo that they take so that they can identify problems
tracking in ARKit, a YOLOv3 object detection model [57], and hand
in the photo (e.g., object being cropped) right away; since this gets
segmentation models [42] to estimate the descriptors. To speed
repetitive, a photo-level descriptor is communicated only when true.
up the calculations for real-time interactions, the object detection,
Users can access this detailed information also later, when review-
hand segmentation, and edge detection run on our server.
ing their trained objects (Figure 2e). Photo-level descriptors are also
provided in aggregate together with the set-level descriptors (e.g., • Small object: Given a bounding box of an object in an image
photo blurred in 50% of the training examples for an object). Users from YOLOv3 [57], the object is considered too small if the
ASSETS ’22, October 23–26, 2022, Athens, Greece Hong, et al.

Table 2: Photo-level and set-level descriptors. The descriptors are informed by prior studies with sighted and blind people who
have no machine learning expertise looking at the way they synthesize their data for training [17, 26, 27, 34, 41].

Photo-level descriptors
Small object The bounding box of the object is smaller than 1/8 (12.5%) of the image.
Cropped object The object is partially included in the image.
Blurry photo The photo is too blurry to recognize textures or texts.
Hand in photo A user’s hand is visible in the image.
Set-level descriptors
Variation in size A set of images shows objects with diferent sizes.
Variation in perspective A set of images shows diferent sides of objects.
Variation in background A set of images show backgrounds with diferent textures or items.

size of its bounding box is smaller than 1/8 (12.5%) of the • Variation in background: Assuming that the backgrounds
image. captured in photos can vary as a user moves the camera to
• Cropped object: If the YOLOv3 [57] bounding box is at the diferent places or changes its orientation, we used the loca-
edge of the image, the object is considered cropped. tion and orientation of the camera to measure the variation
• Blurry photo: The original RGB image is converted to in the background. We calculate the standard deviations of
grayscale (pixels values range: 0-255). We use Laplacian edge diferences in both orientation (using 1-cosine similarity)
detection [37] to produce an image with the edges in the and the location of the camera in the 3D coordinate system
grayscale image. In this last image, we then calculate vari- in ARKit. The greater value of the two standard deviations is
ance in pixel values to quantify blurriness. If the variance is selected as a variation in background. Like variation in size,
lower than a threshold, the photo is considered blurry. In this we set the maximum value as 0.15 through an internal test.
study, we set the threshold at 3.0; we found it classifes the We present the variation in background as a percentage.
blurriness most accurately when tested on photos collected
3.2.2 Object Recognition Model. The base model for object recog-
in a prior study with blind participants [41].
nition is Inception V3 pre-trained on ImageNet [15]. When users
• Hand in photo: The server detects the pixels from a hand via
train the app, it fne-tunes the last layer of the base model using
a hand segmentation model that has been previously tested
transfer learning with photos taken by the users. The transfer learn-
with blind participants [42]. If the proportion of pixels of a
ing works with a gradient descent algorithm with 500 iterations
hand(s) in an image is greater than a threshold, it considers
and a 0.01 learning rate. The training takes around 80 seconds with
the photo to show a hand. The threshold is 0.3%, which
90 photos of three objects. When users recognize an object with a
detected photos with hands most accurately when tested
personalized model, the time from taking a photo to notifying the
with the photos collected in a prior study by Lee et al. [41]
recognition result is around 100 milliseconds. To make the model
• Variation in size: When users take a photo, we detect the
distinguish the objects in a user’s training set and tell the diference
position of the smartphone with ARKit. As the size of the
from other objects that it has not been trained on, we employed an
object depends on the distance between the phone and the
approach of quantifying the confdence level of the discriminability
object, we used the standard deviation of the diferences
based on the entropy of confdence scores [79]. Specifcally, when
between the phone positions (����� ) to measure the vari-
the entropy value is greater than 2.0 or the confdence score is lower
ation in size indirectly. We set the maximum value of the
than 0.4, the app says "Don’t know" in synthesized speech instead
variation as 0.15 (����� ) that we could observe with the
of the label predicted by the model. We decided the thresholds of
photos collected by a sighted person in our research team
the entropy and confdence score through internal tests such that
through an internal test. The app presents the variation in
the app could diferentiate the three objects for the user study in
size as percentage (����� /����� ∗ 100).
the Section 4 from other items (e.g., pen, keyboard, mouse, keys)
• Variation in perspective: We detect the sides of an object
with the thresholds most accurately.
using ARKit. For this, we pre-trained the 3D object detection
model in ARKit with the three object stimuli in our study. The
model provides an enclosing bounding box of an object with
4 USER STUDY
six sides in 3D space when it detects the object regardless To explore the potential and limitations of descriptors in the context
of the object shape. The model fnds the main side of the of a teachable object recognizer, we conducted a remote user study
bounding box based on the object orientation. We calculate with blind participants. The study took place in participants’ homes
the variation in perspective based on the number of object to minimize safety concerns during the COVID-19 pandemic. The
sides shown in a training set. For consistency with other study was approved by the Institutional Review Board at the Uni-
descriptors, we present the variation in perspective as a versity of Maryland, College Park (IRB #1255427-1). In designing
percentage with scaling (� ∗ 15% where � is the number of this remote study, we came across many challenges, including how
sides in photos). to provide remote guidance and observe participants’ interactions
with MYCam and their objects. We quickly found that having just
the third-person camera view from the laptop was not enough.
Blind Users Accessing Their Training Images in Teachable Object Recognizers ASSETS ’22, October 23–26, 2022, Athens, Greece

Thus, as shown in Figure 1, we added a frst-person view with objects. For the frst task, the order of objects for training was fully
smart glasses. We iterated via several pilot tests that involved blind counterbalanced between participants. When participants trained
and sighted researchers in our team to anticipate the logistics (i.e., the app with the frst object, the experimenter provided step-by-step
study equipment delivery) and communication methods (i.e., laptop instructions on the MYCam user interface (e.g., the position and
and smart glasses) required for this remote study. Lessons learned functionality of buttons as well as the audio feedback that indicates
from accessing blind participants’ interactions via smart glasses the steps of training). Then, participants trained the app with the
(with this study serving as part of a larger case study) are discussed second and third objects and asked the experimenter for help when
in depth in Lee et al. [40]. necessary. When participants were testing the app for the frst time,
the experimenter also gave detailed instructions on the MYCam
4.1 Participants interface for testing. After that, participants were free to test their
We recruited 12 blind participants (6 women, 6 men, 0 nonbinary) models for as long as they wished (taking any number of photos).
from campus email lists and local organizations. As shown in Ta- When reviewing their trained objects in the third task, participants
ble 3, their ages ranged from 32 to 70 (� = 54.3, �� = 15.2). Three could access both information related to the descriptors as well as
participants reported being totally blind, fve having some light their own object labels and any recorded audio descriptions.
perception, and four being legally blind. P1 and P2 reported having After reviewing a training set with the descriptors, participants
an “auditory processing disorder” and difculty in hearing “very high decided whether they would collect the photos again or not for
sound”, respectively. All participants reported using smartphones that object. We made retraining optional for two reasons: (1) to
several times a day and taking a photo or recording a video at least avoid collecting data from participants who are not motivated to
once a month. As for their familiarity with machine learning, two experiment by retraining a model (as this could add a confounding
participants reported being somewhat familiar, eight being slightly factor in our analysis) and (2) to be able to contrast the attributes
familiar, and two being not familiar at all—we used a 4-point scale of training sets for who decided to retrain their models and those
for this question: (1) not familiar at all (have never heard of ma- who did not.
chine learning), (2) slightly familiar (have heard of it but don’t Throughout the study, we encouraged participants to think out
know what it does), (3) somewhat familiar (have a broad under- loud and to ask questions at any time. After each task, participants
standing of what it is and what it does), (4) extremely familiar (have were asked to answer questions related to their experience with
extensive knowledge on machine learning). While all participants the descriptors and MYCam and questions captioning usability
had experience taking photos before, many indicated that they had satisfaction [44]. All questions in this study were either open-ended
challenges related to image framing (9), focusing (2), holding a or on a 5-point Likert scale (i.e., strongly disagree, disagree, neutral,
camera steadily (2), and controlling the lighting (2). Many partici- agree, strongly agree).
pants indicated prior experience with other camera-based assistive
mobile applications such as Aira [4], Be My Eyes [8], Google Look- 4.3 Object Stimuli
out [21], Microsoft Seeing AI [63], Mediate Labs Supersense [48], Accounting for blind people’s need for recognizing objects with
Super Lidar [29], and Voice Dream Scanner [16]. similar sizes, weights, and textures with fne-grained labels [34, 68],
we selected three snacks, shown in Figure 4, with the same size,
4.2 Procedure texture, and nearly identical weights for our user study. As prior
Participants communicated with the experimenter remotely via work shows that end-users’ strategies of collecting training photos
dual Zoom video conferencing [84] connected both via a laptop are often inconsistent between objects [27], we expect that the
and a pair of Vuzix Blade smart glasses [77] that we delivered prior choice of three similar objects allows us to observe blind people’s
to their study sessions (see Lee et al. [40]). At the beginning of teaching strategies in the context of fne-grained object recognition.
the study, we briefy explained the concept of a teachable object With these snacks, we simulated a scenario in which a blind user
recognizer. Here, we provided a minimal description of how to interacts with the app to recognize diferent objects that the blind
take photos to train or test the app to mitigate priming in photo- user may feel difcult to distinguish using only the tactile sensation.
taking strategies for training and testing an object recognizer. The It was engineered to be a challenging scenario for machine learn-
description given at the beginning of the study reads as follows: ing models since these objects were similarly shaped and colored,
“The idea behind the app is that you can teach it to had refective surfaces, and were deformable. Unique and personal
recognize objects by giving it a few photos of them, objects without logos or texts on them (e.g., key, mug cup) can be
their names, and if you wish, audio descriptions. Once potentially used with a teachable object recognizer and perhaps
you’ve trained the app and it has them in its memory, could be ft for a more realistic scenario. However, for this study,
you can point it to an object, take a photo, and it will we included only commercial products to allow for comparison
tell you what it is. You can always go back and manage and replicability similar to prior studies regarding teachable object
its memory.” recognizers [34, 41, 68].
Then, participants were asked to perform three tasks: (1) train
the app with their own photos and labels of three snacks that 5 RESULTS
served as object stimuli shown in Figure 4, (2) use the app again Participants spend on average 143.8 seconds (�� = 72.4) taking
to recognize those objects later i.e. to test the performance of the 30 photos of an object. Five out of 12 participants re-train the ob-
app, and (3) review and edit the information of the already trained ject recognizer after inspecting their training sets with descriptors.
ASSETS ’22, October 23–26, 2022, Athens, Greece Hong, et al.

Table 3: Participants’ demographics and experience with machine learning, photo taking, and camera-based assistive apps.

ID Age Gender Level of vision Onset Machine learning Photo taking Experience with assistive apps
P1 39 Woman Light perception Birth Not familiar at all Once a day Aira, Be My Eyes, Seeing AI
P2 67 Man Legally blind 55 Slightly familiar Once a month Seeing AI
P3 62 Woman Totally blind Birth Somewhat familiar Several times a month Seeing AI, Be My Eyes
P4 32 Man Legally blind 20 Slightly familiar Several times a day None
P5 66 Man Light perception 46 Slightly familiar Once a week Seeing AI, Supersense, Super Lidar
P6 61 Man Light perception 41 Somewhat familiar Several times a week Seeing AI
P7 70 Man Legally blind Birth Slightly familiar Several times a week None
P8 50 Woman Legally blind 45 Slightly familiar Several times a week Seeing AI
P9 69 Woman Totally blind 55 Not familiar at all Several times a day VD Scanner, Be My Eyes, Seeing AI
P10 66 Woman Light perception Birth Slightly familiar Several times a week None
P11 33 Woman Light perception Birth Slightly familiar Once a month Seeing AI, VD Scanner
P12 36 Man Totally blind Birth Slightly familiar Several times a day Seeing AI, VD Scanner, Lookout

variation in size, small object), the researcher annotated the bound-


ing boxes of the objects. The variation in size was considered as
the standard deviation of the proportions that the bounding boxes
occupy in photos. The proportions range from 0.0 (i.e., the object is
not captured) to 1.0 (i.e., a bounding box covers the entire photo).
A photo with a small object is defned as one having a bounding
box smaller than 12.5% of the photo.
Figure 4: Object stimuli in the study chosen for a challenging
As shown in Figure 6, the correlation coefcients between es-
fne-grained classifcation task: Fritos, Cheetos, and Lays.
timated descriptors and annotated attributes ranged from 0.23 to
0.57, highlighting that this is a challenging task. The correlation
for "small object" is not shown since only three of all photos had
Examples of the photo-collection attempts and their annotated
small objects that are not detected by our descriptor estimator. Even
attributes (i.e., ground-truth attributes annotated by a researcher
though we employed naive approaches for estimating the descrip-
through visual inspection) are shown in Figure 5. Through the
tors as a proof of concept, all pairs had positive correlations. This
analysis of the participants’ photos and the performance of the
indicates that even with partial access there can be an opportunity
personalized object recognition models, we show how descriptors
for refection and experimentation i.e., if participants considered
may relate to the participants’ strategies for collecting training
relative changes rather than absolute values. Below, we see some
photos when they decided to retrain their models. We also show
empirical evidence in support of this premise.
the impact of these changes in training photos on the performance
of the models. We observe promising trends in the characteristics
of photos (i.e., adding more variations and reducing problematic 5.2 Changes in Annotated Descriptors For
photos) over time and iterations. Participants’ subjective feedback Participants who Choose to Retrain
also indicate that our descriptors can be a promising approach for
Five participants (P1, P3, P5, P8, P10) decided to retrain with a new
providing access to one’s training data in this context.
set of photos for an object after reviewing their initial training sets;
one of them (P3) trained the same object three times, each time
5.1 Correlation Between Estimated Descriptors with a new set of photos. A participant (P10) retrained with new
and Annotated Attributes sets of photos for two of the three objects. No participant retrained
We report the performance of our approach in estimating descrip- all three objects.
tors as it is a critical context for interpreting the remainder of the As shown in Figure 7, we contrast the estimated descriptors
results. More so, it can provide a glimpse at future eforts for esti- for initial attempts to those during retraining attempts. When the
mating such descriptors in a real-world context. Here we measure participants decided to retrain, their new training sets had fewer
performance by computing the correlation between the estimated photos with cropped objects, no hands included, almost no blurred
descriptors and annotated attributes. Given that prior work indi- photos, and higher variation in perspective and size on average
cates high inter-rater agreement for the annotation of these at- compared to their initial photos. This is a promising trend providing
tributes [27], we had a single researcher in our team performing some evidence on participants’ attempt to respond and adhere to the
this task. To quantify the variation of background and perspective, descriptors though it may have come at the cost of lower variation
the researcher grouped the photos within a set based on their simi- for background.
larity in terms of background and object side. We used the groups Specifcally, the average numbers of photos with cropped objects
to calculate the Shannon-Wiener Diversity Index [64], a measure and users’ hands were fewer at 15.83 (�� = 13.41) and 0.00 (�� =
of variation in background and perspective. The researcher also 0.00) in their new training photos versus the initial at 19.50 (�� =
coded the photos with a cropped object, participants’ hands, and 12.52) and 0.33 (�� = 0.81), respectively. The number of blurry
blurriness. For the attributes related to the size of the object (i.e., photos was 0.00 (�� = 0.40) and 0.17 (�� = 0.41) in retrained and
Blind Users Accessing Their Training Images in Teachable Object Recognizers ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 5: Photos from P10 and manually annotated attributes to be compared with automatically estimated descriptors.

Figure 6: Scatter plots indicating correlations between manual annotations (x-axis) and estimations (y-axis) for each descriptor.
ASSETS ’22, October 23–26, 2022, Athens, Greece Hong, et al.

(a) The aggregate of photo-level descriptors.


(a) Cropped object.

(b) Set-level descriptors. (b) Blurry photo, hand in photo, too small object.

Figure 7: Contrasting descriptor values in initial attempts Figure 8: The average values of annotated photo-level at-
to retraining attempts for P1, P3, P5, P8, and P10. Red dots tributes for individual photos among 12 participants. The
indicate means. charts include photos of the frst three training sets (1-30:
frst set, 31-60: second set, 61-90: third set). The lines are ftted
to dots using LOWESS smoothing.
initial, respectively. The system did not detect any photos with a
too-small object in either set. As for variation, mean variation in
perspective and size in retrained were 0.37 (�� = 0.33) and 0.12 participants’ hands included, objects too small, or blurry photos
(�� = 0.09), respectively, which is higher compared to those in the were nearly zero throughout the study (Figure 8b).
initial sets at 0.20 (�� = 0.32) and 0.11 (�� = 0.07). However, this 5.3.2 Coarse-grained Changes Across 3 Training Sets. With photo-
trend was reversed for variation in background. This descriptor level descriptors and set-level aggregates participants’ could gauge
was on average lower in retrained at 0.19 (�� = 0.28) compared to potential issues related to their teaching strategies or image quality
the initial at 0.26 (�� = 0.28). at the end of each training attempt; MYCam shows them imme-
diately after 30 photos are taken. Participants may or may not
5.3 Changes in Annotated Attributes for All choose to go back and retrain. But they may also choose to refect
Participants Over Time when training the next object, especially since our object stimuli
Many participants chose not to retrain. Perhaps the interactive were engineered to be very similar. As shown in Figure 9, partici-
nature of the descriptors created opportunities for early refection pants increased the variation among their training examples and
and experimentation; not just at the end of training. To explore reduced the number of photos with cropped objects. A one-way
this, we measured trends over time at diferent levels of granularity; repeated-measure ANOVA indicate a signifcant efect of order of
for this analysis, we use the manually annotated attributes, which sets on variation in background (� (2, 22) = 4.59, � = 0.022, partial
serve as the ground truth, rather than the estimated descriptors. � 2 = 0.18) and in perspective (� (2, 22) = 3.61, � = 0.044, partial
� 2 = 0.05). We did not observe a statistically signifcant efect of the
5.3.1 Fine-grained Changes Across 90 Training Photos. With photo-
other attributes. However, we do observe a tendency for an increase
level descriptors participants’ could gauge potential image quality
in the number of photos that were blurry or where the participant’s
issues right away; MYCam indicates them immediately after a photo
hand was included. Perhaps these descriptors were not deemed as
is taken. As shown in Figure 8a, we observe a dropping trend in the
that problematic or they were ranked lower in priority as teaching
number of images where the object was cropped as participants
strategies evolved. Participants’ feedback below can shed a bit more
progressed in the study. This is promising for a descriptor that
light on these observations.
merely provides binary feedback (i.e., whether the object is cropped
or not) instead of directional guidance on how to move a camera to
fully capture an object (e.g., Lee et al. [41]). The proportion of photos
5.4 Performance of Participants’ Object
with cropped objects was around 0.56 at the beginning (1st photo Recognition Models
in 1st training), decreasing to 0.37 by the last photo (30th photo in After fnalizing their training for all objects, participants were called
3rd training). Whereas the proportion of training examples with to test the performance of their models; we explicitly did not allow
Blind Users Accessing Their Training Images in Teachable Object Recognizers ASSETS ’22, October 23–26, 2022, Athens, Greece

(a) Set-level descriptors.


(a) Object recognition accuracy on one’s own testing images.

(b) Cropped object.


(b) Accuracy against participant satisfaction with performance.

Figure 10: When testing their models, participants’ experi-


ences varied (a), which seems to be refected in their satisfac-
tion scores (b).

Overall, we fnd that when testing on one’s own data the av-
erage accuracy (i.e., the number of correct predictions divided by
(c) Blurry photo, hand in photo, small object. total test images) of the models was 0.65 (�� = 0.24) with a break-
down across participants shown in Figure 10a. These results may
Figure 9: The average annotated values of set-level attributes seem surprisingly low for a 3-way classifcation task. However,
and the annotated number of photos with photo-level at- beyond being a fne-grained classifcation, the task can be partic-
tributes for all 12 participants across three training sets (a ularly challenging with objects of deformable shapes, same-size,
training set per object). refective-surface, and similar colors that can be hard to distinguish.
Among the high-performing models are those of P1 and P8 who
choose to iterate on their training (they tested the models with 3
for intermediate train-test iterations in an attempt to limit interfer- and 7 photos, respectively); though the same is not to be said for the
ence from that type of experimentation in the observed behaviors. models of P3, P5, and P10 who also iterated on their training (they
For the purpose of our analysis, we report model performance not tested the models with 12, 28, and 10 photos, respectively). When
only on participants’ fnal training sets but also dive deeper and juxtaposing model performance with participants’ subjective re-
look at their photos chosen to test their models and how well their sponses on the satisfaction of their models (Figure 10b), we fnd that
model generalizes e.g., if tested with photos taken by others. those whose models did not perform well disagree with this state-
ment and those whose models perform better agree. This alignment,
5.4.1 Model Performance with Testing Images from Self. We found
however, did not hold for those on the edges (strongly disagree and
that participants used a very small number of photos (� = 3.7,
strongly agree). Participants’ feedback in the next section, provides
�� = 3.2) to check if their models were working properly. Some
a potential explanation.
(4 out of 12) included photos where the object was more than half
cropped. Others (4 out of 12) captured multiple objects in the frame. 5.4.2 Model Performance with Testing Images from Others. One of
Some of these observations could be perhaps explained by our study the promises of a teachable object recognizer is that it works well
setup (e.g., participants were done with taking photos for the day or for each individual since the training and test sets are collected by
objects were in close proximity due to study setup). However, prior the same person and it is highly likely that they are going to exhibit
work in teachable object recognizers employing diferent study similar patterns [34, 68]. This was also the case in our study. As
designs also indicates that model testing and evaluation can be shown in Figure 11, for 9 out of 12 participants, the accuracy of the
challenging for end users [17, 27]. These challenges are critical model was higher when tested with an individual participant’s test
as perceived and actual performance may be diferent when the set than an aggregated test set from all participants in our study
models are actually used after testing. and photos from another study with blind participants [41] on the
ASSETS ’22, October 23–26, 2022, Athens, Greece Hong, et al.

P11 and P12 were neutral. P11 mentioned that taking 30 photos
is time-consuming, saying “I don’t really feel like I was all that ef-
fective because it takes a while to train for each one.” The errors in
descriptors afected the reliability of the app, making a participant
think the training process was less efective even though the two
models work independently of each other. P12 said "I don’t think
that the app is correct, especially when I know, for example, that my
hand was not in the photo...I don’t have a lot of confdence in the app’s
accuracy.”
When asked whether they could train the app quickly, fve par-
ticipants agreed, four disagreed, and three were neutral. Seven
(a) Accuracy per participant. participants indicated that taking 30 photos is tedious. For exam-
ple, P10 said, "The process is pretty straightforward. But I have to
spend, like, quite long time to train the three objects." When asked
about the difculty of the training task, all but one who remained
neutral, disagreed or strongly disagreed that the task was difcult.
P11 who was neutral, found it not difcult but tedious. Surprisingly,
this sentiment of the task being tedious was not presented in the
initial study with teachable object recognizers [34] even though the
number of training photos was identical. We suspect this diference
refects more on our implementation of the descriptors in the MY-
Cam testbed rather than the process of training itself. In MYCam
users could not opt in or out of the photo-level descriptors during
training leading to higher training times; specifcally, the time for
(b) Summary statistics of accuracy.
taking photos for training an object was doubled from 65 seconds
(�� = 35.2) reported in that frst study [34] to 143.8 seconds on
Figure 11: Model accuracy when tested on individual test
average (�� = 72.4) in our study.
images, aggregated test images from all 12 blind participants
in this remote study, and aggregated test images from all 9
blind participants in a prior in-lab study [41].
5.5.2 Descriptors. As shown in Figure 13, all but one participant
(P1, who was neutral) agreed or strongly agreed that the descrip-
tors were easy to understand. P6 said “I understood what it was
same objects. The accuracy of the model with individual test sets telling me. I didn’t have questions about what I was supposed to do.”
was 0.65 (�� = 0.24). The accuracy was lower at 0.51 (�� = 0.14) Participants’ responses indicate user refection based on descrip-
and 0.52 (�� = 0.09) when pooling test sets across all participants tor changes across multiple attempts, strengthening some of our
in the current study and testing photos from a prior study [41], observations in the previous sections. P2 elaborated “It gives you
where nine blind participants trained and tested a teachable object directions. The explanation (descriptors) afterwards, in the analysis,
recognizer, respectively. However, we observed that the iteration told me that my photographs were not always good. So I have to learn
can make the models generalize better. Among the fve participants to take better photographs.” Some participants found it difcult to
who did retraining, four and three participants had higher accuracy understand the absolute values of provided in the descriptors and
after retraining when their models were tested with the aggregated were wondering whether they should have a specifc value as a
test set and the set from the prior study [41], respectively. goal. For example, the values of descriptors were somewhat am-
biguous to P1 who said, “I guess just knowing exactly what they’re
5.5 Subjective Feedback from Participants referring to what numbers are really preferable.” P4 also mentioned
5.5.1 Overall Experience. To provide more context on participants’ the challenge in understanding the values of descriptors, but then
feedback for the descriptors, we illustrate in Figure 12 their re- mentioned that over repeated data collection during the study, he
sponses related to the MYCam testbed. Overall, participants agreed fgured out their purpose. P4 said “I wasn’t aware of any of those
that they could train the object recognition model efectively with felds when we did the frst object [...] For the second and third objects,
MYCam and disagreed on training being difcult, though they were I could take a little bit more variation in the photos or to better train
divided on whether it could be done quickly. This is promising. the application.” This is interesting feedback as the descriptors are
Specifcally, ten participants agreed or strongly agreed that they there merely to provide access to what one could infer via a visual
could train their models efectively with some pointing to the need inspection not per se dictate optimal characteristics for the training
for onboarding. P1 and P10, for example, who are not familiar at set. The diference of course is that when a sighted person glances
all and slightly familiar with machine learning said “after a while, I over their training photos they may or may not make an inference
learned that I could train it” and “It’s pretty easy. You have to teach on potentially problematic photos or lack of variation (see Hong et
me though. But if you teach me then it’s pretty easy to follow in- al. [27]), but a blind person always hears the descriptors. This ex-
structions and fnish the process.” respectively. On the other hand, plicit presence of the descriptors calls for the need for more context.
Blind Users Accessing Their Training Images in Teachable Object Recognizers ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 12: Participants’ feedback on training with the MYCam testbed.

Figure 13: Participants’ feedback on the descriptors.

While “ideal values” are use-case depended, during onboarding for the training.” This is an intriguing approach, one that we aim to
users could perhaps be provided with some rationale or examples. explore.
Ten participants agreed or strongly agreed that the descriptors
5.5.3 Model Performance. When we asked participants if they were
were useful. P10 (who was neutral on this) and P11 thought de-
satisfed with the performance of the object recognizer, opinions
scriptors helped them understand how to collect training examples
were divided; fve participants agreed or strongly agreed, six par-
for the object recognizer. P10 said, “(I agree) because I know the
ticipants disagreed or strongly disagreed, and one participant was
quality of the photos, the diferent aspects of the photos that I take.”
neutral, as shown in Figure 14. When accounting for the perfor-
P11 said, “It helped me understand what the camera needed in order to
mance of their model in their subjective responses (Figure 10b),
recognize the objects.” Participants also mentioned that descriptors
we observe that participants were not satisfed if the accuracy was
were useful to identify problems in their training sets. P10 elabo-
lower than 0.6. However, it did not all come down to model per-
rated “you have to get feedback or you’re not going to improve [...] it
formance. Open-ended feedback indicates that satisfaction is also
helps you to understand what you’re doing wrong.” P2 had a similar
related to efort. P11 remained neutral even though she did not ob-
idea: “the explanation (descriptors) afterward, in the analysis, told
serve any recognition errors (her model had the highest accuracy)
me that my photographs were not always good, so I have to learn to
but attributed this to the training task being tedious (see Section
take better photographs.” P12 thought they were not useful because
5.5.1). P11 said, “Because it took so much work to get that small
they were error-prone. P12 said, “I don’t think that the app is correct,
amount of performance.”
especially when I know, for example, that my hand was not in the
P7 and P10 agreed that they were satisfed with the performance
photo, or that the object is not cropped because the previous objects
though their model accuracy was only 0.6 and 0.4, respectively. As
were cropped.” This feedback highlights the need for improving the
legally blind P7, expressed that the performance is good enough
estimation of descriptors in future work.
to supplement his vision. P10 believed that she just did not train
Participant feedback suggests that it would have been helpful
the app properly. She said, “I think it recognized objects, but if you
to include more explicit guidance on how to improve the training
don’t train it properly, then it’s not going to recognize anything [...]
photos. For example, P7 suggested similar feedback to Lee et al. [41]
the Fritos bag was the one that didn’t work out, but that was probably
and Ahmetovic et al. [3] along the descriptors, elaborating “Cropped,
my fault.”
it did not help me know what to do diferently. If it said, maybe move
While the majority (9 out of 12) of participants observed recogni-
up, move down and move camera left, move the camera, right. That
tion errors during testing, many could not explain why. Six partici-
would have been more useful.” Our current implementation of this
pants were neutral or disagreed with the statement that they have a
photo-level descriptor actually can be re-purposed to provide such
good sense of why the recognition errors occurred. Their responses
feedback. More so, P6 mentioned that the interface for replacing
were simply “I have no idea.” or “I don’t know.” Though P7 and P10
problematic photos in a training set would improve the app. He
strongly agreed and agreed, respectively, their rationale was vague.
said “I would assume the training process can self-evaluate itself and
P10 said “I think it was my fault. I think it was my training. Other
it should sum that up for me and tell me what photos I should replace.
than that, I don’t know.” P9 strongly agreed and contributed the
[...] you need to replace those bad pictures unless you don’t need them
recognition errors to imperfect descriptors in training, elaborating
ASSETS ’22, October 23–26, 2022, Athens, Greece Hong, et al.

Figure 14: Participants’ feedback on the performance of their object recognition models.

“The reason is because I was teaching it, and I wasn’t 100% sure that it collected similar-sounding examples during training and thus, could
was 100% accurate. It makes sense that while I was teaching it, I was beneft from an interface that visualizes features of their training
a little bit of, so its recognition was a little bit of. It kept telling me sets; data descriptors could fll that need.
that the hand was in the photos.” We believe that these observations Accessing one’s training data is also critical for making informal
motivate the need for accessible computer vision explanations. learning activities that typically employ teachable machines with
children more inclusive. Learning objectives for AI education in
6 DISCUSSIONS K12 (e.g. [73]) highlight the use of interactive systems for exposing
children to AI prior to using those that leverage block-based pro-
Our user study, exploratory in nature, shows both promising results
gramming [74]. Dwivedi et al. [17] suggest that future teachable
and future research directions for supporting blind users’ interac-
interfaces for such activities beneft from classifcation tasks that
tions with teachable machines. In this section, we frst refect on
allow children to quickly inspect the data and uncover patterns.
lessons learned, while discussing implications for designing descrip-
Thus, it is not a surprise to see many learning activities for exposing
tors to access one’s training data in teachable object recognizers
children to AI often leverage teachable image classifcation appli-
and broader teachable applications either assistive or educational.
cations [14, 17, 22, 39, 75]. However, in these initial explorations,
We then discuss limitations in our study that may afect the general-
none of these applications are inclusive of blind children. Our data
izability of our fndings as well as future work for better estimating
descriptors could help increase their accessibility e.g., by leverag-
such descriptors and exploring their potential for explainability.
ing our shared code for MYCam and the descriptors. Further, we
see how researchers working in teachable object recognizers and
6.1 Implications broader contexts, could beneft from the following insights:
Our study provides evidence that descriptors derived from visual
attributes used to code training photos in teachable object recog- Balancing with demand on time and cognitive load.
nizers, can provide blind users with a means to inspect their data, MYCam’s set-level descriptors are given at the end of
iterate, and improve their training examples. Challenges often in- training but image-level descriptors are played every time
volve onboarding, time needed for training, as well as descriptor the user takes a training photo (they can also be accessed
accuracy and interpretation. when reviewing at a later time). Participants’ feedback
Insights from this work are complementary to prior studies indicates that, although the image-level descriptors are
exploring the feasibility of training [34, 47, 68] and camera aim- informative, they add to the training time and cognitive load.
ing [3, 42, 72] in teachable object recognizers for the blind. More so, Indeed, if we were to compare our study with the times
the underlying methods for extracting meaningful descriptors, i.e., reported in Kacorri et al. [34] training with descriptors (4.8
instance- and set-level characteristics that can be coded by quickly seconds per photo on average) took more than double the
inspecting the training data and that point to noise and variation, time without them (2.2 seconds on average), respectively.
respectively, can be adopted for other teachable applications. This Still, this was much less when compared to another study,
is especially critical for those assistive applications where training where blind users took photos with real-time camera aiming
typically requires similar skills to those the technology aims to guidance (i.e., audio and haptic feedback for camera aiming);
fulfll. For example, teachable sound detectors for Deaf/deaf and there, they spent on average 10.3 seconds per photo [41]
hard of hearing people [9, 20] could beneft from visualizations but did not refect much on the time needed to train. The
of the sound examples in a way that allows users to quickly in- diference between the two is: MYCam feedback when
spect potential noise in a training example and variation across taking photos is passive and requires listening to a list
the training set (e.g., better start and end of a recording, multiple of descriptors and optimizing simultaneously multiple
sound sources, variation, and other characteristics that hearing variables whereas the feedback in Lee et al. [41] is interactive
users could leverage for experimentation just by listening to the and requires listening to an audio cue or sensing vibration
audio). Indeed, Goodman et al. [20] observed that Deaf/deaf users and optimizing a single variable (including the object in the
Blind Users Accessing Their Training Images in Teachable Object Recognizers ASSETS ’22, October 23–26, 2022, Athens, Greece

frame). This challenge of maximizing information while though � = 12 is most common in human-computer interaction
minimizing cognitive load is not new and calls for better studies [12]. Our study is remote, yet participants are recruited
interactivity with the descriptors via opt in/out mechanisms from a relatively small area in the US in proximity to the authors’
(e.g., play descriptors via press and hold), verbosity controls, institution. The study is conducted in participants’ homes, yet it
or audio haptic feedback. For example, P10 expressed that shares more characteristics with a controlled in-lab study rather
while descriptors provide hints to problems, they do not than a real-world deployment: object stimuli were predefned and
directly instruct users on how to solve them. For example, small in number, the duration of the study was relatively short,
when an object is cropped in a photo, participants did not MYCam was deployed on one of our devices, participants were
get feedback on in which direction the camera should move being observed and had real-time support from the experimenter,
even though this information can be made available from and they were somewhat confned in terms of space.
the current implementation. We expect that combining the Specifcally, participants were asked to wear smart glasses and
descriptors with camera guidance (e.g., [3, 41]) could be communicate with the experimenter through a laptop computer
helpful. in front of them. Though these devices were necessary for com-
Balancing descriptors with instructions. There is a rich lit- munication and data analysis, they limit participants’ behavior e.g.,
erature on the value of tutorials, instructions, and in-context in walking around with the phone and taking photos in diferent
interactive assistance for supporting users with technology; locations and illuminations. For example, when participants wanted
a comprehensive review for blind users and smartphone to vary the backgrounds in photos, they took pictures with diferent
devices can be found in Rodrigues et al. [59]. Some prior parts of a table. However, if they could move around outside the
studies with blind users have shown that real-time descrip- user study setup, perhaps they would choose completely diferent
tions can lead to better accuracy and confdence compared to locations for background variation. More so, using MYCam on one
instructions at the start of a task (e.g., [19]). While we did not of our iPhone 8 devices instead of their own mobile devices could
compare the two, participants’ feedback indicate that data have afected our observations. All but one participant owned an
descriptors would be complementary and not a substitute iPhone; most participants were familiar with iOS apps. However,
for tutorials and instructions. In addition to the real-time the diference between their personal phones and our device (e.g.,
feedback, participants call for support in navigating the app in terms of size and camera location) could have afected the quality
and interpreting descriptor values. In our study, the exper- of photos and overall perception of the descriptors. We expect that
imenter provided some of this information. For example, the use of MYCam in a real-world scenario would have resulted
for the set-level descriptors the experimenter said: “you can in a richer set of contexts in users’ photos (typically a table in our
check how much variation your photos have. For example, study). Though we limited the objects to three snacks with similar
a 10% variation in the background means that most of your textures and weights, blind people may choose to train on personal
photos have similar backgrounds.” However, participants objects that may not be products in the market with a larger num-
mentioned that they could better understand the absolute ber of object instances. As the performance of an object recognizer
values of descriptors after experimenting. We suspect that depends on the number of classes and visual diference between
the level of understanding for these values would afect both the objects, these diferences could have afected the performance
the quality of the training sets as well as how reliable the of a personalized model and blind users’ experiences with it.
system is perceived by the users. Our experimental setup was in part restricted by our implemen-
Editing a training set based on descriptors. The current tation of the descriptors, which is meant to serve as a proof of
design of MYCam focuses on informing the users of the concept and is somewhat tied to a predefned set of object stimuli
attributes of their training sets rather than instructing how (e.g., for the ARKit to work and for establishing diferent thresholds).
to spot potential issues in their training sets or making Although the estimated descriptors had a positive correlation with
the data collection process efcient. Participants had to the manually annotated attributes and enabled participants to in-
diagnose problems for themselves based on descriptors spect their training sets, they were error-prone. When some of the
and replace the entire training set with new photos if they participants noticed the errors in descriptors, they deemed them as
wanted to fx something. Participants suggested adding well as the object recognition model unreliable. This suggests that
functions to edit (e.g., delete) at a photo level e.g., right after it is imperative to further advance approaches related to descriptor
taking a photo that is deemed noisy or while reviewing the estimation for a better user experience.
training set at the end to make the iterations more efcient. Due to the lack of datasets for benchmarking our descriptor-
For example, P8 suggested having an interface that flters based approach, we had to manually create our own dataset for
out bad images based on descriptors or enables users to comparison. As in other AI-based systems evaluation, having bench-
replace them instead of starting from scratch. This opens up mark datasets is useful to assess systems for generating descriptors
interesting venues for approaches such as active learning in a more widely accepted way. One potential step in this direction
and data valuation. would be to invite blind data contributors, who can inspect their per-
sonal training data and agree to data sharing, to contribute to such
benchmark datasets employing approaches similar to Theodorou
6.2 Limitations and Future Work et al. [72].
There are many limitations that could impact the generalizability Last, in this study, images were used for the purpose of training.
of our fndings. Our observations come from a small sample even This approach can provide more control for the blind users over
ASSETS ’22, October 23–26, 2022, Athens, Greece Hong, et al.

their training sets regarding both incorporating variations and miti- REFERENCES
gating privacy risk concerns [5] as it would be less likely for a blind [1] Ali Abdolrahmani, William Easley, Michele Williams, Stacy Branham, and Amy
user to capture unnecessary information in an image. For example, Hurst. 2017. Embracing Errors: Examining How Context of Use Impacts Blind
Individuals’ Acceptance of Navigation Aid Errors. In Proceedings of the 2017 CHI
blind users usually use their hands as a reference to center the Conference on Human Factors in Computing Systems (Denver, Colorado, USA)
object in the camera frame, but they are often not willing to include (CHI ’17). Association for Computing Machinery, New York, NY, USA, 4158–4169.
https://doi.org/10.1145/3025453.3025528
their hands in the fnal photo to preserve user anonymity [41]. Also, [2] Dragan Ahmetovic, Cristian Bernareggi, Andrea Gerino, and Sergio Mascetti.
in the one study where videos were used, blind users had to be 2014. ZebraRecognizer: Efcient and Precise Localization of Pedestrian Crossings.
trained to follow some instructions and flming techniques [72]. In 2014 22nd International Conference on Pattern Recognition. 2566–2571. https:
//doi.org/10.1109/ICPR.2014.443
On the other hand, video increases the number of collected images [3] Dragan Ahmetovic, Daisuke Sato, Uran Oh, Tatsuya Ishihara, Kris Kitani, and
since it is a collection of frames. Also, the use of video increases the Chieko Asakawa. 2020. ReCog: Supporting Blind People in Recognizing Personal
chance of the object being in the frame at some point [72]. Perhaps Objects. Association for Computing Machinery, New York, NY, USA, 1–12. https:
//doi.org/10.1145/3313831.3376143
a way to get the best of the two worlds could be live photos as they [4] Aira. 2017. Your Life, Your Schedule, Right Now. https://aira.io
are easy to capture (like photos), and they include multiple frames [5] Taslima Akter, Bryan Dosono, Tousif Ahmed, Apu Kapadia, and Bryan Semaan.
2020. "I am uncomfortable sharing what I can’t see": Privacy Concerns of the
over one to three seconds [53]. Visually Impaired with Camera Based Assistive Applications. In 29th USENIX
Security Symposium (USENIX Security 20). USENIX Association, 1929–1948. https:
7 CONCLUSION //www.usenix.org/conference/usenixsecurity20/presentation/akter
[6] Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi,
In this work, we examined the challenge of accessing one’s training Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N. Bennett, Kori Inkpen, and et
examples in teachable object recognizers, where visual inspection al. 2019. Guidelines for Human-AI Interaction. In Proceedings of the 2019 CHI
Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk)
of training photos is not accessible to blind users with the ultimate (CHI ’19). Association for Computing Machinery, New York, NY, USA, Article 3,
goal of making machine teaching more inclusive. To this end, we 13 pages. https://doi.org/10.1145/3290605.3300233
engineered real-time descriptors that indicate to the blind user [7] Matthias Bauer, Mateo Rojas-Carulla, Jakub Bartłomiej Świątkowski, Bernhard
Schölkopf, and Richard E Turner. 2017. Discriminative k-shot learning using
whether the photo they just took is blurry, if their hand is in it, probabilistic models. arXiv preprint arXiv:1706.00326 (2017). https://doi.org/10.
if the object is cropped, and whether their photos overall vary 48550/ARXIV.1706.00326
[8] BeMyEyes. 2016. Lend you eyes to the blind. http://www.bemyeyes.org/
in object background, distance, and perspective; all factors that [9] Danielle Bragg, Nicholas Huynh, and Richard E. Ladner. 2016. A Personalizable
can afect model performance. We built MYCam, an accessible and Mobile Sound Detector App Design for Deaf and Hard-of-Hearing Users. In
open-source teachable object recognizer iOS app with descriptors. Proceedings of the 18th International ACM SIGACCESS Conference on Computers
and Accessibility (Reno, Nevada, USA) (ASSETS ’16). Association for Computing
We shared our fndings, observations, and lessons learned from a Machinery, New York, NY, USA, 3–13. https://doi.org/10.1145/2982142.2982171
remote study with 12 blind participants who trained MYCam in [10] Danielle Bragg, Oscar Koller, Mary Bellard, Larwan Berke, Patrick Boudreault,
their homes to recognize three distinct but visually similar objects. Annelies Brafort, Naomi Caselli, Matt Huenerfauth, Hernisa Kacorri, Tessa
Verhoef, Christian Vogler, and Meredith Ringel Morris. 2019. Sign Language
Our results showed that participants who choose to iterate their Recognition, Generation, and Translation: An Interdisciplinary Perspective. In
training for an object, were able to provide fewer photos where The 21st International ACM SIGACCESS Conference on Computers and Accessibility
(Pittsburgh, PA, USA) (ASSETS ’19). Association for Computing Machinery, New
the object was cropped, included no hand in their photos, and York, NY, USA, 16–31. https://doi.org/10.1145/3308561.3353774
had slightly less blurry photos that overall had more variation in [11] John Bronskill, Jonathan Gordon, James Requeima, Sebastian Nowozin, and
terms of object perspective and size but less in terms of background. Richard Turner. 2020. TaskNorm: Rethinking Batch Normalization for Meta-
Learning. In Proceedings of the 37th International Conference on Machine Learning
Overall, participants increased the variation among their training (Proceedings of Machine Learning Research, Vol. 119), Hal Daumé III and Aarti Singh
examples and reduced the number of photos with cropped objects (Eds.). PMLR, 1153–1164. https://proceedings.mlr.press/v119/bronskill20a.html
as they moved in training from one object to the next. Some of these [12] Kelly Caine. 2016. Local Standards for Sample Size at CHI. In Proceedings of
the 2016 CHI Conference on Human Factors in Computing Systems (San Jose,
changes are refected in their model performance that somewhat California, USA) (CHI ’16). Association for Computing Machinery, New York, NY,
relate to their satisfaction scores. However, errors in descriptor USA, 981–992. https://doi.org/10.1145/2858036.2858498
[13] Kathleen Campbell, Kimberly LH Carpenter, Jordan Hashemi, Steven Espinosa,
estimates seem to afect overall participants’ perception and trust Samuel Marsan, Jana Schaich Borg, Zhuoqing Chang, Qiang Qiu, Saritha Vermeer,
of model performance. Participants’ responses indicate that even Elizabeth Adler, Mariano Tepper, Helen L Egger, Jefery P Baker, Guillermo
though it was difcult to gauge the meaning of absolute values for Sapiro, and Geraldine Dawson. 2019. Computer vision analysis captures atypical
attention in toddlers with autism. Autism 23, 3 (2019), 619–628. https://doi.
some of the descriptors (e.g., variation), they could infer it based org/10.1177/1362361318766247 arXiv:https://doi.org/10.1177/1362361318766247
on relative changes. However, many found the training being te- PMID: 29595333.
dious, opening discussions around the need for balance between [14] Michelle Carney, Barron Webster, Irene Alvarado, Kyle Phillips, Noura Howell,
Jordan Grifth, Jonas Jongejan, Amit Pitaru, and Alexander Chen. 2020. Teachable
information, time, and cognitive load. These results, taken together, Machine: Approachable Web-Based Tool for Exploring Machine Learning Classi-
indicate that our novel data descriptors, realized in MYCam, hold fcation. In Extended Abstracts of the 2020 CHI Conference on Human Factors in
Computing Systems (Honolulu, HI, USA) (CHI EA ’20). Association for Computing
potential for facilitating quick inspection of training photos among Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/3334480.3382839
blind individuals. Going forward, we are excited to continue our [15] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Im-
endeavors towards building more inclusive participatory machine ageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on
Computer Vision and Pattern Recognition. 248–255. https://doi.org/10.1109/CVPR.
learning experiences both for blind youth and adults. 2009.5206848
[16] Voice Dream. 2022. Scanner – Voice Dream. https://www.voicedream.com/
ACKNOWLEDGMENTS scanner/
[17] Utkarsh Dwivedi, Jaina Gandhi, Raj Parikh, Merijke Coenraad, Elizabeth Bon-
We thank the anonymous reviewers for their thoughtful comments signore, and Hernisa Kacorri. 2021. Exploring Machine Teaching with Children.
on an earlier draft of this paper. This work is supported by NSF In 2021 IEEE Symposium on Visual Languages and Human-Centric Computing
(VL/HCC). 1–11. https://doi.org/10.1109/VL/HCC51201.2021.9576171
(#1816380). Kyungjun Lee is supported by NIDILRR (#90REGE0008).
Blind Users Accessing Their Training Images in Teachable Object Recognizers ASSETS ’22, October 23–26, 2022, Athens, Greece

[18] Alexander Fiannaca, Ilias Apostolopoulous, and Eelke Folmer. 2014. Headlock: intelligence in medicine 60, 1 (2014), 27–40. https://doi.org/10.1016/j.artmed.2013.
A Wearable Navigation Aid That Helps Blind Cane Users Traverse Large Open 11.004
Spaces. In Proceedings of the 16th International ACM SIGACCESS Conference on [37] Ron Kimmel and Alfred M Bruckstein. 2003. Regularized Laplacian zero crossings
Computers & Accessibility (Rochester, New York, USA) (ASSETS ’14). Association as optimal edge integrators. International Journal of Computer Vision 53, 3 (2003),
for Computing Machinery, New York, NY, USA, 19–26. https://doi.org/10.1145/ 225–243. https://doi.org/10.1023/A:1023030907417
2661334.2661453 [38] Masaki Kuribayashi, Seita Kayukawa, Hironobu Takagi, Chieko Asakawa, and
[19] Nicholas A. Giudice, Benjamin A. Guenther, Toni M. Kaplan, Shane M. Anderson, Shigeo Morishima. 2021. LineChaser: A Smartphone-Based Navigation System
Robert J. Knuesel, and Joseph F. Ciof. 2020. Use of an Indoor Navigation System for Blind People to Stand in Lines. In Proceedings of the 2021 CHI Conference on
by Sighted and Blind Travelers: Performance Similarities across Visual Status Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association
and Age. ACM Trans. Access. Comput. 13, 3, Article 11 (aug 2020), 27 pages. for Computing Machinery, New York, NY, USA, Article 33, 13 pages. https:
https://doi.org/10.1145/3407191 //doi.org/10.1145/3411764.3445451
[20] Steven M. Goodman, Ping Liu, Dhruv Jain, Emma J. McDonnell, Jon E. Froehlich, [39] Google Creative Lab. 2017. Teachable Machine. https://teachablemachine.
and Leah Findlater. 2021. Toward User-Driven Sound Recognizer Personalization withgoogle.com/v1/.
with People Who Are d/Deaf or Hard of Hearing. Proc. ACM Interact. Mob. [40] Kyungjun Lee, Jonggi Hong, Ebrima Jarjue, Ernest Essuah Mensah, and Hernisa
Wearable Ubiquitous Technol. 5, 2, Article 63 (jun 2021), 23 pages. https://doi.org/ Kacorri. 2022. From the Lab to People’s Home: Lessons from Accessing Blind
10.1145/3463501 Participants’ Interactions via Smart Glasses in Remote Studies. In Proceedings of
[21] Google. 2022. Lookout - Assisted vision - Apps on Google Play. the 19th International Web for All Conference (Lyon, France) (W4A ’22). Association
https://play.google.com/store/apps/details?id=com.google.android.apps. for Computing Machinery, New York, NY, USA, Article 24, 11 pages. https:
accessibility.reveal&hl=en_US&gl=US //doi.org/10.1145/3493612.3520448
[22] Google. 2022. Teachable Machine. https://teachablemachine.withgoogle.com/. [41] Kyungjun Lee, Jonggi Hong, Simone Pimento, Ebrima Jarjue, and Hernisa Ka-
[23] João Guerreiro, Daisuke Sato, Saki Asakawa, Huixu Dong, Kris M. Kitani, and corri. 2019. Revisiting Blind Photography in the Context of Teachable Object
Chieko Asakawa. 2019. CaBot: Designing and Evaluating an Autonomous Recognizers. In The 21st International ACM SIGACCESS Conference on Computers
Navigation Robot for Blind People. In The 21st International ACM SIGAC- and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19). Association for Computing
CESS Conference on Computers and Accessibility (Pittsburgh, PA, USA) (AS- Machinery, New York, NY, USA, 83–95. https://doi.org/10.1145/3308561.3353799
SETS ’19). Association for Computing Machinery, New York, NY, USA, 68–82. [42] Kyungjun Lee and Hernisa Kacorri. 2019. Hands Holding Clues for Object
https://doi.org/10.1145/3308561.3353771 Recognition in Teachable Machines. In Proceedings of the 2019 CHI Confer-
[24] Anhong Guo, Xiang ’Anthony’ Chen, Haoran Qi, Samuel White, Suman Ghosh, ence on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI
Chieko Asakawa, and Jefrey P. Bigham. 2016. VizLens: A Robust and Interactive ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https:
Screen Reader for Interfaces in the Real World. In Proceedings of the 29th Annual //doi.org/10.1145/3290605.3300566
Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). [43] Kyungjun Lee, Daisuke Sato, Saki Asakawa, Hernisa Kacorri, and Chieko
Association for Computing Machinery, New York, NY, USA, 651–664. https: Asakawa. 2020. Pedestrian Detection with Wearable Cameras for the Blind: A
//doi.org/10.1145/2984511.2984518 Two-Way Perspective. Association for Computing Machinery, New York, NY, USA,
[25] Tom Hitron, Yoav Orlev, Iddo Wald, Ariel Shamir, Hadas Erel, and Oren Zuck- 1–12. https://doi.org/10.1145/3313831.3376398
erman. 2019. Can Children Understand Machine Learning Concepts? The [44] James R Lewis. 1995. IBM computer usability satisfaction questionnaires:
Efect of Uncovering Black Boxes. In Proceedings of the 2019 CHI Conference psychometric evaluation and instructions for use. International Journal of
on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Human-Computer Interaction 7, 1 (1995), 57–78. https://doi.org/10.1080/
Association for Computing Machinery, New York, NY, USA, 1–11. https: 10447319509526110
//doi.org/10.1145/3290605.3300645 [45] Haley MacLeod, Cynthia L. Bennett, Meredith Ringel Morris, and Edward Cutrell.
[26] Jonggi Hong, Kyungjun Lee, June Xu, and Hernisa Kacorri. 2019. Exploring 2017. Understanding Blind People’s Experiences with Computer-Generated
Machine Teaching for Object Recognition with the Crowd. In Extended Abstracts Captions of Social Media Images. In Proceedings of the 2017 CHI Conference
of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI ’17).
Scotland Uk) (CHI EA ’19). Association for Computing Machinery, New York, NY, Association for Computing Machinery, New York, NY, USA, 5988–5999. https:
USA, 1–6. https://doi.org/10.1145/3290607.3312873 //doi.org/10.1145/3025453.3025814
[27] Jonggi Hong, Kyungjun Lee, June Xu, and Hernisa Kacorri. 2020. Crowdsourcing [46] Cristina Manresa-Yee, Javier Varona, Francisco J Perales, and Iosune Salinas.
the Perception of Machine Teaching. Association for Computing Machinery, New 2014. Design recommendations for camera-based head-controlled interfaces that
York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376428 replace the mouse for motion-impaired users. Universal access in the information
[28] Kristina Höök. 2000. Steps to take before intelligent user interfaces become real. society 13, 4 (2014), 471–482. https://doi.org/10.1007/s10209-013-0326-z
Interacting with computers 12, 4 (2000), 409–426. https://doi.org/10.1016/S0953- [47] Daniela Massiceti, Lida Theodorou, Luisa Zintgraf, Matthew Tobias Harris, Si-
5438(99)00006-5 mone Stumpf, Cecily Morrison, Edward Cutrell, and Katja Hofmann. 2021. ORBIT:
[29] Virtual Collaboration Research Inc. 2022. Super Lidar - Lidar for Blind. https: A real-world few-shot dataset for teachable object recognition collected from
//apps.apple.com/us/app/super-lidar-lidar-for-blind/id1543706309 people who are blind or low vision. https://doi.org/10.25383/CITY.14294597
[30] Hairong Jiang, Ting Zhang, Juan P Wachs, and Bradley S Duerstock. 2016. En- [48] Mediate. 2022. Supersense - AI for Blind / Scan text, money and objects. https:
hanced control of a wheelchair-mounted robotic manipulator using 3-D vision //www.supersense.app/
and multimodal interaction. Computer Vision and Image Understanding 149 (2016), [49] Carrie Morales. 2019. What’s Better for the Blind and Low Vision? Android or
21–31. https://doi.org/10.1016/j.cviu.2016.03.015 iPhone? https://liveaccessible.com/2019/03/03/whats-better-for-the-blind-and-
[31] Hernisa Kacorri. 2017. Teachable Machines for Accessibility. SIGACCESS Access. low-vision-android-or-iphone/
Comput. 119 (nov 2017), 10–18. https://doi.org/10.1145/3167902.3167904 [50] John Morris and James Mueller. 2014. Blind and deaf consumer preferences for
[32] Hernisa Kacorri, Utkarsh Dwivedi, Sravya Amancherla, Mayanka Jha, and Riya android and iOS smartphones. In Inclusive designing. Springer, 69–79. https:
Chanduka. 2020. IncluSet: A Data Surfacing Repository for Accessibility Datasets. //doi.org/10.1007/978-3-319-05095-9_7
In The 22nd International ACM SIGACCESS Conference on Computers and Accessi- [51] Meredith Ringel Morris. 2020. AI and Accessibility. Commun. ACM 63, 6 (2020),
bility (Virtual Event, Greece) (ASSETS ’20). Association for Computing Machinery, 35–37. https://doi.org/10.1145/3356727
New York, NY, USA, Article 72, 4 pages. https://doi.org/10.1145/3373625.3418026 [52] Donald A Norman. 1994. How might people interact with agents. Commun. ACM
[33] Hernisa Kacorri, Utkarsh Dwivedi, and Rie Kamikubo. 2020. Data Sharing in 37, 7 (1994), 68–71. https://doi.org/10.1145/176789.176796
Wellness, Accessibility, and Aging. NeurIPS 2020 Workshop on Dataset Curation [53] Lauren Olson, Chandra Kambhamettu, and Kathleen McCoy. 2021. Towards Using
and Security (2020). Live Photos to Mitigate Image Quality Issues In VQA Photography. Association for
[34] Hernisa Kacorri, Kris M. Kitani, Jefrey P. Bigham, and Chieko Asakawa. 2017. Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3441852.
People with Visual Impairment Training Personal Object Recognizers: Feasibility 3476541
and Challenges. In Proceedings of the 2017 CHI Conference on Human Factors in [54] Thomas J. Palmeri and Isabel Gauthier. 2004. Visual object understanding. Nature
Computing Systems (Denver, Colorado, USA) (CHI ’17). Association for Computing Reviews Neuroscience 5, 4 (2004), 291–303. https://doi.org/10.1038/nrn1364
Machinery, New York, NY, USA, 5839–5849. https://doi.org/10.1145/3025453. [55] Rupal Patel and Deb Roy. 1998. Teachable interfaces for individuals with
3025899 dysarthric speech and severe physical disabilities. In Proceedings of the AAAI
[35] Seita Kayukawa, Keita Higuchi, João Guerreiro, Shigeo Morishima, Yoichi Sato, Workshop on Integrating Artifcial Intelligence and Assistive Technology. Citeseer,
Kris Kitani, and Chieko Asakawa. 2019. BBeep: A Sonic Collision Avoidance 40–47.
System for Blind Travellers and Nearby Pedestrians. In Proceedings of the 2019 [56] Rubens Lacerda Queiroz, Fábio Ferrentini Sampaio, Cabral Lima, and Priscila
CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Machado Vieira Lima. 2020. AI from concrete to abstract: demystifying artifcial
Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. intelligence to the general public. https://doi.org/10.1007/s00146-021-01151-x
https://doi.org/10.1145/3290605.3300282 arXiv:2006.04013 [cs.CY]
[36] Taha Khan, Dag Nyholm, Jerker Westin, and Mark Dougherty. 2014. A computer
vision framework for fnger-tapping evaluation in Parkinson’s disease. Artifcial
ASSETS ’22, October 23–26, 2022, Athens, Greece Hong, et al.

[57] Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. [72] Lida Theodorou, Daniela Massiceti, Luisa Zintgraf, Simone Stumpf, Cecily Morri-
https://doi.org/10.48550/ARXIV.1804.02767 son, Edward Cutrell, Matthew Tobias Harris, and Katja Hofmann. 2021. Disability-
[58] Alejandro Reyes-Amaro, Yanet Fadraga-González, Oscar Luis Vera-Pérez, Eliz- First Dataset Creation: Lessons from Constructing a Dataset for Teachable Object
abeth Domínguez-Campillo, Jenny Nodarse-Ravelo, Alejandro Mesejo-Chiong, Recognition with Blind and Low Vision Data Collectors. In The 23rd International
Biel Moyà-Alcover, and Antoni Jaume-i Capó. 2012. Rehabilitation of patients ACM SIGACCESS Conference on Computers and Accessibility (Virtual Event, USA)
with motor disabilities using computer vision based techniques. Journal of acces- (ASSETS ’21). Association for Computing Machinery, New York, NY, USA, Article
sibility and design for all 2, 1 (2012), 62–70. https://doi.org/10.17411/jacces.v2i1.87 27, 12 pages. https://doi.org/10.1145/3441852.3471225
[59] André Rodrigues, André Santos, Kyle Montague, Hugo Nicolau, and Tiago Guer- [73] David Touretzky. 2019. The AI4K12 Initiative: Developing National Guidelines
reiro. 2019. Understanding the Authoring and Playthrough of Nonvisual Smart- for Teaching AI In K-12. https://github.com/touretzkyds/ai4k12/blob/master/
phone Tutorials. In Human-Computer Interaction – INTERACT 2019, David Lamas, documents/CSTA_2019_How_To_Teach_AI_Across_K-12.pdf.
Fernando Loizides, Lennart Nacke, Helen Petrie, Marco Winckler, and Panayiotis [74] David Touretzky. 2020. The AI4K12 Initiative: Developing National Guidelines
Zaphiris (Eds.). Springer International Publishing, Cham, 42–62. for Teaching AI In K-12. https://raw.githubusercontent.com/touretzkyds/ai4k12/
[60] Manaswi Saha, Alexander J. Fiannaca, Melanie Kneisel, Edward Cutrell, and master/documents/GlobalSWEdu2020_Touretzky.pdf.
Meredith Ringel Morris. 2019. Closing the Gap: Designing for the Last-Few- [75] Henriikka Vartiainen, Matti Tedre, and Teemu Valtonen. 2020. Learning machine
Meters Wayfnding Problem for People with Visual Impairments. In The 21st learning with very young children: Who is teaching whom? International Journal
International ACM SIGACCESS Conference on Computers and Accessibility (Pitts- of Child-Computer Interaction 25 (Sept. 2020), 1–11. https://linkinghub.elsevier.
burgh, PA, USA) (ASSETS ’19). Association for Computing Machinery, New York, com/retrieve/pii/S2212868920300155
NY, USA, 222–235. https://doi.org/10.1145/3308561.3353776 [76] Oriol Vinyals, Charles Blundell, Timothy Lillicrap, koray kavukcuoglu, and Daan
[61] Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2017. Toward Scalable Wierstra. 2016. Matching Networks for One Shot Learning. In Advances in
Social Alt Text: Conversational Crowdsourcing as a Tool for Refning Vision-to- Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg,
Language Technology for the Blind. In HCOMP. I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 3630–3638. http://papers.
[62] Wojciech Samek, Thomas Wiegand, and Klaus-Robert Müller. 2017. Explainable nips.cc/paper/6385-matching-networks-for-one-shot-learning.pdf
Artifcial Intelligence: Understanding, Visualizing and Interpreting Deep Learning [77] Vuzix. 2021. Vuzix Blade Smart Glasses. https://www.vuzix.com/products/blade-
Models. CoRR abs/1708.08296 (2017). arXiv:1708.08296 http://arxiv.org/abs/1708. smart-glasses-upgraded
08296 [78] Yutaro Yamanaka, Seita Kayukawa, Hironobu Takagi, Yuichi Nagaoka, Yoshimune
[63] SeeingAI. 2017. An app for visually impaired people that narrates the world around Hiratsuka, and Satoshi Kurihara. 2022. One-Shot Wayfnding Method for Blind
you. https://www.microsoft.com/en-us/seeing-ai People via OCR and Arrow Analysis with a 360-Degree Smartphone Camera. In
[64] Claude Elwood Shannon. 2001. A mathematical theory of communication. ACM Mobile and Ubiquitous Systems: Computing, Networking and Services, Takahiro
SIGMOBILE mobile computing and communications review 5, 1 (2001), 3–55. Hara and Hirozumi Yamaguchi (Eds.). Springer International Publishing, Cham,
[65] Patrice Y. Simard, Saleema Amershi, David M. Chickering, Alicia Edelman Pelton, 150–168. https://doi.org/10.1007/978-3-030-94822-1_9
Soroush Ghorashi, Christopher Meek, Gonzalo Ramos, Jina Suh, Johan Verwey, [79] Guangxiao Zhang, Zhuolin Jiang, and Larry S Davis. 2012. Online semi-supervised
Mo Wang, and John Wernsing. 2017. Machine Teaching: A New Paradigm for discriminative dictionary learning for sparse representation. In Asian conference
Building Machine Learning Systems. https://doi.org/10.48550/ARXIV.1707.06742 on computer vision. Springer, 259–273. https://doi.org/10.1007/978-3-642-37331-
[66] Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep Inside 2_20
Convolutional Networks: Visualising Image Classifcation Models and Saliency [80] Gongjie Zhang, Zhipeng Luo, Kaiwen Cui, and Shijian Lu. 2021. Meta-detr:
Maps. https://doi.org/10.48550/ARXIV.1312.6034 Few-shot object detection via unifed image-level meta-learning. arXiv preprint
[67] Hojun Son, Divya Krishnagiri, V. Swetha Jeganathan, and James Weiland. 2020. arXiv:2103.11731 (2021).
Crosswalk Guidance System for the Blind. In 2020 42nd Annual International [81] Yuhang Zhao, Elizabeth Kupferstein, Brenda Veronica Castro, Steven Feiner, and
Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). 3327– Shiri Azenkot. 2019. Designing AR Visualizations to Facilitate Stair Navigation
3330. https://doi.org/10.1109/EMBC44109.2020.9176623 for People with Low Vision. In Proceedings of the 32nd Annual ACM Symposium
[68] Joan Sosa-García and Francesca Odone. 2017. “Hands On” Visual Recognition on User Interface Software and Technology (New Orleans, LA, USA) (UIST ’19).
for Visually Impaired Users. ACM Trans. Access. Comput. 10, 3, Article 8 (Aug. Association for Computing Machinery, New York, NY, USA, 387–402. https:
2017), 30 pages. https://doi.org/10.1145/3060056 //doi.org/10.1145/3332165.3347906
[69] Pierre Stock and Moustapha Cisse. 2018. ConvNets and ImageNet Beyond Accu- [82] Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2018. A Face
racy: Understanding Mistakes and Uncovering Biases. In The European Conference Recognition Application for People with Visual Impairments: Understanding Use
on Computer Vision (ECCV). Beyond the Lab. In Proceedings of the 2018 CHI Conference on Human Factors in
[70] Qianru Sun, Yaoyao Liu, Tat-Seng Chua, and Bernt Schiele. 2019. Meta-transfer Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing
learning for few-shot learning. In Proceedings of the IEEE Conference on Computer Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3173574.3173789
Vision and Pattern Recognition. 403–412. [83] Xiaojin Zhu, Adish Singla, Sandra Zilles, and Anna N. Raferty. 2018. An Overview
[71] Ruxandra Tapu, Bogdan Mocanu, and Titus Zaharia. 2019. DEEP-HEAR: A of Machine Teaching. CoRR abs/1801.05927 (2018). arXiv:1801.05927 http:
Multimodal Subtitle Positioning System Dedicated to Deaf and Hearing-Impaired //arxiv.org/abs/1801.05927
People. IEEE Access 7 (2019), 88150–88162. https://doi.org/10.1109/ACCESS.2019. [84] Zoom. 2022. Video Conferencing, Cloud Phones, Webinars, Chat, Virtual Events.
2925806 https://zoom.us/
Challenging and Improving Current Evaluation Methods for
Colour Identification Aids
Connor Geddes David R. Flatla
geddesc@uoguelph.ca dfatla@uoguelph.ca
School of Computer Science School of Computer Science
University of Guelph University of Guelph
Guelph, Ontario, Canada Guelph, Ontario, Canada

ABSTRACT Colour Identifcation (or CVD) Aids have been developed that
Identifcation of and discrimination between colours is an impor- look to assist people with CVD, however, determining the potential
tant task in everyday life, but for the 5% of the population who efectiveness of these aids is commonly performed via diagnostic
have Colour Vision Defciency (CVD), correctly identifying or dis- tests like the Ishihara plate test [29] or the FM-100 hue test [15].
criminating between colours can be difcult or impossible. Colour Other tasks such as judging natural photos with the aid applied
Identifcation (or CVD) Aids have been developed to assist people to them, the Colour Matching Task [16], and the Odd one out
with CVD, however, the methods used to evaluate them are often Colour Task [17] have been developed to be used to verify colour
limited and many use CVD simulations instead of participants with identifcation aids. However, these methods of evaluation all fall
CVD. To address this, we propose two new CVD Aid evaluation short in that they only cover one type (Comparative Judgement)
tasks and show that they can assist in providing a more thorough of the four types of colour judgement that individuals with CVD
evaluation of potential CVD aids. In addition, we evaluate the efec- struggle with as defned by Cole [9]: Comparative, Connotative,
tiveness of CVD simulations used by non-CVD people in providing Denotative, and Aesthetic.
results similar to those for people with CVD, and found that both Three recently developed colour identifcation tasks were de-
the results and participant behaviour often difered. Our results signed to address more of Cole’s types of colour judgement: Selec-
indicate that greater care is needed when evaluating CVD Aids. tion Task, Transition Task, and Sorting Task [20]. However, these
new tasks do not evaluate connotative judgement of colour (match-
CCS CONCEPTS ing colour to a meaning/category). To address this, we developed
a new connotative colour identifcation task called Category Task.
• Human-centered computing → Accessibility.
In addition, we also created another task (called Distinction Task),
which looks to identify the number of distinct colours an aid allows
KEYWORDS an individual with CVD to correctly identify.
Colour Vision Defciency, Colour Vision Defciency Simulation In addition, several previous evaluations of CVD Aids have used
ACM Reference Format: CVD simulations with non-CVD participants (or with the authors
Connor Geddes and David R. Flatla. 2022. Challenging and Improving Cur- or readers serving as the only evaluators) as stand-ins for individ-
rent Evaluation Methods for Colour Identifcation Aids. In The 24th Interna- uals with CVD [1, 3, 7, 8, 13, 25–27, 30, 41, 43, 53, 54], a technique
tional ACM SIGACCESS Conference on Computers and Accessibility (ASSETS that has come to be referred to as ‘indirect’ evaluation [57]. Further-
’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 12 pages. more, most research that looks to develop colour identifcation aids
https://doi.org/10.1145/3517428.3544818 uses CVD simulations as the core component of their respective
evaluations and rarely include people with CVD in the evaluation at
1 INTRODUCTION all, and then often only tangentially (e.g., asking for preferences on
which CVD Aid flter looks the best, but not determining efective-
Colour Vision Defciency (CVD) afects an individual’s ability to
ness in realistic or abstract tasks). Of the hundreds of publications
accurately diferentiate and name colours. Failure to correctly deter-
in this research area, very few studies [16, 17, 20, 24, 48] have gone
mine colours can result in frustrating or dangerous consequences.
through the efort to develop tasks for a comprehensive user study
For example, taking the wrong route on a train/subway by misin-
that fully evaluates colour identifcation aids while simultaneously
terpreting a colour encoding or missing the dangerous naturally
involving those with CVD as crucial contributors towards deter-
colour coded medical symptom of blood in the stool can both result
mining the efectiveness of the aid.
from an inability to correctly distinguish colours.
To determine how efective ‘indirect’ evaluations are at mimick-
ing genuine issues for those with CVD, we evaluated the use of
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed CVD simulations with non-CVD participants on the three previ-
for proft or commercial advantage and that copies bear this notice and the full citation ously developed tasks: Selection Task, Transition Task, and Sorting
on the frst page. Copyrights for components of this work owned by others than the Task, as well as our two new tasks: Distinction Task and Category
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission Task. As it is often unclear how much information non-CVD partic-
and/or a fee. Request permissions from permissions@acm.org. ipants are provided regarding the vision of individuals with CVD
ASSETS ’22, October 23–26, 2022, Athens, Greece during ’indirect evaluations’, we also provided three diferent levels
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 of simulation information: No Information, Minimum Information
https://doi.org/10.1145/3517428.3544818 (where they were informed that they are under a CVD simulation
ASSETS ’22, October 23–26, 2022, Athens, Greece Connor Geddes and David R. Flatla

and colour will appear diferent), and Maximum Information (where age [12, 28, 31], chemical exposure [14, 36], brain injury [56], and
they were provided details of exactly which colours individuals with disease [19, 45].
CVD tend to confuse and why). Our ground truth data was obtained
from actual people with a variety of types and severities of CVD.
We employed the Machado CVD simulation method [38], due
to its rising popularity in industry (both Chrome and Firefox have
integrated it into their browser devtools [21, 22, 42]), its popularity
in research studies [35, 58], and its ability to simulate a range of
CVD severities. As people with CVD are often diagnosed into one
of three categories of severity, we opted to include three simulation
severity conditions in order to not unfairly compare a more severe a b
CVD simulation to a ‘mild’ CVD participant. To do this, we used
the 0.3 severity for mild CVD, the 0.6 severity for moderate CVD,
and the 1.0 severity for strong CVD (dichromacy) in the Machado
simulation.
Our fndings indicate signifcant diferences between the results
for all simulation conditions and the results for CVD participants.
We also observed no diference or very minor diferences between
the diferent simulation information conditions. These fndings
lead us to conclude that CVD simulations – even with the addition c d
of detailed information – cannot hope to match years of genuine
experience living with CVD.
This paper makes two important contributions: 1) Two new tasks Figure 1: Illustrating Machado’s Colour Vision Defciency
for evaluating CVD Aids, with empirical evidence that suggests that simulation (deutan): a) no simulation, b) severity = 0.3, c)
they should help generalize current evaluation techniques to better severity = 0.6; d) severity = 1.0 (or dichromatic).
match the diversity of challenges experienced by people with CVD;
and 2) Clear empirical evidence that we (as a discipline) should NOT
be evaluating CVD Aids by using non-CVD participants under CVD
simulations, no matter the severity, type, or amount of contextual 2.2 Simulating Colour Vision Defciency
information provided. CVD simulations have been designed to help non-CVD people
understand how people with CVD see the world, and are often
used to help identify accessibility concerns that could afect people
2 RELATED WORK with CVD. In order to simulate CVD, the established approach is to
frst convert an RGB colour into LMS colour space (corresponding
2.1 Colour Vision Defciency to long, medium, and short wavelength cone stimulation levels).
To identify colour, we use retinal cone cells (or cones) that are sensi- Next, the data for the cone corresponding to the type of CVD to
tive to diferent wavelengths of visible light. Human eyes typically be simulated is replaced with the data from another cone (e.g.,
contain long (Red), medium (Green), and short (Blue) wavelength the middle-wavelength cone data is overwritten with the long-
sensitive cone types [10], hence are described as ‘trichromatic’. wavelength cone data for Deuteranopia). Finally, the modifed LMS
When one or more of the cone types are either missing, damaged, colour is converted back to RGB.
or have variant sensitivity, it results in Colour Vision Defciency A variety of CVD simulations have been developed, ranging from
(CVD) [5]. Three main classes of CVD exist, each depending on how academic research projects to personal-interest projects. Meyer and
the cones are afected: 1) Anomalous Trichromacy (one variant- Greenberg [40] proposed the frst algorithm for CVD simulation,
sensitivity cone - approximately 3/4 of all cases of inherited CVD), in which they simulated CVD perception in CIE XYZ color space
2) Dichromacy (one cone missing - approximately 1/4 of all cases of after making the appropriate adjustments to account for the vision
inherited CVD), and 3) Monochromacy (two or three cones missing - of individuals with dichromatic vision in LMS space.
very rare). Anomalous Trichromacy and Dichromacy have three dif- Brettel et al. [6] developed a simulation to model the appearance
ferent subtypes depending on the cone that is afected: Protanomaly of colours for people with dichromacy. They based their simulation
and Protanopia for the long wavelength sensitive cone, Deutera- on previous work that identifed colours that are perceived identi-
nomaly and Deuteranopia for the medium wavelength sensitive cally by people with unilateral dichromacy (dichromatic vision in
cone, and Tritanomaly and Tritanopia for the short wavelength one eye and trichromatic vision in the other) [23]. For protanopia
sensitive cone. CVD can be either inherited or acquired. Inherited and deuteranopia, their simulation projects all colours onto two
CVD most commonly afects the long (1/4 of cases) and medium half-planes that stretch from the achromatic (greyscale) axis to
wavelength (3/4 of cases) cones resulting in diminished colour dis- either a 475nm (blue) or a 575nm (yellow) anchor. Vienot et al. [52]
tinguishability along the red-green axis, leading to the common simplifed the simulation process for Brettel’s method.
misnomer ‘red-green colourblindness’. Acquired CVD most com- Kondo [32] developed an early simulation method that mod-
monly afects the short wavelength cone, and can be the result of elled the vision of anomalous trichromacy based on established
Challenging and Improving Current Evaluation Methods for Colour Identification Aids ASSETS ’22, October 23–26, 2022, Athens, Greece

dichromacy simulation methods. Yang et al. [55] also attempted to in order to clarify colours of confusion and increase contrast, 2)
simulate anomalous trichromacy by attempting LMS space adjust- overlay patterns to signify specifc colours or signify a diference, or
ments at the photoreceptor level. 3) highlight a specifc colour. Sound based aids use diferent sounds
Machado et al. [38] based their simulation method on physiolog- to signify specifc colours.
ical principles around how cones shift in sensitivity in anomalous The common tasks for evaluating aids to determine whether
trichromacy, taking into account a novel ‘stage theory’ of human they are benefcial to individuals with CVD or if they improve on
colour vision. Their model simulates the vision of various nanome- previous methods are: 1) Ishihara plate test [3, 8, 13, 27, 29, 30, 33,
ter shifts in the sensitivity of the anomalous cones, allowing the 34, 41, 43, 46, 50, 53, 57, 58], 2) FM-100 Hue test [15, 57], 3) Colour
simulation of a variety of severities of anomalous trichromacy. Fig- matching test [16], 4) Image ranking [3, 13, 25, 26, 33, 43, 50, 54,
ure 1 contains a demonstration of Machado’s deutan CVD simula- 57, 58], 5) Odd-one-out colour task [17], 6) Selection Task [20], 7)
tion with increasing severity level. Flatla & Gutwin [18] developed Transition Task [20], and 8) Sorting Task [20].
a personalized simulation adapted to an individual via a calibra- Most tasks that are used in the evaluation of CVD aids do not
tion process, and later favourably compared their simulation to adequately address all four types of colour challenges that people
Machado’s adjustable simulation [37], however their personalized with CVD face, as defned by Cole [9]. These four challenges are:
simulation is not tunable to arbitrary severities.
The ‘ColourMatrix’ implementation is a method that roughly (1) Comparative: Where colours must be compared to fnd simi-
simulates CVD, which, despite its questionable implementation and larities or diferences (e.g., matching two fabrics).
having little scientifc foundation [2], has gained in popularity. The (2) Denotative: Where colour is defned or referenced only by
HCIRN simulation 1 is another method that simulates CVD based on name (e.g., "Mine is the red car.").
the Meyer and Greenberg method [40] and is the method employed (3) Connotative: Where colour can have a meaning or be as-
by the popular Coblis online simulator 2 . However, similar to the signed a meaning (e.g., Yellow bananas are ripe).
‘ColourMatrix’ method, the HCIRN method has been noted to have (4) Aesthetic: Where colour can be used to convey beauty (e.g.,
issues with validity [39, 44]. in art or fashion).

2.3 Criticisms of Using Disability Simulations Most CVD Aid evaluation tasks tend to only cover comparative
Other disability simulation methods have been present in the acces- colour challenges, however, some more recent tasks (Selection and
sibility literature (e.g., using a blindfold to approximate blindness Sorting) also cover denotative uses of colour.
during design, development, or evaluation), with the goal being However, a task that covers connotative uses of colour has not
more inclusive designs, and what could be characterized as a ‘better- been developed yet, nor has a task that allows for an evaluation
than-nothing’ attitude. Sears and Hanson [49] argue that it might of the number of simultaneous variations of specifc colour names
be acceptable to do pilot testing with non-representative (i.e., sim- that can be accurately distinguished. The beneft of having a task
ulated) users, however, in studies that target archival publication, that assesses the number of distinct colours that an aid enables
representative users must be recruited. to be distinguished gives a clear metric that not only allows for
More recently, Bennett & Rosner [4] discuss how empathy build- comparison of aids but also to determine how efective the aids can
ing through simulations might threaten inclusive design because it be in the most extreme of circumstances.
can lead to the wholesale exclusion of individuals with impairments Finally, the common approach to evaluating colour identifcation
from the design work because the team already has an (artifcially in- aids is the ‘indirect’ evaluation [57], which bypasses recruiting par-
fated) ‘understanding’ of those individuals through the simulation. ticipants with CVD by employing CVD simulations and showing
Likewise, Tigwell [51] found that very few designers understand them to non-CVD participants (sometimes just the research team
the problems that can arise from over-reliance on disability simula- or reader of the paper). This problem is very pervasive – ‘indirect’
tions, and that designers tend to believe that disability simulations evaluation form a core component of almost every paper looking
provide ‘sufcient validation’ for their fnal design. to assist people with CVD. However, some papers do employ CVD
In summary, Bennett & Rosner [4] and Tigwell [51] both recom- participants, but often only by treating CVD participants as tokens
mend that collaboration between individuals with disabilities and for verifying the results of a tool rather than for determining ef-
designers is necessary for the creation of designs and interfaces that fectiveness or obtaining feedback from the target population. Such
better refect the true life experiences of people with disabilities. tokenizing is employed in the most commonly used evaluation
method when CVD participants are involved, the Image Ranking
2.4 Colour Identifcation Aids and the Methods Task. The Image Ranking Task has CVD participants rank output
used to Evaluate them images from several visual aids (usually recolouring) simply based
on which looks the ‘best’. We share the concerns expressed else-
Colour Identifcation Aids have been developed to assist people with where in the accessibility literature (see Section 2.3) regarding this
CVD. There are three categories for CVD aids: haptic aids, visually approach.
modifying aids, and sound based aids. Haptic colour identifcation In this paper, we look to address the lack of a connotative CVD
aids generally use patterns of vibration and push feedback in order Aid evaluation task, plus the lack of an extreme task that allows for
signify a specifc colour. Visual aids either: 1) recolour the view the determination of the number of colours that can be simultane-
1 http://colorlab.wickline.org/colorblind/colorlab/docs/acknowledgments.html ously distinguished. In addition, we also look to address whether
2 https://www.color-blindness.com/coblis-color-blindness-simulator/ ‘indirect’ evaluations are useful and what their limitations are. We
ASSETS ’22, October 23–26, 2022, Athens, Greece Connor Geddes and David R. Flatla

make use of recently developed colour identifcation tasks – Selec- 3.2 Distinction Task
tion, Transition and Sorting – to help us with this and we make use
of the Machado CVD simulation method [38] as it is the state-of-the-
art. We chose to utilize the three previous methods from [20] over
the Ishihara Test as although the Ishiraha Test garners far more
use in the related work, its value as a CVD Aid evaluation task is
extremely limited due to the very fne-tuned colours used in the
Ishihara Test (precisely chosen to identify diferent types of CVD) –
essentially any chromatic modifcation of an Ishihara plate will ren-
der its hidden number visible to someone with CVD. This presents
such a low barrier to ‘success’ that it leaves the Ishihara Test almost
useless as a CVD Aid evaluation task. We did not employ the Image
Ranking Task due to: a lack of standardized set of images (previous
studies vary widely on the images used), the ‘entire-image’ nature
of this task (which prevents assessing component diferences, such
as colour naming performance), and the highly subjective nature
of this task (e.g., no clear defnition of ‘best’ exists for this task).
Figure 3: Example of Distinction Task shown with nine total
3 PROPOSED CVD EVALUATION METHODS ‘distinct’ hues. In the above grid the hues of ‘red’, ‘orange’,
‘yellow’, ‘green’, ‘blue’, ‘purple’, ‘pink’, ‘grey’ and ‘brown’ are
3.1 Category Task shown.

We designed the Distinction Task to create a task that allowed for


the number of distinct colour names/hue categories an aid enabled
an individual with CVD to distinguish to be assessed. This allows
a unique opportunity for potential aids to compare the number of
relative colour names they accurately allow individuals with CVD
to distinguish simultaneously. Research looking to improve upon
recolouring aids often uses subjective judgement and/or contrast
measurements to determine the relative success of a novel technique
over existing methods. While we believe the ability of an aid to
improve the contrast between two colours is very important, we also
believe there is great beneft in creating this task which enables an
assessment of the number of distinct colours that can be accurately
named by an individual with CVD to be equally important. The
design of the Distinction Task is inspired by the Selection Task,
where the Distinction Task features a large number of coloured
swatches in grid of colours. The bottom features a grid of check
Figure 2: Example of Category Task. In this example, the
boxes indicating the distinct colour names to be identifed from the
correct category is Category 6 for the colour ‘purple’, which
the grid.
matches the slightly diferent purple in the top-right.
3.2.1 Distinction Task Scoring. We scored the Distinction Task
very similarly to how the Selection Task was reported to be scored
The category task was designed to address the lack of a task
in [20], due to both the Selection Task and the Distinction Task
that featured connotative judgement of colour [9]. We designed
allowing participants to select more distinct colours than the correct
the category task to function very similar to that of a pH strip test,
answer. If the participant selects a total number of colour names
where a target category colour (indicated at the top in the task) must
equal to or less than the number of ‘true’ answers, then accuracy is
be matched to a corresponding coloured category in a coloured
calculated by dividing the total number of distinct colour names
category grid. A key aspect of this task is that the colour that is
the participant got correct by the total correct number of distinct
required to be matched to a category will never exactly match the
colours. However, if the participant selected a number of distinct
correct corresponding category. Figure 2 illustrates this idea where
colours greater than the number of ‘true’ colours, than the accuracy
the colour to match is a slightly more vibrant ‘purple’ than the
is calculated by dividing the number of distinct colours a participant
corresponding ’purple’ in the correct category (Category 6).
got correct by the total number of distinct colours selected.
3.1.1 Category Task Scoring. We scored the Category Task simply
based on the index of the selected category and the index of the 4 ONLINE EVALUATION
answer for the task. This resulted in every trial being either correct We conducted an online evaluation in order to test the efectiveness
or incorrect. of our new methods as potential techniques that could be utilized for
Challenging and Improving Current Evaluation Methods for Colour Identification Aids ASSETS ’22, October 23–26, 2022, Athens, Greece

evaluating CVD Aids by comparing the performance of participants Red Orange Yellow Green Blue
with and without CVD on a collection of tasks. Simultaneously,
we also evaluated the efectiveness of ‘indirect’ evaluations by
obtaining results from non-CVD participants under a variety of
CVD simulation severities to compare to the CVD participant data.

4.1 Participants
We conducted a remote online evaluation with a total of 284 partic-
ipants recruited from Reddit and Facebook. Of our 284 participants, Purple Pink Brown Grey Black
26 had CVD, consisting of 16 deutan (mild = 4, moderate = 10, strong
= 2) CVD participants and 10 protan (mild = 2, moderate = 4, strong
= 4) CVD participants. CVD type and severity was self-reported
by participants, as we recruited our participants from social me-
dia groups with members that are typically very knowledgeable
about their colour vision. With this in mind, we sub-sampled the
remaining 258 non-CVD participants (most of whom did the study
under CVD simulation) to ensure similar distributions in CVD type
and severity between the CVD group and the non-CVD simula-
tion groups. We did this to avoid unfairly biasing our study for or
Figure 4: Colours generated for the tasks
against CVD simulations (e.g., by comparing performance under
dichromatic simulations to the performance of participants with
mild CVD). In particular, in each of the three level-of-information
conditions, we ensured we chose a total of 26 (non-CVD under
CVD simulation) participants such that 16 had used a deutan simu- for non-CVD people to better see how individuals with anomalous
lation with 4 using mild (0.3), 10 moderate (0.6) and 2 strong (1.0) trichomacy see colours. We decided to add three diferent levels of
simulations, and 10 used the protan simulation with 2 having mild severity of the Machado CVD simulation with roughly equal steps:
(0.3), 4 having moderate (0.6), and 4 having strong (1.0) simulations. 1) 0.3 - mild CVD, 2) 0.6 - moderate CVD, and 3) 1.0 - strong CVD.
We further tried to maintain consistency by attempting to match To test the potential conditions for how simulations could be
participant reported ages, the device used, environment (indoor employed in studies utilizing ’indirect’ evaluation we decided on
vs. outdoor) and lighting (natural vs. artifcial). three levels of living-with-CVD information provided to the par-
After the subsampling, we ended up with 130 participants total ticipant using the simulation: 1) None, 2) Minimal, 3) Maximum
(5 groups of 26 each: non-CVD control, CVD, plus three level-of- (Images for the Minimal and Maximum conditions are available
information CVD simulation groups). Self-reported sex was roughly in the supplementary material). In the condition where no infor-
balanced (male = 59, female = 61, and prefer not to disclose = 10), mation was provided, participants would not be informed of any
although the CVD group was likely predominantly biologically simulation that occurred within the study. In the minimal condition,
male due to inherited CVD being sex-linked. We had 55 participants participants were simply informed that they were under a CVD
between the ages of 18-24, 57 between the ages of 25-34, 10 between simulation and that colours may not appear as they normally do.
the ages of 35-44 and 8 between the ages of 45-64. Participants used Finally, in the maximum condition, participants were informed that
a mix of devices to complete the study with 76 using mobile devices, they were under a simulation and were informed what colours
27 using laptops, 25 using a desktop or monitor and 2 using a tablet. that individuals with CVD tend to confuse (e.g., Purples lose their
Finally, all participants completed the study indoors, with 84 doing ‘redness’, so are often seen as shades of blue).
so under artifcial lighting, 41 doing so under natural lighting and
4.2.1 Colours Generated. First, we modifed the ‘ColourIconizer_
5 who reported they did not know.
Props’ colour name dictionary published in the supplementary
material of [20] by splitting all ‘grey’ proportions into ‘black’,
4.2 Task, Stimuli & Apparatus ‘white’, and ‘grey’. We did this in recognition that those with
Our study consisted of fve colour-related tasks that tested various red-green CVD can confuse reds and blacks, plus whites and
ways of interpreting and determining colours. The task order was cyans/aquamarines/teals. As such, we opted to include ‘black’ and
fxed so that participants were always given tasks in the order of ‘white’ to improve the generalizability of our evaluation tasks.
1) Selection Task, 2) Transition Task, 3) Sorting Task, 4) Distinc- Next, we subsampled the dictionary resulting from the above
tion Task and 5) Category Task. The specifc implementations and step for RGB colours that had a proportion value of >= 0.90 for
colours used for each task are described in each subsection below. their dominant colour term. The resulting set thus contained only
To evaluate the efectiveness of ‘indirect’ evaluations we utilized colours that people without CVD should easily and consistently
the Machado CVD simulation method [38], specifcally we used name (i.e., at least 90% of the population should agree that it is the
both the Deutan and Protan simulation. We decided on using the same name). Unfortunately, this eliminated ‘white’ and ‘teal’ as
Machado simulation method as the Machado simulation has gained there were no RGB colours that had either of these terms above
alot of attention due to its severity specifcation potentially allowing 0.90.
ASSETS ’22, October 23–26, 2022, Athens, Greece Connor Geddes and David R. Flatla

As we wanted a set of colours that were easily and consistently make sure we had comparable results between the results of each
named by people without CVD (which we now had) but also exhib- run of the Sorting Task, we used the identical set of colours for
ited as much variation as possible to challenge people with CVD, each participant.
we further subsampled our colour set by selecting the 21 most visu-
ally distinct colours (defned using Euclidean distance in CIE LUV 4.2.5 Distinction Task. To complete the Distinction Task, partic-
space) for each colour term. Only eight RGB colours had ‘black’ ipants need to use a combination of denotative and comparative
>= 0.90 as their dominant colour term, so we included them all. colour judgement, similar to the Selection Task, however, in a more
These colours are shown in Figure 4. demanding task (more colours are presented). Also, similar to the
Finally, to generate pairs of colours that are named diferently Selection Task, we implemented the Distinction task to have a grid
by people without CVD but likely confused by people with CVD, size of 80 colour swatches, but with a grid of 10 (5x2) checkboxes
we followed a process similar to that used in [17] by defning a with colour name labels below. We specifcally ensured that we
CIE LUV 3D line between each Figure 4 colour and the protan, and generated distinct colour sets of one to nine colours each once,
deutan copunctal points (as defned in [47]). Any Figure 4 colours resulting in a total of nine repetitions of the Distinction Task (one
with a diferent dominant colour term but within 5.0 CIE LUV units for each set of colours). We randomized the layout of the colours
of the 3D lines defned above were considered candidates for CVD for each participant and the order of the specifc distinct colour
confusion for the type of CVD for that line’s copunctal point. sets. With this we got the following nine colour distinction sets:

4.2.2 Selection Task. The Selection Task was designed to assess (1) blue
a combination of both denotative and comparative colour judge- (2) orange and yellow
ment [20]. Our implementation of the Selection Task consists of (3) red, blue, and grey
a grid of 80 colour swatches. To choose colours for our task, we (4) red, orange, yellow, and blue
sampled from our generated colours (described above). We gener- (5) red, orange, green, pink and grey
ated two distinct colour sets, one each for Protans and Deutans. (6) red, orange, green, blue, pink and brown
Our Protan colour set consisted of Red & Brown, Orange & Green, (7) orange, yellow, green, blue, pink, grey and brown
and Blue & Purple. Our Deutan colour set consisted of Green & (8) orange, yellow, green, blue, purple, pink, grey, and brown
Brown, Pink & Grey, and Blue & Purple. Similar to the original (9) red, orange, yellow, green, blue, purple, pink, grey, and brown
implementation [20], all colours (for either Protan or Deutan) were
presented in the grid even if they were not the target pair. Finally, 4.2.6 Category Task. The Category Task served as our connota-
in all cases participants completed each colour pair with each of 0-3 tive colour task, requiring users to match a colour swatch to its
target colours. This resulted in the Selection Task being completed respective colour category (also represented as a diferent colour
a total of 12 times. swatch). In the Category Task, a target colour swatch was presented
in a prompt at the top of the screen, with a grid of 10 (5x2) colour
4.2.3 Transition Task. We implemented the Transition Task as de- swatches presented below. Importantly for this task, there was
scribed in [20], with a progression of ten colour swatches presented never an exact match for the target colour in the lower swatches.
horizontally. The extreme colours (far left and far right) were fxed To ensure that each colour name was encountered throughout the
and used as anchors to order the eight randomly-placed intermedi- study (similar to the Distinction Task) we ensured that every colour
ary colours between them. To complete the Transition task, only name of ‘red’, ‘orange’, ‘yellow’, ‘green’, ‘blue’, ‘purple’, ‘pink’, and
comparative colour judgement is required. We generated two dis- ‘brown’ each served once as the target category colour. This resulted
tinct sets of colours, one each for Protan and Deutan CVD. Our in a total of eight total trials of the Category Task.
Protan set consisted of ‘yellow’ to ‘green’, ‘red’ to ‘green’, and ‘blue’
to ‘pink’ transitions. Our Deutan set consisted of ‘yellow’ to ‘green’, 4.2.7 Apparatus. To create our study, we used the cross-
‘red’ to ‘brown’, and ‘grey’ to ‘pink’ transitions. platform SDK Flutter (version 2.2.0) and we used Firebase to
host our website online. The study is available here https://
4.2.4 Sorting Task. The Sorting Task was designed to require only colourmethodevaluation.web.app, and the code of the study can be
denotative judgement of colour by showing a single swatch of accessed here https://github.com/geddesc/ASSETS22-Challenging-
colour that is named by the participant, thereby removing any and-Improving-Evaluation-Methods.
comparative colour judgement. We implemented the Sorting Task
identically to the original implementation [20], with a single colour
swatch presented at the bottom of the screen that needed to be 4.3 Procedure & Design
sorted into one of ten colour names presented in a grid above the At the start of the study participants completed a demographics
colour swatch. However, in this implementation (unlike the original) questionnaire. If the participants were not colour blind and were as-
we removed ‘teal’ and added ‘black’. We did this as we had difculty signed one of the simulation conditions with information, next they
generating high confdence teal colours when we generated our were then supplied with information telling them about the simula-
test colours, so we replaced it with ‘black’ to maintain a consistent tion. Next participants completed the tasks in the following order:
colour grid size (5x2). The colours we used for the Sorting Task 1) Selection Task, 2) Transition Task, 3) Sorting Task, 4) Distinction
were generated from our set of colours. We randomly chose fve Task, and 5) Category Task. Between every task participants were
each of ‘red’, ‘orange’, ‘yellow’, ‘green’, ‘blue’, ‘purple’, ‘pink’, ‘grey’, asked if they struggled on the task and if so to optionally provide
and ‘brown’, resulting in 45 colours that needed to be sorted. To more information.
Challenging and Improving Current Evaluation Methods for Colour Identification Aids ASSETS ’22, October 23–26, 2022, Athens, Greece

4.4 Results have the lifelong knowledge of being colour blind that they can
For all the tasks except Transition Task, we measured accuracy apply to a task like this to assist them in knowing which shades of
(higher is better) and completion time (lower is better). For the the colours presented before them are the most likely candidates
Transition Task, we also measured completion time but measured to match the target colour name. This also provides more justifca-
error score (lower is better) instead of accuracy. tion for our suspicion that a future investigation into the specifc
colours chosen in a task such as this could reveal more evidence of
4.4.1 Data Analysis. To start, we checked our data for normality a diference in this specifc task.
using the Lilliefors test. We found that of 50 data distributions (5
conditions x 5 tasks x 2 dependent measures [accuracy/error score
+ completion time]), 22 rejected the null hypothesis that the data
was normally distributed. As such, we used the non-parametric
Kruskal-Wallis test for our analysis with Dunn’s Test (which uses
Bonferroni’s correction) for post-hoc pairwise comparisons.

Figure 6: Transition Task accuracy & completion time re-


sults.

4.4.3 Transition Task Results. Our analysis with Kruskal-Wallis for


error score indicates a signifcant main efect (H ′ (4) = 39.472, p <
Figure 5: Selection Task accuracy & completion time results. .0001), with posthoc tests indicating pairwise diferences between
the non-CVD condition and all other conditions, and between the
4.4.2 Selection Task Results. Analyzing accuracy of the Selec- CVD condition and all simulation conditions. The diferences be-
tion Task with Kruskal-Wallis, indicated a signifcant main efect tween the CVD condition and all the simulations show how the
(H ′ (4) = 32.614, p < .0001), with posthoc tests indicating signif- simulations struggled to emulate the challenge embodied in the
icant pairwise diferences between the non-CVD condition and colour distinction task, which uses very fne colour diferences.
all other conditions. Notably we did not observe a diference be- This result does not mean that simulations cannot provide some
tween the CVD condition and any of the simulation conditions. useful insights, but they just cannot match the difculty of the task
We believe this is in part due to the difculty the task posed; the as experienced by participants with CVD, which could also belie a
variable number of target colours presented seems to have equally weakness in the simulation algorithm itself. Transition Task results
challenged both CVD participants and participants under a CVD are shown in Figure 6.
simulation. However, we did not explore deeper to see if there were For completion time, our analysis with Kruskal-Wallis did not
any patterns in which colours were chosen (or not) between these indicate a signifcant main efect (H ′ (4) = 4.706, p = .3188), per-
groups, and plan to do this in future work. Specifcally, we expect haps because all groups used a similar technique to complete the
there could be consistency in the colours that CVD participants Transition Task (comparing individual swatches to each other one
tend to confuse but we expect the simulation condition to be highly at a time).
variable, likely indicating random guessing with little justifcation
or thought behind it. Selection Task results are shown in Figure 5.
For completion time, analysis with Kruskal-Wallis indicated a sig-
nifcant main efect (H ′ (4) = 29.785, p < .0001), with posthoc tests
indicating signifcant diferences between the non-CVD condition
and the CVD condition, the non-CVD condition and the Simulation
condition with no information provided, the CVD condition and
all three simulation conditions, and the no information simulation
condition and the other two simulation conditions. We believe in
the case of the diferences observed between the CVD condition
and all simulation conditions, this is telling that CVD participants
clearly behave diferently in a colour task such as this, taking more Figure 7: Sorting Task accuracy & completion time results.
time to complete the task as a result of likely comparing a set of
colours they distinctly believe to be the target colour. Whereas non- 4.4.4 Sorting Task Results. Our analysis with Kruskal-Wallis for
CVD participants under a simulation (regardless of the information accuracy indicates a signifcant main efect (H ′ (4) = 53.557, p <
condition) were more likely to spend far less time as they do not .0001), with posthoc tests indicating pairwise diferences between
ASSETS ’22, October 23–26, 2022, Athens, Greece Connor Geddes and David R. Flatla

the non-CVD condition and all other conditions, and between participants under a simulation not name colours in the same way
the CVD condition and all simulation conditions. The diference as (or even make mistakes similar to) CVD participants, but also
found with the non-CVD condition despite not specifcally choosing that increasingly descriptive information regarding the experience
colours along confusion lines for this task shows the efectiveness of of people with CVD does not help at all in overcoming the gap.
this task in being utilized to test potential colour identifcation aids’ For completion time, our analysis with Kruskal-Wallis indicates a
ability in assisting individuals with CVD. The diferences between signifcant main efect (H ′ (4) = 9.755, p = .0448), with posthoc tests
the CVD condition and all simulation conditions in the Sorting Task indicating pairwise diferences between the non-CVD condition
raises the idea that CVD participants have greater knowledge for and the simulation condition with minimum information, between
how colours in their specifc perception are named; they may not the CVD condition and all simulation conditions, the simulation
always be correct, however, they are most certainly always more condition with no information and the simulation condition with
correct than non-CVD participants under a simulation (regardless minimum information, and between the simulation condition with
of information provided). This raises the idea that CVD simulations minimum information and the simulation condition with maximum
– regardless of information provided to participants on how CVD information. These results are somewhat confusing, but might be
people confuse colours – cannot hope to overcome the years of due to the very high variance in the ‘No Info’ and ‘Max Info’ condi-
(sometimes socially painful) experience that an individual with tions. We are planning deeper analysis into these diferences in the
CVD has in naming and distinguishing colours by name. Further, future.
we can also see by the lack of diference in any of the simulation
conditions as well as the relatively stable accuracy data as seen
in Figure 7 that the idea that the amount of information given to
non-CVD participants under a simulation does not matter much at
all as they do not have the life experience to utilize the knowledge
properly.
For completion time, our analysis with Kruskal-Wallis for com-
pletion time did not indicate a signifcant main efect (H ′ (4) =
4.449, p = .3487). This is interesting in that it probably refects
similarity in how each group completed the Sorting Task, which is
surprising given that only the non-CVD participants were expected
to not struggle with this task (and hence be able to complete it
quickly).
Figure 9: Category Task accuracy & completion time results.

4.4.6 Category Task Results. Our analysis with Kruskal-Wallis for


accuracy indicates a signifcant main efect (H ′ (4) = 21.984, p <
.0002), with posthoc tests indicating pairwise diferences between
the non-CVD condition and all other groups. This result actually
provides some support for CVD simulations achieving similar ac-
curacy performance as participants with CVD.
However, our analysis with Kruskal-Wallis for completion time
indicates a signifcant main efect (H ′ (4) = 10.797, p = .0289), with
posthoc tests indicating pairwise diferences between the non-CVD
Figure 8: Distinction Task accuracy & completion time re- condition and the CVD condition, the non-CVD and the simulation
sults. condition with minimum information, and between the CVD condi-
tion and all simulation conditions. By and large, these results indi-
4.4.5 Distinction Task Results. Our analysis with Kruskal-Wallis cate that CVD participants behaved diferently than non-CVD and
for accuracy indicates a signifcant main efect (H ′ (4) = 36.351, p < CVD simulation groups. This not only underlines that simulations
.0001), with posthoc tests indicating pairwise diferences between do not allow non-CVD participants to act like CVD participants, but
the non-CVD condition and all but the CVD condition, and the CVD it also underlies that the non-CVD participants act very similarly,
condition and all Simulation conditions. First, notably CVD partici- whether under simulation or not.
pant results did not difer too much from the non-CVD participants,
we believe this is due to the selection of colours for the task not 4.5 Diferences in How Colours were Named
being chosen to specifcally fall along a confusion line. For future To get a better understanding of the diferences that a lifetime of
studies implementing this task it would be benefcial to ensure living with CVD can provide in colour naming compared to that of
that colour name categories (e.g., red, blue) contain corresponding a non-CVD person using a CVD simulation, we looked at how these
pairs of colours that are confused. Finally, the diference between two groups of participants difered in how they named colours. To
the CVD condition and the simulation conditions is noteworthy do this, we dug deeper into the individual-colour results for the
as this once again reinforces the idea that not only can non-CVD Sorting and Distinction Tasks as these tasks featured the widest
Challenging and Improving Current Evaluation Methods for Colour Identification Aids ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 10: Sorting Task naming for each colour term for each participant group, showing for each ‘correct’ colour name (x-axis)
how participants ended up naming it during the sorting task (stacked column). ‘red’, ‘orange’, ‘green’ and ‘purple’ illustrate
clear diferences between CVD and simulation participants.

Figure 11: Distinction Task colours participants selected for the one-distinct-colour condition (left) and the fve-distinct-colour
condition (right). The one-distinct-colour was ‘blue’. The fve-distinct-colours were ‘red’, ‘orange’, ‘green’, ‘pink’ and ‘grey’.

variation of colours, required participants to identify colours by Figure 11 illustrates that in the ‘fve distinct colours’ condition of
name, and used a consistent set of colours for every participant. the Distinction Task, CVD participants more frequently chose the
We found two major diferences in the data between CVD and correct colours for ‘red’, ‘orange’, ‘green’, ‘pink’ and ‘grey’. These
non-CVD participants under simulations: 1) CVD participants were fndings again reinforce that CVD participants have a lifetime of
more consistent and correct when naming colours, and 2) the experience seeing and naming colours in largely a non-CVD world.
colours that were incorrectly named difered wildly between CVD While it might be attractive and easier to evaluate CVD Aids using
participants and non-CVD participants under simulations (regard- simulations, ‘indirect’ evaluations simply will never replicate the
less of the amount of contextual information provided). knowledge rooted in the lived experience of people with CVD.

4.5.1 CVD Participants More Accurately Named Colours. In the


Sorting and Distinction Tasks, CVD participants in most cases were 4.5.2 Incorrectly Named Colours Difer Wildly Between CVD and
more correct at naming colours than non-CVD participants under a Simulations. When colours were incorrectly named in the Sorting
simulation. This makes perfect sense as CVD participants are most and Distinction Tasks, we again found that there were large difer-
familiar with their perception of colour and have been living with ences between CVD and non-CVD participants under simulations.
and self-accommodating their CVD their entire lives. To directly To illustrate this, we will discuss the errors made in the Sorting Task
see how CVD participants outperformed non-CVD participants for ‘orange’ as can be seen in Figure 10. CVD participants mostly
under simulations, see Figure 10, specifcally the ‘red’, ‘orange’, correctly sorted all the ‘orange’ colours in the Sorting Task, however,
‘green’, and ‘purple’ data, which illustrates the diferences in CVD when they did incorrectly sort ‘oranges’, they named them mostly
participants naming the colours far more accurately than non- ‘red’ or ‘green’. Looking at the results for the non-CVD participants
CVD participants under simulations. In the case of how the ‘reds’ under simulations, we can see that they did much worse overall and
were sorted, we can see that they were sorted to mostly ‘red’ with that they confused colours diferently. In all simulation conditions
very little to ‘brown’ for the CVD participants. However, for the the major colours non-CVD participants incorrectly sorted ‘orange’
participants under simulations, ‘reds’ were largely sorted to ‘brown’ into were ‘yellow’ and ‘brown’. There was a slight improvement in
with the remainder to a combination of ‘red’, ‘orange’, ‘yellow’ and the maximum information condition, however ‘yellow’ and ‘brown’
only in the case of the maximum information condition ‘green’. still remained the most common confused colours.
ASSETS ’22, October 23–26, 2022, Athens, Greece Connor Geddes and David R. Flatla

Turning to the Distinction Task and the ‘one distinct colour’ data people regarding how the world might look for those with CVD, for
in Figure 11, we can see that all participants across all conditions example to help build understanding between deeply-connected
chose ‘blue’ at least once – errors in this condition are represented people [18] (e.g., spouses, siblings) with diferent colour perception.
by choosing anything other than ‘blue’. CVD participants largely However, our results show that CVD simulations do not capture
did so with ‘purple’, however non-CVD simulation participants the lived experience gained from actually having CVD, and that
rarely chose ‘purple’ but instead opted for ‘black’. no amount of contextual information provided can compensate
Both the Sorting Task and Distinction Task examples outlined for those years of experience. Our fndings agree with the advice
above show clear diferences in how when naming colours incor- from Tigwell [51] and Bennett & Rosner [4], where they argue that
rectly CVD participants confused colours diferently than non-CVD disabled people should be incorporated into the design team and
participants, once again afrming our conclusion that CVD simu- that simulations simply cannot match the life experiences of those
lations cannot replace a lifetime of living with CVD. Furthermore, with disabilities.
we can also see that the provided CVD contextual information did Even if CVD simulations can allow an individual without CVD
not help non-CVD participants name colours in a similar fashion to obtain diagnostic results to ‘classify’ as an individual with CVD.
as CVD participants. This does not allow for that individual to understand the deep
contextual cues that people with CVD have developed over their
5 DISCUSSION entire lives to be able to better discriminate and interpret colours.
5.1 Summary & Explanation of Results We assert that studies that look to assist people with CVD must
understand and assist them not the tools or simulations that look
To summarize, our fndings indicate that the use of simulations to approximate them.
by individuals without CVD does not produce results equivalent
to those obtained from people with CVD. Further, even providing
more contextual information for the simulation does not make a
diference, as non-CVD people cannot simply adapt to a perception
of colour that is so foreign to them. What this means is that CVD 6 LIMITATIONS AND FUTURE WORK
simulations and information appended to those simulations are not The Sorting, Distinction and Category Tasks all featured colour sets
equivalent to the life experiences that an individual with CVD has that were randomly generated without specifcally choosing pairs
in their colour perception. along confusion lines. This had a side efect where it appeared as
The most notable illustration of these results are the tasks that though the colours were not always difcult for individuals with
involved selecting distinct colours (Distinction Task) or sorting CVD. Future implementations of these tasks should consider this
colours (Sorting Task) by name. There was a clear diference in and perhaps take an approach similar to what we did for both the
these conditions that highlighted the diference between someone Selection and Transition Tasks and ensure that there are pairs of
with CVD who has been identifying colours by name or distin- colours that lie along confusion lines.
guishing separate-named colours their entire lives, and someone Cole’s [9] four categories of colour judgement that individuals
who is experiencing a CVD simulation (with or without contextual with CVD tend to struggle with also includes aesthetic judgement
information) for the frst time. of colour. We have not yet implemented an aesthetic task as there
Finally, our results, specifcally those from the Transition Task, is very little understanding of what makes specifc colour com-
put into question the recommendations provided in a recent Nature binations aesthetically pleasing to individuals with CVD. Future
Communications paper [11], which justifes specifc continuous work should look to understand what makes colours aesthetically
colour maps as accessible for those with various forms of CVD pleasing to people with colour vision defciency and should develop
based on their appearance under CVD simulations. In our results, a task that assesses potential aesthetic colour judgement of people
we saw that when non-CVD participants used a CVD simulation with CVD based on that understanding.
to complete the Transition Task, their results were always more We conducted our study online, and thus had no contact with
accurate than those who had CVD. This means that using CVD our participants. The results we observed indicated diferences in
simulations to evaluate if a given continuous colour map is suitable behaviour when completing our colour-related tasks between CVD
for people with CVD does not actually assist in understanding and non-CVD participants under simulations. However, future work
how people with CVD will interact with any visualization that should look to ground and verify this in a controlled lab setting.
uses such a colour map. Unfortunately, using CVD simulations to We had the tasks ordered specifcally so that it was always Selec-
justify accessible colour maps has already been adopted by both tion, Transition, Sorting, Distinction and fnally Category task. This
Matplotlib 3 and R 4 . may have had a learning efect on the overall results even though
our tasks feature diferent types of colour judgement. Future studies
5.2 What this Means for CVD Simulation Use? should be cautious of the potential efect ordering could have on
We do NOT recommend that CVD simulations be used to evaluate the results of their study when utilizing these tasks.
the efectiveness of CVD aids or in any case to substitute for the Finally, there are also perhaps deeper diferences that may be
lived colour perception experiences of those with CVD. If carefully found on the basis of which specifc colours individuals with CVD
employed, CVD simulations can provide some insights for non-CVD confused or struggled with for specifc tasks (e.g., Selection Task).
3 https://matplotlib.org/3.5.2/tutorials/colors/colormaps.html We next plan to examine our data more deeply to see if more fne-
4 https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html grained patterns emerge.
Challenging and Improving Current Evaluation Methods for Colour Identification Aids ASSETS ’22, October 23–26, 2022, Athens, Greece

7 CONCLUSION [16] David Flatla and Carl Gutwin. 2012. SSMRecolor: Improving Recoloring Tools
with Situation-Specifc Models of Color Diferentiation. In Proceedings of the
This study illustrated that there were signifcant diferences be- SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA)
tween the results of CVD participants and non-CVD participants (CHI ’12). Association for Computing Machinery, New York, NY, USA, 2297–2306.
https://doi.org/10.1145/2207676.2208388
under a CVD simulation in tasks relating to colour identifcation, [17] David R. Flatla, Alan R. Andrade, Ross D. Teviotdale, Dylan L. Knowles, and
comparison, and distinction. People with CVD have had years of Craig Stewart. 2015. ColourID: Improving Colour Identifcation for People with
experience and understand how specifc colours can look to them, Impaired Colour Vision. In Proceedings of the 33rd Annual ACM Conference on
Human Factors in Computing Systems. Association for Computing Machinery,
they are not always correct, but they are certainly always more New York, NY, USA, 3543–3552. https://doi.org/10.1145/2702123.2702578
correct than a non-CVD participant under a simulation. Further, the [18] David R. Flatla and Carl Gutwin. 2012. "So That’s What You See": Building Under-
amount of CVD-context-information provided with the simulation standing with Personalized Simulations of Colour Vision Defciency. In Proceed-
ings of the 14th International ACM SIGACCESS Conference on Computers and Ac-
seems to make no diference; we saw that no matter how much cessibility (Boulder, Colorado, USA) (ASSETS ’12). Association for Computing Ma-
information we provided – none, minimal or a maximum amount – chinery, New York, NY, USA, 167–174. https://doi.org/10.1145/2384916.2384946
[19] Donald S Fong, Franca B Barton, George H Bresnick, and Early Treatment Diabetic
there was no clear diference between the results for those three Retinopathy Study Research Group. 1999. Impaired color vision associated with
conditions. Even assuming that non-CVD participants using CVD diabetic retinopathy: early treatment diabetic retinopathy study report no. 15.
simulations ‘see’ what participants with CVD see, they cannot du- American journal of ophthalmology 128, 5 (1999), 612–617.
[20] Connor Geddes, David R. Flatla, Garreth W. Tigwell, and Roshan L Peiris. 2022.
plicate their behaviour and cannot fake the deep knowledge that Improving Colour Patterns to Assist People with Colour Vision Defciency. In
individual with CVD have developed over their lives. Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems
(New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New
York, NY, USA, Article 479, 17 pages. https://doi.org/10.1145/3491102.3502024
ACKNOWLEDGMENTS [21] Google. 2020. Improve vision defciency color matrices per research paper. https://
We thank our participants for taking part in our research. We would chromium-review.googlesource.com/c/chromium/src/+/2124500 Accessed: 2022-
4-14.
also like to thank our anonymous reviewers for their time and efort [22] Google. 2020. What’s New In DevTools (Chrome 83). https://developer.chrome.
in the peer review of this paper. com/blog/new-in-devtools-83/#vision-defciencies. Accessed: 2022-4-14.
[23] C. H. Graham and Yun Hsia. 1959. Studies of Color Blindness: A
Unilaterally Dichromatic Subject. Proceedings of the National Acad-
REFERENCES emy of Sciences 45, 1 (1959), 96–99. https://doi.org/10.1073/pnas.45.1.96
[1] Katrin Angerbauer, Nils Rodrigues, Rene Cutura, Seyda Öney, Nelusa Path- arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.45.1.96
manathan, Cristina Morariu, Daniel Weiskopf, and Michael Sedlmair. 2022. Ac- [24] Soon Hau Chua, Haimo Zhang, Muhammad Hammad, Shengdong Zhao, Sahil
cessibility for Color Vision Defciencies: Challenges and Findings of a Large Goyal, and Karan Singh. 2015. ColorBless: Augmenting Visual Information
Scale Study on Paper Figures. In Proceedings of the 2022 CHI Conference on for Colorblind People with Binocular Luster Efect. ACM Trans. Comput.-Hum.
Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). As- Interact. 21, 6, Article 32 (Jan. 2015), 20 pages. https://doi.org/10.1145/2687923
sociation for Computing Machinery, New York, NY, USA, Article 134, 23 pages. [25] Chun-Rong Huang, Kuo-Chuan Chiu, and Chu-Song Chen. 2010. Key color
https://doi.org/10.1145/3491102.3502133 priority based image recoloring for dichromats. In Pacifc-Rim Conference on
[2] Anonymous. 2008. Inkscape - Color Blindness Simulation. https://web. Multimedia. Springer, 637–647.
archive.org/web/20111123145815/http://kaioa.com/node/75#comment-247. Ac- [26] Jia-Bin Huang, Yu-Cheng Tseng, Se-In Wu, and Sheng-Jyh Wang. 2007. Infor-
cessed: 2022-4-14. mation preserving color transformation for protanopia and deuteranopia. IEEE
[3] Jibin Bao, Yuanyuan Wang, Yu Ma, and Xiaodong Gu. 2008. Re-coloring im- Signal Processing Letters 14, 10 (2007), 711–714.
ages for dichromats based on an improved adaptive mapping algorithm. In 2008 [27] Jia-Bin Huang, Sih-Ying Wu, and Chu-Song Chen. 2008. Enhancing color represen-
International Conference on Audio, Language and Image Processing. IEEE, 152–156. tation for the color vision impaired. In Workshop on Computer Vision Applications
[4] Cynthia L. Bennett and Daniela K. Rosner. 2019. The Promise of Empathy: Design, for the Visually Impaired.
Disability, and Knowing the "Other". In Proceedings of the 2019 CHI Conference [28] K. Ichikawa, S. Yokoyama, Yoshiki Tanaka, Hideki Nakamura, R. T. Smith, and
on Human Factors in Computing Systems. Association for Computing Machinery, S. Tanabe. 2020. The Change in Color Vision with Normal Aging Evaluated on
New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300528 Standard Pseudoisochromatic Plates Part-3. Current Eye Research 46 (2020), 1038
[5] Jennifer Birch and Jennifer Birch. 2001. Diagnosis of defective colour vision. – 1046.
(2001). [29] Shinobu Ishihara. 1987. Test for colour-blindness. Kanehara Tokyo, Japan.
[6] Hans Brettel, Françoise Viénot, and John D Mollon. 1997. Computerized simula- [30] Luke Jeferson and Richard Harvey. 2007. An Interface to Support Color Blind
tion of color appearance for dichromats. Josa a 14, 10 (1997), 2647–2655. Computer Users. In Proceedings of the SIGCHI Conference on Human Factors
[7] Olivier Burggraaf, Sanjana Panchgnula, and Frans Snik. 2021. Citizen science in Computing Systems (San Jose, California, USA) (CHI ’07). Association for
with colour blindness: A case study on the Forel-Ule scale. PLoS ONE 16(4): Computing Machinery, New York, NY, USA, 1535–1538. https://doi.org/10.1145/
e0249755. (2021). https://doi.org/10.1371/journal.pone.0249755 1240624.1240855
[8] Yu-Chieh Chen and Tai-Shan Liao. 2011. Hardware digital color enhancement [31] Kenneth Knoblauch, François Vital-Durand, and John L Barbur. 2001. Variation
for color vision defciencies. ETRI Journal 33, 1 (2011), 71–77. of chromatic sensitivity across the life span. Vision research 41, 1 (2001), 23–36.
[9] Barry L Cole. 2004. The handicap of abnormal colour vision. Clinical and [32] Shoji Kondo. 1990. A computer simulation of anomalous color vision. Color
Experimental Optometry 87, 4-5 (2004), 258–275. Vision Defciencies (1990), 145–159.
[10] Bevil R Conway. 2009. Color vision, cones, and color-coding in the cortex. The [33] Tobias Langlotz, Jonathan Sutton, Stefanie Zollmann, Yuta Itoh, and Holger
neuroscientist 15, 3 (2009), 274–290. https://doi.org/10.1177/1073858408331369 Regenbrecht. 2018. ChromaGlasses: Computational Glasses for Compensating
[11] Fabio Crameri, Grace E. Shephard, and Philip J. Heron. 2020. The misuse of Colour Blindness. Association for Computing Machinery, New York, NY, USA,
colour in science communication. Nature Communications 11, 5444 (2020). https: 1–12. https://doi.org/10.1145/3173574.3173964
//doi.org/10.1038/s41467-020-19160-7 [34] Jinmi Lee and Wellington Pinheiro dos Santos. 2011. An adaptive fuzzy-based sys-
[12] Peter B Delahunt, Michael A Webster, Lei Ma, and John S Werner. 2004. Long-term tem to simulate, quantify and compensate color blindness. Integrated Computer-
renormalization of chromatic mechanisms following cataract surgery. Visual Aided Engineering 18, 1 (2011), 29–40.
neuroscience 21, 03 (2004), 301–307. [35] Huei-Yung Lin, Li-Qi Chen, and Min-Liang Wang. 2019. Improving Discrimi-
[13] Yinhui Deng, Yuanyuan Wang, Yu Ma, Jibin Bao, and Xiaodong Gu. 2007. A fxed nation in Color Vision Defciency by Image Re-Coloring. Sensors 19, 10 (2019).
transformation of color images for dichromats based on similarity matrices. In https://doi.org/10.3390/s19102250
International Conference on Intelligent Computing. Springer, 1018–1028. [36] Richard B Lomax, Peter Ridgway, and Maureen Meldrum. 2004. Does occupational
[14] F Dick, S Semple, R Chen, and A Seaton. 2000. Neurological defcits in solvent- exposure to organic solvents afect colour discrimination? Toxicological reviews
exposed painters: a syndrome including impaired colour vision, cognitive defects, 23, 2 (2004), 91–121.
tremor and loss of vibration sensation. QJM 93, 10 (2000), 655–661. [37] Rhouri MacAlpine and David R. Flatla. 2016. Real-Time Mobile Personalized
[15] Dean Farnsworth. 1943. The Farnsworth-Munsell 100-Hue and Dichotomous Simulations of Impaired Colour Vision. In Proceedings of the 18th International
Tests for Color Vision∗. J. Opt. Soc. Am. 33, 10 (Oct 1943), 568–578. https: ACM SIGACCESS Conference on Computers and Accessibility (Reno, Nevada, USA)
//doi.org/10.1364/JOSA.33.000568 (ASSETS ’16). Association for Computing Machinery, New York, NY, USA, 181–189.
https://doi.org/10.1145/2982142.2982170
ASSETS ’22, October 23–26, 2022, Athens, Greece Connor Geddes and David R. Flatla

[38] Gustavo M. Machado, Manuel M. Oliveira, and Leandro A. F. Fernandes. 2009. 118–129. https://doi.org/10.1109/TVCG.2012.93
A Physiologically-Based Model for Simulation of Color Vision Defciency. IEEE [49] Andrew Sears and Vicki Hanson. 2011. Representing users in accessibility re-
Transactions on Visualization and Computer Graphics 15, 6 (nov 2009), 1291–1298. search. In Proceedings of the SIGCHI conference on Human factors in computing
https://doi.org/10.1109/TVCG.2009.113 systems. 2235–2238.
[39] MaPePeR. 2015. jsColorblindSimulator. https://github.com/MaPePeR/ [50] Jonathan Sutton, Tobias Langlotz, and Alexander Plopski. 2022. Seeing Colours:
jsColorblindSimulator. Accessed: 2022-4-14. Addressing Colour Vision Defciency with Vision Augmentations Using Compu-
[40] Gary W Meyer and Donald P Greenberg. 1988. Color-defective vision and tational Glasses. ACM Trans. Comput.-Hum. Interact. 29, 3, Article 26 (jan 2022),
computer graphics displays. IEEE Computer Graphics and Applications 8, 5 (1988), 53 pages. https://doi.org/10.1145/3486899
28–40. [51] Garreth W. Tigwell. 2021. Nuanced Perspectives Toward Disability Simulations
[41] Neda Milić, Miklós Hofmann, Tibor Tómács, Dragoljub Novaković, and Branko from Digital Designers, Blind, Low Vision, and Color Blind People. In Proceedings
Milosavljević. 2015. A content-dependent naturalness-preserving daltonization of the 2021 CHI Conference on Human Factors in Computing Systems. Association
method for dichromatic and anomalous trichromatic color vision defciencies. for Computing Machinery, New York, NY, USA, Article 378, 15 pages. https:
Journal of Imaging Science and Technology 59, 1 (2015), 10504–1. //doi.org/10.1145/3411764.3445620
[42] Mozilla. 2020. Color vision simulation. https://frefox-source-docs.mozilla.org/ [52] Françoise Viénot, Hans Brettel, and John D Mollon. 1999. Digital video
devtools-user/accessibility_inspector/simulation/index.html. Accessed: 2022-4- colourmaps for checking the legibility of displays by dichromats. Color Research
14. & Application 24, 4 (1999), 243–252.
[43] Shigeki Nakauchi and Tatsuya Onouchi. 2008. Detection and modifcation of [53] Ken Wakita and Kenta Shimamura. 2005. SmartColor: Disambiguation Frame-
confusing color combinations for red-green dichromats to achieve a color univer- work for the Colorblind. In Proceedings of the 7th International ACM SIGAC-
sal design. Color Research & Application: Endorsed by Inter-Society Color Council, CESS Conference on Computers and Accessibility (Baltimore, MD, USA) (As-
The Colour Group (Great Britain), Canadian Society for Color, Color Science Associ- sets ’05). Association for Computing Machinery, New York, NY, USA, 158–165.
ation of Japan, Dutch Society for the Study of Color, The Swedish Colour Centre https://doi.org/10.1145/1090785.1090815
Foundation, Colour Society of Australia, Centre Français de la Couleur 33, 3 (2008), [54] Alexander Wong and William Bishop. 2008. Perceptually-adaptive color enhance-
203–211. ment of still images for individuals with dichromacy. In 2008 Canadian Conference
[44] nburrus. 2021. Let’s collectively determine the best color blindness simula- on Electrical and Computer Engineering. IEEE, 002027–002032.
tion method! https://www.reddit.com/r/ColorBlind/comments/qzkl7h/lets_ [55] Seungji Yang, YongMan Ro, EdwardK Wong, and Jin-Hak Lee. 2008. Quantifca-
collectively_determine_the_best_color/. Accessed: 2022-4-14. tion and standardized description of color vision defciency caused by anomalous
[45] M Pacheco-Cutillas, DF Edgar, and A Sahraie. 1999. Acquired colour vision trichromats—part I: simulation and measurement. EURASIP Journal on Image
defects in glaucoma – their detection and clinical signifcance. British Journal of and Video Processing 2008 (2008), 1–9.
Ophthalmology 83, 12 (1999), 1396–1402. https://doi.org/10.1136/bjo.83.12.1396 [56] Semir Zeki. 1990. A century of cerebral achromatopsia. Brain 113, 6 (1990),
[46] S Poret, RD Dony, and S Gregori. 2009. Image processing for colour blindness 1721–1777.
correction. In 2009 IEEE Toronto International Conference Science and Technology [57] Zhenyang Zhu and Xiaoyang Mao. 2021. Image recoloring for color vision
for Humanity (TIC-STH). IEEE, 539–544. defciency compensation: a survey. The Visual Computer 37 (12 2021), 1–20.
[47] B.C. Regan, J.P. Refn, and J.D. Mollon. 1994. Luminance noise and the rapid https://doi.org/10.1007/s00371-021-02240-0
determination of discrimination ellipses in colour defciency. Vision Research 34, [58] Zhenyang Zhu, Masahiro Toyoura, Kentaro Go, Kenji Kashiwagi, Issei Fujishiro,
10 (1994), 1279–1299. https://doi.org/10.1016/0042-6989(94)90203-8 Tien-Tsin Wong, and Xiaoyang Mao. 2021. Personalized Image Recoloring for
[48] Behzad Sajadi, Aditi Majumder, Manuel M. Oliveira, Rosalia G. Schneider, and Color Vision Defciency Compensation. IEEE Transactions on Multimedia PP (03
Ramesh Raskar. 2013. Using Patterns to Encode Color Information for Dichro- 2021), 1–1. https://doi.org/10.1109/TMM.2021.3070108
mats. IEEE Transactions on Visualization and Computer Graphics 19, 1 (Jan. 2013),
ASL Wiki: An Exploratory Interface for Crowdsourcing ASL
Translations
Abraham Glasser Fyodor Minakov Danielle Bragg
atg2036@rit.edu fyodorominakov@gmail.com Danielle.Bragg@microsoft.com
Computing and Information Sciences Microsoft Research Microsoft Research
Rochester Institute of Technology Cambridge, MA, USA Cambridge, MA, USA
Rochester, New York, USA
ABSTRACT (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA,
The Deaf and Hard-of-hearing (DHH) community faces a lack of 13 pages. https://doi.org/10.1145/3517428.3544827
information in American Sign Language (ASL) and other signed
languages. Most informational resources are text-based (e.g. books, 1 INTRODUCTION
encyclopedias, newspapers, magazines, etc.). Because DHH signers Approximately 1 in 6 adults in the U.S. is Deaf or Hard-of-Hearing
typically prefer ASL and are often less fuent in written English, (DHH), and prior literacy research shows that over 17% of deaf
text is often insufcient. At the same time, there is also a lack of adults have "low literacy" [1]. Signed languages are the primary
large continuous sign language datasets from representative sign- languages of Deaf communities worldwide, and they are completely
ers, which are essential to advancing sign langauge research and distinct from local spoken/written languages. For example, Ameri-
technology. In this work, we explore the possibility of crowdsourc- can Sign Language (ASL) is the primary signed language used in
ing English-to-ASL translations to help address these barriers. To the U.S., but it is a completely diferent language from English – not
do this, we present a novel bilingual interface that enables the com- a one-to-one mapping. As a result, if a person is fuent in ASL, they
munity to both contribute and consume translations. To shed light are not necessarily fuent in English reading and writing. ASL is of-
on the user experience with such an interface, we present a user ten DHH signers’ primary language, and they typically prefer ASL
study with 19 participants using the interface to both generate and over English, are more comfortable with, and understand content
consume content. To better understand the potential impact of the better in ASL [21]. Among this bilingual community, there is a wide
interface on translation quality, we also present a preliminary trans- range of literacy levels (e.g. studies have found fourth-grade reading
lation quality analysis. Our results suggest that DHH community levels among DHH high-school graduates [39]). Research has found
members fnd real-world value in the interface, that the quality of that as a result there are lower educational outcomes among DHH
translations is comparable to those created with state-of-the-art individuals and lower rates of employment (and salaries) among
setups, and shed light on future research avenues. DHH adults [2].
A major obstacle facing DHH signers is a lack of educational
CCS CONCEPTS resources in sign language. Many educational resources are avail-
able in text (e.g. textbooks, literature books, online encyclopedias,
• Human-centered computing → Accessibility systems and
etc.), but not in a signed language. As there is no standardized
tools; Accessibility technologies; Empirical studies in col-
written form of ASL and sign language is typically in video form,
laborative and social computing; • Information systems →
these text-based interfaces do not adequately support users who
Collaborative and social computing systems and tools; Web
prefer a signed language. Because of this lack of ASL content, DHH
interfaces; Web searching and information discovery; • Applied
users often have to look up individual English words on a sepa-
computing → Digital libraries and archives; E-learning.
rate website or interface (e.g. English-to-ASL dictionaries) [10], and
re-read the English content they are trying to consume [3]. Even
KEYWORDS
though individual words can be looked up when necessary, this is
Deaf and Hard-of-Hearing, Sign Language, Bilingual, Interface, not efcient, does not help to understand English grammar, and
Education, Corpus, Crowdsourcing, ASL data collection may be insufcient for DHH signers trying to understand English
ACM Reference Format: text. It would be helpful if an ASL version of the target content
Abraham Glasser, Fyodor Minakov, and Danielle Bragg. 2022. ASL Wiki: was available – having the entire sentence/article signed might be
An Exploratory Interface for Crowdsourcing ASL Translations. In The 24th preferred by a DHH ASL signer rather than looking up individual
International ACM SIGACCESS Conference on Computers and Accessibility words and/or re-reading multiple times.
At the same time, advancing sign language research and technol-
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed ogy is currently impeded by lack of sign language data [8]. Existing
for proft or commercial advantage and that copies bear this notice and the full citation ASL datasets typically ofer a set of individual ASL signs, with their
on the frst page. Copyrights for components of this work owned by others than ACM respective English meanings, and/or ASL glosses. They do not have
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a sufciently representative and diverse signers – they often consist
fee. Request permissions from permissions@acm.org. of homogeneous sign language interpreters, small sets of signers,
ASSETS ’22, October 23–26, 2022, Athens, Greece and poorly labeled videos of unverifed quality (listed in [6]). In
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 order to more fully understand and model the language, labelled
https://doi.org/10.1145/3517428.3544827 continuous signing (i.e. complete sentences with annotations) from
ASSETS ’22, October 23–26, 2022, Athens, Greece Abraham Glasser, Fyodor Minakov, and Danielle Bragg

diverse signers is needed. However, creating such a dataset is ex- spoken and written language users, who are typically considered
tremely difcult. It is not only expensive to produce, but it also by default. There have been limited attempts to create browser
requires a massive amount of human labeling and annotation, since tools that provide signed translations of written content, to create
there is no automated system to do so. It is also hard to enable a signing avatars, and to more generally create recognition and trans-
large pool of contributors, since most in-person data collection ef- lation systems [6]. These tools and resources are not viable due
forts are limited to individuals who live close-by within commuting to the very limited amount of labeled signing videos with diverse,
distance, and have time in their schedules to contribute. How to en- well-representative signers. There are some DHH content creators
able everyday signers to efciently contribute labelled continuous that strive to support accessibility of information and support the
content, and how DHH users might respond to crowd-generated DHH community, such as the Daily Moth – a group who “deliver
content remain open questions. news in video using American Sign Language” [29]. However, as
In this work, we present a novel interface that addresses two these eforts are sponsored and often composed of a small group of
needs at once: 1) it provides a bilingual information resource and 2) people, they are limited in the amount of content they can create,
it simultaneously generates a continuous labelled signing dataset and often have to selectively ofer a handful of content options –
that could be used by artifcial intelligence researchers, ASL lin- for instance, the Daily Moth says “the deaf host, Alex Abenchuchan,
guists, ASL learners, DHH ASL signers, and others. Our interface covers trending news stories and deaf topics on new shows Monday-
provides a side-by-side ASL (video) and English (text) synchronized Fridays”. Many people in the DHH community praise the Daily
interface, where users are able to read/view articles simultaneously Moth due to the level of access it provides, being a bilingual in-
in both languages. Users can also use this platform to contribute formation resource for selected news happening around the world
ASL translations of existing English texts in the communal database. [33, 41].
For this exploratory work, we seed the interface with popular Eng- Our interface would enable crowdsourcing to address the prob-
lish Wikipedia articles, which are translated into ASL, and refer to lems of large-scale sign language data collection and diversity, nat-
this prototype system as "ASL Wiki". However, the same interface uralness, all while serving as a bilingual educational resource.
could be seeded with any long text, and could be used with any
pair of written and signed languages. In terms of dataset creation,
by enabling contributors to record segments of English text with 2.2 ASL and English Bilingualism
known contents, the interface eliminates the need for humans to Prior work suggests that bilingual resources are useful for DHH fu-
later segment and align the text and video. Such intensive labelling ent signers, rather than having any negative information-overload
work is commonly done in creating parallel corpora containing efects. Psychology researchers have established that it is not costly
signed language and spoken/written text (e.g. [17]). to switch from single to dual lexical retrieval (using two languages
To help understand the efectiveness of such an interface, we at once), and revealed a signifcant cost to turning of a language,
ran two exploratory studies. First, to better understand the user which bilingual DHH users might do while trying to understand
experience with the interface, we conducted an exploratory user English text alone [16]. This suggests that an ASL and English bilin-
study where 19 participants used the interface to consume and gual interface, such as the one we have developed in this work,
generate content, and shared feedback. Our results suggest that could be benefcial to DHH fuent signers by providing greater
DHH individuals fnd real-world value in our interface, thought accessibility than English text alone.
it was easy and intuitive to use, and were excited to see further The value of bilingualism in ASL and English has been further
development and identifed several target audiences they would substantiated by Deaf-led organizations. The National Association
recommend the site to. Second, we also conducted an exploration of the Deaf (NAD), a nonproft organization whose mission “is to
into the quality of translations that can be generated through our preserve, protect and promote the civil, human and linguistic rights
interface. Results suggest that the translation quality is comparable of deaf and hard of hearing people in the United States of America.”
to the quality of translations created through state-of-the-art setups Internationally, NAD represents the U.S. to the World Federation of
for sign language translation. We conclude by discussing future the Deaf (WFD), an international human rights organization. NAD
work that this initial exploration introduces. supports bilingualism, using ASL and English, in the home and
educational environment for DHH individuals. They advocate that
2 RELATED WORK bilingualism is important and efective because it fosters “positive
self-esteem, confdence, resilience, and identity, factors necessary
In this section, we focus on work relevant to our two motivations:
for lifelong learning and success” [30].
supporting bilingual content, and supporting sign language data
Despite the value of bilingual ASL/English resources, few exist.
collection eforts.
The Deaf Studies Digital Journal (DSDJ) “is the world’s frst peer-
reviewed journal dedicated to advancing the cultural, creative, and
2.1 Sign Language Educational Resources critical output of published work in and about sign languages and
Existing resources that make information available in a signed Deaf culture” [22]. It is a bilingual and bimodal publication primarily
language compromise a small number of dictionaries ([11], [18]), presented in both ASL and English. It features academic work in
educational materials ([14], [12], [20], [42]), lexical databases ([37], other sign languages, and ofers scholarly articles, commentary,
[19]), and mobile vocabulary apps ([27], [28]). Several examples literature, visual arts, flm/video, interviews, reviews, and archival
of these are listed in [6]. The landscape of existing sign language history footage and commentary. To date, there have been 5 issues
resources is very small compared to the resources available for (spanning 2009-2020) with about 150 articles total. In the most
ASL Wiki ASSETS ’22, October 23–26, 2022, Athens, Greece

recent issue, each article has a split side-by-side view showing ASL of the next), among other difculties. Solving these challenges re-
(or other sign language) on the left, and English text on the right. quires large amounts of continuous sign language data to learn
The content is synchronized so that the English sentence being from. Continuous signed sentences would also be useful for DHH
signed is highlighted. The video has controls so that the user can individuals trying to understand content, especially new concepts,
control playback of the signed video. Our interface builds on this, as it is natural and comfortable for them. There are some continuous
similarly providing side-by-side views in both languages. signing datasets, such as [15], which help fll this void. However,
To the best of our knowledge, there has been only one past these datasets are typically small and recorded in a laboratory or
attempt to systematically provide sign language translations of studio environment, rather than a natural setting, which makes
existing text. Signly1 is a recent commercial efort to add “synchro- generalizing to diverse users and real-world environments difcult.
nous, in-vision, sign language translations on any webpage for any Currently, the process of producing large ASL datasets is pro-
deaf sign language user”. They enable website visitors to select hibitively expensive, due to the equipment needed and the time it
English text they would like translated into British Sign Language takes to collect and label/annotate ASL data. Crowdsourcing is a
(BSL), which is sent to a professional interpreter for translation. more afordable alternative to traditional in-lab collection, and has
Once the translation video is created, website visitors can click been successfully used in other accessibility domains. For example,
on the English text to trigger a pop-up translation video at the research has explored collecting images/videos and questions from
bottom-right corner. While this company helps make English texts blind and low-vision users, with answers provided by the crowd
online accessible, users have to request translations, and website [4, 5]. Other accessibility crowdsourcing research has explored pro-
creators have to contact and pay Signly to incorporate and maintain viding image alt-text [35, 36], transportation information [34, 38],
their services. Scale is also limited, as the translations are done by and live captions [23, 24]. Our current work adds to this body of
the Signly team. Our interface is similarly motivated to provide work, by providing an initial exploration into crowdsourcing con-
access to English text online. However, we enable crowdsourcing tinuous sign language data.
translations to streamline and scale data collection, and to enable Along with the lack of bilingual information resources, lack of
a more diverse and representative group to contribute. We also data motivates our “ASL Wiki” interface.
display the text and video side-by-side in a more bilingual manner.
3 ASL WIKI PROTOTYPE
2.3 Sign Language Technologies Need Data
We have created “ASL Wiki” – a prototype site where people can
A recent paper [8] summarized the state of sign language processing. crowdsource ASL translations of English articles, providing a com-
They hosted an interdisciplinary workshop with 39 domain experts munity resource that supports accessibility as a bilingual infor-
with diverse backgrounds, where they reviewed the state-of-the-art, mation resource, while also tackling the lack of continuous ASL
and listed calls to action for the research community. These calls datasets with English labels. In this section, we describe our proto-
included but were not limited to focusing on real-world applications type and design process.
and creating larger, more representative, public video datasets. They
emphasized the current lack of data, cited as the biggest obstacle
in sign language technology research. Data collection is difcult
3.1 Design Process and Criteria
and costly, yet “without sufcient data, system performance will We engaged in an iterative design process to arrive at our “ASL
be limited and unlikely to meet the Deaf community’s standards”. Wiki” website design. We frst identifed design criteria the plat-
Despite these challenges, groups have worked on sign language form needed to meet (e.g. that the text used is available for use
data creation and curation. Datasets exist for many signed lan- on the platform and in a dataset, that participants can contribute
guages, including but not limited to German2 , American3 , Argen- remotely without specialized hardware, and that translations are
tinean4 , Turkish5 (more listed in [8, 13, 31, 32], etc.). The main pa- segmented and labelled). With these identifed, we started with
rameters of sign language datasets include the number of subjects, drawn designs which were iteratively refned and implemented.
samples, language level, type, and annotations/labels. As explained Throughout the process, we continued to meet with stakeholders
in [8], existing sign language datasets greatly limit the robustness consisting of a group of interdisciplinary Deaf and hearing individ-
of systems trained on them. Current datasets are not sufciently uals who have deep ties with the DHH community and incorporate
large – typically containing fewer than 100,000 articulated signs. their input. These stakeholders tried out the evolving prototype,
Also, many existing datasets contain individual signs, which and also discussed the project and provided guidance.
may not be as useful for real-world use cases of sign language Through our meetings, we chose to explore creating and reading
processing. For real-world applications, there needs to be natural bilingual versions of Wikipedia articles, rather than play scripts,
conversation with complete sentences, i.e. "continuous" sign lan- books, or other resources. We decided on Wikipedia articles be-
guage. Continuous sign language recognition and translation are cause they are generally neutral, publicly available, and popular
challenging due to epenthesis efects (insertion of extra features informational resources. There also exist other parallel corpora of
into signs) and co-articulation (ending of a sign afecting the start Wikipedia content which have been useful for natural language
processing and artifcial intelligence/machine learning.
1 https://signly.co/
Our iterative design process uncovered specifc user require-
2 https://www.phonetik.uni-muenchen.de/forschung/Bas/SIGNUM/
3 https://github.com/YAYAYru/sign-lanuage-datasets ments of our interface. We found that the interface needed to show
4 http://sedici.unlp.edu.ar/handle/10915/56764 ASL and English at the same time, so that users could see both
5 https://www.cmpe.boun.edu.tr/pilab/BosphorusSign/home_en.html and easily look at one or the other as they wished. Our interface
ASSETS ’22, October 23–26, 2022, Athens, Greece Abraham Glasser, Fyodor Minakov, and Danielle Bragg

also needed to show which English portion is being signed in the sentence in the article, or clicking "Redo" will prompt them to redo
current ASL video, so that users can keep track of their position the recording. The English sentences that have been recorded will
in both the video timeline and the English article. Users who are show a video camera icon. There is also a guide on the top, above
recording their videos should have an efcient, streamlined way to the English article, to remind users how to use the interface. Also,
record sentences, meaning that the interface should not pose unnec- underneath the ASL video placeholder is a picture demonstrating
essary overhead. It should be allowed for multiple people to submit a good recording setup and a bad example, to remind users that
recordings for the same English sentence, as diferent people might they should be sure to position their webcams so it captures their
sign diferently (e.g. regional accents or varied interpretations), or upper body and that their arms/hands do not go out of frame while
have preferred signs for specifc English words. signing. There is also an upvote/downvote button where users can
give feedback on the ASL video.
3.2 "ASL Wiki" Design In reading view, the site enables users to access parallel con-
3.2.1 Homepage. We took inspiration from the idea that Wikipedia tent in ASL and English. After toggling to reading view, the same
is a "free content, multilingual online encyclopedia written and English article is kept, and now shows a "play" icon next to the
maintained by a community of volunteers" [40]. On the homepage sentences that have been recorded. Clicking on a sentence will
of the ASL Wiki site, on the left hand side, is a checkbox list of highlight that sentence, and play the respective ASL video. Once
featured categories. Users can use these checkboxes to bring up the video completes, it auto-progresses to the next sentence. There
relevant articles, which appear in the middle with fractions indi- is a playback control underneath the video so that the user can go
cating how many sentences there are in the article, and how many back, forward, redo, pause/play, and control the playback speed.
of these sentences have been recorded by at least one user. Being There is also a toggle to turn on or of the auto-progression. It is
clickable, the rows of article titles also display a "Record" button possible that multiple users would sign the same English sentence,
that takes you to the reading/recording interface (discussed further so underneath the ASL video is a list of the users who submitted
in subsequent section). videos for the currently selected English sentence. The user has the
On the top of the homepage is an introductory title and para- option to switch between signers if they desire. A screenshot of a
graph, along with an ASL video of someone signing this text. Once sample reading view is shown in Figure 2.
you are logged in, on the top right of the page is a button that allows
you to view and edit your profle, or sign out. Next to this button is 4 USER STUDY
a gamifying trophy icon displaying the number of sentences the To explore the usability of our ASL Wiki site design, we ran a remote
logged-in user has recorded. This was added as it is a common user study, with Institutional Review Board (IRB) approval. In this
element of social media sites that display the number of "posts" an user study, participants answered survey questions, tried out the
user has submitted. It potentially incentivizes the user by showing reading and recording views, and discussed interview questions
them how much sentences they have recorded. about their experience.
In the middle of the page, between the top banner and article
table of contents, is a numbered instruction summary to remind
users how to navigate and use the interface. Especially as users are
4.1 Participants
able to leave the site and come back later, and since they navigate 4.1.1 Recruitment. Participants were recruited via mailing lists,
into and out of specifc articles, they may need a persistent reminder social media posts, and snowball sampling. The recruitment criteria
of how to use the site, which is why we added this. A screenshot was that they use ASL, are 18 or above years of age, and have a
of the homepage is shown in Figure 1. Once the user selects an computer with a webcam. 19 participants were recruited in total.
article, they are taken to the reading/recording view. This view has The sessions ran for about 1 hour, and participants were given a
a toggle on the top to switch between the recording view and the $30 (USD) Amazon gift card for their participation.
reading view.
4.1.2 Demographics. Out of the 19 participants, 15 identifed as
3.2.2 Recording and Reading View. In both recording view and Deaf, 3 deaf, and 1 Hard-of-Hearing. 11 identifed as female, and
reading view, the main layout is the same: it is a split, side-by-side 8 male. The average age of all participants was 26.1 with standard
bilingual interface. On the left is a placeholder for an ASL video. deviation 2.2.
On the right is the article in English. Participants self-reported their ASL fuency on a scale from 1 (I
If in record view, once the user selects a sentence on the article, do not use ASL) to 7 (I am fuent). The average fuency was 6.4 with
the ASL video placeholder becomes a self-view of the user’s web- a standard deviation of 1. Generally, all participants were educated,
cam, so that they can see themselves. Their self-view is overlaid with only 3 out of 19 not having a bachelor’s degree yet at the time
with a head and body guide to encourage users to center themselves of participation. Participants were diverse, with 12 self-identifying
in the recording. A 3-second countdown commences, and then the as White (e.g. German, Irish, English, Italian, Polish, French, etc), 5
user would sign the English sentence in ASL. While they are record- as Asian (e.g. Chinese, Filipino, Asian Indian, Vietnamese, Korean,
ing, the according English sentence is highlighted, to mark and keep Japanese, etc.), 1 as Black or African American (e.g. African Ameri-
track of their place in the article. When they are fnished, clicking can, Jamaican, Haitian, Nigerian, Ethiopian, Somalian, etc), and 1 as
on a stop button underneath their self-view stops the recording, Middle Eastern or North African (e.g. Lebanese, Iranian, Egyptian,
and displays their recorded video for playback. If the user approves, Syrian, Moroccan, Algerian, etc). The 19 participants came from 8
clicking "Keep" will submit the video, and auto-progress to the next diferent U.S. states.
ASL Wiki ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 1: Screenshot of ASL Wiki homepage.

Figure 2: Screenshot of reading view of article "Caramel".

4.1.3 Prior Experience with ASL and English. All participants re- Participants were asked the question "How often have you en-
ported that they read English text online daily (n=5) or multiple countered websites you wish provided ASL videos instead of or
times a day (n=14). It was reported in the demographics survey that in addition to English text?" 3 answered "multiple times a day", 5
participants read English text via websites, books, articles, video "daily", 5 "weekly", 2 "monthly", 2 "less than once a year", and 2
transcripts, and social media posts. Along with these 5 options, we "never".
had also listed podcasts (and "other") as the answer-choices on this All participants except one said that they watch ASL videos
survey question, but nobody selected that. online frequently (1 said "yearly", 3 "monthly", 5 "weekly", 5 "daily",
and 5 "multiple times a day"), typically through video blog (vlog)
ASSETS ’22, October 23–26, 2022, Athens, Greece Abraham Glasser, Fyodor Minakov, and Danielle Bragg

Figure 3: Screenshot of recording view of article "Agriculture".

posts, YouTube videos and other social media videos. Participants got and understood the full experience of recording and con-
commented that they have seen content on various social media tributing to the site, and were told they would discuss their
platforms where someone is signing in ASL, and there are English experience afterwards.
captions visible, so they have seen bilingual/bimodal content before, (5) Semi-structured interview: While the fuent DHH ASL
and are comfortable with it. 10 out of 19 participants said that they signer continued to be on a video call with them, they en-
have at least once created content like this that had both ASL and gaged in a semi-structured interview with guiding questions
English, and that they created the ASL video frst and added English spanning short answer, long answer, and Likert-scaled ques-
subtitles afterwards. 9 out of these 10 did this to post on social media, tion items. The interview focused frst on the reading view,
where they have both DHH and hearing friends, and 1 said they asking about their experience and understandability using
only did it for a homework assignment or class project in college. the interface, and then were asked questions about their
experience and challenges (if any) while recording. Lastly,
4.2 Procedure questions were asked about the overall concept of the site,
what they liked and disliked, and whether they would recom-
An online form walked participants through the study procedures mend the site to others. Appendix A.1 provides our interview
while a DHH fuent ASL signer was on a video call with the partic- questions.
ipant. Each participant was scheduled for their own session, and (6) Demographics: After the interview portion, participants re-
the entire procedures took approximately 1 hour. The procedures turned to the online form where they flled out demographics
were as follows: and compensation information.
(1) Consent: Participants engaged in a consent process with
IRB-approved language through the online form. The re- 5 USER STUDY RESULTS
searcher on the video call checked whether the participant We discussed with each of the 19 DHH participants during the ex-
needed any portion of the consent language signed in ASL periment to gauge their reactions and experiences with the reading
so that it was fully understood. and recording views of the interface. We evaluated how they used
(2) Background: Through the online form, participants were the site, to understand their motivations, challenges and strategies,
asked multiple-choice questions about their prior experience and the benefts they took away from the site. We thematically ana-
with using English and ASL online. lyzed the interview responses and performed statistical analysis of
(3) Reading: After this, they followed instructions on how to their responses to the questionnaire. We also collected feedback and
access the ASL Wiki site and sign in, and were directed to the identifed several target audiences who the users would recommend
“Caramel” article which had been entirely pre-recorded by a the interface to.
DHH ASL signer from our research team. They engaged with
the interface to read this article until they were satisfed.
(4) Recording: Next, they were instructed to select any arti- 5.1 Reading View
cle of their choice and record themselves signing. Since we 5.1.1 ASL vs. English. Participants valued having both English and
wanted to closely match a real-world experience of using ASL versions of articles available for consumption. On average,
our site, participants were given the fexibility to record as participants self-reported that, while reading the "Caramel" article
much (or as little) as they wanted to, but were told to use that our research team had entirely pre-recorded, they looked at
the recording interface until they were confdent that they the English part 65% of the time, and the ASL video 35% of the
ASL Wiki ASSETS ’22, October 23–26, 2022, Athens, Greece

time. Participants explained that the English part is faster to read, of user interface preferences and suggestions, such as coloring
with P1 saying "...it’s faster to read and skim through. It’s more of a and layout styling. There were also some suggestions about the
habit because I’m used to reading English articles". P8, who reported fundamental system. P7 suggested a diferent layout: "For me, I
looking at the English part 50% of the time, said "I like ASL. [It] is would prefer top and bottom rather than side by side, so it’s kind of like
more visual and I can visualize it better, but for English I can read it captions. It was a little challenging for me to have it side by side". P13
faster. If I just want to consume the content and save time, I would look suggested making the recorded videos easier to fnd: "One suggestion
at the English 100%. If I wanted to fully understand, learn, visualize, I have is that it might be nice to have a separate scrolling bar other
maybe 50/50 – I’d also be curious what it looks like in ASL". than the browser one where it’ll indicate the recorded statement bits.
Participants were asked to indicate how understandable the ASL E.g. code changes in a code review". Besides these feedbacks, users
content and the English content in the "Caramel" article they viewed also complimented the interface, P4 said "I liked the clarity, green
was, on a scale from 1 (very difcult) to 5. For the ASL content, highlight, follow each other, I liked the time/playback, matching", P13
the average was 4.6 (s.d. .7), and 4.8 (s.d. .4) for the English part. "What I liked about the interface is that each statement and section is
Even though participants said that the English part was very easy reasonably spaced out which makes it easier to read and I like how
to understand, all 19 participants answered "Yes" to the question there’s a clear indicator whether if there’s recordings for it or not".
"Was it helpful to view the content in both English and ASL?" P11
explained "Yes, I can imagine how it would be helpful for the general. 5.2 Recording View
It’s a nice tool for me to use, and I would like having it even if I don’t
A total of 202 sentences were recorded from our 19 participants. On
use it much".
average, participants recorded 11 sentences. Participants recorded
in 25 total articles from the Entertainment, Deaf Culture, Sports,
5.1.2 Interface Usability. On a scale from 1 (very difcult) to 5
Books, Mathematics, Technology, Food, Geography, Art, and Poli-
(very easy), participants said that the interface was very easy to
tics categories.
use, giving it an average of 4.5. Most of the difculty came from not
having prior experience and not knowing what to expect with the 5.2.1 Challenges and Strategies. Participants were asked if they
interface, e.g. P10 saying "I didn’t think there was any information found any content challenging to record. They reported that they
overload – in the beginning I wasn’t fully sure what to do. Maybe the generally selected articles from topics they thought they were the
frst sentence could be highlighted with the video, that would make it most familiar and comfortable with. For instance, P11 said "I picked
more clear there is ASL there". P2 commented "at frst when I opened the content I was most comfortable with, and it was straightforward
it, I wasn’t sure what to do – my eyes caught the English part frst, and just facts, so it wasn’t challenging. I can imagine if I picked a
and I ignored the left half – and then it took me a while to realize STEM article or something complicated it would be challenging". Some
that the left side was empty until I clicked on some text, and the video participants commented that they felt it was challenging to actually
player showed up. [...] I think there should have been some kind of translate the English content into ASL, because they were not sure
tutor/illustrations with directions of how to use this site before I went how to sign some words, or were not sure how to make it so it
ahead and looked at an actual article". wasn’t a word-for-word English-to-ASL translation, but rather a
Most participants (12 out of 19) did not use the upvote/downvote concept-to-concept translation – P18 said "it can be a bit challenging
button that was available to them while viewing the "Caramel" to keep it simple and brief yet informative", P16 refected that there
article. Some participants said that they did not see it, while some were "some words that I’m not sure if they have signs for them", and
did but decided not to use it. P12 said that they do not use it in P3 summarized "sometimes I have to reread and think about how I
general, such as upvoting/downvoting on Reddit, liking/disliking on will sign it to try not to be too English".
Facebook or YouTube. P9 said "I didn’t know about the feature until We asked participants if they had any strategies they employed
I arrived at this question. I normally skim through contents", and P14 while recording content. Most said they did not – they commented
said "I wasn’t focusing on providing feedback on performance". Those about trying not to be too "English" in their signing, with P1 saying
who did use it generally said that they wanted to give feedback, "I would read frst, and then think about my understanding of it, and
with P4 saying "I wanted to give feedback on the video, so I clicked yes try my best to explain it in ASL. I wanted to avoid one to one or exact
– I noticed the signing was clear and matched the English so I went English translations" and P7 commenting "I tried to fnd simpler
ahead and clicked yes". Some participants such as P6 emphasized sentences, but most sentences required a lot of fngerspelling. It was
it was important: "I think it’s important to use, yes I would use it, it challenging to use it, I didn’t really think through it, I just read the
gives feedback to other people and I can help this website advance English part a couple times and then tried my best". We noticed that
and develop in the future and make sure it has good content", and P16 not all participants started at the top of their selected articles. It
suggested it would prevent misunderstandings, saying "I don’t want seems that some participants selected sentences throughout their
some signers to use wrong signs or say it in the wrong way which will articles and did not always record consecutively.
make viewers misunderstand. We want to avoid that", and P11 made
an analogy to real-world applications they’ve seen, reminiscing 5.2.2 Interface Usability. To ensure whether the interface itself
"Yes, it’s the same as FAQs or other articles that say “was this article caused any signifcant issues for participants trying to record them-
helpful?” – this is the same situation". selves, we asked them to rate, on a scale from 1 (very difcult) to
We collected some feedback about the interface, to understand 5 (very easy), how easy the interface was to use. For the 19 par-
how our interface could be improved and help inform future work ticipants, the average was 4.6. It did not appear that the interface
on such bilingual interfaces. These feedbacks typically consisted caused any further challenges to the recording experience, with
ASSETS ’22, October 23–26, 2022, Athens, Greece Abraham Glasser, Fyodor Minakov, and Danielle Bragg

P1 saying "I thought it was straightforward and simple", P3 "...liked wouldn’t have to spend time looking up specifc English words in a
Redo/keep, add playback/review to watch it before deciding". Some separate interface and/or re-reading the English text multiple times.
participants had suggestions about the interface to make their expe- Some participants said that this would also help them improve
rience better, such as the seemingly abrupt countdown that started their signing and presentation skills, since they could beneft from
as soon as they clicked on an English sentence to record, but there watching their own videos, or pick up new signs for unfamiliar
were conficting responses as some participants said they disliked words. For instance, P14 said "I can improve how best I can interpret
it, e.g. P7 suggesting "maybe instead of auto countdown, I felt more English in ASL", P4 similarly saying "If I record, I could beneft from
pressure, I would rather click on the sentence and then have a record watching my own videos, I will see if I signed it clear and understood
button" and P13 who said "I dislike that I can’t manually start record- it well. I would also beneft from reading myself, and others would
ing", while some liked it, e.g. P11 "... I liked the countdown, 3-2-1". beneft by reading my videos that I contribute". When we asked them
Participants also made some suggestions for extra features, such as what kind of content they would like to see on the website, they
being able to trim the video before submitting, moving the place- mentioned things they were studying, e.g. P2 "related to my major,
ment of the self-view, an explicit way to "skip" (rather than "redo" tutorials on 3d design software, art, technology, art terminology, for
or leaving the page). example gothic art history, etc.", things they were interested in, as
P11 brought up "nutrition, diets, women’s health, for example there’s
5.3 General Experience a lot of things that are related to hormones, specifc foods afecting
5.3.1 Enthusiasm. After participants had tried out our bilingual in- things, having ASL there would be nice", general news, information
terface, we asked whether they "wish more content online provided and topics, with P8 saying "news, health, podcasts, could be a safe
both English and ASL?" from a scale from 1 (strongly disagree) to place for community involvement, like an area for people to post
5 (strongly agree). The average response was 4.6 (s.d. .76). Partici- news around the world, gaming area, etc. Make subcommunities for
pants gave examples of where they have wished they had access gamers, etc, same concept as Facebook groups, Reddit subreddits, etc.
to both English and ASL. These examples included but were not But everyone is deaf and uses ASL", among several others. Many
limited to news, podcasts, articles, social media and entertainment. participants were very supportive of the idea, and did not care what
Participants mentioned the Daily Moth, where they have seen both kind of content is available, as long as a lot is available, with P19
ASL and English captions or transcripts, but they mentioned that saying "... every site should have this option, all kind of topics are
these are selected specifc news, and they wish they had access to a welcomed", P8 agreeing "as much as possible, no limits", P5 "there’s so
more broad, general selection of news around the world. Some par- many topics to choose from, I would just pick the best and most infor-
ticipants mentioned that they wish they had this kind of bilingual mative articles for education", and P7 "not that I can think of, general
resource when they were learning about things for their classes, Wikipedia articles would be good". This shows that participants were
projects, and homework. These fndings suggest that people may very supportive of the site.
want to use a tool, similar to our novel interface, in the real-world. 5.3.3 Concerns. We asked the participants whether they had any
Despite the 19 participants’ desire of having more content online concerns while using a site like this. 7 out of 19 participants ex-
provided in both English and ASL, they were not as interested in plicitly brought up the concern that there wasn’t control over the
generating this sort of content themselves. When they were asked quality of users’ submitted videos. For instance, many commented
"Would you be interested in generating content available in both that people may not have professional backgrounds, or that they
English and ASL? (1-5: Strongly disagree– Strongly agree)", their may have something inappropriate or unintentional (such as other
average response was 3.6 (s.d. 1.6). Some of their rationale included people) in the backgrounds of the videos they submit. People also
not wanting to record themselves and/or posting publicly, with P1 mentioned that users have varying devices and webcam technolo-
saying "I personally would not, because I personally don’t like record- gies, so that the quality of the videos themselves may not be as good
ing myself and posting online publicly", P10 "No, because I feel like I’m as they’d like – perhaps the lighting would be bad, the video would
signing wrong, or feel that people would judge my signing for being be choppy or blurry, etc. A few participants also mentioned that
English, etc", P17 "No, I’m a camera shy". Participants who indicated the site may fnd users who misinterpret or inaccurately translate
that they are interested were inclined to do so because they felt content. P18 said "It can be misinterpreted easily if the translator is
they would be giving back to the community, and supporting this not professional or a novice", P11 brought up that "not everyone can
concept of accessibility, e.g. P2 "Yes, I wouldn’t mind – because I feel translate well, so that would be my concern – there might be some bad
like there is a lot of ASL content out there that is not neutral, where videos. I recommend having STEM topics assigned to people who are
the people who are signing are biased, or give biased information. specialists in that feld". A participant also brought up the issue of
This would be nice and I would like to help increase access while still privacy, stating that they are concerned about the privacy of their
keeping neutral and spreading information in a neutral manner" and data, and who would "own" it and who would be able to access it,
P3 "Yes, because if I can get access like this, why not I give back, I especially if it was public.
don’t want others to miss out".
5.3.4 Participant Impression. Overall, participants said that they
5.3.2 Personal benefits. One participant mentioned that the site enjoyed using the website, and that they thought it was "cool to
would beneft them because they can use it while teaching, to make use". When asked "Would you want to use a website like this to
sure their students have access and can understand the content read content in the future?", 14 participants said "Yes", and 5 said
fully. Many participants mentioned the site would help them un- "No". The participants who said "No" said that they are already
derstand content better and go through content faster, since they comfortable with reading English text alone, and do not require
ASL Wiki ASSETS ’22, October 23–26, 2022, Athens, Greece

ASL for reading comprehension. Despite this, the participants, on and then paid two fuent ASL experts to evaluate all the recordings,
average, said 4.5 to the question "How likely are you to recom- and compared the results. Our results suggest that the quality of
mend this website to others? (1-5: very unlikely – very likely)". translations created through ASL Wiki are comparable to those
Participants suggested many diferent groups of people who they created through standard state-of-the art setups, with potential
would recommend this site to. They would recommend it to DHH slight improvements to translation accuracy and recording quality.
individuals, because of the communication barriers they face, as P2
said "I feel for DHH people, and others who are not good at English, 6.1 Procedure
and have communication barriers and have a lack of education, they 6.1.1 Video Generation. We paid four Certifed Deaf Interpreters
can learn well through this site", P7 said "I would recommend it to (CDIs)6 to translate a set of 20 Wikipedia articles twice – with
people who I know grew up signing and struggle with English, they both our interface and with their standard translation setup. We
could improve their reading skills and understand content better", P16 chose to work with professional Deaf interpreters in order to enable
"pretty much everyone with ASL especially for people who have weak comparison to state-of-the-art translations. Each CDI was assigned
English skills", and P4 bringing up "international friends who don’t 5 articles to record twice. and we counterbalanced the procedure,
know English very well, it would help understand English and ASL, so that two CDIs started with our interface and then used their
or other people whose frst language isn’t English". standard setup, and the other two CDIs did the reverse.
Many participants mentioned they would recommend the site In the standard recording procedure, the interpreters were given
to people who are learning ASL, since the site is bilingual and has access to the plain text, and asked to record a translation of the
synchronized English and ASL content, as P3 says "friends who are text in sections. They were instructed to use their typical setup and
interpreters, on their own time learn ASL/translation, receptive skills, procedures for such jobs – for example, referencing the text and/or
signing skills", P15 "ASL students for learning and people who are personal notes and recording through a video camera app on their
thriving to learn ASL", P17 "If the website has enough recordings or laptop or smartphone. This is a standard type of translation job
gains popularly among users, I would recommend to a friend who taken on by professional ASL interpreters (e.g. to translate written
isn’t fuent in ASL", and P19 summarizing both ASL learners and questions in a survey, or to translate consent form language).
the DHH community "this would be great for people learning ASL. Each CDI translated their own set of 5 Wikipedia articles. Each
They can practice their receptive skills, and learn how to follow the set spanned a variety of topics, including both technical and non-
ASL grammar structure and sign placements. This is also good for technical topics. In total, 17 topics were covered in these 20 articles
every person in the deaf community who may prefer reading captions (identifed through topic modeling on the most popular 810 Eng-
only some days and ASL other days, or anyone who has a preference lish Wikipedia articles): Geography, Entertainment, Sports, Deaf
in how they absorb information". Since the site has been seeded with Culture, History, Science, Mathematics, Medicine, Business, Poli-
articles from Wikipedia, which are normally informative and neu- tics, Technology, Military, Philosophy, Food, Books, Religion, Art.
tral, participants suggested people who often look up information, Article length ranged from 105-627 words (avg 309), and from 4-29
or use information in their profession, such as researchers, with sentences (avg 15). In total, we collected 308 recordings through
P6 saying "school educational use, like for students to do research, our ASL Wiki interface (corresponding to individual sentences),
or college students/professionals to record videos, k-12, community and 111 recordings through state-of-the-art interpreter setups (cor-
college, ...", P5 suggesting teachers "I would recommend it to teachers. responding to sections).
I think the website would be best for education and is very educational
rather than recreational, so teachers could use it to record content and 6.1.2 Video Evaluation. To compare the quality of the two record-
provide information online", and P8 recommending learners: "Maybe ing sets, we paid two fuent ASL linguists to evaluate each video
people who want to learn more things, learners, people who typically along fve dimensions. These dimensions capture the accuracy of
look stuf up and read things". the translation from English to ASL (Q1), the quality of the ASL
independent of the English (Q2-Q3), and the completeness of the
data captured (Q4-Q5). The dimensions and exact questions that
6 TRANSLATION QUALITY EXPLORATION the experts answered for each video are listed below. In addition,
While our ASL Wiki site was designed to facilitate translation the experts had the opportunity to enter additional notes for each
contributions, how the interface design may impact translation video, and we also engaged in a debrief meeting to gather their
quality is unclear. Interpreters typically generate ASL translations feedback and observations about the video sets as a whole.
of English texts in large sections (e.g. paragraphs). In contrast, our Q1) Translation accuracy: How well does the ASL recording con-
interface elicits of text segmented into sentences to enable readers vey the meaning of the English? (Scale of 1-5)
to access spot-translations within long texts. Our interface also Q2) Linguistic correctness: How correct is the ASL execution
provides built-in mechanisms to facilitate the translation process (e.g. were there many mistakes with handshape, movement,
(e.g. marking completion progress within the text, providing the grammar, etc.)? (Scale of 1-5)
text and recording interface in the same tool). Q3) Signing naturalness: How natural is the ASL (i.e. how similar
To explore the potential impact of the interface on translation is it to ASL you might run into in real life)? (Scale of 1-5)
quality, we ran a small experiment comparing a set of recordings Q4) Recording quality: How good is the recording quality (e.g. is
generated through our interface to a comparable set generated it blurry, is the lighting good, etc.)? (Scale of 1-5)
through state-of-the-art recording setups. Specifcally, we paid four
professional Deaf interpreters to record 20 articles in both setups, 6 https://rid.org/rid-certifcation-overview/available-certifcation/cdi-certifcation/
ASSETS ’22, October 23–26, 2022, Athens, Greece Abraham Glasser, Fyodor Minakov, and Danielle Bragg

Q5) Signing captured: Is the full signing space captured in the would thus incentivize contributions from the community. Further
video (i.e. hands, torso, surrounding are)? (Yes/No) incentivization, such as credit for class or monetary payment could
also be benefcial to deployment at scale.
6.2 Results Participants indicated that the reading and recording interfaces
The expert evaluations of the recordings generated through our of our website design were easy to use. Even though participants all
interface and through the CDIs’ standard setups were comparable, thought the site was easy and intuitive to use, several would rather
across all fve explored dimensions. Figure 4 shows the overall re- only use it for reading bilingual content, rather than contributing
sults – average score and standard error for Q1-Q4, and the percent ASL videos. They thought it was helpful to view content in both
of videos that were evaluated as having captured the full signing English and ASL, and mentioned several cases where they wish
space for Q5. We ran two-sided Wilcoxon-Mann-Whitney tests with they had this level of accessibility in media. They talked about some
Bonferonni correction to compare evaluations of the two interfaces of the challenges and strategies used while recording. The website
for Q1-4. For Q1 (Translation accuracy), Q2 (Linguistic correctness) was strongly supported and all participants identifed populations
and Q3 (Signing naturalness), there was no statistically signifcant that they would recommend the site to. Participants also suggested
diference (p>.0125). For Q4 (Recording quality), the test showed many diferent topics that could be added to the interface that would
statistical signifcance Q4 (U=83175.5, p<.005). We also ran a � 2 beneft them and others in mind.
test to compare Q5-Signing captured, which was not statistically Even participants who commented that they were fuent in Eng-
signifcant (p>.05). lish and ASL still indicated that seeing content in a bilingual, bi-
During our debriefng, the expert evaluators identifed some pat- modal form was useful. Even if it did not help them understand
terns in the data. They noted that the recordings, in particular those the content itself better, some participants still mentioned that they
created through ASL Wiki, contained straight translations rather could pick up new signs or improve their signing and presentation
than interpretations. For example, the interpreters did not tend to skills. Overall, participants enjoyed using the website, and identi-
elaborate on concepts from the text to ensure that the meaning in fed several use cases and target audiences who they would highly
ASL is clear, or to provide additional context not provided in the recommend the interface to.
text. Instead, they tended to stick to the exact text. They also noted During the interview portion of the user study, we collected
some examples where it seemed that the interpreters had not done feedback from participants so that we could further iterate upon our
the full prep work to understand the content they were translating. design. These feedbacks would also be useful for future researchers
For example, this was evident to one expert in a translation of some who want to generalize our interface, and potentially use it for
plant anatomy, which lacked the appropriate visual representation. other signed or written languages. While our exploratory user
One expert also noted that they had expected to see a larger dif- study serves as a proof-of-concept, several research questions have
ference in quality between the recording sets, in particular due to arised. We have identifed several research avenues and next steps
the diference in text segmentation lengths. They were surprised as a result of this work.
that there was not a larger diference in translation accuracy and
quality for the longer and shorter excerpts.
7.2 Translation Quality
In our translation quality exploration, it is possible that linguistic
7 DISCUSSION
correctness was slightly more reliable with our interface due to
Generally, the results of our exploratory studies suggest that it reduced cognitive load. Our interface provided required shorter
may be possible to use specialized interfaces to crowdsource ASL excerpts of text to be translated. It also simplifed the recording
translations of English text, to provide valuable bilingual resources task by keeping track of where the user was within the text, auto-
to the community and to curate ASL data. Our user study results progressing to the next excerpt, and providing the text and video
suggest that users would fnd value in such a bilingual ASL and feedback in a single interface rather than requiring the interpreter
English platform, and would be willing to contribute, especially if to manage two separate interfaces for these components. It is also
incentivized. At the same time, our translation quality exploration possible that the recording quality was slightly better on average
suggests that the interface enables high-quality translations. In this with our interface because the quality of the recording was less
section, we provide further discussion on our exploratory work, dependent on the quality of apps that the interpreter has available
the limitations of this initial work in this space, and related future to them. While we did not provide hardware, we provided built-in
work. recording software in our website, unlike state-of-the-art setups
that are dependent on the recording software that interpreters have
7.1 User Experience access to and know how to use.
Because participants were not incentivized further for contributing While our exploration suggests comparable translation quality
more videos during our user study, the majority of participants only with ASL Wiki compared to state-of-the-art translation setups, it
contributed until they fgured out and were satisfed with the user still leaves open questions about the impact of isolated interface
interface and experience for the recording view, with an average of components. For example, it would be interesting to examine the
11 sentences per user. It seemed that participants generally chose efect of diferent text segmentations within our interface, possibly
to contribute to topics that were personally meaningful to them, ranging from individual words, to sentences, to paragraphs or sec-
especially those who contributed a larger number of recordings. It is tions. Similarly, it would be interesting to experiment with the efect
possible that an expanded range of topics that interest more people of diferent types of visual cues for orienting the translator within
ASL Wiki ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 4: Comparison of expert evaluations of ASL translations of 20 Wikipedia articles, recorded by CDIs through ASL Wiki
and a control state-of-the-art setup. For Q1-4, (Translation accuracy, Linguistic correctness, Signing naturalness, and Recording
quality), the bar chart shows the average and standard error of expert evaluation. For Q5 (Signing captured), the bar chart
shows the percent of recordings evaluated as having captured the full signing space.

a page of text. It is also possible that the impact of the interface on investigate is educational tasks, e.g. ASL interpreting students could
the translation quality may vary depending on the experience or contribute to gain credits for certifcation or program requirements.
fuency of the user. For this user study, we chose to implement a stand-alone website
pre-populated with a sample of Wikipedia articles, limiting the type
of content available for participants. We chose this implementation,
7.3 Limitations and Future Work rather than a web plug-in or other setup with broader content for
There is a need for a larger, more longitudinal study to see how several reasons: ability to choose English texts that are open for pub-
users use the site over a period of time in the real world, rather lic use, utility to users in having a complete translation as opposed
than a short 1 hour session where they use the site for the frst to sparse translations across more content, and implementation
time and answer survey and interview questions with a researcher. feasibility. Still, user study participants brought up many diferent
Additionally, most of our participants already had a Bachelor’s types of content and explained their experiences with other real-
degree, which may have biased our results; as a result, it is impor- world content. Consequently, other types of content, and expansive
tant for future studies to capture more diverse participants from interface designs including web plug-ins, should be explored. The
the DHH community. Such studies would allow for deeper insight utility of our interface with other signed/written language pairs,
in user participation and behavior, and the additional data collec- or exploring other potential user groups (e.g. those recommended
tion would enable deeper linguistic analysis and open up several by participants, such as K-12 students) could also be investigated.
research questions. Diferent use cases may or may not require other interface changes,
Since some participants in our user study skipped sentences, which would be explored in this research avenue.
selecting nonconsecutive sentences to contribute, there are gaps There were two major concerns brought up during our user study.
in the articles. Our user study participants supported the idea of Users were concerned about the level of control over data quality –
the website, said it was easy to use, but many of them said they since this is a crowdsourced approach, it is the contributors’ respon-
would personally not contribute themselves. To encourage users to sibility to have a good background in their signed videos, ensure
contribute in completeness, further research is needed to investigate there is good lighting, and that the video is not choppy or blurry.
diferent incentivization methods. There are several ways we can The other major concern was privacy. This is a very complicated
imagine this happening, such as strengthening the gamifcation topic ([25], [9], [7]), and more research is needed about privacy
inside the website (emphasizing the experience points they earn as concerns when it comes to crowdsourced ASL datasets. Another
they contribute, displaying a leaderboard of the top contributors), data quality research question is whether the crowd would be able
or monetary compensation for some arbitrary milestone of amount to control the data quality at a bigger scale. We included an up-
of ASL videos an user contributes. Another possible avenue to vote/downvote button where participants could give feedback, but
ASSETS ’22, October 23–26, 2022, Athens, Greece Abraham Glasser, Fyodor Minakov, and Danielle Bragg

we did not study this further, since 12 out of 19 participants did not Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/
use it. We also had a small number of sentences from each partici- 3313831.3376563
[2] Oliver Alonzo, Jessica Trussell, Becca Dingman, and Matt Huenerfauth. 2021.
pant, but if a larger and more longitudinal study was conducted, it Comparison of Methods for Evaluating Complexity of Simplifed Texts among
could be investigated how users use this feature. Deaf and Hard-of-Hearing Adults at Diferent Literacy Levels. Association for
Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764.
We have also run a small experiment comparing the quality of 3445038
translation recordings made through our interface and through a [3] Larwan Berke, Sushant Kafe, and Matt Huenerfauth. 2018. Methods for Eval-
state-of-the-art setup. This exploratory study suggested that the uation of Imperfect Captioning Tools by Deaf or Hard-of-Hearing Users at
Diferent Reading Literacy Levels. In Proceedings of the 2018 CHI Conference
quality of translations created through ASL Wiki are comparable on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18).
to those created through state-of-the-art setups, and potentially Association for Computing Machinery, New York, NY, USA, 1–12. https:
might enable slight improvements. While this is promising, we have //doi.org/10.1145/3173574.3173665
[4] Jefrey P Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller,
not evaluated the crowd-generated dataset from our participants Robert C Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White,
(as we did not have a control dataset to compare to, since general et al. 2010. Vizwiz: nearly real-time answers to visual questions. In Proceedings
of the 23nd annual ACM symposium on User interface software and technology.
community members do not normally engage in translation). It 333–342.
would be useful for future researchers to investigate this, as well as [5] Erin Brady, Meredith Ringel Morris, and Jefrey P Bigham. 2015. Gauging re-
to conduct in-depth linguistic analysis. For instance, it is possible ceptiveness to social microvolunteering. In Proceedings of the 33rd Annual ACM
Conference on Human Factors in Computing Systems. 1055–1064.
that our interface reduced the cognitive load of the signer, as well as [6] Danielle Bragg, Naomi Caselli, John W. Gallagher, Miriam Goldberg, Courtney J.
the technical requirements, which may have elicited more natural Oka, and William Thies. 2021. ASL Sea Battle: Gamifying Sign Language Data
and linguistically correct translations. Collection. Association for Computing Machinery, New York, NY, USA. https:
//doi.org/10.1145/3411764.3445416
As mentioned above, privacy is another issue that may impact [7] Danielle Bragg, Naomi Caselli, Julie A. Hochgesang, Matt Huenerfauth, Leah
the design and use of ASL Wiki and future work. The research Katz-Hernandez, Oscar Koller, Raja Kushalnagar, Christian Vogler, and Richard E.
Ladner. 2021. The FATE Landscape of Sign Language AI Datasets: An Interdis-
community has only recently begun to explore privacy concerns ciplinary Perspective. ACM Trans. Access. Comput. 14, 2, Article 7 (jul 2021),
related to sign language videos and thought about how they can 45 pages. https://doi.org/10.1145/3436996
be addressed [9, 26]. This initial work began to explore the impact [8] Danielle Bragg, Oscar Koller, Mary Bellard, Larwan Berke, Patrick Boudreault,
Annelies Brafort, Naomi Caselli, Matt Huenerfauth, Hernisa Kacorri, Tessa
of fltering videos, for example by blurring the video or anonymiz- Verhoef, Christian Vogler, and Meredith Ringel Morris. 2019. Sign Language
ing facial features. However, acceptability of these approaches is Recognition, Generation, and Translation: An Interdisciplinary Perspective. In
poorly understood, and their technical implementation is limited. The 21st International ACM SIGACCESS Conference on Computers and Accessibility
(Pittsburgh, PA, USA) (ASSETS ’19). Association for Computing Machinery, New
Indeed, it is possible that the community might prefer diferent ap- York, NY, USA, 16–31. https://doi.org/10.1145/3308561.3353774
proaches altogether, for example protective licensing or enhanced [9] Danielle Bragg, Oscar Koller, Naomi Caselli, and William Thies. 2020. Explor-
ing Collection of Sign Language Datasets: Privacy, Participation, and Model
security and transparency of data use. Once a better understanding Performance. In The 22nd International ACM SIGACCESS Conference on Com-
of the privacy needs and appropriate solutions have been developed, puters and Accessibility (Virtual Event, Greece) (ASSETS ’20). Association for
such techniques could be incorporated into ASL Wiki and similar Computing Machinery, New York, NY, USA, Article 33, 14 pages. https:
//doi.org/10.1145/3373625.3417024
applications, and make a ripe area for future work. [10] Danielle Bragg, Kyle Rector, and Richard E Ladner. 2015. A user-powered Amer-
ican Sign Language dictionary. In Proceedings of the 18th ACM Conference on
Computer Supported Cooperative Work & Social Computing. 1837–1848.
8 CONCLUSION [11] B Cartwright. 2017. Signing Savvy. https://www.signingsavvy.com/
[12] ASL Clear. 2022. https://aslclear.org/
The lack of sign language bilingual resources and the lack of sign [13] Helen Cooper, Eng-Jon Ong, Nicolas Pugeault, and Richard Bowden. 2012. Sign
language datasets are difcult problems to solve, mainly due to the Language Recognition Using Sub-Units. J. Mach. Learn. Res. 13, 1 (jul 2012),
cost, resources needed, and amount of human efort required to label 2205–2231.
[14] ASL Core. 2022. https://aslcore.org/
and annotate data. In this work we have addressed both of these [15] Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth
problems by presenting a novel interface. Our interface provides a DeHaan, Florian Metze, Jordi Torres, and Xavier Giro-i Nieto. 2021. How2Sign:
A Large-scale Multimodal Dataset for Continuous American Sign Language. In
side-by-side ASL and English synchronized interface, streamlines Conference on Computer Vision and Pattern Recognition (CVPR).
pre-labeled data collection, and enables a crowd to contribute to [16] Karen Emmorey, Chuchu Li, Jennifer Petrich, and Tamar H. Gollan. 2020. Turning
piecemeal translation. We pioneer exploration of the question of languages on and of: Switching into and out of code-blends reveals the nature of
bilingual language control. Journal of Experimental Psychology: Learning, Memory,
how to enable everyday signers to contribute to continuous content and Cognition 46, 3 (2020), 443–454. https://doi.org/10.1037/xlm0000734
translation eforts, and how DHH users would respond to crowd- [17] Jens Forster, Christoph Schmidt, Thomas Hoyoux, Oscar Koller, Uwe Zelle, Justus
generated content. Piater, and Hermann Ney. 2012. Rwth-phoenix-weather: A large vocabulary
sign language recognition and translation corpus. In Proceedings of the Eighth
International Conference on Language Resources and Evaluation (LREC’12). 3785–
3789.
ACKNOWLEDGMENTS [18] Ann Grafstein. 2002. HandSpeak: A Sign Language Dictionary Online. https:
We thank Chinmay Singh for initial work on the prototype; Vanessa //www.handspeak.com/
[19] Julie Hochgesang, Onno Crasborn, and Diane Lillo-Martin. 2022. Sign Bank.
Milan for design guidance; Bill Thies and Naomi Caselli for thought- https://aslsignbank.haskins.yale.edu/
ful discussions; and Philip Rosenfeld, Paul Oka, and Mary Bellard [20] Leala Holcomb and Jonathan McMillan. 2022. http://www.handsland.com/
for general support for the project. [21] Matt Huenerfauth and Vicki Hanson. 2009. Sign language in the interface: access
for deaf signers. Universal Access Handbook. NJ: Erlbaum 38 (2009), 14.
[22] Deaf Studies Digital Journal. 2009. https://www.deafstudiesdigitaljournal.org/
[23] Walter Lasecki, Christopher Miller, Adam Sadilek, Andrew Abumoussa, Donato
REFERENCES Borrello, Raja Kushalnagar, and Jefrey Bigham. 2012. Real-time captioning by
[1] Oliver Alonzo, Matthew Seita, Abraham Glasser, and Matt Huenerfauth. 2020. groups of non-experts. In Proceedings of the 25th annual ACM symposium on User
Automatic Text Simplifcation Tools for Deaf and Hard of Hearing Adults: Benefts interface software and technology. 23–34.
of Lexical Simplifcation and Providing Users with Autonomy. Association for
ASL Wiki ASSETS ’22, October 23–26, 2022, Athens, Greece

[24] Walter S Lasecki, Christopher D Miller, Raja Kushalnagar, and Jefrey P Bigham. • Did you primarily look at the ASL or English part? [Follow
2013. Legion scribe: real-time captioning by the non-experts. In Proceedings of up to estimate percentage (0% ASL 100% English vs 100%
the 10th International Cross-Disciplinary Conference on Web Accessibility. 1–2.
[25] Sooyeon Lee, Abraham Glasser, Becca Dingman, Zhaoyang Xia, Dimitris Metaxas, ASL 0% English)]
Carol Neidle, and Matt Huenerfauth. 2021. American Sign Language Video • How did viewing diferent signers afect your experience?
Anonymization to Support Online Participation of Deaf and Hard of Hearing
Users. Association for Computing Machinery, New York, NY, USA. https:
(If applicable)
//doi.org/10.1145/3441852.3471200 • On a scale from 1-5 (1- very difcult), how understandable
[26] Sooyeon Lee, Abraham Glasser, Becca Dingman, Zhaoyang Xia, Dimitris Metaxas, was the ASL content you viewed? Can you explain why you
Carol Neidle, and Matt Huenerfauth. 2021. American Sign Language Video
Anonymization to Support Online Participation of Deaf and Hard of Hearing chose this number?
Users. In The 23nd International ACM SIGACCESS Conference on Computers and • On a scale from 1-5 (1-very difcult), how understandable
Accessibility. was the English content you viewed? Can you explain why
[27] Colin Lualdi. 2022. Sign School. https://www.signschool.com/
[28] Matt Malzkuhn and Melissa Malzkuhn. 2022. The ASL App. https://theaslapp. you chose this number?
com/ • Was it helpful to view the content in both English and ASL?
[29] The Daily Moth. 2022. The Daily Moth. https://www.dailymoth.com/
[30] National Association of the Deaf. 2022. Position Statement On ASL and Eng-
Why or why not?
lish Bilingual Education. https://www.nad.org/about-us/position-statements/ • Did you use the upvote/downvote feature? Why or why not?
position-statement-on-asl-and-english-bilingual-education/ • How easy was the interface to use? (1-5: Very difcult –
[31] Eng-Jon Ong, Helen Cooper, Nicolas Pugeault, and R. Bowden. 2012. Sign Lan-
guage Recognition using Sequential Pattern Trees. 2012 IEEE Conference on Very easy) If difcult, did information overload contribute
Computer Vision and Pattern Recognition (2012), 2200–2207. to difculties?
[32] Eng-Jon Ong, Oscar Koller, Nicolas Pugeault, and Richard Bowden. 2014. Sign • What did you like or dislike about the interface?
Spotting Using Hierarchical Sequential Patterns with Temporal Intervals. In
2014 IEEE Conference on Computer Vision and Pattern Recognition. 1931–1938. Recording
https://doi.org/10.1109/CVPR.2014.248
[33] Jingnan Peng. 2020. Bringing light to the news, for those who can’t hear it • Did you fnd any content challenging to record? If so, what
(video). https://www.csmonitor.com/The-Culture/2020/0731/Bringing-light-to- made it challenging?
the-news-for-those-who-can-t-hear-it-video • Did you use any strategies while recording content? If so,
[34] Manaswi Saha, Michael Saugstad, Hanuma Teja Maddali, Aileen Zeng, Ryan
Holland, Steven Bower, Aditya Dash, Sage Chen, Anthony Li, Kotaro Hara, et al. what were they?
2019. Project sidewalk: A web-based crowdsourcing tool for collecting sidewalk • On a scale from 1-5, how easy was the interface to use? (1-5:
accessibility data at scale. In Proceedings of the 2019 CHI Conference on Human
Factors in Computing Systems. 1–14.
Very difcult– Very easy)
[35] Elliot Salisbury, Ece Kamar, and Meredith Morris. 2017. Toward scalable social • What did you like or dislike about the interface?
alt text: Conversational crowdsourcing as a tool for refning vision-to-language
technology for the blind. In Proceedings of the AAAI Conference on Human Com-
Desirability
putation and Crowdsourcing, Vol. 5. 147–156. • Do you wish more content online provided both English and
[36] Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2018. Evaluating and
Complementing Vision-to-Language Technology for People who are Blind with
ASL? (1-5: Strongly disagree– Strongly agree) If so, can you
Conversational Crowdsourcing.. In IJCAI. 5349–5353. give some examples of when you wanted content provided
[37] Zed Sevcikova Sehyr, Naomi Caselli, Ariel M Cohen-Goldberg, and Karen Em- in both languages?
morey. 2021. The ASL-LEX 2.0 Project: A Database of Lexical and Phono-
logical Properties for 2,723 Signs in American Sign Language. The Jour- • Would you be interested in generating content available in
nal of Deaf Studies and Deaf Education 26, 2 (02 2021), 263–277. https:// both English and ASL? (1-5: Strongly disagree– Strongly
doi.org/10.1093/deafed/enaa038 arXiv:https://academic.oup.com/jdsde/article- agree)
pdf/26/2/263/36643382/enaa038.pdf
[38] Ather Sharif, Paari Gopal, Michael Saugstad, Shiven Bhatt, Raymond Fok, Galen • What benefts do you feel this site ofer to you as a user, if
Weld, Kavi Asher Mankof Dey, and Jon E. Froehlich. 2021. Experimental Crowd+ any?
AI Approaches to Track Accessibility Features in Sidewalk Intersections Over
Time. In The 23rd International ACM SIGACCESS Conference on Computers and
• What concerns do you have in using a website like this, if
Accessibility. 1–5. any?
[39] Carol Bloomquist Traxler. 2000. The Stanford Achievement Test, 9th • How enjoyable was using the website, overall? What did you
Edition: National Norming and Performance Standards for Deaf and
Hard-of-Hearing Students. The Journal of Deaf Studies and Deaf Ed- like/dislike?
ucation 5, 4 (09 2000), 337–348. https://doi.org/10.1093/deafed/5.4.337 • Would you want to use a website like this to read content in
arXiv:https://academic.oup.com/jdsde/article-pdf/5/4/337/9835826/337.pdf the future? Why or why not? Is there diferent content you
[40] Wikipedia. 2001. https://www.wikipedia.org/
[41] Wikipedia. 2022. Deaf News. https://en.wikipedia.org/wiki/Deaf_News would want to read (e.g. movie scripts, podcast, etc.)?
[42] Alicia Wooten and Barbara Spiecker. 2022. https://www.atomichands.com/ • What type of *Wikipedia* content would you want translated
(i.e., picking from the list of topics on the Wikipedia landing
page – food, math, Deaf culture, etc.)?
A APPENDIX • Would you want to use a website like this to contribute
A.1 Semi-structured user study interview recordings in the future? Why or why not?
questions • How likely are you to recommend this website to others? (1-5:
Below is the semi-structured interview questions that were dis- Very likely – Very unlikely) If so, who would you recommend
cussed with participants as part of the user study: this to, and for what purpose (e.g. ASL students for learning,
people with certain English/ASL fuency, etc.)?
• Role/relation to ASL: What’s your role/relationship with
ASL? (e.g. native speaker, primary language, ASL teacher,
use ASL at work, etc...)
Reading
Empowering Blind Musicians to Compose and Notate Music with
SoundCells
William Payne Fabiha Ahmed Michael Zachor
New York University New York University New York University
USA USA USA
william.payne@nyu.edu fsa253@nyu.edu mfz226@nyu.edu

Michael Gardell Isabel Huey R. Luke DuBois


New York University New York University New York University
USA USA USA
mg6950@nyu.edu ijh234@nyu.edu dubois@nyu.edu

Amy Hurst
New York University
USA
amyhurst@nyu.edu

ABSTRACT KEYWORDS
Commercial technologies for notating music pose usage barriers accessibility, music technology, blindness, visual impairments,
to blind and visually impaired (BVI) musicians because they use braille, music notation, creative expression, longitudinal study
graphic user interfaces and only produce visual, print scores. How-
ever, more research to date has studied how to make existing scores ACM Reference Format:
William Payne, Fabiha Ahmed, Michael Zachor, Michael Gardell, Isabel
available in braille or large print rather than understand the needs
Huey, R. Luke DuBois, and Amy Hurst. 2022. Empowering Blind Musicians
and workfows of BVI musicians who notate new music. To address
to Compose and Notate Music with SoundCells. In The 24th International
this gap, we conducted a six-week remote study in which six BVI ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22),
musicians with wide-ranging backgrounds wrote original music October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 14 pages.
culminating in a live performance. To create their scores, partici- https://doi.org/10.1145/3517428.3544825
pants used SoundCells, a product of ongoing co-design and testing
with BVI musicians that uses text to generate audio, print, and
braille music. Across three interviews, participants ofered diverse 1 INTRODUCTION
and nuanced views of how text input could facilitate creative ex- Recent HCI and accessibility research has studied how blind and
pression. We uncovered how vision ability, music experience, and visually impaired (BVI) people use technology to create content
assistive technology preference afected how music was accessed across wide-ranging mediums including photography [9], 3D mod-
and traversed. From this research, we provide design recommenda- eling [64], making and fabrication [10, 15, 59], writing [16], web
tions for improving SoundCells’ input and output systems, discuss design [35], and podcasts [63], but the practices of BVI people who
how visual cues embedded in SoundCells’ syntax make learning use computers to notate music are understudied. Music notations,
and remembering harder for people who can’t view it, and refect including western print notation taught in most American schools,
on how our chosen methods resulted in high engagement. are predominantly visual and cannot be read by many BVI people.
Many researchers [13, 25, 26, 36, 37] and community eforts [48, 72]
CCS CONCEPTS have developed systems for converting visual notation into braille
music and specially-formatted large print that can be read by BVI
• Human-centered computing → Accessibility technologies;
musicians. While this important work makes existing music more
Sound-based input / output.
widely accessible, i.e. scores written by deceased, male composers,
it only partially addresses the needs of BVI music makers, including
composers, arrangers, students, and educators, who notate new
music. Commercial music notation tools predominately use graphic
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed user interfaces (GUI’s) with “what-you-see-is-what-you-get” inter-
for proft or commercial advantage and that copies bear this notice and the full citation actions that range from completely inaccessible to complex and
on the frst page. Copyrights for components of this work owned by others than ACM difcult to navigate for BVI musicians, while specialized braille
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a music tools limit collaboration since they do not produce visual
fee. Request permissions from permissions@acm.org. print scores [50, 52].
ASSETS ’22, October 23–26, 2022, Athens, Greece This study depicts the creative processes of BVI musicians as
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 they learn and use SoundCells, a text-based music notation software
https://doi.org/10.1145/3517428.3544825 that renders visual and braille music scores simultaneously [50].
ASSETS ’22, October 23–26, 2022, Athens, Greece Payne et al.

SoundCells’ design was motivated by prior research with experi- research exploring multisensory alternatives includes tactile inter-
enced BVI musicians who discussed gaps in existing software and actions with MIDI [28], audio navigation [71], audio equalization
identifed a need for converting between braille and print notations [34], and tabletop collaboration [47]. Groove Pizza, a web-based
without complex tool chains [52]. SoundCells emerged frst as a drum machine, explores keyboard input and auditory output, in-
text-based music notation prototype validated through a design cluding sonifcation and text-to-speech [51]. We build on past work
probe [4]. Then, it was co-developed with two experienced blind by closely examining how BVI music makers with diverse expertise
musicians and evaluated with fve other blind musicians [50]. While incorporate assistive technology and utilize multiple modalities to
prior research gathered BVI users’ initial reactions to SoundCells’ create original music.
text input and multimodal output, this is the frst evaluation of the
system in use over time. Six BVI Musicians, who range widely in vi-
sion ability and prior composition and technology experience, used 2.2 Braille Music Notation and Access
SoundCells for six weeks and wrote original, short compositions Braille music is harder to write and obtain than print music. De-
that were played in a live performance. spite its long history and standardization [24, 65], braille music
The primary contributions of this research are as follows: is not typically sold by publishers or available in public domain
catalogues such as the International Music Score Library Project
• We depict how BVI musicians with wide-ranging musical
(IMSLP) [53]. For example, IMSLP contains a catalog of more than
backgrounds, vision abilities, and equipment engaged mean-
650,000 PDF scores compared to the NLS collection of instructional
ingfully with SoundCells, a new system for creative expres-
music materials and scores made up of about 25,000 braille, large
sion.
print, and audio fles [46]. If a braille version of a score cannot
• We compile participant ideas and design recommendations
be obtained, schools, agencies, or end-users may translate print
impacting music, text, and multimodal interactions.
music to braille following a multi-phase, rigorous process: One or
• We demonstrate how our study design presented opportuni-
more sighted people copy a PDF in notation software, export to
ties for self-expression, motivated engagement with technol-
MusicXML, convert to braille with a commercial or open-source
ogy towards a common goal, and utilized the web increasing
program [13, 14, 25, 26, 36, 37], and fnally send to a braille music
access across the six-week study.
expert to fnalize layout and correct errors caused by human or com-
We frst identify areas of related research that SoundCells builds puter error [33]. Braille music notation software such as the Braille
upon including music technology accessibility, braille music, music Music Editor and Braille Music Notator [18, 61] and generic braille
coding, and accessibility of text interfaces (§2). Then, we provide a translation software such as Duxbury [17] can be used to notate
brief overview of the current SoundCells system (§3). We describe braille scores directly, but the above steps are required in reverse
our methodology based around a six-week virtual class culminating to generate print for sighted musicians. Lime Aloud is capable of
in a performance (§4). Finally, we describe how participants engaged producing both print and braille music [14], but it provides a limited
with SoundCells and gave us feedback for how to improve it (§5). view of the braille score during editing, only runs on Windows, and
We discuss and connect these fndings to the design of text-based is expensive. SoundCells ofers an alternative to existing eforts. It
systems, multimodal interactions, and tools for creative expression incorporates multiple forms of notation during composition instead
(§6). of requiring additional work, software, and expertise to convert a
score after it has been completed.
2 RELATED WORK
This study builds on recent eforts to broaden access to music tech- 2.3 Code and Text Interface Accessibility
nology for creation, performance, and education. Specifcally, we
draw on existing systems and workfows for producing braille mu- ABC text [73] is a shorthand ASCII music notation that can be
sic notation, research in accessible programming and text editing, directly interpreted by screenreaders. Many commercial music ap-
and environments for programmatically making music. plications use GUIs that are impossible or unwieldy to navigate with
screen readers [52]. While text input may improve usability, it can
introduce new challenges. As with developing software with code,
2.1 Accessibility of Music Technology for Blind notating music with text requires memorizing and understanding
and Visually Impaired People syntax, debugging mistakes, running mental calculations to track
Researchers are increasingly committed to improving and measur- small and large structures, and planning approaches to open-ended
ing accessibility in music technology [38] and expanding designers’ problems. For example, Payne et. al interviewed two blind musi-
understanding of when, how, and why people who identify as liv- cians who used LilyPond [23], a text-based music notation language
ing with a disability or access requirement use music technology comparable to LATEX for musical scores [52], but found the syntax
[21]. BVI music learners who need accessible notation are well- challenging and the functionality geared mainly towards producing
studied, e.g. [3, 41, 44], and recent work depicts how online support PDF fles. Because blind developers may experience difculty nav-
networks and toolchains made up of commercial and custom hard- igating and conceptualizing code structures [42, 68], researchers
ware/software enable creative practices [27, 52, 63]. have designed interfaces more suited for screen readers. For ex-
A 2019 review of accessible digital musical instruments found ample, StructJumper and AudioHighlight present approaches for
that few interactions have been designed and tested with BVI mu- skimming structured documents [5, 8], SodBeans and Wicked Au-
sicians compared to other populations with disabilities [20]. Past dio Debugger use audio cues to aid error debugging [66, 67], and
Empowering Blind Musicians to Compose and Notate Music with SoundCells ASSETS ’22, October 23–26, 2022, Athens, Greece

CodeTalk adds screen reader features to Visual Studio [54]. Sound- generate the complete braille score outside the editing environment.
Cells is analogous to the Web-ALAP LATEX environment designed ABC text is simpler than LilyPond in part because it is not intended
for screen reader users [6]. While in Web-ALAP, users write math for engraving print music and provides limited functionality for
expressions, in SoundCells, users notate music with ABC text. In visual formatting and layout. Many online libraries contain free
this study, we explore how BVI musicians understand and work music in ABC text, especially folk tunes [12]. While learning ABC
with specialized text notation, and propose design opportunities text has been suggested to be potentially useful for blind musicians,
for improving text-based environments. students, and novices [62], existing ABC editors do not support
braille output and are not designed for screen reader users.
2.4 Computational Music Making for Novices 3.1.2 ABC Parser. SoundCells uses a custom parser that locates
Computer music environments are no longer only used by experts syntax errors and lists unsupported text fragments and suggested
who possess signifcant music and code knowledge [74]. Today, fxes in a screen-reader accessible diagnostics view. It tracks mea-
many introductory computational music environments provide ed- sure completeness, whether a measure has the appropriate number
ucational opportunities to make and remix music within diferent of notes, and lists incomplete measures in the diagnostics view.
programming languages and musical styles. For example, SonicPi, Finally, it connects the user’s cursor position in the ABC text with
a live-coding language based on Ruby, is used in algoraves, taught their position in the generated score, enabling text-to-speech and
in primary school classrooms, and included in the Rasberry Pi music feedback described below (§3.2). SoundCells uses music21
OS [2]. EarSketch uses Python or Javascript to control and alter [13], an open-source Python library, for fnal braille and print con-
playback of hip-hop grooves and samples within a Digital Audio version.
Workstation (DAW) [19]. DAWs tend to use highly visual interfaces
[52, 63]. Finally, JythonMusic has been used in cross-disciplinary 3.1.3 Editing Environment. The coding editor in SoundCells is built
college courses to teach music and programming [40]. Each of these on CodeMirror 6, an open-source project designed to better support
systems use text for editing music, but very few studies have in- screen reader and keyboard users in web-based text editing [29, 31].
corporated BVI musicians who may beneft from improved screen Keyboard commands and the diagnostics view are built directly in
reader access to text instead of a GUI. Sonic Pi was adapted by the editor. SoundCells includes an alternative HTML text area as
developers of the tangible coding environment Code Jumper, but VoiceOver users have reported bugs in CodeMirror 6 [30]. However,
brief tests found the syntax unapproachable for children ages 7–11 the HTML text area lacks the same features as the CodeMirror
who used screen readers or text enlargement [43]. Furthermore, editor requiring the user navigate elsewhere on the page to access
SonicPi uses a custom integrated development environment (IDE) similar information conveyed via key commands.
that appears to support screen readers [1], but its accessibility has
not been formally evaluated. This study takes place in a learning 3.2 Multi-Modal Output
context and mainly engages novices who have little music tech- We designed SoundCells to leverage audio and tactile output during
nology, notation, and/or code experience. Accessible interaction both composition and production.
fndings are applicable to other music coding environments that Music Playback: As a user types, they hear corresponding notes
use diferent languages and styles of music. played back immediately making the QWERTY keyboard act like
a MIDI piano keyboard. Audio playback refects the current state
3 SOUNDCELLS SYSTEM of the score, not just the character typed, so if the user types D
SoundCells is a web application for quickly notating print and below a line indicating a key signature of Db, then they will hear Db.
braille music scores with text [4, 50]. SoundCells can be accessed Additionally, the entire score may be played, or a single measure
across desktop and mobile operating systems, browsers, and screen may be looped indefnitely as a user edits.
readers. Built on open-source software, SoundCells’ source code is Text-To-Speech: “Tell-me” commands or question mark com-
available on GitHub [56], a stable prototype is deployed via Heroku mands trigger local descriptions of the score based on cursor loca-
[55], and we published a detailed technical description [50]. We tion. A user types a ? followed by a character to request information
describe the fundamental features of SoundCells relevant to this about a note, symbol, or region. For example, typing ? - m describes
research below. the current measure number and completeness.
Braille Music: The generated braille score may be viewed in the
3.1 Text Input interface or felt through a connected refreshable braille display. In
SoundCells, braille music can be displayed in both Unicode and
3.1.1 ABC Text. SoundCells separates musical content from its
ASCII formats.
presentation making it unique from most other notation software.
Print Score: A visual print score located below the text editor
Music is generated from plain ASCII text written in a syntax called
updates as the user writes ABC text. Large print scores can be made
ABC [62] that can be read directly by screen readers. ABC, along
using a slider to simultaneously modify multiple elements, includ-
with other text-based music notations such as LilyPond [23], is
ing note size, font size, measures per system (row), and systems per
akin to LATEX, while SoundCells is akin to Web-ALAP [6], a web-
page.
based LATEX editor with built-in accessibility features and domain-
specifc tools and commands. In SoundCells, users edit text, while 3.2.1 Sharing and Extensibility. SoundCells does not support ad-
in Lime Aloud, a popular notation software developed and used by vanced formatting and customization of braille or print, but it ex-
blind people [14], users edit a rendering of print music directly and ports fles in standard formats. For example, An ASCII braille fle
ASSETS ’22, October 23–26, 2022, Athens, Greece Payne et al.

Figure 1: SoundCells System: Main interface shown on the right (with documentation, text editor, diagnostics view, playback
controls, and print score) showing how ABC text yields print music and braille notation on a braille display. The Settings and
Save menus are shown expanded on the left.

may be opened in Duxbury [17]. A PDF can be downloaded in stan- follow-up questions during the interviews. Finally, we designed
dard page sizes (8.5x11 or 11x17) and layouts (portrait or landscape), a supplementary course website following web accessibility best
with enlargement settings applied. A MusicXML fle can be opened practices [75] to post course topics and share participant work. It
in commercial applications such as Sibelius [7]. Raw ABC text may remains online (with participant names removed) as a resource for
also be downloaded or copy/pasted into an email or chat message. learning SoundCells [57].
The course was structured around learning ABC text and the
4 METHODS SoundCells interface (Table 1). The frst four weeks covered new
topics, while Week 5 was an open-work period to prepare for a
In this section, we describe our methods in which six BVI partici- fnal performance in Week 6, in which a musician with low vision
pants used SoundCells to notate music during a six-week course, from FMDG School performed participant works live on soprano
and were interviewed three times to share their experiences and saxophone. Our goal of this research and the course design was
ideas to improve SoundCells. to provide an authentic music making environment and clear mo-
tivation to become adept at using SoundCells. This research was
4.1 Course Design not aiming to measure the efectiveness of the curriculum or the
To learn SoundCells, participants enrolled in a six-week virtual extent to which participants learned music theory or composition
course through The Filomen M. D’Agostino Greenberg (FMDG) concepts.
Music School. The course, entitled “Composing Music with ABC
and SoundCells,” was advertised to students age 16 and older as 4.1.1 Take-Home Challenges. To increase engagement, we pro-
an opportunity to learn a new technology for notating braille and vided three options for participants to use SoundCells outside of
large print scores. Enrollment was free, and the advertisement class through notation, remix, and composition challenges. Nota-
email made clear that participating in research was optional and tion challenges were prompts to practice new syntax (e.g. debug-
not required. The course met once per week for one hour remotely ging incorrect excerpts, notating common chord progressions, etc.),
over Zoom. Audio recordings of sessions were shared with stu- remix challenges were prompts to integrate new syntax into ex-
dents as a learning aid. These class recordings are not used for isting work, and composition challenges ofered guided activities
research because we wanted the learning environment to be a for writing new music using the most recent syntax. Five of the six
safe space where participants could make mistakes and ask ques- participants submitted work each week via email or an embedded
tions. Instead, we noted topics and conversations from class to ask form, and participants chose to work on diferent activities. For
Empowering Blind Musicians to Compose and Notate Music with SoundCells ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Class and Interview Schedule

Wk. Main Concepts SoundCells Features Interview


1 Notes and Pitches Interface Overview, Music Playback -
2 Rhythms “Tell-Me” Commands 1
3 Chords Settings and Save Menus -
4 Decorations (Other Musical Symbols) Diagnostics View 2
5 Work Day - -
6 Performances - 3

example, P5 worked primarily on the Notation Challenges fnding 4.2.2 Interview 2. This interview emphasized the 3–4 weeks of
composition “daunting,” while P3 worked primarily on her fnal challenge activities participants had completed up to that point.
composition extending it and integrating new syntax each week. We frst asked participants to refect on what they had learned so
far and what they still hope to accomplish. Most of this interview
4.1.2 Final Performance. Participants were tasked with notating centered around a specifc excerpt chosen by us or the participant.
a short, 12–24 measure composition that they could perform, or We then asked them to describe their process and to either to fnd
would be performed by a professional musician who has low vision and correct an existing mistake in the excerpt, such as an incomplete
on soprano saxophone. Our requirements indicated that the pieces measure, or to demonstrate how they would.
had to be complete without mistakes and had to include certain
information like a composer and title. Five participants chose to
have the professional play their piece, and P3 chose to perform 4.2.3 Interview 3. We asked participants to discuss their entire
herself because she notated chords to be played on piano. The experience using SoundCells including their process working on
professional musician was not enrolled in the course. They asked the fnal compositions, feelings after the live performance, ideas
for three systems, or rows of measures, per page using 11x17 paper for improving SoundCells, and vision for using it in the future.
in landscape orientation. Researchers used SoundCells to create
these scores from ABC text submitted by the participants. We used
the magnifcation slider to adjust the size of the notation until it 4.3 Artifacts
reached three systems per page. The musician requested we shrink In addition to the interviews, we gathered materials created by par-
P1’s score from fve to four pages allowing it to ft on two music ticipants across the course including exercises, improvisations, and
stands preventing a page turn. The performance of each piece was fnal compositions. A form incorporated into SoundCells enabled
followed by short discussion. All participants attended along with participants to report bugs, issues, or questions, but most elected
the musician and the head instructor from FMDG School. to communicate via email.

4.2 Interview Protocols


We conducted three semi-structured interviews with all participants 4.4 Participants
at the beginning, middle, and end of the course to capture initial re- Six BVI musicians enrolled in the course, and all elected to partici-
actions, in-progress experiences, and refections. We wrote a generic pate in the study and engage in all three interviews. As shown in
script for each interview along with individualized questions de- Table 2, the participants have diverse experience. P1 contributed
termined by participant characteristics, prior interview responses, to the design of SoundCells [50] and has a degree in Music Com-
and/or class interactions. For example, we asked P5 about print position, and P2 teaches Music Technology and has decades of
music and not braille music which she does not read, we asked experience composing and performing professionally. Others had
P2 to compare SoundCells to other music notation software given never composed original music and had little or no prior experience
his signifcant prior experience, and we asked P4 repeatedly to dis- using music software (P4, P5, P6). Two participants identify as male
cuss her workfow for notating rhythm which she indicated feeling (P1, P2) and the remaining four identify as female. P5 uses tech-
challenged by. nology with magnifcation and currently reads large print music,
while the the others use screen readers and read braille music.
4.2.1 Interview 1. We asked participants to describe their musical Participants used a range of AT to access SoundCells. P3 and
backgrounds, prior experience notating music with technology or P6 used both a screen reader and braille display referencing the
by hand, and motivations for enrolling. Questions were adapted braille music often as they notated. P2 and P4 only used screen
from previous interview studies with blind musicians [4, 52]. Par- readers. P4 made heavy use of SoundCells features including tell-
ticipants new to SoundCells (P3, P4, P5) were asked to share initial me commands, the diagnostics view, music playback, and looping,
reactions and where they felt challenged. Participants who con- while P2 primarily played his composition back while he worked.
tributed to SoundCells’ design (P1) or tested SoundCells in an earlier P1 primarily used a braille display to toggle back and forth between
study (P2, P6) were asked to consider their frst experience with braille music, and rarely enabled audio. P5 used a large screen with
SoundCells. system magnifcation.
ASSETS ’22, October 23–26, 2022, Athens, Greece Payne et al.

Table 2: Participants

Prior SoundCells Notation Technology


Participant School Role Notation Experience Identifcation Assistive Technology
Experience Experience
Technologist, College degree in composition. Lime, Duxbury. JAWS,
P1 Co-Designer Blind
Braille Music Transcriber Braille music expert. Uses daily 40-cell display

Decades of experience composing Lime, Sibelius. VoiceOver,


P2 Music Technology Instructor Usability Tester Blind
and arranging print and braille. Uses multiple days a week 40-cell display

Piano Instructor, Teaches braille music Lime. Visually JAWS,


P3 None
Braille Music Instructor using Perkins Brailler. Used over 20 years ago. Impaired 40-cell display

P4 Adult Student None None None Blind NVDA

Hand-notated music when


P5 Adult Student None None Low Vision Magnifcation
much younger before vision loss.

Learning braille music Lime. VoiceOver,


P6 High School Student Usability Tester Blind
for about 3 years. 4 months, once per week. 32-cell display

4.5 Analysis own piece. Oh wow, this is actually really nice.”, and then on the
We conducted a Thematic Analysis [11] on interview data gath- experience of hearing the saxophone player interpret her piece that
ered, recorded, and transcribed verbatim to identify trends and id- lacked rests, she said, “I was like that sounds really good, but wow I
iosyncrasies in how SoundCells was used and understood. Themes didn’t give him time to breathe.” Final compositions varied greatly,
were identifed, coded, and analyzed by the research team: Two and as shown in Table 3, prior music technology and notation ex-
researchers conducted each interview, while a third listened to the perience roughly correlated with music complexity, as estimated
recording asynchronously. After each round the research team dis- by the length of music and range of notation used. Multiple partici-
cussed the interviews and thematic trends. Once all 18 interviews pants commented on hearing each other’s works and recognizing
across the three rounds were completed, the team generated themes the wide range of skill, experience, and aesthetic. For example, P5
relating participant experiences to features of SoundCells. As this said, “I enjoyed that everybody’s pieces were very diferent. Each
study builds on prior analysis of SoundCells [4, 50], we used a com- person had a very diferent way of expressing themselves. And also,
bination of deductive and inductive coding. Initial codes such as you know I think it showed diferent peoples’ profciency with the
“Audio Interface,” “Mistake Correction,” and “Design Recommenda- system.” Final compositions including ABC text, print, braille, and
tion,” seeded the coding process, while new codes, such as “Reaction recorded audio may be accessed online [58].
to Performance” and “Perception of ABC Text,” arose out of data In the remainder of this section, we describe how participants in-
gathered during this study. Finally, we refned themes and selected teracted with SoundCells’ text input (§5.1) and multimodal outputs
especially relevant examples and quotations for this publication. (§5.2.1), and we report their suggestions for improving SoundCells
We considered the musical excerpts submitted by participants and to support future use (§5.3).
measured the complexity of fnal compositions by counting musical
events and unique instances of notation (shown below in Table 3). 5.1 Input: Using Text as Notation
Challenges or works-in-progress sent to us by participants during Most participants enjoyed learning ABC text and found SoundCells
the six weeks were not evaluated because participants shared their to be an approachable tool for notating music. Surprisingly, we saw
work voluntarily. We used these measurements to understand what very few excerpts submitted with incorrect syntax, the most com-
was notated with SoundCells, not as a way to evaluate efort or mon error being mismatched brackets. When asked, participants
creativity. expressed little concern about syntax stating that they tended to
notice if they typed the wrong thing, and felt the diagnostics view
5 FINDINGS did a good job highlighting incorrect characters. Many challenged
themselves to incorporate as many unique features of ABC text into
Participants interacted with SoundCells diferently and had a range
their fnal compositions as possible. Throughout our interviews,
of feedback, likely informed by their assistive technologies, prior
participants refected on how text could facilitate or complicate
experience, and personal preferences. Despite the diversity across
universal music notation processes – conceiving of musical ideas,
participants, all successfully used SoundCells to compose original
inputting notes, and writing full, complete, and rhythmically inter-
short works that were performed live in the fnal class. In addi-
esting measures.
tion to the valuable feedback about the design of this tool (§§5.1.4,
5.2.5), this research illustrates the value of making tools for creative 5.1.1 Perceptions of ABC Text as Music Notation. Our participants
expression accessible to an underserved population. For example, had a range of experience notating braille or print music, with
participants were enthusiastic about the physical copies of their P1, P2, and P3 possessing signifcant experience, P6 with some
scores and the live performance. On reading her embossed score for experience, and P4 and P5 with little or no experience. Only P1
the frst time, P6 exclaimed “I’m like, oh my goodness, this is my had previously notated original music using ABC text. We believe
Empowering Blind Musicians to Compose and Notate Music with SoundCells ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 2: P1’s Final Composition. Left: ABC text excerpt from the middle of his piece. Middle: Corresponding notation in large
print with three systems on 11x17 paper and the entire score printed on 8.5x11 paper. Right: Embossed Braille score on large,
11x11.5 paper.

this experience informed participants’ initial positive and negative what, I hear something, let me type it.’ ” In contrast, P2, P4, and
reactions to using ABC text. P3 expressed difculty remembering P5 described how typing ABC text remained tedious and required
syntax, “What is hard is to memorize what each key is, how would concentration after six weeks. Their attention to locating correct
I get fat, sharp, what key to press.” P5 noted surprise at using text characters overwhelmed their ability to consider larger musical
to represent music, “strange experience to type out letters and then structures. For example, P2 said, “Most notation programs, we use
get the music from that rather than writing notes on a staf.” P6 a musical keyboard, we are playing the notes... You don’t have
expressed enthusiasm for learning a new paradigm, “one of the to think so much about typing octaves, sharps and fats. You are
things I actually like is the text editor.” As the course progressed, thinking about melody line, harmony line,” and later asked “Am I
many felt more comfortable and able to express their ideas, but all concentrated on the music or just on how to write the notation?”
participants, except for P1, felt that after six weeks they were not The difculties typing syntax led some participants to come up
experts in ABC text. with new strategies or cut back on elaborate composition goals. P5
Participants identifed specifc aspects of ABC text that were worked iteratively and divided her fnal composition process into
unintuitive and difcult to learn. Many highlighted pitch notation: multiple steps. First, she wrote down all the pitches, and later she
In ABC text, a letter’s case describes a note’s starting octave, while incorporated other musical elements. “I started playing with what
characters placed before or after raise or lower it. For example, a' I had written down, and adding durations and rests and holds and
is an “A” in ffth octave raised to sixth octave while C,, is a “C” in ties, and actually that sort-of brought it all together for me.”
fourth octave lowered to second octave. P2 understood the syntax When asked to compare learning SoundCells to other systems,
but questioned the behavior of specifc characters, “Octave-down, participants with prior experience reported that SoundCells re-
octave-up is comma, apostrophe. What’s the relationship between quires memorizing syntax while GUI-based systems require memo-
those keys, you know?” P3 also brought up these symbols, “comma, rizing keyboard shortcuts. P3 refected on the difculty of learning
apostrophe is not used in music. It’s used in lyrics.” Furthermore, shortcuts in an early version of Lime Aloud more than 20 years ago
P5, the participant who reported feeling most overwhelmed in the and preferred SoundCells. In contrast, P2 had a strong preference
class “reached [her] saturation point and level at Class 3.” She com- towards keyboard shortcuts in GUI-based systems because key-
pletely avoided expression markings covered in Class 4 (Table 3). board shortcuts can be chosen based on physical location and not
In contrast, P1, a programmer and experienced SoundCells user, only their function in entering text. For example, P2 highlighted
found that text presented exciting musical possibilities because how Lime Aloud maps many important functions to the top row of
“ABC fundamentals allow you to expand the rhythmic complexity the keyboard. While P2 generally preferred keyboard shortcuts, he
in a way that you just can’t do with standard Western notation, like appreciated how chords in SoundCells are typed out note-by-note
large prime fractions like seven over eleven.” For him, the current instead of played simultaneously, which may be difcult for some.
syntax lacked the precision to notate microtones, or notes in be-
tween notes, so he left a comment in his fnal score to communicate 5.1.3 Challenges Notating Rhythms. Nearly all participants shared
what he was not able to notate, “approximation of Arabic ‘maqam exercises or drafts of their fnal compositions with incomplete mea-
sabah’ without 3/4 tones”. sures in which the combined durations of notes made a measure
too long or too short. The difculty notating rhythm became ap-
5.1.2 QWERTY Keyboard as Input. Participants difered in how parent as the study progressed. With quarter notes in 4/4 time the
they viewed the mechanical process of typing musical notes on extent of rhythm covered Class 1, all measures consist of four notes
a computer keyboard. P1, P3, and P6 thought typing was simple surrounded by two barlines, e.g. | a b c d |. However, when
and intuitive. For example, P6 explained her strategy: “Sometimes writing rhythms, expressed as fractions or multipliers, one can’t
I improvise, mainly it’s like a no right or wrong type of thing... simply count to 4, but instead needs to track the sum of all written
If I have something that comes in my head I’m like, ‘You know durations, e.g. | a1/3 b2/3 c1/4 d1/4 e1/2 f3/2 g1/2 | is a
ASSETS ’22, October 23–26, 2022, Athens, Greece Payne et al.

Table 3: Characteristics of Final Compositions: This breakdown provides a rough indication of notation complexity and is not a
measure of participant creativity or efort.

Unique Expression3 Tempo, Meter,


ID Measures Notes/Chords/Rests Accidentals1 Comments
Durations2 Markings Key Changes
P1 13 173/5/0 44 8 30 5 5
P2 16 70/0/0 15 5 0 0 0
P3 16 230/39/9 8 5 8 0 0
P4 18 103/6/1 5 4 4 0 0
P5 8 24/0/1 0 5 0 0 1
P6 13 76/0/0 0 3 4 0 0
1. Accidentals are sharp, fat, or natural symbols that raise and lower notes, and can indicate melodic complexity.
2. Durations are the lengths of notes, e.g. quarter note, eighth note, and can indicate rhythmic complexity.
3. Expression markings are instructions to perform a note a specifc way, e.g. accented, short, long, etc.

valid measure because the fractions sum to 4. None of the partici- I wanted to delete a wrong note while I was in the Show-Braille
pants had any difculty interpreting the syntax or adding numbers, mode. I was unable to do that. I had to go back to the letter.”
but both tasks combined appeared to be signifcantly challenging.
For example, P4 said, “people, everybody, not just me, are going 5.2 Outputs: When and How Participants Used
crazy trying to fgure out what the character is on the line, and
then they have to make sure that they’re still keeping track, that
Diferent Modalities
they don’t lose their train of thought.” As evidence of her statement, Participants used audio and tactile outputs while editing to track
Both P2 and P5 told us they avoided complex rhythm in their fnal progress and identify mistakes. We defne “mistakes” in this context
compositions fnding the efort not worth it. In her fnal composi- loosely as moments that break musical rules, such as measures that
tion, P4 requested help after two failed attempts to make a measure are incomplete (§5.1.3), or moments that don’t sound “right.” These
complete because, complicating things further, the grace notes she mistakes exclude syntax errors, which as indicated above (§5.1),
notated needed to be excluded from the count. P6 emailed us an were not a common issue for participants. Participants reported that
early draft of her fnal composition with signifcant duration errors real-time feedback and visual/braille score renderings were useful
and chose to restart rather than revise after we highlighted the aids beyond the raw syntax. However, orientiation in complex
mistakes. P1 was the only participant who did not submit music scores and mistake correction are nontrivial tasks in SoundCells.
with incorrect rhythms. His fnal composition also used the most For example, P4 said, “the more complex it gets the longer it takes
complicated rhythms and meters (Table 3, Figure 2). As we discuss for me to explore it and to see what I can do to make it presentable.”
further below, P1 attributes his accuracy to strategies that incorpo- Participants identifed afordances of diferent output modalities.
rate text-to-speech (§5.2.2) and braille output (§5.2.3), derived from For example, P6 said, “Braille music, I feel like it’s good whenever
prior braille notation and programming experience. to read rhythms. Screen reader to check what bar you are on and
whether it is incomplete. Music playback for rhythm check, and
adding the notes you want to add.” The remainder of this section
5.1.4 Participant Recommendations for Improving Text Input. Our highlights features of each output before moving on to participant
participants suggested several alternative modes of input to address suggestions.
these challenges. Those with more experience using music software
(P1 and P2) suggested “musical typing,” a feature common in GUI- 5.2.1 Music Output: Listening for Creative Expression. All partici-
based music software, where the arrangement of white and black pants used SoundCells’ music playback, and many highlighted the
keys is superimposed on the QWERTY keyboard. If SoundCells instantaneous output, in which notes sound as they are typed, as
supported a “musical typing mode” the ‘a’ key would typically their favorite feature. For example, P3 said “Well, when you type
produce a letter ‘C’ or middle C, while the ‘w’ key would produce something and you hear right away what you typed, it really moti-
‘_D’ or D-fat. When asked if separating keys from their designated vates you and you hear it and it brings me in such a mood that, ‘Oh,
character could get confusing, P2 responded that he didn’t think I hear it, I will now do more and more.’ ” Most participants played
so, “I’m not thinking about the letter. I’m just thinking about the their entire score back often to inform their notation process, while
layout of the keyboard.” However, participants who were new to P1 tended only to use it near the end of completing a challenge:
notation (P4 and P5), liked the current keyboard input, but wished “Sometimes I use the audio if I want to know what it sounds like.” P4
the interface could ofer more aid with note entry. For example, P5 and P6 are both musicians with perfect pitch, the ability to discern
said, “The idea of the piano as a keyboard was useless to me, but the exact pitch and octave of a note just by hearing its frequency.
perhaps having some sort of popups asking what you want and just They found note playback especially helpful since it conveyed most
being able to click, that would make it easier to notate.” Separately, information depicted in text-to-speech. P4 was the only participant
to support fuent braille music readers, P3 requested the ability to who experimented with Loop Mode, in which a measure repeats
edit braille directly citing an experience, “I found my mistake and indefnitely as it is edited, stating, “I love loop mode. Something
Empowering Blind Musicians to Compose and Notate Music with SoundCells ASSETS ’22, October 23–26, 2022, Athens, Greece

to meditate on.” Though we asked, none of the screen reader users easy with small projects to just count measures and fgure out how
indicated difculty hearing note playback, a fnding from an earlier things are within because I can view multiple layers at one time.
study of SoundCells [50], or experimented with the option to de- It’s not like only using speech and having to scroll through and
lay notes until after a screen reader speaks the text. Despite their try to analyze each note each time, and try to query where you are
enthusiasm, participants identifed some limitations in the built-in at any given time with the question mark commands.” P1 further
synthesizer. P2 stated that it does not sound as good as other pro- refected on using a braille display to track indentations in Python
grams, while P1 and P3 found it less audible in lower frequencies. scripts where “speech was just too hard.”
P1 and P4 requested that we incorporate diferent instruments. While they seemed to aid navigation, braille displays did not
universally solve difculties. In Interview 2, we asked P3 to try and
5.2.2 Text-To-Speech Feedback. All participants, except P5, oper-
identify the location of an incomplete measure in an excerpt she
ated SoundCells with screen readers. P1 only used a screen reader
submitted. P3 opened the braille music view and tracked measures
once while writing a complex part of his fnal composition. Mul-
across multiple lines. She told us how difcult it was to remember
tiple participants noted how difcult it could be to discern long
the measure she was on, since she only had access to one line of
sections of ABC text when voiced out loud, character by character
music at a time, and each line could have a diferent number of
or meaningless word, through their screen readers. For example, P4
measures depending on their length. Ultimately, she could not fnd
expressed frustration, saying, “the screen reader reads letters, and
the subtle mistake until we suggested she open the diagnostics
it doesn’t read notes, and it doesn’t read music signs. It just reads
view. P2, who described himself as a slower braille reader compared
even the whole words that are like nonsense.” She further compared
to some of the other participants, had a braille display, but chose
the relationship between raw ABC text and it’s musical meaning to
not to use it in SoundCells or any notation software. Generally, he
letters that constitute words in English, saying “It would be helpful
fnds that lengthy braille scores can provide unnecessary detail,
if the screen reader would read actual names of notes and what kind
“Sometimes half the line is just the explanation. But where’s the
of note... In a regular word document, if you wrote ‘I went to the
note? I want to know the note... So, I heard some people just make
store and I bought apples, bananas, clementines,’ it would say that.
their own [custom Braille Music scores], just get to rid of those
So it would be helpful if it read ‘A eighth note - A dotted quarter -
letters on the music line.”
Barline.’ You know, that kind of thing.” P6 felt similar wishing that
her screen reader would convey meaning beyond raw syntax, but
5.2.4 Print (Visual) Music. P5, a participant who identifes as hav-
she did appreciate that VoiceOver recognizes fractions, “which is
ing low vision and who uses SoundCells with a screen magnifer,
very helpful because it helps with the math side of music.”
was the only participant to read the print rendering in conjunction
Tell-me commands (§3.2) were thought to be helpful under cer-
with music. She described: “Mostly looked at the text. I listened
tain circumstances. P4 used the diagnostics view and tell-me com-
to the score, and if I felt there was something wrong I went back
mands frequently. For example, when challenged to fnd an incom-
to the text. I looked at the score if there were errors too, see if a
plete measure during Interview 2, P4 opened the diagnostics view,
measure was underflled or overflled and see how it looked as a
iterated to the frst warning, jumped to the corresponding measure,
score.” At times, the print score would reveal mistakes visible to P5
extended one note’s duration, and fnally used a tell-me command
that were not immediately apparent in the text, “Obvious when I
to verify the fx. She thought that the tell-me command for describ-
looked at the score, not obvious looking at the ABC.” Because of
ing an entire measure was useful, but the command for describing
P5’s system-wide magnifcation preferences, she did not change
the current note was unnecessary since she understood the syntax:
the font size or score size within SoundCells. P5 felt that the layout
“I have no problem remembering the notes and stuf like that.” P1,
of the webpage, in which the print score is located below the text,
who did not use a screen reader for most of the challenge activities,
resulted in too much scrolling.
indicated using tell-me commands for a very complex section of
his composition saying, “that was actually a big example of where
5.2.5 Participant Recommendations for Improving Outputs. Partici-
I did need to use the question mark to fgure out where I was or
pants ofered many suggestions to improve outputs, and focused
if the measure was full or not because those were long measures.
on the modality they primarily used. Regarding music output, P2
Yeah, with a complex meter I think it’s very useful.” P6 used the
requested that a metronome or click track play along to convey
HTML text area which didn’t support tell-me commands (§3.1.3),
durations relative to the beat and time signatures, which can’t
but instead she often navigated below the editor where the same
always be determined from hearing the music alone. Regarding
information is displayed.
text-to-speech, P4 requested more detailed information when using
5.2.3 Braille Music Accessed via Braille Displays. P1, P3, and P6 the tell-me command to debug incomplete measures: “It should
found that braille displays were crucial to their notation processes. specify WHY it’s underflled, instead of making people go crazy
For example, P6 used her braille display to track her location in her trying to fgure it out.” Regarding braille, P3 requested a view of the
own compositions and in others’ work, “I’ve tried using it without braille music better suited to single-line braille displays, in which
a braille display and I can’t tell what notes are in there already. I measures are preceded by measure numbers and separated by new
can still use my ear, but if I was working on something very hard lines instead of spaces so that navigation keys iterate measure by
it’s going to be a lot more complicated.” P1, who as indicated above measure. Finally regarding visual output, P5 requested the ability
( §5.1.3) used a braille display to notate rhythms, understood the to re-position the visual score and zoom in on the selected measure
difculties faced by other participants to be exacerbated by their to prevent needing to repeatedly scroll up and down. An additional
screen readers: “When I’m using a braille display, it’s relatively suggestion made by P2 and P5 was for feedback to occur the instant
ASSETS ’22, October 23–26, 2022, Athens, Greece Payne et al.

they make a mistake, e.g. an error sound upon entering a note that 6 DISCUSSION
is too long. Participants overwhelmingly enjoyed composing music in Sound-
Cells as they incorporated their own workfows and accessibility
preferences. In this section, we suggest design recommendations,
refect on how the structure of the syntax within SoundCells hin-
dered usability, and report how our methodology facilitated creative
expression.
5.3 Envisioned Usefulness of SoundCells
Five participants envisioned continuing to use SoundCells in the
future. They gave us suggestions for who else may use it and what 6.1 Input: Design Considerations for Text Entry
features would be necessary to support themselves and others. and Navigation Inspired by Visual Music
P1 and P2 highlighted diferent benefts of the tool, including the Software
shareability of text and the capacity to export diferent fle types. SoundCells was designed to be an alternative to GUI-based music
Each hoped to keep testing SoundCells and contributing to its software immediately accessible to screen reader and braille display
development. P2 thought SoundCells could be useful for notating users. However, participants expressed diferent comfort levels with
short music and exercises in a classroom environment, however for text (§5.1.2) while the unstructured editing environment introduced
large projects, he intends to keep using Lime Aloud. Similarly, P3 new challenges, including difculties understanding and remember-
noted that most BVI musicians use Lime Aloud for notation, but ing notation (§5.1.1) and typing measures with the correct number
that she fnds SoundCells easier, and envisions using SoundCells to of notes (§5.1.3). Most participants suggested improved outputs
transcribe piano music excerpts for her students. P6 not only aims that aid in navigation and identifying mistakes, but new features
to continue using SoundCells, she shared it with a blind musician for structured editing could help reduce challenges.
in a diferent country directing them to the built-in tutorial. P4, Blind producers who work in digital audio workstations (DAWs)
told us she is continuing to extend her fnal composition, and felt commonly access musical data (MIDI) with an event view, or long
she motivated after learning SoundCells to attend a webinar on list of MIDI events [52]. SoundCells’ diagnostics view could be
accessibility and code and the American Printing House (APH) extended to support a wider range of information beyond errors
National Coding Symposium. Finally, P5, who enrolled in the course and warnings. Going beyond what DAWs support, SoundCells could
to learn something new, told us that she enjoyed the experience, provide diferent flters, e.g. notes, measures, errors, etc., verbosity
but does not plan to continue using SoundCells or learning any options for how much detail to provide about the note, and the
music software, saying “I doubt it, but who knows?” option to play back music and/or text-to-speech. Additional context-
While SoundCells worked well as a tool for novices to create specifc commands drawn from keyboard shortcuts present in other
new compositions, more experienced participants found they could tools, like doubling/halving a duration or transposing a note, might
accomplish less in SoundCells than in the notation software Lime support users’ workfows for editing music. An understanding of
Aloud. These experiences indicate large potential for SoundCells if it how users work between raw text and flterable data could have
supports generating more advanced braille and print music notation, implications for other semi-structured text environments.
or suggest that it could be better suited for novices who transition To aid in entering rhythms, participants suggested an error
to more advanced tools. Currently, SoundCells supports single-line, sound/message (§5.2.5), but GUI-based tools, in contrast with text
single-instrument scores, while participants requested the ability editors, prevent incorrect measures from occurring in the frst place.
to notate for multiple voices and instruments. For example, P3, a For example, Lime Aloud automatically flls an incomplete measure
piano instructor, needs to be able to notate left-hand and right-hand with rests and ignores notes that would make a measure too long.
piano parts along with fngerings that convey to students where SoundCells could auto-complete underflled measures with rests
to place their fngers. P1 and P6 identifed a few musical symbols and prevent the entry of text that would overfll a measure.
covered in Week 4 that appeared incorrectly or not at all in the New additions to SoundCells that aid text editing, such as an
braille music score. P1, a braille music expert, sent us a braille fle event view or automatic error prevention, could have implications
containing his fnal composition in which he manually adapted for other coding environments. A signifcant trend in educational
the version generated by SoundCells. His fle contains stylistic coding environments like Scratch [60] has been to use a visual
improvements including more measure numbers and the removal canvas with draggable code blocks, but blocks-based environments
of some unnecessary text. Additionally, P1’s composition could not introduce many accessibility barriers [39, 45]. Structured text in-
be processed by the music21 converter because a few very long teractions inspired by other tools for creative expression promote
measures in 10/4 time contained so many notes and symbols they usability and accessibility within creative coding environments.
could not ft on a 40-cell line of braille. (The embossed score in
Figure 2, splits each 10/4 measure into fve 2/4 measures.) P1 used
this as an example of the limits of automation in braille music
6.2 Designing Interactions Across Multiple
transcription: “According to the code book, the break should occur Output Modalities
at a logical position in the measure.” Since what is logical is often The incorporation of cross-modal music representations in Sound-
subjective and context-dependent, in this case based on the cultural Cells enabled accessibility and fexible use while introducing chal-
tradition P1’s rhythm draws upon, “that’s where [his] editing comes lenges. Specifcally, we observed difculties navigating and inter-
in handy.” preting screen reader output.
Empowering Blind Musicians to Compose and Notate Music with SoundCells ASSETS ’22, October 23–26, 2022, Athens, Greece

6.2.1 Adapting Braille and Print Representations for Improved Navi- generate multiple versions may have educational value in settings
gation. SoundCells displays entire braille and print scores similar where braille music is being taught. For example, a teacher may
to how Overleaf displays a full document next to its LATEX source. want to remove expression markings before a student has learned
However, this adherence to replicating physical format impeded them, or, as P3 suggested, add optional piano fngerings at the start
access and navigation in browsers. Instead, altered or reduced views of a learning process.
of music may address difculties. For example, P3 suggested a dig-
ital braille score containing one measure per line prefaced by its 6.3 Ocularcentrism in ABC Syntax
measure number (§5.2.5). This would beneft braille display users While most participants did not express major concerns understand-
since they only read one line at a time, and buttons for iterating ing ABC text within SoundCells, many found it to be unintuitive
line by line would apply to iterating measure by measure. In the and hard to remember. Upon refection, we think this is a conse-
visual output, a reduced score containing only the current measure quence of ABC text’s fundamental design. P2 and P3 questioned
or local group of measures could be displayed. This miniature score why seemingly random characters – commas/apostrophes, under-
would always include important context, like measure number and scores/carets – are used to lower and raise notes. These characters
key signature, that can be obscured by enlargement. have no inherent musical meaning. Instead, their spatial position
Furthermore, designers should consider not only how users nav- visually signifes their operations. For example, characters placed
igate within outputs but also between outputs. We observed how low relative to a pitch letter communicate lowering operations
mistake detection could lead to complex workfows - P3 described (__F,), while higher characters communicate raising (^a''). This
hearing a mistake, then navigating multiple lines of braille music visual cue benefts sighted readers but is lost in braille and text-
on a single line braille display to fnd it, and then fnally returning to-speech output. Not only can syntax convey meaning diferently
to the text to correct it (§5.2.3). Ultimately, systems designed to across visual, auditory, and tactile modalities, it presents usability
support multiple modalities, must consider the afordances of each implications when inputted on a QWERTY keyboard. P2 described
and ensure users are not constantly in a state of reorientation. how in other GUI-based music software, the most common key
6.2.2 Context-Aware Screen Reader Output. Screen reader output commands for editing notes lie in one easy-to-reach part of the
presented additional challenges. Participants highlighted the con- keyboard, e.g. the top row in Lime Aloud [14]. While ,/' and _/^
centration and extra efort required to understand streams of raw characters have parallel efects in ABC, they are located across the
text spoken through their screen readers and interpret the state keyboard, and some require holding shift to input. The scattered
of their music accordingly (§5.2.2). Accessibility researchers devel- layout of common symbols may contribute to the slow text entry
oping systems for novice BVI coders [69, 70] have encountered P2 and others perceived while notating music in SoundCells (§5.1.2).
a similar challenge in code debugging workfows and developed While the fnal compositions were relatively short, 24–230 notes
new screen reader paradigms. For example, the “talking debugger” (Table 3), many classical compositions contain thousands of notes.
inside SodBeans can describe behaviors during program execution, The ability to type quickly and efciently is crucial to advanced
e.g. stating the underlying value of a variable rather than its name notation workfows. ABC text was chosen for SoundCells in part
[67]. SoundCells similarly, could interpret syntax and account for due to its 20-year existence and widespread use [73]. An alternative
musical context after an event is typed. For example, ‘ffth octave notation would need to covert to ABC to make it universally share-
B fat quarter note’ could be read when _b is typed in C major or b able and would not beneft from existing online resources. P1 and
is typed in F major. Such a feature may require implementing cus- P2 proposed alternative entry methods such as connecting a MIDI
tom screen reader behaviors into SoundCells. However, SoundCells, keyboard, or mapping notes on the QWERTY keyboard to resemble
unlike SodBeans, is developed for browsers where screen readers a piano (§5.1.2). While aiding speed, new entry methods would not
typically relay the content the user is focused on, and it isn’t clear aid legibility. Looking ahead, an important consideration for domain
how web developers might access system accessibility API’s. specifc languages, is that visual cues alone should not defne syntax
behavior, and that keyboard position and other intuitive properties
6.2.3 Customization for Print and Embossed Music. Customization of characters should be considered. For example, P2 suggested that
options for physical scores (printed and embossed) could provide fats, sharps, and natural signs be added with -/+/= respectfully.
signifcant performance and educational value. The musician that Not only are these keys adjacent on the QWERTY keyboard, they
we partnered with for the fnal performance requested music with are meaningfully connected with their musical operations.
three systems per 11x17 page (Figure 2), but in one case requested
we shrink the music to ft on four pages (§4.1.2). While the Sound- 6.4 High Engagement Through Creative
Cells’ score size slider made this possible, we still needed to use
“guess and check” to resize the music and then reference the result-
Expression, Clear Motivation, and Remote
ing layout of the PDF. Instead, end-users should be able to specify Study
systems-per-page and fne tune the layout based on an estimated Despite the potential for many challenges, this study yielded high
page count before needing to check the PDF. Prior music education engagement across diverse participants. We refect on three features
research has shown that blind musicians do not engage with music of our experimental design, 1) giving participants opportunities for
the same way that sighted musicians do, and instead use notation as self-expression, 2) positioning technology interactions towards a
a memorization tool often along with audio recordings [22, 32, 49]. clear goal, and 3) conducting research remotely. Each of these fea-
Variations of a score may be desirable across diferent stages of tures, we think, contributed to the success of this experiment. First,
learning, memorization, and rehearsal. Furthermore, being able to our study revolved around open-ended music expression, while
ASSETS ’22, October 23–26, 2022, Athens, Greece Payne et al.

weekly creative prompts provided guidance and constraints. All like typing unintuitive syntax and mentally calculating complex
participants, busy adults and one high school student, attended rhythms, while navigating and identifying mistakes across modali-
every class and devoted hours of efort working independently. ties could prove demanding. We discussed design recommendations
This was entirely voluntary, as participants were only compensated for improving text input and multimodal output in SoundCells, iden-
$75 for research activities consisting of three interviews. We view tifed visual signifers as a cause of difculty learning ABC syntax,
the high engagement as validation that music making is something and refected on how our methods led to high engagement across
participants enjoy, but also that our bare-bones curriculum, largely all participants.
documentation with weekly challenges (Table 1), gave them fexi-
bility to learn and work on their own terms. This design expanded ACKNOWLEDGMENTS
the amount of material we could cover during interviews and re- We thank the FMDG School faculty and students for their support
duced pressure during synchronous interactions for participants and insight into the design of SoundCells. We also thank members
to “be creative” under observation. Instead, they discussed ideas, of NYU Ability Project and Vertically Integrated Projects, especially
shared complete and incomplete projects, and demonstrated issues Katrina Lee.
they had run into. Second, the fnal concert motivated participants
to test the boundaries of new technology and notate pieces that REFERENCES
refected their aesthetics. By recruiting a professional musician, [1] Samuel Aaron. 2020. Sonic Pi v3.2 Released! https://www.patreon.com/posts/
the performance gave the course credibility and weight, but was sonic-pi-v3-2-34414152
not overwhelmingly high-stakes since it was online and internal. [2] Samuel Aaron, Alan F Blackwell, and Pamela Burnard. 2016. The development
of Sonic Pi and its use in educational partnerships: Co-creating pedagogies for
Finally, we chose to run this experiment online despite the fact learning computer programming. Journal of Music, Technology & Education 9, 1
that our community partner FMDG School had returned to con- (2016), 75–94.
[3] Joseph Michael Abramo and Amy Elizabeth Pierce. 2013. An ethnographic
ducting many activities in-person. Being online reduced barriers, case study of music learning at a school for the blind. Bulletin of the Council
and enabled participation from individuals who would not have for Research in Music Education 195, 195 (2013), 9–24. https://doi.org/10.5406/
joined otherwise. P4 and P5, who had never notated original music bulcouresmusedu.195.0009
[4] Fabiha Ahmed, Dennis Kuzminer, Michael Zachor, Lisa Ye, Rachel Josepho,
or used music software before, participated just because the topic William Christopher Payne, and Amy Hurst. 2021. Sound Cells: Rendering
seemed interesting. Being remote meant we had to ensure partici- Visual and Braille Music in the Browser. In The 23rd International ACM SIGAC-
pants could use SoundCells independently. This included making CESS Conference on Computers and Accessibility (Virtual Event, USA) (ASSETS ’21).
Association for Computing Machinery, New York, NY, USA, Article 89, 4 pages.
course materials screen reader accessible and recording zoom ses- https://doi.org/10.1145/3441852.3476555
sions for later listening. Ultimately, we felt remote research worked [5] Ameer Armaly, Paige Rodeghero, and Collin McMillan. 2018. AudioHighlight:
Code Skimming for Blind Programmers. In 2018 IEEE International Conference
well in facilitating longitudinal, high-commitment activities with on Software Maintenance and Evolution (ICSME) (Madrid, Spain). IEEE, 206–216.
synchronous and asynchronous interactions. https://doi.org/10.1109/ICSME.2018.00030
[6] Safa Arooj, Shaban Zulfqar, Muhammad Qasim Hunain, Suleman Shahid, and
Asim Karim. 2020. Web-ALAP: A Web-based LaTeX Editor for Blind Individuals. In
7 LIMITATIONS AND FUTURE WORK The 22nd International ACM SIGACCESS Conference on Computers and Accessibility.
1–6.
While this study was set within a learning environment, we did [7] Avid. 2021. Music Notation Software - Sibelius. Retrieved May 1, 2021 from
not evaluate our curriculum or student learning. More research is https://www.avid.com/sibelius
needed to evaluate educational gains. SoundCells currently sup- [8] Catherine M. Baker, Lauren R. Milne, and Richard E. Ladner. 2015. StructJumper:
A Tool to Help Blind Programmers Navigate and Understand the Structure of Code.
ports a limited range of what ABC text and braille music can rep- Association for Computing Machinery, New York, NY, USA, 3043–3052. https:
resent. For example, it supports music for solo string or wind in- //doi.org/10.1145/2702123.2702589
[9] Cynthia L. Bennett, Jane E, Martez E. Mott, Edward Cutrell, and Meredith Ringel
struments but not music for orchestra or choir. We saw signifcant Morris. 2018. How Teens with Visual Impairments Take, Edit, and Share Photos
participant interest in writing more complex compositions, espe- on Social Media. In Proceedings of the 2018 CHI Conference on Human Factors in
cially for notating piano music and more expression markings. Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing
Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3173650
To support advanced notation, music21, the open-source Python [10] Cynthia L. Bennett, Abigale Stangl, Alexa F. Siu, and Joshua A. Miele. 2019.
library SoundCells uses for ABC parsing and braille conversion Making Nonvisually: Lessons from the Field. In The 21st International ACM
[13], would need further development. In the same way that this SIGACCESS Conference on Computers and Accessibility (Pittsburgh, PA, USA)
(ASSETS ’19). Association for Computing Machinery, New York, NY, USA, 279–285.
study extended prior SoundCells research, we would like to further https://doi.org/10.1145/3308561.3355619
test and evaluate the capabilities of text and multimodal output in [11] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.
Qualitative Research in Psychology 3, 2 (1 2006), 77–101. https://doi.org/10.1191/
SoundCells, following future system improvements, by commis- 1478088706qp063oa
sioning BVI composers to notate even more substantial works with [12] Chambers, John. 2021. JC’s ABC Tune Finder. Retrieved May 1, 2021 from
more time. http://trillian.mit.edu/~jc/cgi/abc/tunefnd
[13] Cuthbert, Michael Scott. 2021. music21 Braille Translate. Retrieved May 1, 2021
from https://web.mit.edu/music21/doc/moduleReference/moduleBrailleTranslate.
8 CONCLUSION html
[14] Dancing Dots. 2020. Dancing Dots: Accessible Music Technology for Blind and
In this work, we shared fndings from a six-week remote study in Low Vision Performers since 1992. https://www.dancingdots.com/
[15] Maitraye Das, Katya Borgos-Rodriguez, and Anne Marie Piper. 2020. Weaving
which six BVI musicians used SoundCells to notate music leading by Touch: A Case Analysis of Accessible Making. Association for Computing
to a fnal live performance of their work. SoundCells proved to be Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3313831.3376477
accessible to participants who possessed a range of vision ability, [16] Maitraye Das, Anne Marie Piper, and Darren Gergle. 2022. Design and Evaluation
of Accessible Collaborative Writing Techniques for People with Vision Impair-
music experience, AT preference, and composition style. However ments. ACM Trans. Comput.-Hum. Interact. 29, 2, Article 9 (jan 2022), 42 pages.
participants identifed difculties working within the text editor, https://doi.org/10.1145/3480169
Empowering Blind Musicians to Compose and Notate Music with SoundCells ASSETS ’22, October 23–26, 2022, Athens, Greece

[17] Duxbury Systems. 2021. Duxbury DBT: Braille Translation Software. Retrieved British Journal of Visual Impairment 27, 3 (9 2009), 252–262. https://doi.org/10.
May 1, 2021 from https://www.duxburysystems.com 1177/0264619609106364
[18] Associazione Giuseppe Paccini ETS. 2021. Braille Music Editor. Retrieved May [42] Sean Mealin and Emerson Murphy-Hill. 2012. An exploratory study of blind soft-
1, 2021 from https://braillemusiceditor.com/ ware developers. In 2012 IEEE Symposium on Visual Languages and Human-Centric
[19] Jason Freeman, Brian Magerko, Tom McKlin, Mike Reilly, Justin Permar, Cameron Computing (VL/HCC). 71–74. https://doi.org/10.1109/VLHCC.2012.6344485
Summers, and Eric Fruchter. 2014. Engaging underrepresented groups in high [43] Cecily Morrison, Nicolas Villar, Anja Thieme, Zahra Ashktorab, Eloise Taysom,
school introductory computing through computational remixing with EarSketch. Oscar Salandin, Daniel Cletheroe, Greg Saul, Alan F Blackwell, Darren Edge,
In Proceedings of the 45th ACM technical symposium on Computer science education. Martin Grayson, and Haiyan Zhang. 2018. Torino: A Tangible Programming
85–90. Language Inclusive of Children with Visual Disabilities. 35, 3 (Oct. 2018), 191–239.
[20] Emma Frid. 2019. Accessible digital musical instruments—A review of musical https://doi.org/10.1080/07370024.2018.1512413
interfaces in inclusive music practice. Multimodal Technologies and Interaction 3, [44] Frederick W Moss, Jr. 2009. Quality of Experience in Mainstreaming and Full
3 (2019). https://doi.org/10.3390/mti3030057 Inclusion of Blind and Visually Impaired High School Instrumental Music Stu-
[21] Emma Frid and Alon Ilsar. 2021. Reimagining (Accessible) Digital Musical In- dents.
struments: A Survey on Electronic Music-Making Tools. In Proceedings of the [45] Aboubakar Mountapmbeme and Stephanie Ludi. 2021. How Teachers of the
International Conference on New Interfaces for Musical Expression. Shanghai, China, Visually Impaired Compensate with the Absence of Accessible Block-Based
Article 28, 20 pages. https://doi.org/10.21428/92fbeb44.c37a2370 Languages. In The 23rd International ACM SIGACCESS Conference on Computers
[22] Danni Gilbert. 2018. “It’s Just the Way I Learn!”: Inclusion from the Perspective of and Accessibility (Virtual Event, USA) (ASSETS ’21). Association for Computing
a Student with Visual Impairment. Music Educators Journal 105, 1 (2018), 21–27. Machinery, New York, NY, USA, Article 4, 10 pages. https://doi.org/10.1145/
https://doi.org/10.1177/0027432118777790 3441852.3471221
[23] GNU Project. 2021. LilyPond. Retrieved May 1, 2021 from https://lilypond.org [46] National Library Service for the Blind and Print Disabled. 2021. Music Instruc-
[24] David Goldstein. 2000. Music pedagogy for the blind. International Journal of Mu- tional Materials and Scores. Retrieved April 1, 2022 from https://www.loc.gov/
sic Education 35, 1 (5 2000), 35–39. https://doi.org/10.1177/025576140003500112 nls/about/services/music-instructional-materials-scores/
[25] D. Goto, T. Gotoh, R. Minamikawa-Tachino, and N. Tamura. 2007. A Transcription [47] Shotaro Omori and Ikuko Eguchi Yairi. 2013. Collaborative music application
System from MusicXML Format to Braille Music Notation. Eurasip Journal on for visually impaired people with tangible objects on table. In Proceedings of the
Advances in Signal Processing 2007, 1 (Jan. 2007), 152. https://doi.org/10.1155/ 15th International ACM SIGACCESS Conference on Computers and Accessibility
2007/42498 - ASSETS ’13. ACM Press, New York, New York, USA, 1–2. https://doi.org/10.
[26] Toshiyuki Gotoh, Reiko Minamikawa-Tachino, and Naoyoshi Tamura. 2008. A 1145/2513383.2513403
Web-Based Braille Translation for Digital Music Scores. In Proceedings of the [48] OpenScore. 2018. OpenScore: One Year On. Retrieved May 1, 2021 from
10th International ACM SIGACCESS Conference on Computers and Accessibility https://openscore.cc/blog/2018/8/20/openscore-one-year-on
(Halifax, Nova Scotia, Canada) (Assets ’08). Association for Computing Machinery, [49] Hyu-Yong Park and Mi-Jung Kim. 2014. Afordance of Braille music as a media-
New York, NY, USA, 259–260. https://doi.org/10.1145/1414471.1414527 tional means: signifcance and limitations. British Journal of Music Education 31,
[27] Lucy Green and David Baker. 2017. Digital Music Technologies: The Changing 2 (2014), 137–155. https://doi.org/10.1017/S0265051714000138
Landscape. In Insights in Sound: Visually Impaired Musicians’ Lives and Learning. [50] William Payne, Fabiha Ahmed, Michael Gardell, R. Luke DuBois, and Amy Hurst.
Routledge, Chapter 9, 158–173. https://doi.org/10.4324/9781315266060 2022. SoundCells: Designing a Browser-Based Music Technology for Braille
[28] Thomas Haenselmann, Hendrik Lemelson, Kerstin Adam, and Wolfgang Efels- and Print Notation. In 19th Web for All Conference (Lyon, France) (W4A ’22).
berg. 2009. A tangible MIDI sequencer for visually impaired people. In Proceedings Association for Computing Machinery, New York, NY, USA, Article 11, 12 pages.
of the seventeen ACM international conference on Multimedia - MM ’09. ACM Press, https://doi.org/10.1145/3493612.3520464
New York, New York, USA, 993. https://doi.org/10.1145/1631272.1631485 [51] William Payne, Alex Xu, Amy Hurst, and S. Alex Ruthmann. 2019. Non-visual
[29] Marijn Haverbeke. 2018. Code editor screen reader accessiblity survey. Retrieved beats: Redesigning the Groove Pizza. In ASSETS 2019 - 21st International ACM
May 1, 2021 from https://discuss.codemirror.net/t/code-editor-screen-reader- SIGACCESS Conference on Computers and Accessibility. Association for Computing
accessiblity-survey/1790 Machinery, Inc, 651–654. https://doi.org/10.1145/3308561.3354590
[30] Marijn Haverbeke. 2021. Screen reader response for backspace is faky. Retrieved [52] William Christopher Payne, Alex Yixuan Xu, Fabiha Ahmed, Lisa Ye, and Amy
Dec 1, 2021 from https://github.com/codemirror/codemirror.next/issues/563 Hurst. 2020. How Blind and Visually Impaired Composers, Producers, and
[31] Marijn Haverbeke. 2022. Screen Reader Demo. Retrieved May 1, 2021 from Songwriters Leverage and Adapt Music Technology. In The 22nd International
https://codemirror.net/6/screenreaderdemo/ ACM SIGACCESS Conference on Computers and Accessibility. 1–12.
[32] Virginia A. Jacko, Jin Ho Choi, Audrey Carballo, Brian Charlson, and J. Elton [53] Petrucci Music Library. 2021. IMSLP: Sharing the world’s public domain music.
Moore. 2015. A New Synthesis of Sound and Tactile Music Code Instruction in a Retrieved May 1, 2021 from https://imslp.org/
Pilot Online Braille Music Curriculum. Journal of Visual Impairment & Blindness [54] Venkatesh Potluri, Priyan Vaithilingam, Suresh Iyengar, Y. Vidya, Manohar
109, 2 (2015), 153–157. https://doi.org/10.1177/0145482X1510900212 Swaminathan, and Gopal Srinivasa. 2018. CodeTalk: Improving Programming
[33] Nadine Jessel. 2015. Access to musical information for Blind People. In Interna- Environment Accessibility for Visually Impaired Developers. In Proceedings of
tional Conference on Technologies for Music Notation and Representation. Institut the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC,
de Recherche en Musicologie (IReMus), 232–237. Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA,
[34] Aaron Karp and Bryan Pardo. 2017. HaptEQ. In Proceedings of the 12th Interna- 1–11. https://doi.org/10.1145/3173574.3174192
tional Audio Mostly Conference on Augmented and Participatory Sound and Music [55] Ability Project. 2021. SoundCells. Retrieved Dec 1, 2021 from https://soundcells.
Experiences. ACM, New York, NY, USA, 1–4. https://doi.org/10.1145/3123514. herokuapp.com/
3123531 [56] Ability Project. 2021. SoundCells GitHub Repository. Retrieved Dec 1, 2021
[35] Claire Kearney-Volpe and Amy Hurst. 2021. Accessible Web Development: Op- from https://github.com/Huriphoonado/soundcells
portunities to Improve the Education and Practice of Web Development with a [57] Ability Project. 2022. SoundCells Class at Community Music School for the Blind.
Screen Reader. 14, 2, Article 8 (2021), 32 pages. https://doi.org/10.1145/3458024 Retrieved Mar 1, 2022 from https://huriphoonado.github.io/sc-workshop
[36] Lang, Mario. 2009. FreeDots: MusicXML to Braille Music translation. Retrieved [58] Ability Project. 2022. SoundCells Class at Community Music School for the Blind:
May 1, 2021 from https://delysid.org/freedots.html Final Compositions. Retrieved Mar 1, 2022 from https://huriphoonado.github.
[37] Matthias Leopold. 2006. HODDER–a fully automatic braille note production sys- io/sc-workshop/class/update/2022/04/03/week-6.html
tem. In International Conference on Computers for Handicapped Persons. Springer, [59] Lauren Race, Joshua A. Miele, Chancey Fleet, Tom Igoe, and Amy Hurst. 2020.
6–11. Putting Tools in Hands: Designing Curriculum for a Nonvisual Soldering Work-
[38] Alex Michael Lucas, Miguel Ortiz, and Dr. Franziska Schroeder. 2019. Be- shop. In The 22nd International ACM SIGACCESS Conference on Computers and
spoke Design for Inclusive Music: The Challenges of Evaluation. Proceedings Accessibility (Virtual Event, Greece) (ASSETS ’20). Association for Computing
of the International Conference on New Interfaces for Musical Expression (2019), Machinery, New York, NY, USA, Article 78, 4 pages. https://doi.org/10.1145/
105–109. https://novationmusic.com/keys/launchkey%0Ahttp://www.nime.org/ 3373625.3418011
proceedings/2019/nime2019_021.pdf [60] Mitchel Resnick, John Maloney, Andrés Monroy-Hernández, Natalie Rusk, Evelyn
[39] Stephanie Ludi and Mary Spencer. 2017. Design Considerations to Increase Eastmond, Karen Brennan, Amon Millner, Eric Rosenbaum, Jay Silver, Brian
Block-based Language Accessibility for Blind Programmers Via Blockly. 3, 1 Silverman, et al. 2009. Scratch: programming for all. Commun. ACM 52, 11 (2009),
(July 2017), 119–124. https://doi.org/10.18293/vlss2017-013 60–67.
[40] Bill Manaris, Blake Stevens, and Andrew R. Brown. 2016. JythonMusic: An [61] Toby W Rush. 2019. Braille Music Notator. Retrieved May 1, 2021 from
environment for teaching algorithmic music composition, dynamic coding and https://tobyrush.com/braillemusic/notator/
musical performativity. 9, 1 (May 2016), 33–56. https://doi.org/10.1386/jmte.9.1. [62] Marc Sabatella. 2013. ABC For Blind Musicians. Retrieved May 1, 2021 from https:
33_1 //accessiblemusicnotation.wordpress.com/2013/08/21/abc-for-blind-musicians/
[41] Christina Matawa. 2009. Exploring the musical interests and abilities of blind and [63] Abir Saha and Anne Marie Piper. 2020. Understanding Audio Production Practices
partially sighted children and young people with Retinopathy of Prematurity. of People with Vision Impairments. In The 22nd International ACM SIGACCESS
ASSETS ’22, October 23–26, 2022, Athens, Greece Payne et al.

Conference on Computers and Accessibility. 1–13. [69] Andreas Stefk, Richard E. Ladner, William Allee, and Sean Mealin. 2019. Com-
[64] Alexa F. Siu, Son Kim, Joshua A. Miele, and Sean Follmer. 2019. ShapeCAD: puter Science Principles for Teachers of Blind and Visually Impaired Students. In
An Accessible 3D Modelling Workfow for the Blind and Visually-Impaired Via Proceedings of the 50th ACM Technical Symposium on Computer Science Education
2.5D Shape Displays. In The 21st International ACM SIGACCESS Conference on (Minneapolis, MN, USA) (SIGCSE ’19). Association for Computing Machinery,
Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19). Association for New York, NY, USA, 766–772. https://doi.org/10.1145/3287324.3287453
Computing Machinery, New York, NY, USA, 342–354. https://doi.org/10.1145/ [70] Andreas M. Stefk, Christopher Hundhausen, and Derrick Smith. 2011. On the
3308561.3353782 Design of an Educational Infrastructure for the Blind and Visually Impaired
[65] Lawrence R. Smith, Karin Auckenthaler, Gilbert Busch, Karen Gearreald, Dan in Computer Science. In Proceedings of the 42nd ACM Technical Symposium
Geminder, Beverly McKenney, Harvey Miller, and Tom Ridgeway. 2015. Braille on Computer Science Education (Dallas, TX, USA) (SIGCSE ’11). Association for
Music Code. American Printing House for the Blind, Louisville, Kentucky. Computing Machinery, New York, NY, USA, 571–576. https://doi.org/10.1145/
[66] Andreas Stefk, Roger Alexander, Robert Patterson, and Jonathan Brown. 2007. 1953163.1953323
WAD: A Feasibility study using the Wicked Audio Debugger. In 15th IEEE In- [71] Atau Tanaka and Adam Parkinson. 2016. Haptic Wave. (2016), 2150–2161.
ternational Conference on Program Comprehension (ICPC ’07). 69–80. https: https://doi.org/10.1145/2858036.2858304
//doi.org/10.1109/ICPC.2007.42 [72] The DAISY Consortium. 2021. Music Braille. Retrieved July 1, 2021 from
[67] Andreas Stefk, Andrew Haywood, Shahzada Mansoor, Brock Dunda, and Daniel https://daisy.org/activities/projects/music-braille/
Garcia. 2009. SODBeans. In 2009 IEEE 17th International Conference on Program [73] Chris Walshaw. 2021. abc notation. Retrieved May 1, 2021 from https://
Comprehension. 293–294. https://doi.org/10.1109/ICPC.2009.5090064 abcnotation.com/
[68] Andreas Stefk, Christopher Hundhausen, and Robert Patterson. 2011. An empir- [74] Ge Wang. 2012. A History of Programming and Music. Cambridge University
ical investigation into the design of auditory cues to enhance computer program Press, 55–71.
comprehension. International Journal of Human-Computer Studies 69, 12 (2011), [75] World Wide Web Consortium. 2021. Web Content Accessibility Guidelines
820–838. https://doi.org/10.1016/j.ijhcs.2011.07.002 (WCAG) Overview. Retrieved May 1, 2021 from https://www.w3.org/WAI/
standards-guidelines/wcag/
Designing Gestures for Digital Musical Instruments: Gesture
Elicitation Study with Deaf and Hard of Hearing People
Ryo Iijima Akihisa Shitara Yoichi Ochiai
ryoiijima@digitalnature.slis.tsukuba.ac.jp theta- wizard@digitalnature.slis.tsukuba.ac.jp
University of Tsukuba akihisa@digitalnature.slis.tsukuba.ac.jp University of Tsukuba
Tsukuba, Japan University of Tsukuba Tsukuba, Japan
Tsukuba, Japan

castanet (AR=0.33) triangle (AR=0.22) guiro (AR=0.40) maracas (AR=0.65) pellet drum (AR=0.29)

tam-tam (AR=0.05) jaw harp (AR=0.05) recorder (AR=0.13) guitar (AR=0.31) cymbal (AR=0.40)

Figure 1: Most frequent gesture proposals for each referent. (AR = agreement rate)
ABSTRACT CCS CONCEPTS
When playing musical instruments, deaf and hard-of-hearing (DHH) • Human-centered computing → Empirical studies in acces-
people typically sense their music from the vibrations transmit- sibility.
ted by the instruments or the movements of their bodies while
performing. Sensory substitution devices now exist that convert KEYWORDS
sounds into light and vibrations to support DHH people’s musi- Deaf, hard of hearing, music, mobile, gesture elicitation study
cal activities. However, these devices require specialized hardware,
ACM Reference Format:
and the marketing profles assume that standard musical instru-
Ryo Iijima, Akihisa Shitara, and Yoichi Ochiai. 2022. Designing Gestures for
ments are available. Hence, a signifcant gap remains between DHH Digital Musical Instruments: Gesture Elicitation Study with Deaf and Hard
people and their musical performance enjoyment. To address this of Hearing People. In The 24th International ACM SIGACCESS Conference on
issue, this study identifes end users’ preferred gestures when us- Computers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece.
ing smartphones to emulate the musical experience based on the ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3517428.3544828
instrument selected. This gesture elicitation study applies 10 instru-
ment types. Herein, we present the results and a new taxonomy of 1 INTRODUCTION
musical instrument gestures. The fndings will support the design
Musical activities have been reported to provide a wide range of
of gesture-based instrument interfaces to enable DHH people to
cognitive, social, emotional, and physical benefts [64]. Similarly,
more directly enjoy their musical performances.
smartphones already provide users with opportunities to incorpo-
rate musical activities into their daily lives. Owing to the release
of digital audio applications for mobile devices and desktop music,
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed anyone can experience the process of creating music and virtually
for proft or commercial advantage and that copies bear this notice and the full citation playing an instrument. Although deaf and hard-of-hearing (DHH)
on the frst page. Copyrights for components of this work owned by others than the people can enjoy music in various ways [24], they typically fnd
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission it difcult to enjoy digital musical instrument applications. When
and/or a fee. Request permissions from permissions@acm.org. DHH people play real instruments, they typically feel the sound
ASSETS ’22, October 23–26, 2022, Athens, Greece and rhythm from the vibrations transmitted from the instrument
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 and their own body movements. However, this is not possible with
https://doi.org/10.1145/3517428.3544828 smartphone digital musical instrument applications because they
ASSETS ’22, October 23–26, 2022, Athens, Greece Iijima, et al.

cannot currently provide the same depth of vibrations as that pro- people is scarce; the review by Ilsar et al. [27] reported that only
vided by a real instrument. Furthermore, most applications are four ADMIs for DHH people have been developed as of 2020.
controlled using a touch screen, rendering it difcult to employ the Music and speech are fundamentally diferent in their wave
body in ways benefcial to music appreciation. Previous research forms; thus, it is difcult to enjoy music with hearing aids designed
has helped DHH people play musical instruments using sensory for speech [5]. Additionally, cochlear implants make it easier to
substitution devices that convert sound into light or vibrations [52]. listen to human conversations; however, they are not yet suitable
However, such approaches assume that a traditional instrument is for music [41]. Researchers have worked to improve such devices
available, and special hardware is required. As a result, a gap re- by designing accompanying music training programs and games
mains between DHH people and their daily musical satisfaction. To [18, 37, 81]. However, melody and timbre identifcation remains
fll this gap, this study focuses on augmenting smartphone digital difcult [9, 19]. Several approaches allow DHH users to perceive
musical instrument applications accordingly. music without relying on their ears via sensory substitution [35].
In our previous work [26], we developed a smartphone-based mu- Numerous sensory substitution systems have been developed
sical instrument prototype for DHH users based on motion gestures, to convert auditory information into visual [15, 17, 31, 42, 53, 68,
enabling users to feel the music through smartphone vibrations. 79, 81] and tactile cues [28–30, 33, 43, 51, 52]. Sensory substitution
This concept has been well received by DHH users. However, the systems faces the perennial problem of the difculty of designing
design space remains ambiguous, and there is a lack of taxonomical intuitive mappings [52].
and control parameters. This paper describes the empirical results This study is not an approach that allows DHH people to imitate
of a gesture elicitation study (GES) of the physical expressions dis- the way hearing people enjoy music. The main objective was to
played when using a smartphone as a musical instrument. In our create an application that allows DHH people to enjoy making
experiment, given the opportunity to obtain vibrational feedback vibrations such that they can appreciate the nuances of timbres and
through the smartphone from gestures, participants were asked to melodies.
identify the most suitable motions to accompany the music. This
work provides a quantitative and qualitative characterization of
these gestures and a user-defned gesture set and taxonomy. Fur- 2.3 Creating Music with Mobile Devices
thermore, design implications are summarized. Owing to their widespread use worldwide, mobile phones have the
potential to help people overcome cultural and economic barriers.
2 RELATED WORK In 2004, Tanaka proposed a system for creating music with mobile
devices [62], and development communities have emerged [20].
2.1 Music and DHH People
Since the development of the pocket Gamelan [59] in 2006,
DHH people can and do enjoy music [16, 49]. A detailed study wherein a mobile phone is used with music by moving it around,
on how DHH people relate to music was conducted by Darrow several projects [13, 22, 36, 56, 57, 70, 71] dedicated to realizing
et al. [6]. Glenny [23] and Roebuck [55] are DHH people who musical interaction in this way have emerged. Smartphone mu-
work as professional musicians. Beethoven, who is famous for his sic applications have enabled multiplayer performances. Notably,
many masterpieces, sufered from hearing loss as well [11]. These the Stanford Mobile Phone Orchestra [45] was founded in 2007,
examples suggest that people who are born deaf or have lost their followed which an ensemble at the University of Michigan was
hearing during life can still enjoy music. Few organizations are founded [12]. However, to the best of our knowledge, there are no
working toward making music more enjoyable and accessible to examples of DHH people being involved in these activities.
DHH people. “Music and the Deaf” 1 encourages DHH people to Several mobile music projects address the importance of tactile
participate in musical activities. A musical activity called the “White feedback [21, 25, 61]. Unfortunately, haptic feedback tends to be
Hand Chorus” 2 values the participation of children who are hearing monotonous on smartphones and does not directly provide the
and speech impaired or autistic. These projects also contribute to depth of musical structure required for DHH people. In our ap-
bringing DHH people closer to music. This study aimed to create proach, instead of using a smartphone’s haptic sensations as an
digital musical instruments that can be enjoyed by DHH people adjunct to music, we use them for purposes of musical expression.
using only existing devices. Therefore, it is expected to facilitate In summary, several smartphone sensory output methods are being
the daily musical enjoyment of DHH individuals. applied to solo and group musical performances, and the inclusion
of DHH people in these activities is required.
2.2 Accessible Digital Musical Instruments
(ADMIs)
To make music-making available to a more diverse population,
2.4 GES Method
ADMIs are gaining interest in the feld of computerized music The GES method is used to design gesture-enabled user interfaces
applications [27]. Previous research has resulted in systems that that refect the behavioral preferences of end users. In doing so,
allow people with neurodevelopmental [38, 47, 54] and mental researchers and designers seek to identify the gestures that corre-
disorders [48], as well as motor [2, 3] and visual disabilities [32, spond to certain system operations (i.e., referents). Often, designers
46, 50, 63, 78] to enjoy music. However, ADMI research for DHH devise diferent gestures that correspond to all referents. However,
by involving the end user in this process, we expect to build a
1 https://matd.org.uk/ gesture set that is easier to remember and use. In this study, the
2 https://www.elsistemajapan.org/whitehands pleasing nature of movement is a novel application of a GES.
Designing Gestures for Digital Musical Instruments: Gesture Elicitation Study with Deaf and Hard of Hearing People ASSETS ’22, October 23–26, 2022, Athens, Greece

The GES was frst introduced by Wobbrock et al. in 2005 [73]. First, the participants were provided instructions on how to use
Since then, GESs have been realized using various systems, and the application. After they confrmed that all 10 instruments were
216 peer-reviewed papers were compiled in 2020 [66]. GES has working properly, they were asked to identify the instruments with
been applied to various contexts including driving [44, 77], drone which they were unfamiliar. Thereafter, they were presented with
operations [10, 60], and augmented-reality scenarios [1, 67, 72]. the referents and asked to invent gestures for the given instruments
In addition to freehand types [69, 75, 76], smartphone [14, 34, 80] as the vibrations were created. They demonstrated the gestures over
and smartwatch gestures [4, 8, 39] have been studied. Our research Zoom. To mitigate ordering efects, the referents were presented in
addresses the musical instrument context for DHH users, noting a random order (Appendix B).
that GES methods have been underutilized among the ASSETS Following the study by Ruiz et al. [58], the participants were
community, and we hope this study will inspire them. instructed to treat their smartphone as a “magic brick” that could
recognize any gesture they might wish to perform. Hence, they
3 EXPERIMENT were less infuenced by the sensing and recognition technologies in
creating the gestures. After demonstrating their gestures for each
3.1 Participants
referent, the participants were asked to evaluate those gestures
We recruited 11 participants (fve females, fve males, and one using a fve-point Likert scale of questions [7]: (1) Goodness of ft:
“other”) through email and social media channels (see Table 1). “The gesture I performed is a good match for its purpose,” (2) Ease
The eligibility criteria required DHH individuals with an iPhone of use: “The gesture I performed is easy to perform.”
8 or a newer model, for which we designed the stimuli and study. Additionally, the experimenter(s) delved into how the partic-
This hardware limitation was necessitated by the requirement of ipants produced the gesture and why they selected it. Although
the smartphone’s Core Haptics engine and framework, which is many GESs have employed a think-aloud protocol, it was difcult to
a linear resonant actuator that provides customized vibrational apply this method as some preferred to communicate via text chat.
feedback at fne levels of detail. The criteria did not include any Therefore, we asked them the questions immediately after they
musical experience. The participants were 25.4 years old on average demonstrated their gestures, resulting in a more concise taxonomy.
(SD = 5.39, range = 19-38). Eight participants were profoundly deaf,
and the others noted hearing difculties that interfered with their 3.4 Data Analysis
daily lives. Participants were paid ¥860 for a 1-h study period. The
We evaluated user gesture preferences using the agreement rate
experiment reported in this study was approved by the authors’
(AR) for each referent obtained using the formula provided by
university’s ethics committee.
Vatavu and Wobbrock [65] (see Appendix C) and the thinking time
in seconds required by participants to propose a gesture for a given
3.2 Study Application referent after being asked.
We created an iOS application using Xcode and Swift. The appli-
cation frst intakes the user consent to participate and verifes the 4 RESULTS
presence and functionality of the Core Haptics engine of the Ap-
ple smartphone. When launched, image buttons appear with the 4.1 User-Defned Gesture Set
names of 10 musical instruments, as shown in Fig. S1. When the With 11 participants and 10 referents, 110 total gestures were cre-
user makes a selection, haptic feedback is provided. ated and clustered into groups of similar types according to the
The 10 instruments used as stimuli were representative of the criteria described in Appendix D. Of the 110 gestures obtained,
four main Hornbostel–Sachs musical classifcation hierarchy ele- 55 were distinct. Fig. 1 summarizes the ARs and most frequently
ments (i.e., idiophone, membranophone, chordophone, and aero- proposed gestures for each referent. The mean AR value was 0.27,
phone)(see Fig. S1). The reason for this was to investigate how which is a typical value compared with those reported in previous
gestures were diferent depending on the category of instruments. studies. The ARs for jaw harps and tam-tams were 0.05, and this
This hierarchy was included such that the gestures could be coded value is considered small [65]. In contrast, similar gestures were
and analyzed heuristically. proposed for maracas, with an AR of 0.65.
The vibrational pattern of each instrument was designed empiri-
cally to capture the characteristics of the instrument and make it 4.2 Correlation Analysis
easy to distinguish (see Appendix A for details). The mean AR across all referents was 0.27 (SD = 0.18). The high-
est rate was obtained for maracas, and the lowest rate (0.05) was
3.3 Experimental Process obtained for the tam-tam and jaw harp (see Fig. 1). The average
The study was conducted remotely using Zoom, owing to the thinking time was 62 s (SD = 59.1), goodness of ft was 4.02 (SD =
COVID-19 pandemic and Japan’s social-distancing restrictions. The 0.91), and ease of use was 4.12 (SD = 0.94).
iOS application was delivered to participants through TestFlight 3 . Three pairs of signifcant correlations among the metrics are
The experiment began with demographic questions. Prior to the shown in Fig. S2. We found a signifcant negative correlation be-
experiment, participants were asked about their preferred com- tween ease of use and thinking time (Pearson’s r (N =10) = −0.63, p =
munication channel (i.e., text chat or verbal), and we proceeded 0.05; see Fig. S2 (a)). We also found a signifcant positive correlation
according to their preferences. between goodness of ft and AR (Pearson’s r (N =10) = −0.66, p =
0.03) and ease of use and AR (Pearson’s r (N =10) = −0.63, p = 0.05;
3 https://developer.apple.com/jp/testfight/ see Figs. S2 (b) and S2 (c)).
ASSETS ’22, October 23–26, 2022, Athens, Greece Iijima, et al.

Table 1: Demographics of the DHH participants for the GES.

ID Age Gender Hearing loss Onset age Hearing device Musical Experiences
P1 23 Male Unknown 9 years None piano
P2 38 Female Profound 3 years Hearing Aids chorus, piano
P3 29 Female Profound Birth Hearing Aids chorus, percussion, piano
P4 26 Male Profound 1 year Hearing Aids chorus, drum, melodica, recorder, xylophone
P5 28 Other Profound Birth Hearing Aids chorus, clapping, piano, xylophone
P6 21 Female Profound 2 years Hearing Aids chorus
P7 21 Female Unknown 16 years None accordion, chorus, piano, recorder, xylophone
P8 25 Male Profound Birth None drum, fute, melodica, piano, recorder
P9 19 Male Profound 1 year Cochlear Implants chorus, koto, piano, recorder
P10 21 Female Unknown 2 years Hearing Aids chorus, recorder
P11 28 Male Profound Birth Hearing Aids chorus, piano

4.3 Gesture Taxonomy for Musical Instruments more fexible and easier to demonstrate. Second, corresponding
Several GES taxonomies have been developed, including surface gestures are quickly imagined when the musical instrument is easy
gestures [74] and mobile interaction types [58]. Taxonomies provide to play.
a means of understanding design spaces and, in this case, explaining
5.1.2 Subjective ratings and agreement rate. There was a positive
and analyzing music appreciation gestures. Our taxonomy is the
correlation between goodness of ft and agreement rate and be-
frst developed for DHH gesture-based instrument performance,
tween ease of use and agreement rate. It can be interpreted that the
and it is manually classifed into three dimensions of Nature, Form,
gestures that most participants produce tend to be easy to perform
and Function. Each dimension is further classifed into smaller cate-
and suitable for the task.
gories (see Fig. 2). The classifcation of all instruments together and
those of each instrument are shown in Figs. S3 and S4, respectively.
The Nature dimension classifes gesture-meaning relationships. 5.2 Future Work
For the instruments, 70% of the gestures were classifed as Motion 5.2.1 User Studies for Playing to the Rhythm. The objective of this
and 22% were Vibration, accounting for 92% of the Nature ratings. study was to use gestures to play music and not to select which
Motion accounted for more than 60% of all instruments, apart from instrument to play. To this end, this study aimed to defne the
the jaw harp. The highest percentage of Vibration classifcations gestures as preliminary to playing a piece of music to a rhythm. For
was observed with the tam-tam (36%) and jaw harp (55%), which example, when learning how to play the drums, a series of steps is
were the instruments with the lowest ARs. normally applied (e.g., hold the drumsticks, hit the drums, and play
The Form dimension distinguishes whether the smartphone’s to a rhythm). Of these, holding the drumsticks and hit the drums
position changes and whether the position of the hand not holding are the gestures we investigated, as they precede rhythmic playing.
the smartphone moves. Overall, Smartphone, Hand, and Both ac- Future work should include experiments with playing to a rhythm
counted for 86% of the total. Thus, most gestures were accompanied using gestures.
by user arm movements. Both was the most common gesture for
cymbals (64%) and maracas (45%), as both are commonly played by 5.2.2 Cultural diferences. Previous research [40] has discussed
holding objects of the same shape in both hands. user-defned gestures while focusing on cultural diferences. Hence,
The Function dimension classifes the way in which a smart- some diferences may appear between Deaf and other cultures. For
phone is observed. The instruments with most gestures classifed example, P3 mimicked a gesture that meant “sutra” in sign language,
as Non-sonorous are tam-tam (55%), guitar (55%), triangle (45%), associating the vibration of a wooden fsh used in a Buddhist temple
and guiro (36%). These four are played by holding and moving with that of the cymbal. Future research should include GESs for
non-soronous objects, such as a percussion mallet for the tam-tam playing musical instruments with hearing people and other user
and triangle, pick for the guitar, and stick for the guiro, with one groups, as it may provide deeper insights into inclusion.
hand. The instruments played by moving sonorous objects have a
high percentage of gestures classifed as Sonorous: castanets (100%), 5.3 Implications for Gesture Designs and
cymbal (64%), maracas (82%), and pellet drum (91%). Implementations
5.3.1 Instruments Played with Both Hands. The instruments played
5 DISCUSSION with the same object in both hands were cymbals and maracas. In
the Form dimension of cymbals and maracas, Both accounted for
5.1 Interpretation of Quantitative Analysis 64% and 45%, respectively. When playing a musical instrument ap-
5.1.1 Subjective ratings and thinking time. There was a negative plication, one hand holds the smartphone as the other creates sound.
correlation between ease of use and thinking time. This can be This form is asymmetric and considerably difers from the actual
interpreted in two ways. First, when participants spend more time instrument. Nevertheless, our participants preferred to perform the
thinking about a task, they may come up with gestures that are same movement with both hands. Other instruments played with
Designing Gestures for Digital Musical Instruments: Gesture Elicitation Study with Deaf and Hard of Hearing People ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 2: Taxonomy of gestures for musical instruments based on 110 gestures.

Dimension Category Description


Motion Gesture that mimics the motion of playing a real instrument.
Vibration Gesture that indicates something associated with the feel of vibration.
Nature
Abstract The mapping between instruments and gestures is arbitrary.
Name Gesture that indicates something associated with the name of the instrument.
Smartphone User moves the smartphone.
Hand User moves the hand that is not holding the smartphone.
Form
Both User moves the smartphone and hand that is not holding the smartphone at the same time.
Pose User performs the gesture keeping the smartphone and hands at one location.
Sonorous The smartphone serve as a sonorous object.
Function Non-sonorous The smartphone serve as a non-sonorous object, such as hand and stick.
Others The smartphone serve as something other than the above two cases.

both hands include wooden clappers and several types of drums. 5.3.5 Smartphone Cases and Accessories. There are ring-type ac-
For those, it is recommended that the hand not holding the smart- cessories on the market that attach to the back of a smartphone to
phone be able to move as well. In that case, the hand not holding the make it easier to hold by inserting a user’s fnger through it. P7 and
smartphone would have difculty receiving vibrotactile sensations. P8 emphasized that such accessories would allow certain gestures
Notably, P3 stated that he would like to feel the vibrations with to be executed more safely. Some gestures are difcult to perform
both hands. with notebook-type smartphone cases. When employing gestures
that include vigorous movements, it may be safer to encourage the
5.3.2 Hand Poses are not Important in Motion Gestures. Partici- user to install a ring-type accessory or remove the notebook cover.
pants did not insist on hand poses when demonstrating gestures
that were not classifed as Pose in the Form dimension. For example, 5.3.6 Referents with Low Agreemnet.
all those who demonstrated the gesture of tapping the smartphone
Unknown instruments: jaw harp. No participants were familiar
with a fnger said that it was acceptable to tap it with another
with the jaw harp. With other instruments, participants often cre-
fnger. When devising the gesture of swinging downward with
ated gestures that imitated how they imagined they would play the
the smartphone, P1 said, “I hold my smartphone this way, but I
real one. However, this approach did not work with the jaw harp.
think other users should hold their smartphone in the way they
It is likely that the resulting gestures were diverse for this reason.
prefer.’ Motion gestures for playing a musical instrument should in-
clude various hand poses. Additionally, when developing a motion Instruments that can be played with only one hand: tam-tam. The
gesture recognizer, diverse hand poses should be assumed. agreement rate of the tam-tam was 0.05; however, all other idio-
phones had values greater than 0.2. This may be attributed to the
5.3.3 Association from Vibration. Gestures falling into the Vibra- fact that the real tam-tam does not require the performer to hold
tion category were created by imitating daily actions that evoke the sonorous object, and that it can be played with one hand. All
pleasant feelings. The participants failed to agree on any gestures other idiophones are two-handed instruments; thus, the roles of
in the Vibration category. For example, P7 associated the vibration the hand holding the device and the other hand are easily fxed. Be-
of the tam-tam with the motion of hitting a ball in tennis, whereas cause this is not the case with the tam-tam, the gesture parameters
P3 associated it with the motion of accelerating a motorcycle. De- appear to have been distributed in various ways depending on the
signing a gesture based on vibrations may result in low agreement; smartphone being held by one or both hands, and whether it was
hence, other design methods (e.g., Motion, Abstract, and Name in moved vertically or horizontally.
Nature dimensions) should be prioritized.
6 CONCLUSION
5.3.4 Influence of Daily Smartphone Usage. As mentioned in Sec-
tion 3.3, participants were instructed to assume that the smartphone We conducted a study that elicited motion gestures from 11 DHH
was a magic brick that could recognize any gesture. However, sev- participants based on 10 diferent musical instruments, using their
eral participants were inspired by their daily use of smartphones. smartphones as instruments. Based on the commonalities of user-
For example, when creating a gesture for the triangle instrument, defned gestures, we created the frst taxonomy of gestures for
P7 mimicked the action of capturing a picture with a smartphone DHH people such that developers of musical-instrument-emulating
such that the position of the smartphone would be stable. Because smartphone applications will have design guidelines. Our fndings
smartphones are devices that are used on a daily basis, the way they provide several implications that will eventually help DHH people
are manipulated has been optimized to ensure user comfort. Thus, enjoy various instruments on a daily basis.
it is recommended that gestures be designed to not disrupt the
comfortable holding of the smartphone in normal situations, such ACKNOWLEDGMENTS
as when holding it vertically or horizontally or when capturing This work was supported by JST CREST Grant Number JPMJCR19F2,
pictures. Japan.
ASSETS ’22, October 23–26, 2022, Athens, Greece Iijima, et al.

REFERENCES arXiv:https://nyaspubs.onlinelibrary.wiley.com/doi/pdf/10.1111/j.1749-
[1] Christopher R. Austin, Barrett Ens, Kadek Ananta Satriadi, and Bernhard Jenny. 6632.2009.04551.x
2020. Elicitation study investigating hand and foot gesture interaction for im- [20] Lalya Gaye, Lars Erik Holmquist, Frauke Behrendt, and Atau Tanaka. 2006. Mobile
mersive maps in augmented reality. Cartography and Geographic Information Music Technology: Report on an Emerging Community. In Proceedings of the
Science 47, 3 (2020), 214–228. https://doi.org/10.1080/15230406.2019.1696232 2006 Conference on New Interfaces for Musical Expression (Paris, France) (NIME
arXiv:https://doi.org/10.1080/15230406.2019.1696232 ’06). IRCAM — Centre Pompidou, Paris, FRA, 22–25.
[2] Amal Dar Aziz, Chris Warren, Hayden Bursk, and Sean Follmer. 2008. The [21] Günter Geiger. 2006. Using the Touch Screen as a Controller for Portable Com-
Flote: An Instrument for People with Limited Mobility. In Proceedings of the 10th puter Music Instruments. In Proceedings of the 2006 Conference on New Interfaces
International ACM SIGACCESS Conference on Computers and Accessibility (Halifax, for Musical Expression (Paris, France) (NIME ’06). IRCAM — Centre Pompidou,
Nova Scotia, Canada) (Assets ’08). Association for Computing Machinery, New Paris, FRA, 61–64.
York, NY, USA, 295–296. https://doi.org/10.1145/1414471.1414545 [22] Nicholas Gillian, Sile O’Modhrain, and Georg Essl. 2009. Scratch-Of : A Gesture
[3] Emeline Brulé. 2016. Playing Music with the Head. In Proceedings of the 18th Based Mobile Music Game with Tactile Feedback. In Proceedings of the Inter-
International ACM SIGACCESS Conference on Computers and Accessibility (Reno, national Conference on New Interfaces for Musical Expression. Zenodo, 308–311.
Nevada, USA) (ASSETS ’16). Association for Computing Machinery, New York, https://doi.org/10.5281/zenodo.1177553
NY, USA, 339–340. https://doi.org/10.1145/2982142.2982146 [23] Evelyn Glennie. 2015. Hearing Essay. https://www.evelyn.co.uk/hearing-essay/.
[4] Thisum Buddhika, Haimo Zhang, Samantha W. T. Chan, Vipula Dissanayake, (Accessed on 07/10/2022).
Suranga Nanayakkara, and Roger Zimmermann. 2019. FSense: Unlocking the [24] Rumi Hiraga and Kjetil Falkenberg Hansen. 2013. Sound Preferences of Persons
Dimension of Force for Gestural Interactions Using Smartwatch PPG Sensor. In with Hearing Loss Playing an Audio-Based Computer Game. In Proceedings of the
Proceedings of the 10th Augmented Human International Conference 2019 (Reims, 3rd ACM International Workshop on Interactive Multimedia on Mobile & Portable
France) (AH2019). Association for Computing Machinery, New York, NY, USA, Devices (Barcelona, Spain) (IMMPD ’13). Association for Computing Machinery,
Article 11, 5 pages. https://doi.org/10.1145/3311823.3311839 New York, NY, USA, 25–30. https://doi.org/10.1145/2505483.2505489
[5] Marshall Chasin. 2003. Music and hearing aids. The Hearing Journal 56, 7 (July [25] Euyshick Hong and Jun Kim. 2017. Webxophone: Web Audio Wind Instrument.
2003), 36–38. In Proceedings of the International Conference on Algorithms, Computing and
[6] Alice-Ann Darrow. 1993. The Role of Music in Deaf Culture: Implications for Systems (Jeju Island, Republic of Korea) (ICACS ’17). Association for Computing
Music Educators. Journal of Research in Music Education 41, 2 (1993), 93–110. Machinery, New York, NY, USA, 79–82. https://doi.org/10.1145/3127942.3127954
https://doi.org/10.2307/3345402 arXiv:https://doi.org/10.2307/3345402 [26] Ryo Iijima, Akihisa Shitara, Sayan Sarcar, and Yoichi Ochiai. 2021. Smartphone
[7] Nem Khan Dim, Chaklam Silpasuwanchai, Sayan Sarcar, and Xiangshi Ren. Drum: Gesture-Based Digital Musical Instruments Application for Deaf and Hard
2016. Designing Mid-Air TV Gestures for Blind People Using User- and Choice- of Hearing People. In Symposium on Spatial User Interaction (Virtual Event, USA)
Based Elicitation Approaches. In Proceedings of the 2016 ACM Conference on (SUI ’21). Association for Computing Machinery, New York, NY, USA, Article 25,
Designing Interactive Systems (Brisbane, QLD, Australia) (DIS ’16). Association for 2 pages. https://doi.org/10.1145/3485279.3488285
Computing Machinery, New York, NY, USA, 204–214. https://doi.org/10.1145/ [27] Alon Ilsar and Gail Kenning. 2020. Inclusive Improvisation through Sound
2901790.2901834 and Movement Mapping: From DMI to ADMI. In The 22nd International ACM
[8] Tilman Dingler, Rufat Rzayev, Alireza Sahami Shirazi, and Niels Henze. 2018. SIGACCESS Conference on Computers and Accessibility (Virtual Event, Greece)
Designing Consistent Gestures Across Device Types: Eliciting RSVP Controls for (ASSETS ’20). Association for Computing Machinery, New York, NY, USA, Article
Phone, Watch, and Glasses. Association for Computing Machinery, New York, 49, 8 pages. https://doi.org/10.1145/3373625.3416988
NY, USA, 1–12. https://doi.org/10.1145/3173574.3173993 [28] Maria Karam, Carmen Branje, Gabe Nespoli, Norma Thompson, Frank A. Russo,
[9] Ward R Drennan and Jay T Rubinstein. 2008. Music perception in cochlear and Deborah I. Fels. 2010. The Emoti-Chair: An Interactive Tactile Music Exhibit.
implant users and its relationship with psychophysical capabilities. Journal of In CHI ’10 Extended Abstracts on Human Factors in Computing Systems (Atlanta,
rehabilitation research and development 45, 5 (2008), 779—789. https://doi.org/10. Georgia, USA) (CHI EA ’10). Association for Computing Machinery, New York,
1682/jrrd.2007.08.0118 NY, USA, 3069–3074. https://doi.org/10.1145/1753846.1753919
[10] Jane L. E, Ilene L. E, James A. Landay, and Jessica R. Cauchard. 2017. Drone & [29] Maria Karam, Gabe Nespoli, Frank Russo, and Deborah I. Fels. 2009. Modelling
Wo: Cultural Infuences on Human-Drone Interaction Techniques. In Proceedings Perceptual Elements of Music in a Vibrotactile Display for Deaf Users: A Field
of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Study. In Proceedings of the 2009 Second International Conferences on Advances in
Colorado, USA) (CHI ’17). Association for Computing Machinery, New York, NY, Computer-Human Interactions (ACHI ’09). IEEE Computer Society, USA, 249–254.
USA, 6794–6799. https://doi.org/10.1145/3025453.3025755 https://doi.org/10.1109/ACHI.2009.64
[11] George Thomas Ealy. 1994. Of ear trumpets and a resonance plate: early hearing [30] Maria Karam, Frank Russo, Carmen Branje, Emily Price, and Deborah I. Fels. 2008.
aids and Beethoven’s hearing perception. 19th-Century Music 17, 3 (Spring 1994), Towards a Model Human Cochlea: Sensory Substitution for Crossmodal Audio-
262–273. Tactile Displays. In Proceedings of Graphics Interface 2008 (Windsor, Ontario,
[12] Georg Essl. 2010. The Mobile Phone Ensemble As Classroom. In Proceedings of Canada) (GI ’08). Canadian Information Processing Society, CAN, 267–274.
the International Computer Music Conference (ICMC), Stony Brooks/New York. [31] Jeeeun Kim, Swamy Ananthanarayan, and Tom Yeh. 2015. Seen Music: Ambient
[13] Georg Essl and Michael Rohs. 2007. ShaMus – A Sensor-Based Integrated Mobile Music Data Visualization for Children with Hearing Impairments. In Proceedings
Phone Instrument. In Proceedings of the International Computer Music Conference of the 14th International Conference on Interaction Design and Children (Boston,
(ICMC). 27–31. Massachusetts) (IDC ’15). Association for Computing Machinery, New York, NY,
[14] Sharif A. M. Faleel, Michael Gammon, Yumiko Sakamoto, Carlo Menon, and USA, 426–429. https://doi.org/10.1145/2771839.2771870
Pourang Irani. 2020. User Gesture Elicitation of Common Smartphone Tasks for [32] Joy Kim and Jonathan Ricaurte. 2011. TapBeats: Accessible and Mobile Casual
Hand Proximate User Interfaces. In Proceedings of the 11th Augmented Human Gaming. In The Proceedings of the 13th International ACM SIGACCESS Conference
International Conference (Winnipeg, Manitoba, Canada) (AH ’20). Association for on Computers and Accessibility (Dundee, Scotland, UK) (ASSETS ’11). Association
Computing Machinery, New York, NY, USA, Article 6, 8 pages. https://doi.org/ for Computing Machinery, New York, NY, USA, 285–286. https://doi.org/10.
10.1145/3396339.3396363 1145/2049536.2049609
[15] David Fourney. 2012. Can Computer Representations of Music Enhance En- [33] Bruno La Versa, Isabella Peruzzi, Luca Diamanti, and Marco Zemolin. 2014. MU-
joyment for Individuals Who Are Hard of Hearing?. In Proceedings of the 13th VIB: Music and Vibration. In Proceedings of the 2014 ACM International Sym-
International Conference on Computers Helping People with Special Needs - Volume posium on Wearable Computers: Adjunct Program (Seattle, Washington) (ISWC
Part I (Linz, Austria) (ICCHP’12). Springer-Verlag, Berlin, Heidelberg, 535–542. ’14 Adjunct). Association for Computing Machinery, New York, NY, USA, 65–70.
https://doi.org/10.1007/978-3-642-31522-0_80 https://doi.org/10.1145/2641248.2641267
[16] David W. Fourney. 2015. Making the invisible visible: visualization of music and [34] Huy Viet Le, Sven Mayer, Maximilian Weiß, Jonas Vogelsang, Henrike Weingärt-
lyrics for deaf and hard of hearing audiences. https://doi.org/10.32920/ryerson. ner, and Niels Henze. 2020. Shortcut Gestures for Mobile Text Editing on Fully
14664129.v1 Touch Sensitive Smartphones. ACM Trans. Comput.-Hum. Interact. 27, 5, Article
[17] David W. Fourney and Deborah I. Fels. 2009. Creating access to music through 33 (aug 2020), 38 pages. https://doi.org/10.1145/3396233
visualization. In 2009 IEEE Toronto International Conference Science and Technology [35] Charles Lenay, Stephane Canu, and Pierre Villon. 1997. Technology and Per-
for Humanity (TIC-STH). 939–944. https://doi.org/10.1109/TIC-STH.2009.5444364 ception: The Contribution of Sensory Substitution Systems. In Proceedings of
[18] Qian-Jie Fu and John J Galvin. 2007. Computer-Assisted Speech Training for the 2nd International Conference on Cognitive Technology (CT ’97) (CT ’97). IEEE
Cochlear Implant Patients: Feasibility, Outcomes, and Future Directions. Seminars Computer Society, USA, 44.
in hearing 28, 2 (May 2007). https://doi.org/10.1055/s-2007-973440 [36] Yang Kyu Lim and Woon Seung Yeo. 2014. Smartphone-based Music Conduct-
[19] John J. Galvin III, Qian-Jie Fu, and Robert V. Shannon. 2009. Melodic ing. In Proceedings of the International Conference on New Interfaces for Musical
Contour Identifcation and Music Perception by Cochlear Implant Expression. Zenodo, 573–576. https://doi.org/10.5281/zenodo.1178851
Users. Annals of the New York Academy of Sciences 1169, 1 (July [37] Charles J. Limb and Alexis T. Roy. 2014. Technological, biological, and acoustical
2009), 518–533. https://doi.org/10.1111/j.1749-6632.2009.04551.x constraints to music perception in cochlear implant users. Hearing Research 308
(2014), 13–26. https://doi.org/10.1016/j.heares.2013.04.009 Music: A window
Designing Gestures for Digital Musical Instruments: Gesture Elicitation Study with Deaf and Hard of Hearing People ASSETS ’22, October 23–26, 2022, Athens, Greece

into the hearing brain. Accessibility (Virtual Event, Greece) (ASSETS ’20). Association for Computing
[38] Joana Lobo, Soichiro Matsuda, Izumi Futamata, Ryoichi Sakuta, and Kenji Suzuki. Machinery, New York, NY, USA, Article 104, 4 pages. https://doi.org/10.1145/
2019. CHIMELIGHT: Augmenting Instruments in Interactive Music Therapy 3373625.3417077
for Children with Neurodevelopmental Disorders. In The 21st International ACM [55] Janine Roebuck. 2007. I am a deaf opera singer. https://www.theguardian.com/
SIGACCESS Conference on Computers and Accessibility (Pittsburgh, PA, USA) theguardian/2007/sep/29/weekend7.weekend2. (Accessed on 07/10/2022).
(ASSETS ’19). Association for Computing Machinery, New York, NY, USA, 124– [56] Michael Rohs and Georg Essl. 2007. CaMus<sup>2</sup>: Optical Flow and
135. https://doi.org/10.1145/3308561.3353784 Collaboration in Camera Phone Music Performance. In Proceedings of the 7th
[39] Meethu Malu, Pramod Chundury, and Leah Findlater. 2018. Exploring Acces- International Conference on New Interfaces for Musical Expression (New York, New
sible Smartwatch Interactions for People with Upper Body Motor Impairments. York) (NIME ’07). Association for Computing Machinery, New York, NY, USA,
Association for Computing Machinery, New York, NY, USA, 1–12. https: 160–163. https://doi.org/10.1145/1279740.1279770
//doi.org/10.1145/3173574.3174062 [57] Michael Rohs, Georg Essl, and Martin Roth. 2006. CaMus: Live Music Performance
[40] Dan Mauney, Jonathan Howarth, Andrew Wirtanen, and Miranda Capra. 2010. using Camera Phones and Visual Grid Tracking. In Proceedings of the International
Cultural Similarities and Diferences in User-Defned Gestures for Touchscreen Conference on New Interfaces for Musical Expression. Zenodo, 31–36. https:
User Interfaces. In CHI ’10 Extended Abstracts on Human Factors in Computing Sys- //doi.org/10.5281/zenodo.1176997
tems (Atlanta, Georgia, USA) (CHI EA ’10). Association for Computing Machinery, [58] Jaime Ruiz, Yang Li, and Edward Lank. 2011. User-Defned Motion Gestures for
New York, NY, USA, 4015–4020. https://doi.org/10.1145/1753846.1754095 Mobile Interaction. In Proceedings of the SIGCHI Conference on Human Factors in
[41] Hugh J. McDermott. 2004. Music Perception with Cochlear Implants: A Review. Computing Systems (Vancouver, BC, Canada) (CHI ’11). Association for Comput-
Trends in Amplifcation 8, 2 (January 2004), 49–82. https://doi.org/10.1177/ ing Machinery, New York, NY, USA, 197–206. https://doi.org/10.1145/1978942.
108471380400800203 arXiv:https://doi.org/10.1177/108471380400800203 PMID: 1978971
15497033. [59] Greg Schiemer and Mark Havryliv. 2006. Pocket Gamelan: Tuneable Trajectories
[42] Jorge Mori and Deborah I. Fels. 2009. Seeing the music can animated lyrics for Flying Sources in <i>Mandala 3</i> and <i>Mandala 4</i>. In Proceedings
provide access to the emotional content in music for people who are deaf or hard of the 2006 Conference on New Interfaces for Musical Expression (Paris, France)
of hearing?. In 2009 IEEE Toronto International Conference Science and Technology (NIME ’06). IRCAM — Centre Pompidou, Paris, FRA, 37–42.
for Humanity (TIC-STH). 951–956. https://doi.org/10.1109/TIC-STH.2009.5444362 [60] Matthias Seuter, Eduardo Rodriguez Macrillante, Gernot Bauer, and Christian
[43] Suranga Nanayakkara, Elizabeth Taylor, Lonce Wyse, and S H. Ong. 2009. An Kray. 2018. Running with Drones: Desired Services and Control Gestures. In
Enhanced Musical Experience for the Deaf: Design and Evaluation of a Music Proceedings of the 30th Australian Conference on Computer-Human Interaction
Display and a Haptic Chair. In Proceedings of the SIGCHI Conference on Human (Melbourne, Australia) (OzCHI ’18). Association for Computing Machinery, New
Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association for York, NY, USA, 384–395. https://doi.org/10.1145/3292147.3292156
Computing Machinery, New York, NY, USA, 337–346. https://doi.org/10.1145/ [61] Bradley Strylowski, Jesse Allison, and Jesse Guessford. 2014. Pitch Canvas:
1518701.1518756 Touchscreen Based Mobile Music Instrument. In Proceedings of the International
[44] Vijayakumar Nanjappan, Rongkai Shi, Hai-Ning Liang, Kim King-Tong Lau, Yong Conference on New Interfaces for Musical Expression. Zenodo, 171–174. https:
Yue, and Katie Atkinson. 2019. Towards a Taxonomy for In-Vehicle Interactions //doi.org/10.5281/zenodo.1178947
Using Wearable Smart Textiles: Insights from a User-Elicitation Study. Multimodal [62] Atau Tanaka. 2004. Mobile Music Making. In Proceedings of the 2004 Conference
Technologies and Interaction 3, 2 (2019). https://doi.org/10.3390/mti3020033 on New Interfaces for Musical Expression (Hamamatsu, Shizuoka, Japan) (NIME
[45] Jieun Oh, Jorge Herrera, Nicholas J. Bryan, Luke Dahl, and Ge Wang. 2010. Evolv- ’04). National University of Singapore, SGP, 154–156.
ing The Mobile Phone Orchestra. In Proceedings of the International Conference [63] Stephanie Valencia, Dwayne Lamb, Shane Williams, Harish S. Kulkarni, Ann
on New Interfaces for Musical Expression. Zenodo, 82–87. https://doi.org/10.5281/ Paradiso, and Meredith Ringel Morris. 2019. Dueto: Accessible, Gaze-Operated
zenodo.1177871 Musical Expression. In The 21st International ACM SIGACCESS Conference on
[46] Shotaro Omori and Ikuko Eguchi Yairi. 2013. Collaborative Music Application Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19). Association for
for Visually Impaired People with Tangible Objects on Table. In Proceedings of Computing Machinery, New York, NY, USA, 513–515. https://doi.org/10.1145/
the 15th International ACM SIGACCESS Conference on Computers and Accessibility 3308561.3354603
(Bellevue, Washington) (ASSETS ’13). Association for Computing Machinery, [64] Maria Varvarigou, Susan Hallam, Andrea Creech, and Hilary McQueen. 2012.
New York, NY, USA, Article 42, 2 pages. https://doi.org/10.1145/2513383.2513403 Benefts experienced by older people in group music-making activities. Journal
[47] Deysi Helen Ortega, Franceli Linney Cibrian, and Mónica Tentori. 2015. Bend- of Applied Arts and Health 3 (08 2012), 183–198. https://doi.org/10.1386/jaah.3.2.
ableSound: A Fabric-Based Interactive Surface to Promote Free Play in Chil- 183_1
dren with Autism. In Proceedings of the 17th International ACM SIGACCESS [65] Radu-Daniel Vatavu and Jacob O. Wobbrock. 2015. Formalizing Agreement Anal-
Conference on Computers & Accessibility (Lisbon, Portugal) (ASSETS ’15). As- ysis for Elicitation Studies: New Measures, Signifcance Test, and Toolkit. In
sociation for Computing Machinery, New York, NY, USA, 315–316. https: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing
//doi.org/10.1145/2700648.2811355 Systems (Seoul, Republic of Korea) (CHI ’15). Association for Computing Machin-
[48] Mikel Ostiz-Blanco, Alfredo Pina, Miriam Lizaso, Jose Javier Astráin, and Gonzalo ery, New York, NY, USA, 1325–1334. https://doi.org/10.1145/2702123.2702223
Arrondo. 2018. Using the Musical Multimedia Tool ACMUS with People with [66] Santiago Villarreal-Narvaez, Jean Vanderdonckt, Radu-Daniel Vatavu, and Ja-
Severe Mental Disorders: A Pilot Study. In Proceedings of the 20th International cob O. Wobbrock. 2020. A Systematic Review of Gesture Elicitation Studies: What
ACM SIGACCESS Conference on Computers and Accessibility (Galway, Ireland) Can We Learn from 216 Studies? Association for Computing Machinery, New
(ASSETS ’18). Association for Computing Machinery, New York, NY, USA, 462–464. York, NY, USA, 855–872. https://doi.org/10.1145/3357236.3395511
https://doi.org/10.1145/3234695.3241016 [67] Panagiotis Vogiatzidakis and Panayiotis Koutsabasis. 2020. Mid-Air Gesture
[49] Carol A Padden and Tom Humphries. 1988. Deaf in America. Harvard University Control of Multiple Home Devices in Spatial Augmented Reality Prototype.
Press. Multimodal Technologies and Interaction 4, 3 (2020). https://doi.org/10.3390/
[50] William Payne, Alex Xu, Amy Hurst, and S. Alex Ruthmann. 2019. Non-Visual mti4030061
Beats: Redesigning the Groove Pizza. In The 21st International ACM SIGACCESS [68] Quoc V. Vy, Jorge A. Mori, David W. Fourney, and Deborah I. Fels. 2008. EnACT: A
Conference on Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19). Software Tool for Creating Animated Text Captions. In Computers Helping People
Association for Computing Machinery, New York, NY, USA, 651–654. https: with Special Needs, Klaus Miesenberger, Joachim Klaus, Wolfgang Zagler, and
//doi.org/10.1145/3308561.3354590 Arthur Karshmer (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 609–616.
[51] Benjamin Petry, Thavishi Illandara, Don Samitha Elvitigala, and Suranga [69] Benjamin Walther-Franks, Tanja Döring, Meltem Yilmaz, and Rainer Malaka.
Nanayakkara. 2018. Supporting Rhythm Activities of Deaf Children Using Music- 2019. Embodiment or Manipulation? Understanding Users’ Strategies for Free-
Sensory-Substitution Systems. Association for Computing Machinery, New York, Hand Character Control. In Proceedings of Mensch Und Computer 2019 (Hamburg,
NY, USA, 1–10. https://doi.org/10.1145/3173574.3174060 Germany) (MuC’19). Association for Computing Machinery, New York, NY, USA,
[52] Benjamin Petry, Thavishi Illandara, and Suranga Nanayakkara. 2016. MuSS-Bits: 661–665. https://doi.org/10.1145/3340764.3344887
Sensor-Display Blocks for Deaf People to Explore Musical Sounds. In Proceedings [70] Ge Wang. 2014. Ocarina: Designing the iPhone’s Magic Flute. Computer
of the 28th Australian Conference on Computer-Human Interaction (Launceston, Music Journal 38, 2 (06 2014), 8–21. https://doi.org/10.1162/COMJ_a_00236
Tasmania, Australia) (OzCHI ’16). Association for Computing Machinery, New arXiv:https://direct.mit.edu/comj/article-pdf/38/2/8/1855988/comj_a_00236.pdf
York, NY, USA, 72–80. https://doi.org/10.1145/3010915.3010939 [71] Gil Weinberg, Mark Godfrey, and Andrew Beck. 2010. ZOOZbeat: Mobile Music
[53] Michael Pouris and Deborah I. Fels. 2012. Creating an Entertaining and Infor- Recreation. In CHI ’10 Extended Abstracts on Human Factors in Computing Systems
mative Music Visualization. In Proceedings of the 13th International Conference (Atlanta, Georgia, USA) (CHI EA ’10). Association for Computing Machinery,
on Computers Helping People with Special Needs - Volume Part I (Linz, Austria) New York, NY, USA, 4817–4822. https://doi.org/10.1145/1753846.1754238
(ICCHP’12). Springer-Verlag, Berlin, Heidelberg, 451–458. https://doi.org/10. [72] Adam S. Williams and Francisco R. Ortega. 2020. Understanding Gesture and
1007/978-3-642-31522-0_68 Speech Multimodal Interactions for Manipulation Tasks in Augmented Reality
[54] Grazia Ragone. 2020. Designing Embodied Musical Interaction for Children with Using Unconstrained Elicitation. arXiv:2009.06591 [cs.HC]
Autism. In The 22nd International ACM SIGACCESS Conference on Computers and
ASSETS ’22, October 23–26, 2022, Athens, Greece Iijima, et al.

[73] Jacob O. Wobbrock, Htet Htet Aung, Brandon Rothrock, and Brad A. Myers. C AGREEMENT RATE
2005. Maximizing the Guessability of Symbolic Input. In CHI ’05 Extended Ab-
stracts on Human Factors in Computing Systems (Portland, OR, USA) (CHI EA The agreement rate (AR) for each referent can be obtained using
’05). Association for Computing Machinery, New York, NY, USA, 1869–1872. Vatavu and Wobbrock’s formula [65]:
https://doi.org/10.1145/1056808.1057043
i <j δ i, j
[74] Jacob O. Wobbrock, Meredith Ringel Morris, and Andrew D. Wilson. 2009. User- Í
Defned Gestures for Surface Computing. In Proceedings of the SIGCHI Conference AR(r ) =
on Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association n · (n − 1)/2
for Computing Machinery, New York, NY, USA, 1083–1092. https://doi.org/10. (
1145/1518701.1518866 1, i-th and j-th participants are in agreement over referent r
[75] Huiyue Wu, Jinxuan Gai, Yu Wang, Jiayi Liu, Jiali Qiu, Jianmin Wang, and Xi- δi, j =
aolong(Luke) Zhang. 2020. Infuence of cultural factors on freehand gesture
0, otherwise
design. International Journal of Human-Computer Studies 143 (2020), 102502. where n is the number of participants.
https://doi.org/10.1016/j.ijhcs.2020.102502
[76] Huiyue Wu, Weizhou Luo, Neng Pan, Shenghuan Nan, Yanyi Deng, Shengqian
Fu, and Liuqingqing Yang. 2019. Understanding freehand gestures: a study of D GESTURE GROUPING CRITERIA
freehand gestural interaction for immersive VR shopping applications. Human-
centric Computing and Information Sciences 9, 1 (2019), 43. https://doi.org/10. (1) Tapping on a smartphone is considered the same gesture,
1186/s13673-019-0204-7 even when using diferent or multiple fngers. For example,
[77] Huiyue Wu, Yu Wang, Jiayi Liu, Jiali Qiu, and Xiaolong (Luke) Zhang. 2020. “tapping with the index fnger” and “tapping with the index
User-defned gesture interaction for in-vehicle information systems. Multimedia
Tools and Applications 79, 1 (2020), 263–288. https://doi.org/10.1007/s11042-019- and middle fnger at the same time” are the same.
08075-1 (2) The hand(s) used for gesturing are distinguished. For exam-
[78] Ikuko Eguchi Yairi and Takuya Takeda. 2012. A Music Application for Visually
Impaired People Using Daily Goods and Stationeries on the Table. In Proceedings
ple, if a smartphone is held in the right hand, touching the
of the 14th International ACM SIGACCESS Conference on Computers and Accessibil- screen with the right-hand thumb is diferent from touching
ity (Boulder, Colorado, USA) (ASSETS ’12). Association for Computing Machinery, the screen with the left-hand thumb.
New York, NY, USA, 271–272. https://doi.org/10.1145/2384916.2384988
[79] Hui-Jen Yang, Y.-L. Lay, Yi-Chin Liou, Wen-Yu Tsao, and Cheng-Kun. Lin. (3) Proximity of the smartphone to the body is distinguished.
2007. Development and evaluation of computer-aided music-learning For example, striking a stationary smartphone with one’s
system for the hearing impaired. Journal of Computer Assisted Learn- moving hand is diferent from striking a stationary hand
ing 23, 6 (2007), 466–476. https://doi.org/10.1111/j.1365-2729.2007.00229.x
arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1365-2729.2007.00229.x with a moving smartphone.
[80] Zhican Yang, Chun Yu, Fengshi Zheng, and Yuanchun Shi. 2019. ProxiTalk: (4) The contact locations on the phone include the touch screen,
Activate Speech Input by Bringing Smartphone to the Mouth. Proc. ACM Interact.
Mob. Wearable Ubiquitous Technol. 3, 3, Article 118 (sep 2019), 25 pages. https:
back, and sides. It is inconsequential where on the screen
//doi.org/10.1145/3351276 the participant taps.
[81] Yinsheng Zhou, Khe Chai Sim, Patsy Tan, and Ye Wang. 2012. MOGAT: Mobile
Games with Auditory Training for Children with Cochlear Implants. In Pro-
ceedings of the 20th ACM International Conference on Multimedia (Nara, Japan)
E OTHER INSTRUMENTS PARTICIPANTS
(MM ’12). Association for Computing Machinery, New York, NY, USA, 429–438. WOULD LIKE TO PLAY
https://doi.org/10.1145/2393347.2393409
After the experiment, participants were asked if there were any
other instruments they would like to play on their smartphones.
A DESIGN OF VIBRATIONAL FEEDBACK The following instruments were suggested: cello (P2); drum (P9,
Core Haptics provides two pattern types: transient (i.e., short and P10, P11); fute (P7); handbell (P3, P4); harp (P6, P11); metallophone
fxed duration) and continuous (i.e., customizable duration). The (P6, P9); piano (P7, P9); sampler (P1); tambourine (P2); trombone
intensity and sharpness can be fnely controlled to provide fne (P7); trumpet (P6); violin (P2, P6, P9, P10); and xylophone (P6, P9,
attenuation of intensity, frequency, and tempo. The vibrational pat- P11). P5 stated that he would like to see the inclusion of rare and
terns of the instruments were determined empirically using 4 , a unusual instruments that are unavailable on the market.
graphical user interface tool used to generate haptic feedback. Hap- Few participants who wanted to make use of samplers and other
trix for the Mac operating system provides graphically transient electronic music handlers. This study focused on simple, typical
and continuous parameters with time variations. The vibration pat- acoustic instruments and did not cover instruments that produce
terns were iteratively designed, tested, and improved in a laboratory electronic music (e.g., synthesizers), which many smartphone users
environment with multiple researchers. can currently access. Therefore, exploring the designs of those
We designed the vibration profle of each instrument following instruments will be an interesting challenge in future.
the pattern guidelines provided by Apple 5 in the Apple Haptic and
Audio Pattern fle format, which can be rendered using the Core
Haptics framework.

B RANDOMIZE PROCESS
At the beginning of each experiment, variables were assigned to the
10 instruments, shufed into a uniform distribution, and presented
to the participants.

4 https://www.haptrix.com/
5 https://developer.apple.com/design/human-interface-guidelines/ios/user-
interaction/haptics/
Accessible Blockly: An Accessible Block-Based Programming
Library for People with Visual Impairments
Aboubakar Mountapmbeme Obianuju Okafor Stephanie Ludi
Department of Computer Science and Department of Computer Science and Department of Computer Science and
Engineering, University of North Engineering, University of North Engineering, University of North
Texas, Denton, Texas, USA Texas, Denton, Texas, USA Texas, Denton, Texas, USA
aboubakarmountapmbeme@my.unt.edu obianujuokafor@my.unt.edu stephanie.ludi@unt.edu

ABSTRACT
The visual and mouse-centric nature of block-based programming
environments generally make them inaccessible and challenging
to use by users with visual impairments who rely on assistive tech-
nologies to interact with computers. This prevents these users from
participating in programming activities where these systems are
used. This paper presents a prototype of an accessible block-based
programming library called Accessible Blockly that allows users to
create and navigate block-based code using a screen reader and a
keyboard. This is an attempt to make the famous Blockly library
accessible through a screen reader and keyboard. In this paper, we
present the design and implementation of Accessible Blockly. We Figure 1: Blockly’s editor showing the toolbox and workspace
also discuss the evaluation of the library for block-based code navi-
gation in a study with 12 blind programmers. Analysis of the study
results shows that Accessible Blockly efectively aids users with
Athens, Greece. ACM, New York, NY, USA, 15 pages. https://doi.org/10.1145/
reading and understanding block-based code. Participants found
3517428.3544806
Accessible Blockly easy to use and less frustrating for navigating
block-based programs. The participants also expressed enthusiasm
and interest in using the keyboard and screen reader to navigate 1 INTRODUCTION
block-based code and in the accessibility of block-based program- Block-based programming is a type of programming where users
ming. code by snapping together visual elements called blocks [1], [2]. In
a typical block-based programming environment (BBPE), a toolbox
CCS CONCEPTS contains blocks that are usually grouped by categories. Users select
• Human-centered computing → Accessibility; Accessibility; blocks from the toolbox and drag them onto the workspace, where
Accessibility systems and tools; Accessibility; Empirical studies in they are connected to one another to form a program (Figure 1).
accessibility; • Software and its engineering → Software creation Researchers and practitioners argue that these systems are suit-
and management. able for teaching coding and computational thinking skills to novice
students because their intrinsic design allows users to focus on
KEYWORDS concepts rather than language syntax [2] [3]. This has made block-
Block-based programming, Accessibility, Keyboard Navigation, based programming increasingly used in k-12 to introduce novices
Screen Reader, Students with Visual Impairments, Blind Program- to programming [4]. However, the intrinsic design of block-based
mers, Blockly programming environments presents a considerable obstacle for
ACM Reference Format:
those who cannot see or who cannot use the mouse. Block-based
Aboubakar Mountapmbeme, Obianuju Okafor, and Stephanie Ludi. 2022. programming, in its mouse-centric and visual nature, is inaccessible
Accessible Blockly: An Accessible Block-Based Programming Library for to people with visual impairments [5] [6] [4]. As a result, novices
People with Visual Impairments. In The 24th International ACM SIGACCESS with visual impairments often miss the opportunity to enjoy the
Conference on Computers and Accessibility (ASSETS ’22), October 23–26, 2022, learning facilities brought by these systems. Teachers of students
with visual impairments (TVIs) usually have to look for alternatives
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed for their students as reported in [7].
for proft or commercial advantage and that copies bear this notice and the full citation Blockly [8] [9] [10] is the de-facto library on top of which most
on the frst page. Copyrights for components of this work owned by others than ACM if not all mainstream block-based programming environments are
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a built. Popular BBPEs such as Microsoft MakeCode, Scratch 2.0, Ap-
fee. Request permissions from permissions@acm.org. pInventor, and many others are built using the Blockly Framework.
ASSETS ’22, October 23–26, 2022, Athens, Greece The Blockly library itself is not accessible to people with visual
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 impairments, and as a result, all the mainstream BBPEs that rely on
https://doi.org/10.1145/3517428.3544806 Blockly are not accessible [7]. Recent eforts by the Blockly team,
ASSETS ’22, October 23–26, 2022, Athens, Greece Aboubakar Mountapmbeme et al.

including collaboration with our research group, have led to the program structure and type information, and moving blocks [3] [5].
addition of a keyboard interaction mechanism to Blockly. In [14], the authors discuss the extension of Blocks4All to support
This paper builds on our previous work [11]. It presents the functions and variables. Our work difers from this in that while
design and implementation of Accessible Blockly; a block-based Blocks4All is a complete coding environment where users can build
programming library augmented with keyboard and screen reader and run block-based code, Accessible Blockly as a natural extension
interaction mechanisms. Accessible Blockly maintains the same to Blockly is a library that allows people to develop an accessible
visual and mouse-based interaction as the original Blockly and uses block-based programming environment. It is an extension of the
WAI-ARIA guidelines [12] to add screen reader and keyboard in- Blockly library that adds keyboard and screen reader interactions.
teractions. Accessible Blockly is a natural extension to Blockly and One of the primary objectives of developing Accessible Blockly is
can be used in similar ways as Blockly. We believe having one sys- to provide a mechanism for making existing mainstream BBPEs
tem accessible to all would allow sighted and students with visual that run on Blockly accessible.
impairments to work on the same platforms, foster collaboration, Das et al. [15] present an accessible solution that uses Blockly’s
and remove feelings of exclusion in settings where these systems Keybaord interaction mechanism and custom speech rather than a
are used. Accessible Blockly was designed with two main goals. screen reader. The authors generate custom speech that describes
1. Enable keyboard interaction as an alternative to the mouse the blocks in Blockly. Our work defers from this in two ways: 1)
for navigating, creating, and editing block-based code. Accessible Blockly supports screen reader output, and 2) we ofer
2. Leverage the screen reader as an alternative output channel a novel keyboard interaction mechanism diferent from Blockly’s
on Accessible Blockly. keyboard interaction scheme.
We formulated the following research question to evaluate Ac- The keyboard and the screen reader have been used as alterna-
cessible Blockly for block-based code navigation. tive interaction mechanisms for people with visual impairments
in text-based programming environments. Baker et al. developed
RQ1. How well does the keyboard and screen reader navigation Structjumper [16], which helps blind programmers navigate code
strategy in Accessible Blockly aid people with visual impair- in Eclipse integrated development environment (IDE) using key-
ments navigate and understand block-based code? board shortcuts and screen reader output. CodeMirror-Blocks [17]
Our contribution in this paper includes a prototype of an accessibil- is similar to Structjumper and allows blind programmers to edit
ity library, an analysis and results of the evaluation of the library and navigate text-based programs. It also ofers a block view of
for navigating block-based code, and fnally, refections on how text-based code where code sections are converted into visual ele-
to improve the accessibility of block-based programming environ- ments to facilitate navigation. The bulk of research in improving
ments. the accessibility of programming environments for people with
visual impairments has primarily focused on text-based program-
2 RELATED WORK ming [18]. In [18], the authors discuss the evolution of research in
Research on the accessibility of block-based programming to peo- this feld and present a summary of all the accessibility tools that
ple with visual impairments is a growing feld and dates back to have so far been developed.
2015. In [6], the author points out that block-based programming
is inaccessible to people with visual impairments mainly because
of its visual and mouse-centric design. The author discusses design
considerations to improve the accessibility of BBPEs through key- 3 SYSTEM DESIGN AND IMPLEMENTATION
board and screen reader interactions. The authors in [4] further Before getting into the details of the design, it is important to briefy
present the need to make these systems accessible to people with describe the morphology of a typical block and how block-based
visual and motor impairments, given their increased popularity in programs are constructed. Blocks are special visual elements that
introductory computer science courses. represent a particular programming construct. The shape of the
In [5], the authors found that students with visual impairments block depends on the type of programming construct that it repre-
lack access to mainstream block-based programming environments sents and on its intended functionalities. We can generally identify
and rely on hybrids environments such as Swift Playgrounds. They two types of blocks based on shape: non-container blocks and con-
also reported that students with visual impairments face several tainer blocks. As their name implies, container blocks are blocks
challenges, including difculties in navigating and editing block- that can accommodate other blocks within them. Examples include
based code. Due to the inaccessibility of BBPEs, TVIs often have to blocks that represent control fow statements such as the “If” blocks
resort to tangible alternatives to block-based programming [7] [13]. and “repeat” blocks (Figure 1 and Figure 2 a). A block will have
They, however, argue that these alternatives only support basic connection points (Figure 2). Connection points on a block allow
functionalities, unlike the mainstream BBPEs [7]. other blocks to be attached to it or they enable the block to connect
Blocks4All[3] is an accessible touchscreen-based block-based to other compatible blocks. The type and number of connection
programming environment designed with students with visual im- points also depend on the intended functionality of a block. A typi-
pairment in mind. It runs on the iPad and allows users to build and cal block-based program grows vertically downwards and left to
run block-based code using touchscreen interactions and a screen right. Users select blocks from the toolbox and connect them on a
reader. Blocks4All was designed to address the accessibility barriers workspace to form a program (Figure 1). Constructing block-based
in block-based programming environments, including accessing the programs typically occurs via drag-and-drop operations using the
output, accessing elements in the coding environments, conveying mouse.
Accessible Blockly: An Accessible Block-Based Programming Library for People with Visual Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 2: Block samples

Blockly run on the web. With Blockly being open-source and widely
adopted, anyone is free to add to or extend the Blockly framework.
This also makes it suitable for our case. Another reason for using
Blockly is that it already supports complex syntax such as functions
and variables. Therefore, adding accessibility would allow users
to have access to these rich sets of features by default. In [7], the
authors report that TVIs fnd current accessible programming envi-
ronments to be too basic to use at some point because they only
contain basic coding concepts.

3.2 Keyboard Module


The second component is the Keyboard Module. Blockly, in its
Figure 3: Components of Accessible Blockly initial design, only supported mouse interactions. The Keyboard is
an alternative to the mouse for people with visual impairments and
temporary disabilities. We designed Accessible Blockly to support
Designing accessible BBPEs requires providing access to the keyboard interactions with the following principal guidelines.
rich visual information characteristic of BBPEs and also enabling
1. Allow a user to browse block categories and individual blocks
an alternative interaction mechanism other than the mouse. We
in the toolbox using keyboard inputs.
established fve primary conditions while designing Accessible
2. Enable drag-and-drop-like operations via keystrokes alone.
Blockly:
3. Mimic spatial navigation by allowing a user to navigate a
1. It should be usable by both students with visual impairments program top-down, left-right, and transverse in and out of
and sighted students alike. This includes collaboration be- nested or container blocks.
tween the two.
We believe adhering to these principles is essential to produce one
2. It should communicate information that is otherwise avail-
platform that is usable by people with visual impairments and
able visually in an alternate format such as speech and audio.
sighted people alike. We defned a virtual cursor on the program-
3. It should support keyboard interactions as an alternative to
ming environment controlled by keystrokes. This cursor can be on
the mouse.
the toolbox or the workspace at any time. When the focus is on
4. The library should be easy to integrate into existing main-
the toolbox, a user can open a fyout category by using keystrokes,
stream block-based programming environments to enable
move the cursor between categories, and explore the blocks within
accessibility on these systems.
a category in the same way they would use a screen reader and a
5. It should be easy to learn and use.
keyboard to access HTML menus on the Web.
Figure 3 shows the main components in Accessible Blockly. The keyboard module also allows users to perform drag-and-
drop-like interactions from the toolbox to the workspace. Prior
3.1 Blockly Library research [3] and our experimental work [11] have shown that there
The frst component is the Blockly library [8], on top of which acces- can be more than one way to perform drag-and-drop-like operations
sibility is being added. As stated earlier, Blockly is an open-source with a keyboard. We designed the drag-and-drop-like functions to
library developed by Google for creating block-based programming closely mimic the actions performed by a user using the mouse.
environments. Blockly became popular in 2016 after former US To move a block from the toolbox to the workspace, a user frst
president Barack Obama introduced CS for All using the platform selects a location on the workspace where they want to attach a
to code [19]. Blockly is built using JavaScript and runs on the web. new block using the virtual cursor. Next, they go to the toolbox
This makes it a good candidate for use since Accessibility guide- and choose the desired block to be added, and then with a fnal
lines for Web Applications are well established through the W3C keystroke, the selected block is placed at the previously determined
WAI group [20]. In addition, interactive coding platforms that use location on the workspace. This method is commonly referred to
ASSETS ’22, October 23–26, 2022, Athens, Greece Aboubakar Mountapmbeme et al.

Figure 4: The virtual cursors in Accessible Blockly

as the location-frst select, then block select method[3]. To improve As an example of using the WASD keys to navigate the program,
the user experience and minimize confusion, once a user selects a consider Figure 4a, and suppose the virtual cursor is on the “if”
connection point on the workspace and goes to the toolbox, blocks block. Pressing W would take the virtual cursor up to the “set count
that are incompatible with the selected connection point are marked to 7” group of blocks. Pressing S at this location would take the
as disabled and communicated to the user via the speech output. cursor back down to the “if” block. Now if we are still on the “if”
The keyboard module operates in two modes: Navigate mode and block, pressing F takes the cursor inline or horizontally right to the
Edit mode. Navigate mode allows a user to move the virtual cursor “count >= 5” block. The F-key causes the virtual cursor to move to
through the blocks or categories of blocks on the toolbox or to the frst property block of the current block. In Blockly terminology,
navigate the blocks that make up a program on the workspace. The these blocks are referred to as value blocks. These are blocks that
navigation allows a user to explore a program on the workspace cannot start a statement on their own and must exist as properties
spatially. The virtual cursor determines the current location of to other blocks. Another example of using the F-key is if we are on
focus of the user. The cursor can be moved vertically downwards or the “repeat 3 times” blocks as shown in Figure 4a. Pressing F would
upwards across blocks. It can also be moved horizontally between take the virtual cursor to the number block “3”. Pressing A while at
blocks connected at the same level. Figure 4a shows the cursor as a the number block “3” would move the cursor back to the “repeat 3
yellow outline around the “repeat 3 times” block. The cursor can times” block. Pressing D while the virtual cursor is on the “repeat 3
also be caused to move directly into the body of a container block, times” block would cause the virtual cursor to move into the body
such as an “if” block. This allows a user to continue exploration of this container block to the “print 10” block. Pressing A causes
within the body of the container block. the virtual cursor to move out back to the “repeat 3 times” block.
Edit mode is used to construct a new program or modify an
existing program on a workspace. In this mode, the virtual cursor is
used to select a connection point of a block on the workspace. The
3.3 Screen Reader Module
virtual cursor only moves between the connection points of the This component was designed to serve as an alternative output
highlighted block. It allows a user to circle through the connection channel to the visual channel. One of the ways in which people
points and select a connection point at which to attach a new block. with visual impairments interact with computers is through screen
Upon choosing a connection point, the user goes back to the toolbox readers. A screen reader is a program that reads out loud content on
and selects a compatible block that is automatically snapped at the the screen. The screen reader module converts and conveys infor-
connection point on the workspace. The cursor is taken back to mation otherwise only accessible visually into verbal descriptions.
the workspace onto the newly added block. Figure 4b shows the By default, blocks which are visual elements are not accessible
virtual cursor as a black outline on the top connection of the “repeat through the screen reader. Unlike HTML images tag, for example,
3 times” block. which can easily be made accessibly by adding the alt text, blocks
The Keyboard Module uses the WASD keys as the primary keys are JavaScript drawings created at run time. Currently, there is no
for interaction (Table 1). These keys were selected because the standard mechanism like the alt text to make blocks easily accessi-
WASD keys are common among gamers for navigation and can ble through the screen reader. When the virtual cursor is focused
allow users to interact with the system one-handed. We chose to on a block, the screen reader will output verbal descriptions based
use keys other than the arrow keys for navigation to avoid conficts on what the block represents. Text descriptions for each block were
with existing screen reader functionalities and other applications. carefully designed based on previous experiments [21] and using
previously established guidelines for describing computer programs
Accessible Blockly: An Accessible Block-Based Programming Library for People with Visual Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Keyboard navigation shortcuts for Accessible Blockly

Key Action
W Go vertically up to the previous block.
Go to the previous connection point (in Edit mode).
A Go horizontally left to the block on the left.
Go out of the body of a container block to the container block.
S Go vertically down to the next block.
Go to the next connection point (in Edit mode)
D Go into the body of the container block to the frst child block
E Toggle between Edit Mode and Navigate Mode
F Go horizontally right to the frst inline block or frst block that is a property of the current block.
J Jump to the frst block on the workspace
R Repeat current focus or location
C Open the toolbox
ENTER Add selected block to workspace
Delete Delete selected block
Ctrl + Z Undo Delete

in speech or audio form [22]. Navigating the toolbox also allows the 4 EXPERIMENT DESIGN
screen reader to read out the categories of blocks and each block To answer the research question, we conducted a study in which 12
within a category. As the virtual cursor moves from one location blind programmers accomplished navigation tasks using Accessible
to another in the coding environment, the screen reader outputs Blockly and Blockly’s alternate navigation scheme as control. Our
a corresponding speech that describes the focused element. For institution’s IRB approved the study. This evaluation is the frst
example, if the focus is on the “repeat block” in Figure 4a, the user phase in our evaluation series and involves navigation tasks only.
will hear “repeat three times block” Another example is if the focus We decided to focus on the navigation task because:
is on the connection point shown in Figure 4b, the user will hear
1. This is the frst attempt to design an accessible block-based
“top connection.”
programming library that uses the keyboard and screen
The speech for each block is designed to closely convey what
reader for interaction. The Blockly’s alternate navigation
the block represents. An aria-label text template is defned for each
scheme originated from work by the Blockly team with a
block type. The text label is updated at run time to refect the actual
collaboration from our research group.
state of the block. For example, there is an aria-label text “repeat
2. Navigation is crucial in writing code or building block-based
N times block” for the repeat block. For the example in Figure 4a,
programs. Therefore, we wanted to fully evaluate the tool
the fnal text will be “repeat 3 times block” where N is replaced by
for this activity before considering other coding tasks. Acces-
3. The screen reader will then output this information to the user.
sibility of code navigation is commonly assessed in isolation
Aria-labels at run time can also come from a combination of two
[16], [17].
templates. Consider the “set count to 7” group of blocks in Figure 4.
The resulting aria-label “set count to 7” comes from the template Blockly [8] currently does not support screen readers. It only of-
“set VARIABLE to block” and “NUMBER block.” “VARIABLE” is fers the alternate keyboard interaction mechanism. To use the
replaced by “count” and “NUMBER” by “seven”. Blockly’s alternate keyboard navigation as control in our experi-
Accessible Blockly has been tested with NVDA, VoiceOver, and ment, we replicated this navigation scheme in our codebase since
JAWS screen readers. With NVDA and JAWS, it works in focus this already supports screen reader interactions. Unlike Accessible
mode or pass-key through mode. One of the design constraints Blockly, which ofers two navigation modes as discussed above,
was to maintain compatibility with existing applications while Blockly’s alternate keyboard navigation, referred to as the “Default
adding accessibility to Blockly. Having the screen reader work Cursor” on the ofcial website, has only one mode of interaction. In
in these modes minimizes the chances of key conficts between this single-mode interaction, the user uses the WASD keys to move
Accessible Blockly functions and other screen reader or browser the virtual cursor between blocks and connection points of blocks
functionalities. alike. Detail description of Blockly’s alternate keyboard navigation
Blocky is mainly based on JavaScript. Our Keyboard and Screen scheme can be found at [24]. This work will act as one that eval-
Reader modules are also implemented in JavaScript. The system uates both navigation schemes through a formal user study. We,
listens for keypresses and calls special event handling functions however, chose to use this as a control because it is the navigation
to execute the requested actions and generate the speech output. scheme that is shipped with the ofcial Blockly library.
Our code is open source. The repository alongside the wiki are
located at [23]. This repository was forked from the ofcial Blockly 4.1 Participants
repository. A total of 12 people took part in the evaluation study. There were
ten males and two females. Participants were recruited via mailing
ASSETS ’22, October 23–26, 2022, Athens, Greece Aboubakar Mountapmbeme et al.

Table 2: Summary of participants’ background information

Gender Age Location Programming Screen Screen Reader Keyboard


experience Reader profciency Profciency
used
P1 Male 51 US More than 3 years NVDA Expert Expert
P2 Male 51 US 2 to 3 years NVDA Advanced Advanced
P3 Male 40 Europe More than 3 years NVDA Expert Advanced
P4 Female 56 US More than 3 years NVDA Expert Expert
P5 Male 28 India More than 3 years NVDA Advanced Expert
P6 Male 57 US More than 3 years NVDA Moderate Advanced
P7 Male 26 Middle East More than 3 years NVDA Advanced Expert
P8 Female 40 Europe less than 1 year NVDA Expert Expert
P9 Male 22 India less than 1 year NVDA Expert Expert
P10 Male 29 Europe More than 3 years NVDA Expert Expert
P11 Male 60 US More than 3 years JAWS Expert Expert
P12 Male 35 Hong Kong 2 to 3 years NVDA Advanced Advanced

lists, snowballing, and personal contacts. Participants had varying Participants were briefed on the morphology of a block and how
levels of experience with programming ranging from professional block-based programs are built. Participants were also informed
programmers to novices just learning how to code. Of the 12 partic- about the usefulness of block-based programming and the impor-
ipants, ten reported being blind, and the remaining two said they tance of making block-based programming accessible to all.
had no useful vision and relied on screen readers to interact with An identical program was used for training purposes in each
computers. Two of the participants were regular braille users and phase. This program contained non-container blocks with assign-
the rest used screen readers daily. The average age of the partici- ment operations and a container block (see Appendix A for details).
pants was 41 (SD = 13). Only one participant (P1) out of the twelve During training, participants were taught how to navigate a pro-
had heard of block-based programming before the study. P1 had gram using the keyboard shortcuts for each navigation scheme.
tried Scratch, Blockly and MakeCode and reported these to be only Participants executed the keyboard commands following the in-
partially accessible. Every participant signed an informed consent structions from the researchers. Participants were given time to
form before the study and received $50 Amazon gift card as com- familiarize themselves with the navigation schemes after the in-
pensation after the study session. Table 2 shows the summary of struction stage. Once participants felt comfortable using the navi-
participants demographic data. gation scheme in each phase, we moved on to complete the exercise
tasks.
4.2 Set-up We designed two sets of similar programs for use during the
All study sessions were conducted remotely via Zoom. Participants exercise tasks. Each group comprised of one simple block-based
were encouraged to use their regular screen reader settings and program without any container block. This program performed
other settings related to their daily computer usage. Participants some basic arithmetic operations and updated the value of a named
shared their screen and computer audio throughout the study ses- variable at the end of the program. The second program in each set
sion, and the researchers watched as they completed the tasks. had two container blocks: an “if” block and a “repeat” block. The
Sessions were recorded for later analysis and insights. program also included two print blocks. The participants completed
the tasks on one set of programs using Accessible Blockly and
4.3 Procedure on another set of programs using Blockly’s alternate keyboard
navigation scheme. Appendix A contains details about the study
Participants were asked to provide demographic information in tasks. The link to the study tasks can be found at [25].
a pre-study survey hosted on SurveyMonkey. This survey asked An efective navigation scheme should allow a user to get to
participants to provide information regarding their level of vision, desired locations on the code while building an understanding of the
programming experience, experience with block-based program- code and an awareness of their location within the program. This
ming, and with screen readers (Table 2). should also ofer an acceptable navigation time. The two exercise
The study was organized in two phases based on the type of tasks were designed to evaluate Accessible Blockly on these criteria.
navigation tool used, i.e., Accessible Blockly vs. Blockly’s alternate For each set of programs, the two exercise tasks included:
keyboard navigation. Each phase included a training task, two ex-
ercise tasks, and qualitative questions regarding the navigation
tool used in that phase. After completing both phases, participants 1. Determine the fnal value of a variable: Navigate the program
were asked open-ended questions regarding their overall experience and determine the fnal value of the variable “X.” Here, X had
using both navigation schemes. Before starting each session, the diferent names on each set of programs, and the arithmetic
participants were briefy introduced to block-based programming. operations were diferent on each set.
Accessible Blockly: An Accessible Block-Based Programming Library for People with Visual Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

2. a) What is the expected output? Navigate the program and [17]. Each participant completed two tasks using each navigation
say what you believe will be printed on the screen if this scheme. The order of the navigation scheme encountered frst was
program was executed. This program contained print blocks counterbalanced between participants. However, each navigation
within container blocks. scheme was fxed to a particular set of programs because the pro-
b) Conditions: how will the program’s output in (2a) above grams were designed to be very similar. How this design afects
change if the initial value of a variable in the program was the study results are discussed under limitations. Each participant
set to a diferent value? completed four tasks for the entire experiment for a total of 48
Task 2b was designed to test how well participants understood tasks. A single factor ANOVA was used to analyze tasks completion
what the code was doing after navigating the program in Task 2a. time. Descriptive statistics were used to analyze the semantically
The question asked participants to state how the program’s output anchored scale data.
in Task 2a changes when the initial value of a variable used on the
“if” block was set to a new value. The modifcation was such that it 5 EVALUATION RESULTS
would negate the value of the Boolean condition and thus alter the We analyzed the results and compared the efectiveness of the two
program’s output. navigation schemes based on task completion time, the correctness
Participants were given 10 minutes to complete each task, except of answers to the tasks, and the participants’ experience. Overall,
for Task 2b, a follow-up question to Task 2a. A timer was started participants were faster when completing the tasks with Accessible
after the researcher had fnished reading out the question to the Blocky. Participants also scored higher on average with Accessible
participant and when the participant indicated they were ready to Blockly than with the alternate navigation scheme. Although most
start. The timer was stopped the moment the participant said they participants found both navigation schemes equally accessible, they
found an answer. All answers were recorded. Participants were tended to prefer accessible Blockly because they found it to be easy
allowed to take down notes as they completed the tasks. Eleven and less frustrating to use.
participants used NVDA, and one participant used JAWS during
the study sessions. 5.1 Task completion time
All tasks were scored. Task 1 was scored on a scale of 0-1. A
Participants were faster when completing the tasks using the Acces-
participant received 1 if they could determine the correct value
sible Blockly navigation scheme, with an average task complement
of the named variable. Task 2a was scored on a scale of 0-2. A
time of 1 minute 30 seconds against 2 minute 55 seconds for the
participant received full credit if they determined the correct output
alternate navigation scheme (Table 3). This diference was statisti-
of the program. They received 1 if their answer was incomplete but
cally signifcant (P-value 0.0035). We note that fxing one navigation
partially correct. Task 2b was scored on a scale of 0-1. A participant
scheme to a particular set of programs throughout the study could
received 1 if they correctly answered the follow-up question and
also have an infuence on the time diference and is addressed as a
0 otherwise. For the timed tasks, participants received a 0 if they
limitation. Participants completed task 1 with Accessible Blocky in
could not provide an answer within the allotted time.
an average of 1m30s vs. 2m44s for the alternate navigation scheme.
At the end of each phase, participants were asked to rate their ex-
The same pattern was observed on task 2a with an average task
perience using the navigation scheme in that phase on a 5-point se-
completion time of 1m30s for Accessible Blockly against 3m5s for
mantically anchored scale. We asked the following three questions.
the alternate navigation scheme.
Similar questions have been used in studies evaluating accessible
All participants completed each task within the allotted 10 min-
navigation tools in text-based programming [16] [17].
utes. Longer task completion times came from participants who had
1. On a scale of 1 to 5, 1 being very hard and 5 being very easy, less experience with programming. Figure 5 and Figure 6 compare
rate how easy the tasks were to complete. the task completion time on both navigation schemes by partici-
2. On a scale of 1 to 5, 1 being very frustrating and 5 being pants for Task 1 and Task 2a respectively.
not frustrating at all, rate how frustrating the tasks were to
complete. 5.2 Task Score
3. On a scale of 1 to 5, 1 being I had no idea where I was and 5
The maximum score possible for completing tasks correctly with
being I always knew where I was, rate how well you know
each navigation scheme was four. As shown on Table 4, partici-
where you were in the code while completing the tasks.
pants were more successful in completing the tasks accurately with
After a participant had fnished both phases of the study, they were Accessible Blockly (avg. score = 3.17) than with the alternate nav-
asked to share their experiences with the two navigation schemes igation scheme (avg. score = 2.75). Looking at the score by tasks,
and state if they had a preference for any and, if so, why. Partici- participants scored higher on Task 1 when using Accessible Blockly
pants were also asked to share suggestions regarding improving (avg. 0.83) than with the alternate navigation scheme (avg. 0.67).
the accessibility of block-based programming languages. For both navigation schemes, some participants felt intimidated
by the mathematical expressions in Task 1 and did not take time
4.4 Design and Analysis to properly evaluate the arithmetic expressions despite correctly
We designed a 2x2 within-subjects factorial design experiment with navigating the programs.
factors of the program sets and the navigation scheme used. Our ex- The performance in Task 2a was similar. On average, participants
periment was modeled following similar experiments that evaluate scored higher with Accessible Blockly (avg. 1.58) than with the
accessible navigation tools for people with visual impairments [16] alternate navigation scheme (avg. 1.33). Eight participants scored
ASSETS ’22, October 23–26, 2022, Athens, Greece Aboubakar Mountapmbeme et al.

Table 3: Task completion time

Accessible Blockly Nav Blockly Alternate Nav


Mean SD Mean SD
Task 1 -Time 1m30s 59s 2m44s 1m43s
Task 2a - Time 1m30s 37s 3m5s 2m30s
Avg. Time 1m30s 48s 2m55s 2m6s

Figure 5: Task 1 task completion time by participants for both navigation schemes

Figure 6: Task 2a task completion time by participants for both navigation schemes

full points on task 2a using Accessible Blockly against fve for the determine whether a block was a container or non-container block
alternate navigation scheme (Table 5). Participants who did not with Accessible Blockly, it was not the same with the alternate
get the full points in Task 2a when using either navigation scheme navigation scheme. With the alternate navigation scheme, some
forgot to explore the body of a container block in the program. participants got confused by the connection points, which made
Some participants attributed the miss to their unfamiliarity with them lose track of the level of nesting in the code. Because to go
the jargon on block-based programming. While it was easier to one level into the body of the container block using the alternate
Accessible Blockly: An Accessible Block-Based Programming Library for People with Visual Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 4: Task score

Accessible Blockly Nav Blockly Alternate Nav


Mean SD Mean SD
Task1 - Score 0.83 0.39 0.67 0.49
Task 2a - Score 1.58 0.67 1.33 0.65
Task 2b Score 0.75 0.45 0.75* 0.45
Avg. Score 3.17 1.19 2.75 1.22

Table 5: Number of participants who got full points per task

Accessible Blockly Nav Blockly Alternate Nav


Task 1 10 8
Task 2a 8 5
Task 2b 9 9*
*After they were given hints and completed task 2a

navigation, you must go through the connection points, it was task, and about their awareness of their location on the program
difcult for participants to keep track of their level of nesting at while completing the tasks. A higher number is better as it indicated
the same time while remembering the location of the virtual cursor they found the tasks easy, not frustrating, and were more aware
through the connection points. Participants who got partial points of their location within the program. Overall, participants felt Ac-
in Task 2a thought one of the container blocks was nested inside the cessible Blockly was easier and less frustrating to use. Participants
other, which was not the case. Again, participants with experience explained that it required few keystrokes to get to a desired loca-
in programming did not have this confusion. tion, and the fact that they did not have to deal with the connection
For both navigation schemes, participants who did not get Task points while navigating made their experience less frustrating. The
1 or Task 2a correct after the timer was stopped were given hints to participants also thought they were more aware of their location
complete the task. All participants who failed at the frst attempt suc- within the programs while navigating with Accessible Blockly.
cessfully completed the tasks with recommendations. However, this The largest diference in the semantically anchored scale data
was not accounted for in the task score and task completion time between the navigation schemes was on the level of difculty. Par-
above. For Task 2b, participants on average scored equally using ticipants overall rated Accessible Blockly to be easier to use (avg.
Accessible Blockly (Avg. 0.75) and the alternate navigation scheme 4.33) than the alternate navigation scheme (avg. 3.75) (Figure 7). We
(avg. 0.75). This suggests participants had a good understanding of note that all participants found both navigation schemes equally
the programs they navigated. However, more participants success- accessible and usable for navigating block-based programs.
fully completed Task 2a on frst trail with the timer on when using
Accessible Blockly than with the alternate navigation scheme. The 6 QUALITATIVE RESULTS
three participants (P2, P8, and P11) who did not respond correctly
After participants had completed the tasks with Accessible Blockly
to Task 2b while using either navigation scheme reported they
and the alternate navigation scheme, we asked participants open-
had limited programming experience. In addition, P8 and P11 were
ended questions regarding their experience learning and using
seasoned braille users and mentioned having a hard time keeping
both navigation schemes. Participants were also asked whether
up with the screen reader output. These could also have infuenced
they preferred any of the two navigation schemes. Participants
their performance at the tasks.
were also allowed to provide suggestions regarding improving the
While observing participants complete the tasks, we noticed that
usability of the navigation schemes and the accessibility of block-
all participants could explore all the blocks within the programs
based programming in general.
and within the allotted time. No participant gave up in the middle
Multiple steps of inductive analysis were applied to the tran-
of completing a task. For each navigation scheme, all participants
scripts from each session. Codes were identifed from each tran-
could explore all the blocks on the programs for each task. We no-
script through multiple coding rounds. Using the afnity diagram-
ticed excitement on the part of some participants, which made them
ming technique, these codes were grouped into themes. Several
impatient to refect on their answers properly. In this precipitation
themes were identifed and are discussed below.
to give an answer, some participants made mistakes. This was more
apparent with task 1 which involved arithmetic operations.
6.1 Easy to Use and to Learn.
Analyses of the open-ended questions revealed that participants
5.3 Participant Experience found Accessible Blockly easy to use and learn. Nine out of the
Participants also shared their experience regarding the level of dif- 12 participants explicitly stated that accessible Blockly was easier
fculty of the tasks, their degree of frustration while completing the to use to navigate the programs. The participants attributed this
ASSETS ’22, October 23–26, 2022, Athens, Greece Aboubakar Mountapmbeme et al.

Figure 7: Average score across participants on the semantically anchored questions

ease of use to the fact that you do not have to worry about the it better to other people you know like what I was doing.
connection points while navigating the program with accessible In one sense it’s more complicated to learn, I think. (P1)
Blockly:
“I prefer the navigation two [Accessible Blockly]. Be- 6.2 Shorter Navigation Time
cause I think it’s easiest to navigate because you don’t The quantitative results already showed that Accessible Blockly was
have to worry about the connection. So, I prefer the faster for navigating the programs. Additionally, six participants
second one, phase two one.” (P3) also mentioned in the open-ended discussions that they found
“Obviously, this [Accessible Blockly] is quite a bit Accessible Blockly to be faster when navigating the programs. The
more some, like simplifed because you don’t have all participants said they could quickly get to their target on the block-
the connection points to worry about so this is just based code because they had fewer keystrokes to execute. This is
simpler.” (P10) because Accessible Blockly in navigation mode does not go through
the connection points, saving extra keystrokes from the user.
On the other hand, fve participants found the alternate naviga-
tion scheme challenging and less intuitive. Participants explained “. . . I mean the other thing that I thought was diferent
that navigating the connection points at the same time as blocks I thought the second one [Accessible Blockly] felt like
made learning and using Blockly’s alternate navigation scheme less it had a little less steps may be cause like you know
intuitive. when going to the hollow block before we did D and
for this particular question there was one like one of 2
“It [Blockly’s alternate navigation scheme] is easy not choices I didn’t’ really have to go deeper so in that aspect
very easy. I mean here, the training right, how to use it was kind of like less layers to go into in the second
this navigation is much more important because it’s not session whereas in the frst session [Blockly alternate
very intuitive. It’s a bit difcult. So yeah, a lot of things navigation scheme] when we did the hollow blocks it
to kind of understand properly.” (P9) seemed like there was more like cause I had to go into
Other participants thought the alternate navigation scheme has a the do ummm connection and had to go in there and
steep learning curve and that its ease of use could improve with then come back out and come back out.” (P2)
experience. Because each block has its morphology and type of
In addition to being faster to navigate, participants thought acces-
connection points, it can be overwhelming for the users at the
sible Blockly provided quick review of the programs.
beginning.
“Phase 2 (Accessible Blockly) in terms of reviewing a program is
I guess my umm I really feel like I wanna make it com- easier because it allows you to navigate the programs quickly. You
plicated but that’s because I’m just learning it [Blockly’s don’t have to go into the details of the connection points as with the
alternate navigation scheme]. I think you know I think Phase 1 [Blockly alternate navigation scheme].” (P6)
after time this would be easier like I said because it ac- On the other hand, participants thought using the alternate naviga-
tually goes left and right and I would be able to explain tion scheme to read the programs made it longer to navigate the
Accessible Blockly: An Accessible Block-Based Programming Library for People with Visual Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

code. This longer navigation time is also evident in the quantitative uhh like student uhhmm learning frst phase was very
data presented above. much good.” (P5)
“Getting into the diferent connections made it a One participant mentioned it was easier to memorize the navigation
bit longer than the frst one, phase one [Accessible keys with Accessible Blockly than with the alternate navigation
Blockly].” (P3) scheme.

6.3 Less Cognitively Demanding “I have a preference for navigation scheme applied to
While refecting on their experiences, participants also found that in phase one [Accessible Blockly], because that’s much
contrast to Accessible Blockly, the alternate navigation scheme was more easier. Okay, it’s only few keys. So that’s much
more cognitively demanding. 6 out of the 12 participants explained more easier to memorize because being a blind individ-
that using the alternate navigation scheme to read the programs ual, always you need to memorize key shortcuts. So it’s,
increased their memory constraints and required extra attention. I mean, it should be as simple as that. So, in that case, I
have a preference with phase one [Accessible Blockly].”
“umm I think it could umm I mean I know when I was
(P9)
doing it, it [Blockly alternate navigation] required a
little bit more time to go in there and then I had to
focus a little bit more on where am I at, coming back 6.4 Suitable for Novices
out you know like like ummm It felt like the second As participants thought Accessible Blockly was straightforward,
one [Accessible blockly] I didn’t have to worry, . . . as faster, and less cognitively demanding, they also stated that it
soon as I was in, I was back out. Whereas the frst time seemed more suitable for novices and appropriate for simple pro-
[alternate navigation scheme] I was in then I was in grams. P5 explicitly mentions this in the previous paragraph. One
then I was like okay when I came out where am I at of the primary objectives of block-based programming is to make
when am I gonna come out again.” (P2) learning programming by novices easier by removing the heavy
Having to remember the connection points while navigating to get syntax imposed by text-based languages [26] [13]. Looking at it
to a target location was mostly cited as the reason for the increased from this perspective further strengthens the observations made
cognitive load. Participants felt that the alternate navigation scheme by participants. We also observed from the qualitative data that
exposes too much information during navigation, making it more participants with less programming experience performed better
challenging to explore the code. Exposing connection points during with Accessible Blockly than with the alternate navigation scheme,
navigation increased navigation time and had participants retain suggesting it was easier for them to learn and use.
more in their memory. On the other hand, participants who had more experience with
programming thought the alternate navigation scheme was more
“In the phase one [Blockly’s alternate navigation
appropriate for navigating complex programs because it provided
scheme] you have to keep track of the connection points
more information which would give them more control especially
which you don’t need if you are not editing the program.
when it comes to editing.
I don’t know how it might be in terms of editing but the
phase 2 [Accessible Blockly] was straight forward.” (P6) “ So, for reading the program it was easier with the
Although some participants made remarks regarding the increased way it worked in phase one [Accessible Blockly], umm I
memory requirements, they thought the detailed information pro- would imagine when it came to editing you would need
vided could be more helpful when navigating complex programs. more control like what you have in phase 2 [Blockly’s
P5, while providing his rating for the level of frustration with the alternate navigation scheme].” (P3)
alternate navigation scheme noted:
“I gave 4.5 because I had to process bit more of an in- 6.5 Participant Preference
formation but I believe like to thoroughly understand Participants were explicitly asked if they had a preference for one
a complex program I might need all this information.” of the navigation schemes over the other and their responses were
(P5) interesting. Five participants explicitly stated that they preferred Ac-
On the other hand, fve participants thought it was the opposite cessible Blockly over Blockly’s alternate navigation scheme. Three
when using Accessible Blockly because they did not have to worry participants said they liked both navigation schemes citing various
about the connection points when reading the programs. use cases in which they fnd both to be valuable and appropriate.
“... Because it [Accessible Blockly] just gives you uhh The most common reason for preferring Accessible Blockly was its
enough information you want to understand the pro- ease of use and design to favor novices. Blockly’s alternate naviga-
gram. uhh it did not give you more details which can tion scheme on the other hand was mostly preferred by experienced
make things a bit uhhh like which can increase the cog- programmers who stated it was because it provided more informa-
nitive load. The things you need to intake are more in tion that might be valuable when navigating complex programs.
the second phase [Blockly alternate navigation scheme]. One participant, P10, stated they preferred the alternate navigation
So, in frst phase [Accessible Blockly] like the cognitive scheme over accessible Blockly. What was more intriguing was
load was very much less, and so as frustration and un- the three participants who said they liked none of the navigation
derstanding would be a bit more. So, for a novice or for schemes. P12 explicitly stated this:
ASSETS ’22, October 23–26, 2022, Athens, Greece Aboubakar Mountapmbeme et al.

“I don’t have a preference for any. I feel they end up upon their experience. Most participants expressed interest in using
trying to map the WASD keys to the Arrow keys.” (P12) the navigation schemes, and some even requested more complex
Overall, we observed that both navigation schemes are equally us- programs as they were curious and eager to see how they work
able and accessible for navigating and understanding block-based on larger programs. Overall, it was a feeling of enthusiasm, satis-
programs, with Accessible Blockly being preferred over the alter- faction, and enjoyment. For the participants who had not heard
nate navigation scheme because of its ease of use, faster navigation of block-based programming before, it felt like a new and exciting
time, and its tendency to favor novices. experience. The only participant familiar with the accessibility is-
sues in mainstream BBPEs [5], [7] stated they were happy to see
7 DISCUSSION eforts being made to add accessibility to these systems. P1, who
also teaches students with visual impairments how to code and has
7.1 Participants with Programming Experience tried some mainstream BBPEs stated:
and Experience Using Screen Readers
Perform Better.
The individual reports and quantitative data show that participants “I really like this though you guys have done [. . .]. This
with good programming experience performed best, having the is much better than the Google outline method that they
fastest task completion time and highest score with both navigation had. Umm what was it? I don’t know how many years
schemes. These participants also found the tasks to be easier to ago? Ummm.” (P1)
complete. This was the contrast with those who had little or no
programming experience. The participants with limited program-
ming experience scored low on the tasks. When watching the video P1 was referring to a text-based version of Blockly that was de-
recordings of participants who scored low on the tasks, we noticed signed by the Blockly team to be accessible to people with visual
that most missed exploring the body of the container blocks or impairments. The problem with this was that it was a separate sys-
control fow blocks in Task 2. Five participants missed exploring tem and thus not suitable in situations where students with visual
the body of container blocks in Task 2a, and this was more com- impairments and sighted students work together.
mon when they completed the task with the alternate navigation
scheme. When asked why they missed, some cited their limited
programming knowledge explaining that they did not know these
blocks had blocks nested within them. This was the case for P2 and 7.3 Suggestions to Improve the Accessibility of
P8.
It could also be that these participants had not formed an accu-
Block-Based Code
rate mental model of block-based code and their conceptualization As discussed in the previous section, all participants in the study
of block-based code was still infuenced by their text-based program expressed enthusiasm and gave feedback regarding how we could
reading strategy. At least the discussion with three participants sug- improve the navigation strategy and the accessibility of block-based
gested this. P4, who missed exploring the container blocks in both code in general. We examined all the suggestions made and noticed
phase 1 and phase 2 said her familiarity with procedural languages that they were similar across participants. Most of the recommen-
infuenced her thought process. dations had to do with the feedback provided by the screen readers.
“Yeah, I think it’s just my training as you know umm All participants found the keyboard navigation strategies accessible.
because normally in a procedural language the thing to However, Participants thought the verbal descriptions provided
be repeated would be on the next line and I just haven’t by the screen reader could be improved and enriched with more
gotten used to looking inside the block for what should feedback regarding the actions taking place on the coding envi-
be repeated.” (P4) ronment. Participants suggested providing hints about container
blocks. Participants thought letting the screen reader alert them
We believe that in addition to adding accessibility to block-based that a block is a container block would improve their success at
programming, training students with visual impairment to form an navigation and reading block-based code.
accurate mental model of how block-based code works would be Participants also requested a quick review mode that would allow
essential in increasing their success in using block-based program- one to listen to the description of the whole program or part of a
ming environments. Training students to use screen readers and program without navigating the program. This would give the user
other accessibility technologies would also improve their perfor- a head start before they begin exploring the individual blocks on
mance at block-based coding[5]. the program.
We note here that adequate feedback from the programming
7.2 Enthusiasm Regarding both Methods and on environment is crucial to an efective and fully accessible program-
the Accessibility of BBPEs in General ming environment for people with visual impairments. Therefore,
We observed that almost all participants (9 out of 12) were enthu- researchers and practitioners should explore how to increase the
siastic as they used both navigation methods to complete tasks amount of feedback a user can obtain from such interactive visual
and welcomed the idea of adding accessibility to block-based pro- environments through alternate interaction channels. Sound and
gramming environments. Participants displayed positive emotions auditory cues have been used for similar purposes in text-based
and frequently smiled as they completed the tasks and refected programming environments [22] [27] [28].
Accessible Blockly: An Accessible Block-Based Programming Library for People with Visual Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

8 LIMITATIONS [2] D. Weintrop and U. Wilensky, “Comparing block-based and text-based program-
ming in high school computer science classrooms,” ACM Trans. Comput. Educ.,
One obvious limitation of this work is the number of participants vol. 18, no. 1, pp. 1–25, 2017.
who took part in the study. Due to the difculty in recruiting par- [3] L. R. Milne and R. E. Ladner, “Blocks4All: overcoming accessibility barriers to
blocks programming for children with visual impairments,” in Proceedings of the
ticipants, we could only evaluate Accessible Blockly with 12 partic- 2018 CHI Conference on Human Factors in Computing Systems, 2018, pp. 1–10.
ipants. Although 12 is high when compared to similar studies [3] [4] L. R. Milne and R. E. Ladner, “Position: Accessible Block-Based Programming:
[16]. Why and How,” in Proceedings - 2019 IEEE Blocks and Beyond Workshop, B and B
2019, Oct. 2019, pp. 19–22, doi: 10.1109/BB48857.2019.8941230.
As stated before, each navigation scheme was fxed to one set of [5] A. Mountapmbeme and S. Ludi, “Investigating Challenges Faced by Learners with
programs throughout the study. This might have also infuenced the Visual Impairments using Block-Based Programming / Hybrid Environments,” in
quantitative results and their statistical signifcance. However, the The 22nd International ACM SIGACCESS Conference on Computers and Accessibility
(ASSETS ’20), 2020.
programs were designed to be very similar in length and content and [6] S. Ludi, “Position paper: Towards making block-based programming accessible
we believe this confguration had negligible infuence on the results. for blind users,” in 2015 IEEE Blocks and Beyond Workshop (Blocks and Beyond),
2015, pp. 67–69.
Additionally, during the open-ended discussions most participants [7] A. Mountapmbeme and S. Ludi, “How Teachers of the Visually Impaired Compen-
felt Accessible Blockly was faster for navigation. sate with the Absence of Accessible Block-Based Languages; How Teachers of the
Visually Impaired Compensate with the Absence of Accessible Block-Based Lan-
guages,” 23rd Int. ACM SIGACCESS Conf. Comput. Access., doi: 10.1145/3441852.
9 FUTURE WORK [8] “Blockly | Google Developers.” https://developers.google.com/blockly (accessed
This study mainly evaluated Accessibly Blockly for navigating and Apr. 09, 2022).
[9] N. Fraser, “Ten things we’ve learned from Blockly,” Proc. - 2015 IEEE
reading block-based code. This was the frst study in our series of Blocks Beyond Work. Blocks Beyond 2015, pp. 49–50, Dec. 2015, doi:
evaluation studies. Part of the next steps involves looking at how 10.1109/BLOCKS.2015.7369000.
Accessible Blockly allows people with visual impairments to create [10] N. C. C. Brown, J. Mönig, A. Bau, and D. Weintrop, “Panel: Future directions of
block-based programming,” in Proceedings of the 47th ACM Technical Symposium
and edit block-based code. Participants were very eager about this on Computing Science Education, 2016, pp. 315–316.
and almost all of them wanted to create block-based code. [11] S. Ludi and M. Spencer, “Design Considerations to Increase Block-based Language
Accessibility for Blind Programmers Via Blockly,” J. Vis. Lang. Sentient Syst., vol.
Updating the screen reader output by improving the verbal de- 3, no. 1, pp. 119–124, 2017.
scriptions and increasing the amount and type of information a user [12] “Accessible Rich Internet Applications (WAI-ARIA) 1.2.” https://www.w3.org/TR/
would get through the screen reader is also part of our future work. wai-aria-1.2/ (accessed Sep. 09, 2020).
[13] A. C. Pires, F. Rocha, A. J. De Barros Neto, H. Simão, H. Nicolau, and T. Guer-
Participants gave suggestions regarding the generated speech, and reiro, “Exploring accessible programming with educators and visually impaired
we believe we can address these in our future iterations. Audio cues children,” in Proceedings of the Interaction Design and Children Conference, IDC
have been used to increase the amount of feedback and information 2020, Jun. 2020, pp. 148–160, doi: 10.1145/3392063.3394437.
[14] J. S. Y. Ong, N. A. O. Amoah, A. E. Garrett-Engele, M. I. Page, K. R. McCarthy,
that users get when navigating text-based programs [22] [28]. We and L. R. Milne, “Expanding Blocks4All with Variables and Functions,” in The
plan to explore these mechanisms and other ways to improve the 21st International ACM SIGACCESS Conference on Computers and Accessibility,
2019, pp. 645–647.
feedback generated by the screen reader. [15] M. Das, D. Marghitu, M. Mandala, and A. Howard, “Accessible block-based pro-
gramming for k-12 students who are blind or low vision,” Lect. Notes Comput. Sci.
10 CONCLUSION (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 12769
LNCS, pp. 52–61, 2021, doi: 10.1007/978-3-030-78095-1_5/FIGURES/7.
We have presented a prototype of an accessible block-based pro- [16] C. M. Baker, L. R. Milne, and R. E. Ladner, “Structjumper: A tool to help blind
gramming library that uses the keyboard and screen reader for programmers navigate and understand the structure of code,” in Proceedings of
the 33rd Annual ACM Conference on Human Factors in Computing Systems, 2015,
interaction. This library is a modifcation of the Blockly library pp. 3043–3052.
used to build most mainstream block-based programming envi- [17] E. Schanzer, S. Bahram, and S. Krishnamurthi, “Accessible AST-Based Program-
ronments. The library was designed to improve the accessibility ming for Visually-Impaired Programmers,” in Proceedings of the 50th ACM Tech-
nical Symposium on Computer Science Education, 2019, pp. 773–779.
of block-based programming environments for people with visual [18] Mountapmbeme Aboubakar, Okafor Obianuju, and Ludi Stephanie, “Addressing
impairments. Using the library, a user can navigate, create, and Accessibility Barriers in Programming for People with Visual Impairments: A
Literature Review,” ACM Trans. Access. Comput., vol. 15, no. 1, pp. 1–26, Mar. 2022,
edit block-based code using the screen reader and keyboard only. doi: 10.1145/3507469.
This paper also reported a user study evaluating the library on [19] “Computer Science For All | whitehouse.gov.” https://obamawhitehouse.archives.
code navigation tasks. The results show that people can efectively gov/blog/2016/01/30/computer-science-all (accessed Apr. 13, 2022).
[20] “WAI-ARIA Overview | Web Accessibility Initiative (WAI) | W3C.” https://www.
navigate block-based code with a keyboard and screen reader. Par- w3.org/WAI/standards-guidelines/aria/ (accessed Apr. 09, 2022).
ticipants in the study found Accessible Blockly to be fast, easy to [21] S. Ludi, J. Simpson, and W. Merchant, “Exploration of the use of auditory cues
use, and less frustrating. They also showed enthusiasm about this in code comprehension and navigation for individuals with visual impairments
in a visual programming environment,” in ASSETS 2016 - Proceedings of the 18th
initiative to improve the accessibility of block-based programming International ACM SIGACCESS Conference on Computers and Accessibility, Oct.
environments. Participants also expressed interest in future studies 2016, pp. 279–280, doi: 10.1145/2982142.2982206.
[22] A. Stefk, C. Hundhausen, and R. Patterson, “An empirical investigation into the
to evaluate the library for code creation and editing. design of auditory cues to enhance computer program comprehension,” Int. J.
Hum. Comput. Stud., vol. 69, no. 12, pp. 820–838, 2011.
ACKNOWLEDGMENTS [23] “RITAccess/blockly: The web-based visual programming editor.” https://github.
com/RITAccess/blockly (accessed Jul. 06, 2022).
We thank all participants who took part in the study and Google’s [24] “@blockly/keyboard-navigation Demo.” https://google.github.io/blockly-
Blockly team for their collaboration. samples/plugins/keyboard-navigation/test/ (accessed Apr. 09, 2022).
[25] “Accessible block-based programs study.” https://mountapmbeme.com/study/
home.html (accessed Jul. 07, 2022).
REFERENCES [26] D. Weintrop and U. Wilensky, “To block or not to block, that is the question:
[1] D. Weintrop, “Block-based programming in computer science education,” Com- students’ perceptions of blocks-based programming,” in Proceedings of the 14th
mun. ACM, vol. 62, no. 8, pp. 22–25, Jul. 2019, doi: 10.1145/3341221. international conference on interaction design and children, 2015, pp. 199–208.
[27] S. A. Brewster, “Using nonspeech sounds to provide navigation cues,” ACM Trans.
Comput. Interact., vol. 5, no. 3, pp. 224–259, Sep. 1998, doi: 10.1145/292834.292839.
ASSETS ’22, October 23–26, 2022, Athens, Greece Aboubakar Mountapmbeme et al.

[28] J. Hutchinson and O. Metatla, “An initial investigation into non-visual code struc- A.1.4 Task 2b. How will the output of the program above change
ture overviews through speech, non-speech and spearcons,” Conf. Hum. Factors if the value of count was initially set to 2?
Comput. Syst. - Proc., vol. 2018-April, Apr. 2018, doi: 10.1145/3170427.3188696.

A.2 Phase 2: Using Blockly’s Alternate


APPENDICES
Navigation scheme
A THE STUDY TASKS A.2.1 Training. The following program will be used to teach the
A.1 Phase 1: Using Accessible Blockly Keyboard basic navigation keys for about 15 mins
Navigation Scheme https://mountapmbeme.com/googles/blockly/
accessibleblockly/study/training.html
A.1.1 Training Task. The following program will be used to teach
the basic navigation keys for about 15 mins A.2.2 Task 1. What is the value of the variable named count after
https://mountapmbeme.com/study/accessible/blockly/ the following program is executed? Use the keyboard and Screen
accessibleblockly/study/training.html Reader to read the program.
https://mountapmbeme.com/googles/blockly/
accessibleblockly/study/taska.html
A.2.3 Task 2a. What is the output of the following program? Use
the keyboard and Screen Reader to navigate and read the program.
https://mountapmbeme.com/googles/blockly/
accessibleblockly/study/taskb.html
A.2.4 Task 2b. How will the output of the program above change
if the value of temp was initially set to 20?

Screenshot 1: Training Task

A.1.2 Task 1. What is the value of the variable named result after
the following program is executed? Use the keyboard and Screen
Reader to read the program.
https://mountapmbeme.com/study/accessible/blockly/
accessibleblockly/study/task1.html

Screenshot 3: Task 2a with Accessible Blockly

Screenshot 2: Task 1 with Accessible Blockly

A.1.3 Task 2a. What is the output of the following program? Use
the keyboard and Screen Reader to navigate and read the program.
https://mountapmbeme.com/study/accessible/blockly/
accessibleblockly/study/task2.html
Screenshot 4: Training Task
Accessible Blockly: An Accessible Block-Based Programming Library for People with Visual Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

Screenshot 5: Task 1 with Blockly Alternate Navigation Scheme

Screenshot 6: Task 2a with Blockly Alternate Navigation Scheme


CodeWalk: Facilitating Shared Awareness in Mixed-Ability
Collaborative Sofware Development
Venkatesh Potluri∗ Maulishree Pandey∗ Andrew Begel
vpotluri@cs.washington.edu maupande@umich.edu abegel@andrewbegel.com
Paul G. Allen School of Computer University of Michigan School of Microsoft Research
Science and Engineering at University Information Redmond, Washington, USA
of Washington Ann Arbor, Michigan, USA
Seattle, Washington, USA

Michael Barnett Scott Reitherman


mbarnett@microsoft.com scott.reitherman@gmail.com
Microsoft Research Microsoft Research
Redmond, Washington, USA Redmond, Washington, USA

ABSTRACT ACM Reference Format:


COVID-19 accelerated the trend toward remote software devel- Venkatesh Potluri, Maulishree Pandey, Andrew Begel, Michael Barnett,
and Scott Reitherman. 2022. CodeWalk: Facilitating Shared Awareness in
opment, increasing the need for tightly-coupled synchronous col-
Mixed-Ability Collaborative Software Development. In The 24th Interna-
laboration. Existing tools and practices impose high coordination tional ACM SIGACCESS Conference on Computers and Accessibility (ASSETS
overhead on blind or visually impaired (BVI) developers, imped- ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 16 pages.
ing their abilities to collaborate efectively, compromising their https://doi.org/10.1145/3517428.3544812
agency, and limiting their contribution. To make remote collab-
oration more accessible, we created CodeWalk, a set of features
added to Microsoft’s Live Share VS Code extension, for synchronous 1 INTRODUCTION
code review and refactoring. We chose design criteria to ease the Synchronous software engineering activities like pair program-
coordination burden felt by BVI developers by conveying sighted ming and code walkthroughs are useful for developers to share
colleagues’ navigation and edit actions via sound efects and speech. knowledge, improve and refactor the source code, and debug the
We evaluated our design in a within-subjects experiment with 10 code together. Developers have to remain closely synced to achieve
BVI developers. Our results show that CodeWalk streamlines the efective collaboration and communication in these activities. If a
dialogue required to refer to shared workspace locations, enabling developer moves to a new location in the code i.e. line, function or
participants to spend more time contributing to coding tasks. This fle, their collaborator should follow them immediately; real-time
design ofers a path towards enabling BVI and sighted developers edits should become apparent right away to enable quick feedback.
to collaborate on more equal terms. When collocated, sighted developers work together on one sys-
tem to observe and discuss the source code without expending
CCS CONCEPTS additional efort to stay on the same page. However, referencing a
• Social and professional topics → People with disabilities; collaborator’s screen is inaccessible for blind or visually impaired
• Software and its engineering → Collaboration in soft- (BVI) developers, often requiring them to drive the collaboration
ware development; • Human-centered computing → Syn- on their computers [53].
chronous editors. This screen inaccessibility is magnifed in remote synchronous
collaboration. Developers typically either use screen shares or inte-
KEYWORDS grated development environments (IDEs) with integrated collabora-
tion support (e.g. VS Code, JetBrains, Floobits, CodeTogether, etc.) to
software developers, collaboration, blind or visually impaired, ac- work synchronously (see Figure 1). These approaches assume that
cessibility, sound efects, workspace awareness everyone can see their screens [53, 73]. However, BVI developers
∗ Both
cannot access the screen share video or the visual awareness cues
authors contributed equally to this research.
in IDEs through assistive technologies such as screen readers. They
have to constantly request that their sighted colleagues speak code
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed locations, such as line numbers, functions, fle names, etc., out loud,
for proft or commercial advantage and that copies bear this notice and the full citation in order to stay in sync. Much like collocated collaboration, BVI
on the frst page. Copyrights for components of this work owned by others than the developers end up driving the activity. Sometimes, they even hand
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission of their computer’s control to sighted colleagues in refactoring and
and/or a fee. Request permissions from permissions@acm.org. debugging tasks, which reduces their own agency.
ASSETS ’22, October 23–26, 2022, Athens, Greece That task of providing accessible awareness information lies at
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 the heart of facilitating efective remote collaborations in mixed-
https://doi.org/10.1145/3517428.3544812 ability contexts [25, 34, 53]. Research has begun to explore making
ASSETS ’22, October 23–26, 2022, Athens, Greece Potluri and Pandey, et al.

Following anonymous_2

anonymous_2

Figure 1: VS Code is an IDE that ofers integrated collaboration support through its Live Share extension. Live Share enables
developers to work together on source code through document sharing and co-editing in their respective IDEs. It represents
collaborators’ location and selection in the source code through colorful cursors.

shared workspaces accessible to BVI users [18, 38]. However, these remote collaboration, meeting our design criteria (see §5 and
solutions are intended for general-purpose document co-editing; §6).
they do not cater to the unique needs of software engineering tasks The COVID-19 pandemic has exacerbated the need for collabo-
like pair programming and code walkthroughs. rative programming environments that can enable BVI developers,
Prior to our work, no programming environment with accessible, one of the largest physical disability groups of software develop-
remote, synchronous co-editing support was publicly available. We ers [67], to participate in remote work at par with their sighted
have created and released such an environment, in accordance peers. CodeWalk addresses this timely need and provides a foun-
with four design criteria (see §2.4), which include maintaining the dation for future software engineering tools to facilitate accessible
agency of BVI developers and reducing their burden to drive the collaborations.
collaboration.
In this paper, we present CodeWalk, a set of features added to Mi-
crosoft’s Live Share VS Code extension1 (available to all Live Share
2 BACKGROUND AND RELATED WORK
users since November 2021), with support for remote, synchronous 2.1 Awareness for Sighted People in Remote
code review and refactoring tasks. Our design derives from an in- Collaborations
vestigation of relevant literature in remote collaboration and a set Groupware is only efective when it supports collaboration across
of formative design activities led by a BVI developer and researcher time and space constraints in a shared workspace [32]. Buxton
on the team. During our design process, we compared techniques describes a shared workspace in terms of (1) person space [14], (2)
for capturing a collaborator’s navigation, editing, and referential task space [14], and (3) reference space [13]. Person space ofers a
(i.e., pointing at or highlighting parts of the code) activities and strong sense of copresence with remote collaborators. For instance,
presenting them to BVI users using a combination of sound efects teleconferencing platforms combine video, audio, and even chat to
and speech (see §4.1). We evaluated CodeWalk in a within-subjects convey facial expressions, gestures, and spoken messages. Sharing
controlled experiment involving 10 BVI professional developers (see the task space refers to being copresent in the context of the task
§5). Our results show that CodeWalk increased study participants’ itself. In collaborative software development activities, the shared
awareness of their collaborator’s actions and reduced the coordi- source code forms the task space. Reference space is where the per-
nation overhead required to sync on code locations. Participants son and task overlap [13], allowing remote participants to gesture
strongly preferred CodeWalk over the baseline — the unextended and point to reference one another as well as the task at hand. An
version of VS Code with Live Share which provides awareness cues example is using text highlighting during screen shares to direct
visually (see §6). collaborators’ attention to specifc details.
Our work is an end-to-end demonstration of how to improve the Dourish and Bellotti identifed two approaches to present aware-
accessibility of an IDE’s remote collaboration features. We make ness information in shared workspaces — active and passive [20].
the following contributions to the HCI, accessibility, and software Active approaches include role assignment (e.g. owner, reviewer,
engineering design communities. editor, etc), audit trails, annotations, and messaging. Active mech-
(1) A design for supporting tightly-coupled synchronous pro- anisms require explicit action on collaborators’ part (e.g. leaving
gramming activities (see §3 and §4). a comment on the shared artifact like document or source code).
(2) CodeWalk, an implementation of a set of features added to Conversely, conveying a collaborator’s whereabouts and edits au-
Microsoft’s Live Share VS Code extension that makes syn- tomatically in real-time is a passive approach — collaborators do
chronous programming tasks accessible to BVI developers not have to make explicit eforts to communicate their actions.
(see §4). Scholars and practitioners have blended diferent kinds of shared
(3) Validation of our design’s capability to increase shared workspaces with active and passive approaches to communicate
awareness and facilitate efcient synchronization during awareness information in remote synchronous software develop-
1 https://docs.microsoft.com/en-us/visualstudio/liveshare/use/enable-accessibility- ment. Consider the example of real-time co-editing of source code.
features-visual-studio-code A primary concern is that the shared activity should not introduce
CodeWalk: Facilitating Shared Awareness in Mixed-Ability Collaborative Sofware Development ASSETS ’22, October 23–26, 2022, Athens, Greece

bugs, preventing the code from successfully compiling [23]. IDE often reach out to their sighted colleagues to solve breakdowns in
plugins like FASTDash [7] and Syde [26] summarize the real-time programming tools, especially selecting teammates who understand
activity of collaborators to help them avoid editing conficts. They workfows with assistive technologies [71]. However, help-seeking
also allow developers to leave annotations to inform collaborators in the workplace can be complicated by the team and organization’s
of their actions. Systems like AtCoPE [21] and Collabode [23] allow attitudes towards accessibility and inclusion [53].
programmers to concurrently edit the source code, enabling collab- Accessibility and HCI research has primarily studied BVI de-
oration in a shared task space. CodePilot [76] extends the activities velopers’ as individual contributors, resulting in limited under-
supported in the shared task space to allow collaboration across standing about the accessibility of collaborative programming
the software development process — editing, testing, debugging, activities. The exceptions are the research eforts focused on
and even version control. creating inclusive learning experiences for novice BVI program-
Real-time activities like pair programming and code walk- mers [29, 35, 41, 48, 49, 69]. Pandey et al. were the frst to report
throughs impose an additional requirement on developers to remain on the accessibility of collaborative software development [53].
closely coordinated [74]. These activities often rely on explicit role They highlight the sociotechnical challenges in activities like pair
assignment. In one common form of pair programming, one pro- programming, code reviews, and code writing that limit the contri-
grammer drives (leader) the session and writes or explains the code; butions of BVI developers during collaboration. The paper reported
the other programmer follows (follower) to ofer feedback or heed that BVI developers have to perform additional work, often invis-
the explanations. To collaborate efciently and maintain mutually ible to their sighted colleagues [12], to address the accessibility
recursive awareness of one another (also known as shared inten- challenges in the workplace [53].
tionality [74]), the participating developers must look at the same
regions within the source code. Saros [60], an Eclipse plug-in, dis- 2.3 Accessibility for Mixed-Ability
plays each programmer’s text cursor to communicate their location
in the source code and provides a Follow mode for programmers
Programmers in Remote Software
to sync their IDE viewports during pair programming and code Development
walkthroughs. D’Angelo and Begel used a novel gaze visualization Given the limited research on awareness needs of BVI developers,
technique to communicate the lines of code a collaborator was look- we turn to the larger accessibility literature to identify how sighted
ing at during pair programming [16]. The visualization changed and visually impaired people achieve real-time, remote collabora-
color when both programmers’ gaze overlapped, communicating tion. A common approach is asking collaborators to describe their
awareness and co-presence passively in shared task space. Their actions, but sighted people often forget to verbalize the relevant
evaluation of the gaze visualization technique revealed that when details, resulting in incomplete collaborator awareness [17, 53]. BVI
the collaborators’ eye gaze was in sync, they more efciently spoke people hesitate to repeatedly request information to avoid slowing
about their source code using deictic references (e.g. terms such as down the pace of the collaboration or imposing on their sighted
this, here, that, etc.) collaborators [17].
The research discussed above has focused solely on collabo- In another workaround, collaborators work on their respective
rations among sighted developers; they do not report anything computers with the BVI developer sharing their screen using a
about the needs of BVI developers. The tools noted above rely video calling application to enable the sighted developer to follow
heavily on visual information, which leads to signifcant accessibil- them [53]. Collaborators rely on chat features to copy-paste text and
ity problems in mixed-ability contexts [36, 54], including software share line numbers, etc., to collaborate more accessibly. While this
development [53]. workaround allows a sighted developer to track a BVI developer’s
location, the information is not reciprocated to that BVI developer.
It also places the onus of driving the collaboration session on BVI
2.2 Inaccessibility for BVI Programmers in developers. BVI people occasionally have to relinquish control of
Software Development their computers to let the sighted colleague control their screen and
Initial empirical studies identifed accessibility issues with program- make changes in real-time [73]. Since BVI and sighted computer
ming tools, like seeking information in IDEs and challenges in doing users navigate interfaces diferently [8, 55], the approach causes
UI development [1, 42, 50]. Albusays et al. found that IDEs rely on the screen reader to change focus unexpectedly without feedback
visual aids like indentation and syntax highlighting to structure to the BVI person, impinging on their agency, and raising privacy
the source code, which favor navigation for sighted developers [2]. concerns [73]. Latency issues also lead to unwieldy drag and click
On the other hand, command line interfaces (CLIs) present text in interactions for sighted collaborators.
unstructured form [61], which limit BVI developers in navigating Another strategy is to use NVDA Remote [75] or JAWS Tan-
the text efciently. Researchers have designed tools to improve nav- dem [63], screen reader addons that transmit announcements in-
igation [3, 4, 30, 62] and address the accessibility challenges in tasks stead of simply relaying the video of the collaborator’s screen during
like code editing [57] and debugging [68]. Potluri et al. designed screen share [9, 73]. Unfortunately, these addons often sufer from
CodeTalk, a Visual Studio extension, to improve code glanceability, long latency, causing the BVI person to receive announcements
navigation and debugging by combining UI enhancements, speech, after 15-30 seconds [73]. Plus, sighted collaborators have to set up
and sound efects [56]. the screen reader and the addon with matching confgurations at
Accessibility challenges are further complicated by workplace dy- their ends, giving them additional invisible coordination work and
namics and project management practices [28, 58]. BVI developers adding to the total collaboration time [53]. Tools like Sinter address
ASSETS ’22, October 23–26, 2022, Athens, Greece Potluri and Pandey, et al.

the latency issues and strict confguration requirements of remote Table 1: Descriptions of our code walkthroughs. Each walk-
screen reading but have not yet been evaluated in collaborative through occurred between a pair of sighted and/or BVI de-
contexts [10]. velopers along with a sighted observer watching a shared
Das et al. designed auditory representations to support asyn- screen or listening to a BVI developer’s screen reader.
chronous collaborator awareness for activities such as commenting
and editing in shared text documents [18]. The evaluation of their ID Leader Follower Sighted Observer Task
design elements revealed that the use of non-speech audio, voice
CW1 Sighted BVI Watched sighted developer’s screen
modulation, and contextual presentation could improve awareness CW2 BVI Sighted Watched sighted developer’s screen
of BVI authors. Recently, CollabAlly [38] and Co11ab [19], both CW3 Sighted BVI Listened to BVI developer’s screen reader
Google Docs extensions, have been designed to support synchronous CW4 BVI Sighted Watched BVI developer’s screen
collaborator awareness in shared document editing. The extensions
use spatial audio and voice fonts to represent the actions of collab-
orators joining and leaving the document, addition and deletion of
IDEs that ofer collaborative co-editing support, such as JetBrains
comments, and movement of their cursor into or away from the BVI
CodeWithMe [33], Sublime [24] and Atom [22], are unfortunately
user’s paragraph. Co11ab also uses a variant of Follow mode [60] to
difcult to use by BVI developers. A few require BVI developers to
sync collaborators’ viewports. Shared document editing difers from
perform additional setup steps and others even lack accessibility
pair programming and code walkthroughs in an important way,
support for screen reader users to be able to perform basic code
however. Shared document collaborators need real-time awareness
editing [72].
to actively avoid each other’s cursors in order to prevent overwrit-
By contrast, we found Microsoft’s Visual Studio Code IDE (VS
ing collisions. Software development collaborators, on the other
Code) to be both accessible and easily extensible. Its accessible
hand, intentionally work on the same lines of code together, re-
command palette makes it easy for screen reader users to fnd
quiring close and immediate coordination for extended periods of
commands. In addition, we learned that the VS Code team speaks
time.
with the BVI developer community regularly to improve its acces-
In summary, the workarounds discussed above insufciently
sibility [44, 45]. VS Code supports collaborative work through its
convey collaborator awareness, sufer from delays, place a dispro-
Live Share extension. Similar to Saros [60], Live Share supports
portionate burden on BVI people for driving the collaboration, and
synchronous collaboration through a Follow mode feature, which
compromise their agency. Furthermore, some of the proposed non-
draws the leader’s cursor in all of the followers’ IDEs and keeps it
visual techniques [18, 38], which are designed for collaborative
in sync as the leader moves around the document. Live Share also
writing, will not fulfll the unique needs of synchronous collabora-
supports co-editing, keeping a shared view of the source code in
tive programming. We begin to address accessibility of collaborative
sync between the connected parties. Though there is little informa-
software development by designing design criteria for CodeWalk.
tion on the accessibility of Live Share’s features for BVI developers,
there are enough features to make it a good choice for our project’s
2.4 Design Criteria
baseline IDE.
Based on this literature review, we synthesize the following design
criteria for CodeWalk. We annotate each criterion with citations to 3.2 Formative Design Activity 2: Code
the literature that inspired them.
Walkthroughs
• D1. to minimize the cognitive load on the BVI developer [18]
(e.g., maintaining accessible workspace awareness [19, 38] To assess the accessibility of VS Code with Live Share for teams
while minimizing confict with collaborators’ conversations) with BVI and sighted developers, three of the authors (one of whom
during synchronous programming activities is also a BVI developer) conducted four code walkthroughs (see
• D2. to maintain agency of BVI developers [65] in mixed-ability Table 1). Two walkthroughs were led by a sighted researcher and
collaboration [17, 53] two were led by the BVI researcher-developer. We tried to cover all
• D3. to reduce the burden on BVI developers [73] of driving the combinations of abilities in a pair along with each taking on a leader
collaboration session to accessibly collaborate [53] or follower role. All of the walkthroughs involved mixed ability
• D4. to support tightly-coupled collaboration between all col- teams (CW1, CW2, CW3, and CW4). In all of the walkthroughs, a
laborators [74] third sighted researcher observed the shared screen, however in
code walkthrough CW3, the sighted observer simply tried to listen
3 DESIGN to and comprehend the (slowed down) screen reader audio used by
the BVI developer.
In this section, we describe the formative design activities we con- Each code walkthrough looked at diferent example source code
ducted that led to the design of CodeWalk. from VS Code’s library of extensions. Each example extension con-
sisted of multiple fles written in TypeScript, several fles of which
3.1 Formative Design Activity 1: Choosing a were read through during each walkthrough. Each walkthrough
Baseline IDE was 60 to 90 minutes long.
Our frst design activity was to choose a good baseline IDE to In addition to recording the code walkthrough sessions, the
build upon, one that was already accessible by BVI developers and sighted observer took detailed notes during each code walkthrough,
had facilities for collaborative co-editing support. Several popular noting down accessibility breakdowns and workarounds. They
CodeWalk: Facilitating Shared Awareness in Mixed-Ability Collaborative Sofware Development ASSETS ’22, October 23–26, 2022, Athens, Greece

minimized their interruptions, limiting questions to clarifcations they view the code in a Live Share collaboration session hosted by
of leader and follow actions to locate one another in each fle and to Blake. Blake shares the session link with Mia, who joins the session
help work around any non-obvious accessibility barriers. In total, and is presented with Blake’s code in her IDE. Mia invokes the
the sighted observer took six pages of notes. Immediately after Follow mode command to stay in sync with Blake’s viewport. As
each code walkthrough, both the leader and the follower memoed, Blake navigates in the IDE, Mia’s IDE shows a copy of Blake’s cur-
refecting on their experiences [59]. As a group, the entire research sor in a distinct color, which unfortunately is not accessible to Mia.
team rewatched the sessions from the recording and discussed the Though the viewport changes, Mia’s cursor remains untethered
memos and notes in their weekly meetings. from Blake’s. Therefore, Mia has to occasionally interrupt Blake to
We observed that conversations between leader and follower pri- ask him to speak his line numbers and keywords out loud so that
marily focused on code discussions and clarifcation questions dur- she can navigate there herself and use her screen reader to read the
ing breakdowns in accessibility. When in Live Share’s Follow mode, code that Blake is referring to.
the IDE drew each developer’s cursor and synced their viewports This scenario exposes the limitations and asymmetry of current
on everyone’s screen. Unfortunately, since this information was IDEs in supporting tightly-coupled collaboration and shared aware-
only visual, the BVI follower was not aware of any of it and was fre- ness (Design Criterion D4) among sighted and BVI developers. To
quently lost. Consequently, the sighted leader had to speak their lo- address the asymmetry, CodeWalk automatically tethers a BVI de-
cation out loud to the BVI follower to help facilitate tightly-coupled veloper’s cursor to the leader’s (section 4.1), so that their cursors
collaboration (Design Criterion D4). This active approach was error- move in unison whenever the leader initiates the navigation ac-
prone because the sighted leader sometimes forgot to mention their tion. However, this only happens in Follow mode. Now, Mia should
location, especially when they were navigating quickly around normally have no doubts about being in sync with Blake, but can
the source code. The BVI developer initiated another workaround, still detach (i.e., turn of Follow mode) from the leader if she wants
asking clarifcation questions to sync with the leader. This often to explore the code on her own. The feature can also be useful in
put the burden on the BVI developer (Design Criterion D3) to re- Scenarios 9 through 13 where she leads the collaboration. She does
quest enough accessible information to follow along in the code not have to worry about the correct code segment being displayed
walkthrough. in Blake’s IDE.
When the BVI developer led the code walkthrough, they never Furthermore, to reduce the burden on BVI developers (Design
got lost. However, they often became unsure whether the Follow Criterion D3), to preserve their agency (Design Criterion D2), and
mode had really synced the pairs’ viewports, and had to ask their minimize cognitive load, we designed several features in CodeWalk
sighted follower to confrm they could see the expected code in to convey a collaborator’s location, navigation, and edit actions
their window. Finally, the sighted collaborator often talked at the accessibly to BVI developers using a passive, automated approach.
same time as the BVI developer was trying to listen to their screen We describe the detailed implementation of these features next.
reader. This made it difcult for the BVI developer to listen to either
audio stream. The research team discussed that some of the audio 4 CODEWALK
overlaps could be avoided if the sighted developer knew when the
CodeWalk is a set of features released with Microsoft’s Live Share
BVI developer’s screen reader was speaking. But, revealing the use
VS Code extension that supports accessible, remote, synchronous
of AT is a sensitive issue for many screen reader users, thus we de-
code review and refactoring activities. We describe the cursor teth-
cided to designed CodeWalk’s features to judiciously and carefully
ering and audible feedback features that power its capabilities and
make use of audio efects and speech to reduce the cognitive load
discuss some of the implementation details that we found needed
(Design Criterion D1) experienced by the BVI developer.
careful design.

3.3 Formative Design Activity 3: Synthesizing 4.1 Features


Code Walkthrough Scenarios 4.1.1 Cursor Tethering. Live Share’s Follow mode yokes each col-
Inspired by our literature review and our code walkthroughs, we laborator’s editor viewport together, a passive visual mechanism
created 15 scenarios comprising short events that occurred (or we that is inaccessible to BVI developers. CodeWalk facilitates tightly-
wished had occurred if the IDE were more accessible) across our coupled collaboration (Design Criterion D4) by tethering BVI collab-
code walkthroughs. In addition, we considered both sighted and orators’ cursors with the host of a Live Share session. In designing
BVI developers as leaders, but skipped scenarios involving solely this feature, we explored several options to toggle tethering along
sighted developers. Some scenarios explore possible communica- with various levels of autonomy, ranging from always tethering
tion mechanisms between leaders and followers (e.g., non-verbal, cursors to only tethering cursors when the user toggles Follow
notifying the leader, notifying all collaborators, or not notifying mode.
at all and syncing up after the session). All of these scenarios are Always tethering the BVI developer’s cursor to their collaborator
listed in Table 2. minimizes their cognitive load (Design Criterion D1), but reduces
Here is an illustration of Scenarios 1 and 2, which expose some their agency (violating Design Criterion D2), as the sighted col-
inaccessible features of the baseline IDE and the design features we league would have total control over their BVI colleague’s cursor.
explored to address them. Blake, a sighted developer, wants to refac- BVI and sighted developers navigate code and interfaces difer-
tor a piece of code. He asks Mia, a BVI developer and his colleague ently [2, 55, 56]; they may want to read a part of the code that their
for advice. They set up an audio call to verbally discuss the code as sighted collaborator is talking about by character or by word, a kind
ASSETS ’22, October 23–26, 2022, Athens, Greece Potluri and Pandey, et al.

Table 2: Mixed ability code walkthrough scenarios that informed the design requirements for CodeWalk. Each scenario was
inspired by at least one code walkthrough. Sighted+ and BVI+ indicates more than one developer. Following or watching “on
the side” splits the VS Code editor and puts one in Follow mode.

Scenario Leader Follower Walkthrough Activity


1 Sighted BVI CW1, CW3 Follower joins collaboration session hosted by leader.
2 Sighted BVI CW1, CW3 Follower tethers cursor to leader.
3 Sighted BVI CW1 Follow leader “on the side” without tethering.
4 Sighted BVI CW1 Follower restarts tethering after watching leader “on the side”
5 Sighted BVI CW1, CW3 Follower tells leader they are lost.
6 Sighted BVI CW1, CW3 Follower takes notes during collaborative session.
7 Sighted BVI CW1, CW3 Follower fails to notice what command the leader just used.
8 Sighted BVI CW1, CW3 Follower asks leader about the command they just used.
9 BVI Sighted CW2, CW4 Leader invites follower to join collaboration session.
10 BVI Sighted CW2, CW4 Leader jumps to follower’s cursor, answers the follower’s question,
and jumps back.
11 BVI Sighted CW2 Leader asks follower to show them something.
12 BVI Sighted CW2 Leader asks follower a question to test if they are lost.
13 BVI Sighted CW2, CW3 Follower asks for help using a “I need help” command.
14 BVI Sighted, BVI CW2, CW3, CW4 Leader gets follower’s cursor location from VS Code.
15 BVI Sighted+, BVI+ CW4 Leader gets approximate location of multiple followers from VS
Code.

of fne-grained navigation a non-screen reader user has no idea frequency during collaboration. The use of speech and sound ef-
about. To support this need, we support temporarily untethering fects has shown to improve awareness between collaborators [43].
the BVI follower’s cursor whenever they move it around, giving Sound efects minimize cognitive load on BVI developers (Design
them control to move the cursor to the code they want to read. Criterion D1) because they do not require conscious interpretation
After 10 seconds of inactivity, CodeWalk retethers the cursors. and can be heard even when a screen reader is actively speaking.
However, they do not give enough information about a collabora-
4.1.2 Conveying Collaborator Actions via Audio. CodeWalk uses tor’s location. To address this, after navigation activity has stopped
a combination of sounds and speech to passively communicate a for 1.5 seconds, CodeWalk uses computer-generated speech to tell
tethered collaborator’s location and their navigation and edit ac- the BVI developer what line of code (and fle name if it changed)
tions, reducing the burden on BVI developers to ask about them they are now on. Most sound efects are around 200 ms (though
(Design Criterion D3). We explored several sound designs, drawing one is longer, at 550 ms). Similarly, speech announcements are
inspiration from audio cues used by popular accessible navigation kept short and precise. The complete business logic for CodeWalk’s
apps and operating systems. We experimented with futuristic ar- sound efects and speech can be seen in Table 3.
tifcial sound efects as well as skeumorphic sounds of keyboard Sighted collaborators commonly use visual reference space ges-
clicks and scroll wheels. We felt that since BVI developers were tures such as cursor location, text highlighting, and mouse waving
already familiar with the sounds of standard computer hardware, to refer to code [16], gestures largely unavailable to BVI develop-
the skeuomorphic sound efects would be the best one to convey ers [36, 54]. CodeWalk supports selection awareness by speaking
navigation actions. Navigation distance and direction lacked obvi- the portion of code highlighted by a collaborator. This simplifes the
ous skeumorphic analogs, so we designed a set of artifcial rising process of understanding what a collaborator wants to talk about
and falling tone sound efects to be played a short time after navi- and reduces the burden on BVI developers to ask sighted colleagues
gation activity ends. If the user clicks the mouse somewhere else to verbally announce their selections. An example illustrating these
in the codebase, CodeWalk plays an artifcial “teleportation” sound sound efects and speech can be seen in Figure 2.
instead of a mouse click to make it more obvious that something When a collaborator edits the code with their keyboard, sharper,
drastic has happened to the cursor location, which may invalidate shorter key click sound efects are played. If the collaborators are un-
the mental model the BVI developer has of the region where they tethered (i.e., Follow mode is of), then they may both be editing the
thought the cursor was located. document simultaneously. The baseline VS Code Live Share gives
In designing using speech and sound efects, our primary focus no indicator that collaborators’ edits may collide, other than draw-
is to minimize cognitive load relative to the frequency and speci- ing the two cursors near one another. This, of course, is inaccessible
fcity of the information to be conveyed. We draw inspiration from to BVI developers. In CodeWalk, whenever the collaborators are
accessible data visualization and programming eforts [40, 66, 70] editing within 5 lines of one another, CodeWalk speaks a warning,
and use speech to announce highly specifc information like line “your collaborator is editing nearby.” If the collaborator is on the
numbers, which is needed less frequently. We use sound efects same line, the warning repeats, “your collaborator is editing the
to convey less specifc information, such as the actions performed same line as you,” which should hopefully cause the collaborators to
and navigation direction - these actions occur at a much higher
CodeWalk: Facilitating Shared Awareness in Mixed-Ability Collaborative Sofware Development ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 3: Use of audio cues to convey awareness indicators in Follow mode (unless specifed otherwise in the row)

Non-Speech Speech Built-in


Awareness Non-Speech Speech
Indicator Indicator Visual
Information Indicator Indicator
Frequency Frequency Indicator
Viewport scrolls Click wheel Every scroll None None Screen scrolls
sound event
Scroll direction Falling or rising When scrolling None None Can be inferred
tone depending stops from the scrolling
on direction viewport
Current None None “Lines X to Y on When scrolling Visible on screen
Viewport screen” stops
Cursor moves by Keyboard click Every cursor “Line N” 1.5 seconds after Cursor moves on
single line to line move cursor moves end screen
N
Cursor moves Falling or rising Every event “Line N” 1.5 seconds after Cursor moves on
multiple lines to tone depending cursor moves end screen
line N on move
direction
Cursor moves by Falling or rising Every event “Line N” Every event Cursor moves on
multiple lines to tone depending screen
line N on move
direction
Selection Depends on Every event “Selection on line 1.5 seconds after Selection visible
selection (key- N” selection is made on screen
board/mouse)
Edits on Keyboard type For every None None Cursor moves on
follower’s line character typed screen; edits visi-
ble on screen
Edits on Keyboard type For every “<collaborator> is As long as edits Cursor moves on
follower’s line character typed editing the same continue on the screen; edits visi-
(Follow mode of ) line as you” same line ble on screen
Edits within 5 Proximity sound For every “<collaborator> is As long as edits Cursor moves on
lines of follower character typed editing nearby” continue on the screen; edits visi-
(Follow mode of ) same line ble on screen
Follow status Pull and push When follower “You are now fol- Every event None
sound starts and stops lowing <collabo-
following leader rator>”

Figure 2: Image shows BVI developer’s code editor as she follows a sighted leader. CodeWalk tethers the cursors of collaborators
in Follow mode. When the sighted leader uses arrow keys to navigate, CodeWalk plays skeuomorphic keyboard sounds for each
line moved. When they stop navigation at line 19, CodeWalk plays an artifcial falling tone to indicate downward movement
followed by line number announcement. Similarly, when they highlight a word, CodeWalk announces the selection.

stop what they are doing and negotiate their next actions together, and Live Share (i.e., programming APIs enabling third-party devel-
verbally. opers to enhance specifc features of the IDE), which we illustrate
in the following scenario walkthrough.
Mia, a BVI developer (User 2 in Figure 3), installs CodeWalk
4.2 System Implementation along with Blake, a sighted colleague (User 1 in Figure 3). Blake
The basic architecture of CodeWalk can be seen in Figure 3. Each de- and Mia enter into a joint collaboration session facilitated by VS
veloper runs an instance of the VS Code IDE, extended by CodeWalk. Code Live Share’s existing co-editing support. Various extension
CodeWalk extends four extensibility points provided by VS Code points are triggered as they collaborate. The frst triggers when
ASSETS ’22, October 23–26, 2022, Athens, Greece Potluri and Pandey, et al.

IDE for User 1: Sighted Developer IDE for User 2: BVI Developer

Display User 1’s cursor in User 2’s IDE


Existing Co-editing Support Existing Co-editing Support
Display User 2’s cursor in User 1’s IDE

API Extension Points API Extension Points


e.g. Text Cursor API to track user 1’s locations,
Document Change API to track user 1’s edits, etc. Audio cue filename; Announcement text Announcement Text
e.g. keyclick.wav; “Line 22” CodeWalk
Speech
Microsoft Cognitive
Services API

Load and play audio cue + speech

User 2 Speaker

Figure 3: System Architecture Diagram for CodeWalk

Blake changes his cursor location. It sends the new location to Mia’s The fnal extension point queues sounds to be played whenever
IDE along with a tag explaining what action caused it. CodeWalk Mia toggles Follow mode on or of. Similar to what happens in Zoom
then runs through its business logic (described in Table 3) to de- or Microsoft Teams, a sound is played whenever a co-editor joins
termine the kind of audio feedback to play (sounds and/or speech) or leaves the collaboration session, followed by an announcement
and queues them for playback on Mia’s computer. Each of Blake’s of the co-editor’s name and their cursor location.
navigation actions may trigger a tuple of one or two sounds along
with a spoken message, each separated by a delay. Typically, the 5 EVALUATION STUDY
frst sound is skeuomorphic (i.e. for key clicks, mouse clicks, or
We conducted a within-subjects study to understand and compare
the scroll wheel). It is followed by a 1.5-second delay, a falling or
the efectiveness of VS Code Live Share with CodeWalk features
rising tone (to indicate navigation direction), and an spoken an-
against our baseline, plain VS Code Live Share [46]. Our study
nouncement of the new line number. The 1.5 second delay avoids
aimed at answering the following research questions: (1) How well
spamming Mia with additional sounds and speech if Blake pauses
does CodeWalk improve coordination during remote synchronous
momentarily during his actions (e.g. pausing to adjust mouse wheel
collaboration between sighted and BVI developers? (2) How does
when scrolling through a fle). As there are no cross-platform APIs
it afect the communication between developers about the source
for asking screen readers to generate custom announcements, we
code? (3) How does it shape BVI developers’ perceptions of their
generate CodeWalk’s spoken announcements using the Microsoft
collaborative experience?
Azure Cognitive Services Text-to-Speech (TTS) API. If too many
sounds are requested to be played in a row, queued sounds and
speech may be delayed. If they are delayed over one second, it is 5.1 Participants
considered out of date and CodeWalk ejects it from the playback Eligible participants had to be 18 years or older, identify as blind or
queue. Additionally, CodeWalk categorizes sounds into notifcations visually impaired, be comfortable with using screen readers, have
and warnings. Events associated with the former are interruptible, at least a year of programming experience in one of the following
meaning if a second event comes in before the frst one is done languages: C, C++, C#, Python, JavaScript, TypeScript, or Java (i.e.
playing, it will cancel the frst and start the second right away. the programming languages into which we translated our tasks),
Warning sounds are uninterruptible. They are reserved only for have collaborated on code, and be able to communicate about code
edit actions to prevent co-editors from overwriting one another’s in spoken English. We recruited participants by posting on social
changes. media platforms and mailing lists (e.g., Program-L) that primarily
The second extension point tethers the co-editors’ cursors to- comprised BVI developers.
gether. When Mia follows Blake, her cursor will move automatically Our study accepted 10 BVI developers (P1–P10). Nine partici-
wherever Blake’s cursor goes. When tethering is turned on, all pants identifed as male; one as female. Participants were between
of Blake’s edits will always happen on the same line as Mia’s, so 21 and 47 years old (average age 33.6; median age 31.5). Table 4
we suppress any spoken warnings. When tethering is turned of, summarizes the details of participants’ demographics, country of
edit sounds and announcements only play when Blake is editing residence, and current job title. Each participant was compensated
within fve lines of Mia, else the sheer quantity of sounds would with USD $100 (or its equivalent in local currency) for their partici-
overwhelm her. pation in the study.
A third extension point tracks and conveys selection actions
between the co-editors. Mia hears a verbal announcement of the 5.2 Tasks
selection whenever Blake selects some text in his editor, as long as We employed a 2x2 within-subjects experimental design. Each par-
she is tethered to Blake. ticipant, in collaboration with a sighted confederate (one of the
CodeWalk: Facilitating Shared Awareness in Mixed-Ability Collaborative Sofware Development ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 4: Demographic characteristics of the participants and their study session details.

Programming
ID Gender Age Country Profession Condition 1 Condition 2
Language
P1 Male 34 India Senior Software Engineer CodeWalk (Set B) Baseline (Set A) JavaScript
P2 Male 24 India Software Development Engineer CodeWalk (Set A) Baseline (Set B) Python
P3 Male 27 India Technology Analyst CodeWalk (Set B) Baseline (Set A) JavaScript
P4 Male 47 USA Software Engineering Manager Baseline (Set B) CodeWalk (Set A) JavaScript
P5 Female 29 USA Data and Applied Scientist CodeWalk (Set A) Baseline (Set B) Python
P6 Male 44 USA Senior Program Manager Baseline (Set B) CodeWalk (Set A) C#
P7 Male 21 USA Software Engineering Intern Baseline (Set A) CodeWalk (Set B) Python
P8 Male 46 Sweden Software Developer Baseline (Set B) CodeWalk (Set A) C#
P9 Male 35 USA Senior Software Developer Baseline (Set A) CodeWalk (Set B) Java
P10 Male 29 Netherlands Freelance Developer CodeWalk (Set A) Baseline (Set B) Python

authors and the study coordinator), performed a series of tasks We conducted a pilot study with one BVI developer (not includ-
without CodeWalk (the baseline condition) and another set of ing in the main study) to ensure that each task set was possible to
tasks with CodeWalk (experimental condition). Like prior HCI stud- complete within 20 minutes and the total study time did not exceed
ies [11, 31, 51], the confederate was instructed to strictly and con- 90 minutes. Based on their feedback, we found we needed only to
sistently follow the study protocol with all participants, which is improve two things: our instructions on how to connect remotely
known to lead to more generalizable results [27]. and the description of the extensions’ features in both conditions.
We developed two sets of tasks (henceforth, set A and set B) for
the study. We randomized the order of task sets and the conditions
5.3 Procedure
across the participants. Each set comprised three tasks, resulting
in a total of six tasks. We designed the tasks to range from easy We conducted the studies remotely over Microsoft Teams or Zoom,
to difcult within each set, enabling participants to ease into the as per the participant’s preference. Participants were not required
programming environment. In both sets, the frst task was based to turn on their cameras for the study. Before the main tasks, the
on editing a string; the second task required editing code central to study coordinator explained the key features of the IDE used in the
program execution; the third task required refactoring a specifed study, the baseline condition, and CodeWalk. We asked participants
set of lines into a function. The research team conducted multiple to share their screen without including the system audio so that
rounds of discussion to ensure that both task sets were of equivalent the confederate would not hear the participant’s screen reader or
levels of difculty. CodeWalk audio output. We used the video conferencing tool’s
The confederate led the code walkthrough and asked the partici- recording feature to capture the conversation between the partici-
pant to follow them during the tasks. They asked the participant to pant and the study coordinator, which we referred to during our
recommend changes and solutions to complete each task. Partici- analysis.
pants could explore, edit, or verify the actions of the confederate To facilitate switching between study conditions, we created a
as they wished. This made the collaboration feel more natural. Windows 10 virtual machine (VM) with two diferent versions of
All tasks were based on Hangman,2 a common text-based game. VS Code — one version with the baseline condition and another
The tasks were representative of software development activities augmented with CodeWalk. Both versions had the same features,
and required participants to perform code reviews, bug fxing, and keyboard shortcuts, and UI settings. We installed JAWS (version
code refactoring. We downloaded publicly available source code for 2020) and NVDA (version 2021) on the remote VM. We also set up
Hangman in C#, Java, Python, and JavaScript so that participants Code Factory Eloquence [15], a popular text-to-speech (TTS) syn-
could perform the tasks in their preferred programming language. thesizer used to customize screen reader voice and speech. Before
For internal validity, we selected code samples with similar lengths each study session, we set NVDA as the default screen reader.
and modifed their source code to have similar fle names, func- Participants connected to the VM using Microsoft Remote Desk-
tion names, variable names, and code structure. All code samples top software. Upon login, we informed participants that they could
included (1) a main code fle representing the game’s logic, (2) a modify the screen reader settings. Only P1 and P4 used JAWS; the
text fle containing 851 words to play the game, and (3) a text fle others performed the tasks with modifed settings for NVDA. We
listing both sets of tasks in the order determined for the participant. also turned on screen recording within the VM to record the screen
The C# code sample included an additional fle that represented reader speech. Due to technical glitches with screen recording soft-
the game’s UI; this fle was referenced for the string editing task. ware, we were unable to record the screen reader usage for P4 and
The code samples in JavaScript included HTML/CSS fles which missed a portion of the screen reader usage for P1 and P2.
were not required for the study. Table 4 lists the order of conditions Participants were instructed to switch to the IDE window for
and task sets for each participant, along with the programming the frst condition (see Table 4 for the order) and invite the study
language used in their study session. coordinator (referred to as the ‘confederate’ in this paragraph) to the
collaboration session. The confederate was under strict instructions
2 https://en.wikipedia.org/wiki/Hangman_(game) to hide the participant’s shared screen to not look at their IDE
ASSETS ’22, October 23–26, 2022, Athens, Greece Potluri and Pandey, et al.

contents. Participants could ask questions about the IDE features or Table 5: Reference Codes and Descriptions
share their comments about the baseline and CodeWalk during the
study. We believed this approach allowed the collaboration and the Code Description
conversation to proceed more naturally. After twenty minutes, the
Deictic When a participant or the confederate uses a deictic
participant and the confederate switched to the other experimental reference such as this or here, e.g., “Let’s start with this
condition to perform the next set of tasks. task.”
After each condition, participants verbally responded to a 12 Anaphora When a participant or the confederate refers to a past
statement Likert-scale questionnaire (see Table 6). The question- action or location, e.g., “Can you go back?”
naire was adapted from existing scales [16, 64] and assessed partici- Abstract When a participant or the confederate uses a broad cat-
pants’ opinions regarding awareness and collaboration. Participants egory to refer to an object, e.g., “We need to understand
had to indicate on a fve point scale whether they strongly disagreed the function.”
(1) or strongly agreed (5) with the statements. The study concluded Reading When a participant or the confederate loudly reads a
portion of the code, generally done when approximate
with an informal interview about participants’ experience with
location of the collaborator is known, e.g., “Press Enter
CodeWalk and a short questionnaire about their personal and pro-
to leave the game!”
gramming background. Typing When a participant or the confederate is referring to the
text being typed, e.g., “Let me confrm what you wrote.”
5.4 Data Analysis Specifc When a participant or the confederate uses a specifc
The confederate wrote analytic memos [59] after each study ses- name to describe an object, e.g., “Let’s go to didGuessCor-
sion to refect on how each condition shaped their awareness and rect().”
collaboration. One researcher reviewed all the video recordings Line number When a participant or the confederate uses a specifc
line number, e.g., “I am on line 31.”
and conversation transcripts to highlight the timestamps of sync
operations, analyzed using a Poisson regression (see §6.1). Section
6.2 discusses how we adapted an existing list of codes from [16] to
Note, P7’s outlier value in the CodeWalk condition. The followup
analyze the conversation between the confederate and the partic-
interview and his screen share recording revealed that he had not
ipants. Section 6.3 details our analysis of participants’ responses
realized that his cursor was tethered to the confederate’s. He inter-
to Likert-scale questionnaire . Lastly, two authors used descriptive
preted the sounds and speech in CodeWalk as locations he should
coding [47] to analyze the interviews and organized the codes into
move to, resulting in similar behavior across both conditions.
themes around collaboration and feedback (see §6.3).
6.2 How did CodeWalk afect communication
6 STUDY RESULTS
about the source code?
6.1 How well did CodeWalk improve Since CodeWalk conveyed information on a co-editor’s location
coordination during collaboration? and actions, we hypothesized that CodeWalk would enable the
To analyze how well the participants could follow the confederate, participants and the confederate to converse about code using more
we compared the number of times they attempted to sync their lo- abstract and deictic references compared to the baseline. We also
cation with the confederate’s location. We operationalized location hypothesized that they would use more line numbers and specifc
syncing as attempts, including successful attempts, by participants names in the baseline condition compared to CodeWalk. Both our
to move their cursor to the confederate’s location using one of the hypotheses were informed by D’Angelo and Begel [16].
following: (1) moving from one fle to another (2) going from one To analyze how the confederate and the participant referred to
line to another (3) using the fnd tool to search for a specifc word locations in the code, we adapted and extended the list of referents
to navigate to its location (4) toggling the tether command if unsure from D’Angelo and Begel [16] by making two additions (anaphora
of the tether status of cursors. Participants synced their locations and reading). Table 5 shows the codes of all 7 referents along with
to read the code that the confederate was referring to and to follow their defnitions and examples. Each time a participant or the con-
them during the collaboration. We hypothesized that participants federate made a reference to the code, we recorded the referent and
would require fewer sync operations in CodeWalk because they its category. We calculated the total number of referents uttered
would feel less lost compared to the baseline. We analyzed the screen by the confederate and the participant in each study session. Thus,
share recording and participants’ screen reader speech to calculate we ended with 14 total referent counts (7 for the confederate; 7 for
the total number of sync attempts. the participant) in each condition per session. We normalized the
The median value for the number of times participants tried data in each condition by calculating the percentage of referents in
syncing in CodeWalk was 1, compared to the median value of 8 in each category. We did not include P2’s data because he experienced
baseline, a huge drop. Figure 4a a visualizes the number of sync signifcant lags in screen reader speech causing the confederate
attempts for each participant in both conditions. As recommended and P2 to verbalize and read code aloud for a large portion of the
for integral data with possibility of rare occurrences, we ft a Pois- study (an unlikely scenario for collaboration in outside the study).
son regression [77]. We found that participants made signifcantly We visualize the fraction of referents in each category for each
fewer attempts to sync locations in CodeWalk condition (p = .000875 condition in Figure 4b. The fgure shows that specifc referents
< 0.01). The result indicates that CodeWalk enabled the participants were used most heavily during the tasks in both conditions. We
and the confederate to stay closely coordinated during collaboration. also note a greater usage of deictic and abstract referents in the
CodeWalk: Facilitating Shared Awareness in Mixed-Ability Collaborative Sofware Development ASSETS ’22, October 23–26, 2022, Athens, Greece

(a) Number of times participants attempted to sync locations with (b) Percentage of referents of each type uttered in each condition. An
confederate in each condition. * is shown above referent types that are signifcantly diferent across
conditions

Figure 4: Results from video and conversation analysis

CodeWalk condition compared to the baseline. We carried out a Table 6: Statements in the Likert-scale questionnaire along
one-way ANOVA to compare the percentage of referents in both with their p values. All statements had equal or higher me-
conditions. The analysis revealed no signifcant diference in the dian value in the CodeWalk condition. * beside the state-
percentage values of any category except the abstract referents (p ment code indicates p < 0.05.
= .037 < 0.05), which were greater in the CodeWalk condition.
# Statement p value
6.3 How did participants perceive their S1 I was keenly aware of everything in my environment. .0009*
collaboration experience with CodeWalk? S2 I was conscious of what is going on around me. .0035*
6.3.1 Responses to Likert-Scale Qestionnaire. Table 6 lists the S3 I was aware of what my teammate did and how it hap- .0029*
statements in the Likert questionnaire along with the p value of pened.
participants’ responses in both conditions. All statements had an S4 I was aware that my teammate is aware of my actions. .0197*
S5 I am aware of how well we performed together in the .0118*
equal or higher median value in the CodeWalk condition. A higher
team.
median indicates more agreement among the participants, implying S6 I felt like my teammate and I were on the same page .0118*
a better overall experience with CodeWalk. most of the time.
A one-tailed Wilcoxon signed-rank test indicated that the value S7 I could tell what my teammate was thinking .0328*
for responses to ten out of twelve statements were signifcantly about/looking at/talking about most of the time.
higher in the CodeWalk condition (p < 0.05). Their statement codes S8 I felt like we shared common subgoals as we worked .0294*
are followed by an asterisk in Table 6. The signifcantly diferent on the task.
values confrm that participants felt more aware of the confederate’s S9 My teammate communicated clearly during the task. .1284
locations and actions with CodeWalk. Furthermore, participants S10 I communicated clearly with my teammate during this .0169*
felt that the shared awareness was reciprocated by the confederate task.
S11 It was fun to work with my teammate on this task. .0294*
when using CodeWalk i.e. the participants felt that the confederate
S12 My teammate worked efectively with me to accomplish .1284
was also aware of their actions (S4 in Table 6). This indicates greater the task.
shared intentionality with CodeWalk.
Two statements (S9 and S12) were not signifcantly diferent
across conditions. These focused on participants’ perceptions of
the confederate’s communication style and efectiveness in collabo-
highlight code or read code aloud to direct the confederate’s at-
ration during the tasks. Since the confederate remained unchanged
tention. In addition, we observed that the confederate could easily
in both conditions, participants may have felt that their commu-
keep track of the participant’s cursor with CodeWalk’s tethering
nication style remained consistent across conditions. Participants’
feature. The participant’s cursor was always visible in the confeder-
responses to S9 and S12 may have also been subject to demand
ate’s viewport, and if the participant moved out of the viewport to
characteristics, cues that shape participants’ desire to form a posi-
read code, the confederate would scroll to keep track. On the other
tive impression on the experimenter [52]. Participants may have
hand, participants reported that they “leaned on the communication”
wanted to appear polite in their responses about the confederate’s
with the confederate “pretty heavily” (P7) in the baseline condition.
communication and collaboration abilities.
They had to either wait for the confederate to verbally announce
6.3.2 Interview Results. Video analysis revealed that the partici- their location using a line number or function name or request the
pants felt aware of being in the confederate’s vicinity. They would location information to sync cursors, also indicated by Figure 4a.
ASSETS ’22, October 23–26, 2022, Athens, Greece Potluri and Pandey, et al.

We noted instances where participants used CodeWalk’s teth- 6.4 Threats to Validity
ering feature to direct the confederate. For example, P5 asked the Our study employed a sighted research team member as a confeder-
confederate to take her “to the line again” to revisit the source code. ate for all study sessions. Employing a single confederate across all
The confederate moved their cursors to the location P5 had speci- participants is common in HCI [11, 31, 51] and is recommended for
fed; the sounds confrmed arrival for P5. Later on in the interview, maintaining the internal validity of the experiment [27]. Despite fol-
P5 shared that the “auto move [of cursors] was really useful”. Simi- lowing the study protocol strictly, the confederate may have gained
larly, P6 directed the confederate to move their cursor to various experience and improved as a communicator with each session.
lines during the refactoring task. After each move, he would ex- Thus, it is likely that later participants’ collaboration experience
plore the code at the destination line, make recommendations for may have been better than the former, resulting in fewer diferences
improving the code, and then instruct the confederate to take him in metrics between conditions. We believe the within-subjects de-
to the next location. sign choice would have addressed any learning efects on the part
Participants used CodeWalk’s sound efects extensively to main- of the confederate.
tain awareness of the confederate’s actions. For instance, after the Due to their research experience in accessibility, the confederate
confederate fnished typing, P9 mentioned, “Yeah, I can tell you are may better understand participants’ awareness needs than sighted
done ’cause the typing noise stopped”, and then went on to verify the people unused to collaborating with BVI developers. Participants
changes made by the confederate. Participants used the speech an- commented that the confederate was “a very good communicator”
nouncements to keep track of location changes. Many participants (P7), also confrmed by the lack of signifcant diference in responses
phrased this as being aware that “things were happening” (P8). Even to S9 on the Likert-scale questionnaire (see Table 6). The confeder-
on the occasions when the confederate moved quickly, leading to a ate’s communication may have suppressed diferences in referent
succession of sounds, participants felt that “at least [CodeWalk] con- counts between conditions. Therefore, in real-world conditions
veyed a sense of movement” (P4). The increased awareness seemed with more typical collaborators, CodeWalk may show even more
to positively shape the participants’ feelings about collaboration improvement in communication metrics over the baseline.
and assuaged their worries about feeling lost: “Because I could just We deployed CodeWalk on a cloud-based virtual machine (VM)
snap to wherever you were, I wasn’t worried about wandering of” — to simplify the installation for our participants. Using screen readers
P4. through remote VM may have increased latency. Some participants
Furthermore, participants liked the design choice of primarily reported lags in screen reader playback which may have impacted
using audio cues to convey the confederate’s actions and relying their experience with the extension and shaped their feedback. The
on speech sparingly. They shared that the audio cues “packed a lot latency issues are unlikely to occur in real-world conditions, since
of info” (P7) without seeming verbose. In addition, participants the extension would be installed on the user’s own home system.
did not seem to mind when the audio cues played simultaneously Thus, we expect the experience of CodeWalk to be better upon its
with the screen reader speech, but they indicated a preference for release.
shorter sounds. Most participants were able to quickly map the
skeuomorphic audio cues to their awareness indicators. It took a
few participants longer to associate the non-skeuomorphic audio 7 DISCUSSION
cues with their intended meaning of direction changes. However, Overall, we fnd CodeWalk successfully translates and conveys refer-
they acknowledged that they had not “used it [CodeWalk] enough” ence space gestures from sighted developers to their BVI colleagues,
(P6) to remember the sounds and believed that “some more sessions” extending Buxton’s model [13] for efective remote collaboration
(P1) would enable them to map all the audio cues to their respective to mixed-ability collaborations. In this section, we summarize our
meanings. fndings, consider the role of interdependence in our design, and
Every participant told us that they would like to use CodeWalk relate our results to the two projects that are most similar to ours.
to collaborate with their teammates. P5 mentioned that using Code- We then refect on our own research practices and propose future
Walk in code reviews would enable her to be on the “same page work.
without lagging behind.” P7 shared that CodeWalk would be “ab-
solutely instrumental” in his pair programming assignments, and
he would “install it immediately” if it were released. P9 felt that it 7.1 Summary of Findings
would allow him to mentor junior developers by letting them drive Our study results show that signifcantly fewer attempts were
collaboration sessions: “When I’m collaborating, I’m the one driving needed by our BVI participants to sync locations with CodeWalk
and I share my screen and they look at it. It’s just easier that way [...] I than in the baseline condition. This suggests that the coordina-
would be much more likely with an extension like this to let them drive tion burden (Design Criterion D3), which often requires explicit
more often.” Participants also appreciated that CodeWalk was built communication of awareness cues between collaborators, is re-
for VS Code, a mainstream and accessible IDE that sighted “people duced through CodeWalk’s sound efects and speech. Automating
might have” (P5). Upon its launch, they could use it without asking the transmission of code location and navigation actions helps to
their colleagues to switch to a new IDE. These quotes suggest that ensure that the sighted colleague also benefts from a reduced co-
CodeWalk can enable BVI developers to participate in collaborative ordination burden, since they need not remember to convey those
activities without requiring them to manually manage the sessions actions verbally either. The participants’ increased use of abstract
on their own. referents to code locations showed a corresponding decrease in the
number of more specifc referents (using line numbers and function
CodeWalk: Facilitating Shared Awareness in Mixed-Ability Collaborative Sofware Development ASSETS ’22, October 23–26, 2022, Athens, Greece

names). This suggests a reduction in cognitive load (Design Crite- user to avoid overlapping with the screen reader or CodeWalk. Par-
rion D1) on the part of the BVI developer. It also shows an increased ticipants expressed a need for avoiding “double-speak” (P8) between
sense of shared awareness and shared intentionality between the the collaborator and speech announcements by their screen readers
participants, which ensures that the BVI developer had the capabil- and CodeWalk. We plan to explore designs of a visual indicator to
ity to contribute equitably according to their ability rather than be non-screen reader users of CodeWalk to let them know when screen
sidelined by inaccessible collaboration tools. reader speech is active for any of their collaborators. This feature
From the Likert scale statements, we learned that participants should require BVI users to opt-in before it is turned on because
felt that CodeWalk improved their awareness of their environment, BVI users’ opinions of whether to reveal their use of AT to col-
their teammate, and their teammate’s actions. They also felt that leagues varies by culture [39] and may have signifcant workplace
they were more likely to be on the same mental page most of the consequences [5].
time and were able to work efectively together. BVI developers
were more likely to highlight text on the screen using their key-
board, confdent in the knowledge that their sighted colleagues 7.2 Accessible Co-Editing
would be able to notice it and react to it. BVI participants also could Lee et al. [37]’s CollabAlly tool developed similar sound efect
tell from the sound efects when their colleagues were navigating and speech-based feedback for BVI writers in common co-editing
or when they had stopped, concluding that they were now free to environments (e.g. Google Docs). CollabAlly found success in iden-
engage in conversation and explore the source code. Not only did tifying collaborators’ ongoing work and comments in the document,
this increase improved their communication, it also minimized the enabling collaborator awareness to avoid overwrites synchronously
cognitive load (Design Criterion D1) of trying to intuit what their and asynchronously. Our work examined the awareness needs
sighted colleague might be doing without any audible feedback. found in synchronous tasks of code walkthroughs and reviews and
CodeWalk’s cursor tethering was designed to support tight cou- found the timeliness of push-based notifcations vital to enable
pling (criterion D4) between participants at the task level, so that BVI collaborators to stay in sync with their sighted colleagues for
when one navigated through the code or edited some text, the other extended periods of time. Many coding tasks fuidly switch be-
would immediately be made aware and be able to respond. Some tween asynchronous and synchronous modes leaving unanswered
participants responded by directing the confederate to move to how to best support users’ cognitive load by conveying awareness
additional locations, showing increased agency (Design Criterion information simultaneously using pull and push-based modalities.
D2), looking around the code with their own screen reader, and Das et al. [18, 19]’s Co11ab work supporting mixed ability co-
then driving the confederate to the next code location. editing stops short of handling collaborators editing near one an-
CodeWalk’s skeumorphic sounds (e.g., key clicks and scroll wheel other. Prior to CodeWalk, these kinds of close edits would preferen-
sounds) were straightforward for the participants to understand tially disadvantage the BVI collaborator as their sighted colleagues
with no training. However, some sound efects, e.g., rising tone and could see the impending collisions and take their own steps to avoid
falling tone, used to convey directionality of movement, were not them. CodeWalk’s use of non-interruptible warning messages as
immediately obvious to the listeners. While participants got better colleagues get too close served to encourage all parties to commu-
at distinguishing these during their study session, others may need nicate using alternate, more accessible, channels (e.g. a concurrent
more time to get better at this.3 audio call) in order to appropriately synchronize their edits and
CodeWalk’s spoken sentences were necessary to orient the BVI avoid conficts.
participants after their sighted colleagues navigated to new areas in
the code. Sometimes, however, these spoken words collided with the
participant’s own screen reader speech. An early version of Code- 7.3 Interdependence
Walk played its sounds and speech by extending the NVDA screen Bennett et al.’s reframing of the goals of assistive technology as
reader, which enabled us to detect overlapping speech utterances interdependence instead of independence ring true in CodeWalk’s
and cancel one of them. However, to ensure CodeWalk worked with scenarios [6]. Collaboration between colleagues of mixed abilities
multiple screen readers on multiple platforms (including Mac and encourages each to play to their own strengths, while requiring
Linux), we used Microsoft’s Azure Cognitive Services to generate that each cede some of their own power and control to cooperate
speech and platform-specifc sound APIs to play it. It is possible efectively with others. Working together in a code walkthrough
to address the overlapping audio, however due to time constraints, or code review, a BVI developer who might have special expertise
we were unable to program CodeWalk to cancel our audio while in accessibility can disseminate that knowledge to sighted non-
the screen reader was talking. We encourage screen reader and op- specialists in situ and create a better result for their customers. As
erating system manufacturers to ofer extensible platform-agnostic shown by Pandey et al. [53], long-term mixed ability collaborators
APIs for integrated systems like CodeWalk with screen readers. establish mutual reliance by learning how to work together by
One interesting form of spoken collision remains. Sighted col- paying attention, responding to, and adapting to one another’s task-
leagues receive no indicators when BVI users are listening to their related behaviors, habits, and needs. CodeWalk’s sound efects and
screen readers, and thus do not realize to stop talking to the BVI speech events make a colleague’s navigation and edit work visible
to BVI collaborators, enabling them to be used by a BVI collaborator
as an essential assistive technology for remote collaborative work.
3 See Cat_ToBI (http://prosodia.upf.edu/cat_tobi/en/ear_training/listening.html) to Finally, CodeWalk challenges the established hierarchy of sighted
practice distinguishing rising and falling tones from one another. participants controlling the task, enabling BVI developers to lead
ASSETS ’22, October 23–26, 2022, Athens, Greece Potluri and Pandey, et al.

code walkthroughs and reviews instead of meekly defaulting to REFERENCES


follow. [1] Khaled Albusays and Stephanie Ludi. 2016. Eliciting Programming Challenges
Faced by Developers with Visual Impairments: Exploratory Study. In Proceedings
of 9th International Workshop on Cooperative and Human Aspects of Software
7.4 Researcher Refections Engineering. ACM, Austin, TX, 82–85. https://doi.org/10.1145/2897586.2897616
[2] Khaled Albusays, Stephanie Ludi, and Matt Huenerfauth. 2017. Interviews
This work improves our own practice to communicate accessibly and Observation of Blind Software Developers at Work to Understand Code
and to advocate for our own accessibility when participating in Navigation Challenges. In Proceedings of the 19th International ACM SIGAC-
collaborative software development activities. For example, the CESS Conference on Computers and Accessibility (Baltimore, Maryland, USA)
(ASSETS ’17). Association for Computing Machinery, New York, NY, USA, 91–100.
BVI member of the research team now always asks sighted col- https://doi.org/10.1145/3132525.3132550
laborators to verbalize code locations. A sighted member realized [3] Ameer Armaly, Paige Rodeghero, and Collin McMillan. 2018. Audiohighlight:
Code skimming for blind programmers. In Proceedings of 2018 IEEE International
that he needed to remember to stop speaking every so often to Conference on Software Maintenance and Evolution. IEEE, Madrid, Spain, 206–216.
allow his BVI collaborator to “read” the code for themselves using https://doi.org/10.1109/ICSME.2018.00030
their screen reader. The study coordinator recognized that each [4] Catherine M. Baker, Lauren R. Milne, and Richard E. Ladner. 2015. StructJumper: A
Tool to Help Blind Programmers Navigate and Understand the Structure of Code.
BVI developer’s access and communication needs are diferent and In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing
that expressing these needs can be tricky when collaborating with Systems (Seoul, Republic of Korea) (CHI ’15). Association for Computing Machin-
someone for the frst time. They have become mindful of attuning ery, New York, NY, USA, 3043–3052. https://doi.org/10.1145/2702123.2702589
[5] Florian Beijers. 2019. How to Get a Developer Job When You’re Blind: Advice
their communication to the preferences of their BVI collaborators. From a Blind Developer Who Works Alongside a Sighted Team. https://www.
Finally, as we collaboratively author this paper using the Overleaf freecodecamp.org/news/blind-developer-sighted-team/
[6] Cynthia L. Bennett, Erin Brady, and Stacy M. Branham. 2018. Interdependence
Latex editor, we yearn for it to make use of auditory feedback in as a Frame for Assistive Technology Research and Design. In Proceedings of the
order to fully include our BVI co-author in our writing eforts. 20th International ACM SIGACCESS Conference on Computers and Accessibility
(Galway, Ireland) (ASSETS ’18). Association for Computing Machinery, New York,
NY, USA, 161–173. https://doi.org/10.1145/3234695.3236348
7.5 Future Work [7] Jacob T Biehl, Mary Czerwinski, Greg Smith, and George G Robertson. 2007.
FASTDash: a visual dashboard for fostering awareness in software teams. In
In the future, we would like to explore how to combine the lessons Proceedings of the SIGCHI conference on Human factors in computing systems.
learned from CodeWalk, CollabAlly, and Co11ab in supporting ACM, San Jose, CA, 1313–1322.
mixed ability remote collaboration, whether it be for document [8] Jefrey P. Bigham, Anna C. Cavender, Jeremy T. Brudvik, Jacob O. Wobbrock, and
Richard E. Ladner. 2007. WebinSitu: A Comparative Analysis of Blind and Sighted
co-editing, software co-editing, or additional collaborative software Browsing Behavior. In Proceedings of the 9th International ACM SIGACCESS
development tasks. In particular, future designs should explore Conference on Computers and Accessibility (Tempe, Arizona, USA) (Assets ’07).
ways that BVI collaborators can most efectively and equitably lead Association for Computing Machinery, New York, NY, USA, 51–58. https://doi.
org/10.1145/1296843.1296854
interactions, in one-to-one and one-to-many scenarios, including [9] Syed Masum Billah, Vikas Ashok, Donald E Porter, and IV Ramakrishnan. 2017.
collaborations involving two or more BVI developers. Ubiquitous accessibility for people with visual impairments: Are we there yet?.
In Proceedings of the 2017 CHI conference on human factors in computing systems.
Many participants said we should ensure that CodeWalk was ACM, Denver, CO, 5862–5868.
accessible to deaf-blind programmers and usable with Braille dis- [10] Syed Masum Billah, Donald E Porter, and IV Ramakrishnan. 2016. Sinter: Low-
plays. They felt that its reliance on the audio medium could exclude bandwidth remote access for the visually-impaired. In Proceedings of the Eleventh
European Conference on Computer Systems. ACM, London, UK, 1–16.
deaf-blind programmers. In future, we will extend our design to [11] Jeremy Birnholtz, Nanyi Bi, and Susan Fussell. 2012. Do you see that I see?
support communication of awareness information through tactile Efects of perceived visibility on awareness checking behavior. In Proceedings of
media. the SIGCHI Conference on Human Factors in Computing Systems. ACM, Austin,
TX, 1765–1774.
We recommend that application design standards, such as ATAG [12] Stacy M. Branham and Shaun K. Kane. 2015. The Invisible Work of Accessibility:
(Authoring Tool Accessibility Guidelines) and WCAG (Web Content How Blind Employees Manage Accessibility in Mixed-Ability Workplaces. In
Accessibility Guidelines) be extended to support mixed ability teams Proceedings of the 17th International ACM SIGACCESS Conference on Computers
& Accessibility (Lisbon, Portugal) (ASSETS ’15). Association for Computing Ma-
and provide non-visual information about collaborators’ location, chinery, New York, NY, USA, 163–171. https://doi.org/10.1145/2700648.2809864
navigation, and edit operations. This could increase the use of [13] William Buxton. 2009. Mediaspace - Meaningspace - Meetingspace. In Media
Space: 20+ Years of Mediated Life, S. Harrison (Ed.). Springer, London, UK, 217–
accessibility practices in the design of collaborative authoring tools. 231.
[14] William A. S. Buxton. 1992. Telepresence: Integrating Shared Task and Person
Spaces. In Proceedings of the Conference on Graphics Interface ’92 (Vancouver,
8 CONCLUSION British Columbia, Canada). Morgan Kaufmann Publishers Inc., San Francisco,
Existing tools to facilitate tightly-coupled software development CA, USA, 123–129.
[15] Code Factory. 2021. Eloquence for Windows. Code Factory. https://
tasks rely on visual cues and create accessibility barriers to equi- codefactoryglobal.com/app-store/eloquence-for-windows/
tably collaboration for BVI developers. To address this accessibility [16] Sarah D’Angelo and Andrew Begel. 2017. Improving Communication Between
gap, we designed, developed and evaluated CodeWalk, a set of fea- Pair Programmers Using Shared Gaze Awareness. Association for Computing
Machinery, New York, NY, USA, 6245–6290. https://doi.org/10.1145/3025453.
tures added to Microsoft’s VS Code Live Share extension that makes 3025573
a collaborator’s location in a code fle and their actions accessible [17] Maitraye Das, Darren Gergle, and Anne Marie Piper. 2019. “It Doesn’t Win You
through cursor tethering, as well as sound efects and speech. Code- Friends”: Understanding Accessibility in Collaborative Writing for People with
Vision Impairments. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 191
Walk’s features improved coordination between BVI developers (Nov. 2019), 26 pages. https://doi.org/10.1145/3359293
and their sighted peers while reducing the explicit efort that BVI [18] Maitraye Das, Anne Marie Piper, and Darren Gergle. 2022. Design and Eval-
uation of Accessible Collaborative Writing Techniques for People with Vision
developers need to put to stay coordinated. We hope CodeWalk can Impairments. ACM Transactions on Computer-Human Interaction 29, 2 (2022),
serve as an exemplar for IDE manufacturers to make their envi- 1–42.
ronments more accessible to blind and visually impaired software [19] Das, Maitraye and McHugh, Thomas B. and Piper, Anne Marie and Gergle, Darren.
2022. Co11ab: Augmenting Accessibility in Synchronous Collaborative Writing
developers.
CodeWalk: Facilitating Shared Awareness in Mixed-Ability Collaborative Sofware Development ASSETS ’22, October 23–26, 2022, Athens, Greece

for People with Vision Impairments. In Proceedings of the 2022 CHI Conference of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama,
on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA,
Association for Computing Machinery, New York, NY, USA, 1–18. Article 372, 14 pages. https://doi.org/10.1145/3411764.3445321
[20] Paul Dourish and Victoria Bellotti. 1992. Awareness and coordination in shared [40] Stephanie Ludi, Jamie Simpson, and Wil Merchant. 2016. Exploration of the Use
workspaces. In Proceedings of the 1992 ACM conference on Computer-supported of Auditory Cues in Code Comprehension and Navigation for Individuals with
cooperative work. ACM, Toronto, ON, Canada, 107–114. Visual Impairments in a Visual Programming Environment. In Proceedings of the
[21] Hongfei Fan, Chengzheng Sun, and Haifeng Shen. 2012. ATCoPE: Any-Time 18th International ACM SIGACCESS Conference on Computers and Accessibility
Collaborative Programming Environment for Seamless Integration of Real-Time (Reno, Nevada, USA) (ASSETS ’16). Association for Computing Machinery, New
and Non-Real-Time Teamwork in Software Development. In Proceedings of the York, NY, USA, 279–280. https://doi.org/10.1145/2982142.2982206
17th ACM International Conference on Supporting Group Work (Sanibel Island, [41] Stephanie Ludi and Mary Spencer. 2017. Design considerations to increase block-
Florida, USA) (GROUP ’12). Association for Computing Machinery, New York, based language accessibility for blind programmers Via Blockly. Journal of Visual
NY, USA, 107–116. https://doi.org/10.1145/2389176.2389194 Languages and Sentient Systems 3, 1 (2017), 119–124.
[22] Github. 2022. Teletype for Atom. Microsoft, Redmond, WA. https://teletype.atom. [42] Sean Mealin and Emerson Murphy-Hill. 2012. An exploratory study of blind
io/ software developers. In 2012 IEEE Symposium on Visual Languages and Human-
[23] Max Goldman, Greg Little, and Robert C. Miller. 2011. Real-Time Collaborative Centric Computing (VL/HCC). IEEE, Innsbruck, Austria, 71–74. https://doi.org/
Coding in a Web IDE. In Proceedings of the 24th Annual ACM Symposium on User 10.1109/VLHCC.2012.6344485
Interface Software and Technology (Santa Barbara, California, USA) (UIST ’11). [43] Oussama Metatla, Nick Bryan-Kinns, and Tony Stockman. 2018. “I Hear You”:
Association for Computing Machinery, New York, NY, USA, 155–164. https: Understanding Awareness Information Exchange in an Audio-Only Workspace.
//doi.org/10.1145/2047196.2047215 In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems
[24] Geof Greer and Matt Kaniaris. 2020. Floobits real-time collaboration plugin for (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New
Sublime Text 2 and 3. Floobits. https://github.com/Floobits/foobits-sublime York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3174120
[25] Carl Gutwin and Saul Greenberg. 2002. A Descriptive Framework of Workspace [44] Microsoft. 2020. Accessibility in Visual Studio Code. Microsoft. https://code.
Awareness for Real-Time Groupware. Computer Supported Cooperative Work visualstudio.com/docs/editor/accessibility#_screen-readers
(CSCW) 11, 3 (Sept. 2002), 411–446. https://doi.org/10.1023/A:1021271517844 [45] Microsoft. 2020. Microsoft/vscode-a11y - Gitter. Microsoft. https://gitter.im/
[26] Lile Hattori and Michele Lanza. 2010. Syde: A tool for collaborative software Microsoft/vscode-a11y
development. In Proceedings of the 32nd ACM/IEEE International Conference on [46] Microsoft. 2022. Visual Studio Live Share. Microsoft, Redmond, WA. https:
Software Engineering-Volume 2. ACM/IEEE, Cape Town, South Africa, 235–238. //visualstudio.microsoft.com/services/live-share/
[27] Scott Highhouse. 2009. Designing experiments that generalize. Organizational [47] Matthew Miles, A. Michael Huberman, and Michael Saldaña. 2013. Qualitative
Research Methods 12, 3 (2009), 554–566. Data Analysis: A Methods Sourcebook. Sage Publications, Thousand Oaks, CA.
[28] Earl W. Huf, Kwajo Boateng, Makayla Moster, Paige Rodeghero, and Julian [48] Lauren R. Milne and Richard E. Ladner. 2018. Blocks4All: Overcoming Accessi-
Brinkley. 2020. Examining The Work Experience of Programmers with Visual bility Barriers to Blocks Programming for Children with Visual Impairments. In
Impairments. In 2020 IEEE International Conference on Software Maintenance and Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems
Evolution (ICSME). IEEE, Online, 707–711. https://doi.org/10.1109/ICSME46990. (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New
2020.00077 York, NY, USA, 1–10. https://doi.org/10.1145/3173574.3173643
[29] Earl W. Huf, Kwajo Boateng, Makayla Moster, Paige Rodeghero, and Julian [49] Cecily Morrison, Nicolas Villar, Anja Thieme, Zahra Ashktorab, Eloise Taysom,
Brinkley. 2021. Exploring the Perspectives of Teachers of the Visually Impaired Oscar Salandin, Daniel Cletheroe, Greg Saul, Alan F Blackwell, Darren Edge, Mar-
Regarding Accessible K12 Computing Education. In Proceedings of the 52nd ACM tin Grayson, and Haiyan Zhang. 2020. Torino: A tangible programming language
Technical Symposium on Computer Science Education (Virtual Event, USA) (SIGCSE inclusive of children with visual disabilities. Human–Computer Interaction 35, 3
’21). Association for Computing Machinery, New York, NY, USA, 156–162. https: (2020), 191–239.
//doi.org/10.1145/3408877.3432418 [50] Aboubakar Mountapmbeme, Obianuju Okafor, and Stephanie Ludi. 2022. Address-
[30] Joe Hutchinson and Oussama Metatla. 2018. An Initial Investigation into Non- ing Accessibility Barriers in Programming for People with Visual Impairments:
visual Code Structure Overview Through Speech, Non-speech and Spearcons. In A Literature Review. ACM Transactions on Accessible Computing (TACCESS) 15, 1
Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing (2022), 1–26.
Systems - CHI ’18. ACM Press, New York, New York, USA, 1–6. https://doi.org/ [51] Katja Neureiter, Martin Murer, Verena Fuchsberger, and Manfred Tscheligi. 2013.
10.1145/3170427.3188696 Hand and eyes: how eye contact is linked to gestures in video conferencing. In
[31] Jennifer Hyde, Sara Kiesler, Jessica K Hodgins, and Elizabeth J Carter. 2014. CHI’13 Extended Abstracts on Human Factors in Computing Systems. ACM, Paris,
Conversing with children: Cartoon and video people elicit similar conversational France, 127–132.
behaviors. In Proceedings of the SIGCHI conference on human factors in computing [52] Austin Lee Nichols and Jon K Maner. 2008. The good-subject efect: Investigating
systems. ACM, Toronto, ON, Canada, 1787–1796. participant demand characteristics. The Journal of general psychology 135, 2
[32] Hiroshi Ishii and Naomi Miyake. 1991. Toward an open shared workspace: (2008), 151–166.
computer and video fusion approach of TeamWorkStation. Commun. ACM 34, [53] Maulishree Pandey, Vaishnav Kameswaran, Hrishikesh V. Rao, Sile O’Modhrain,
12 (1991), 37–50. and Steve Oney. 2021. Understanding Accessibility and Collaboration in Program-
[33] JetBrains. 2020. Meet Code With Me (EAP) — a tool for collaborative development ming for People with Visual Impairments. In Proceedings of the CSCW Conference
by JetBrains. JetBrains, Prague. https://blog.jetbrains.com/blog/2020/09/28/code- on Computer Supported Cooperative Work (Virtual) (CSCW ’21). Association for
with-me-eap/ Computing Machinery, New York, NY, USA, 30 pages.
[34] Sasa Junuzovic, Prasun Dewan, and Yong Rui. 2007. Read, write, and navigation [54] Yi-Hao Peng, JiWoong Jang, Jefrey P Bigham, and Amy Pavel. 2021. Say It All:
awareness in realistic multi-view collaborations. In 2007 International Conference Feedback for Improving Non-Visual Presentation Accessibility. In Proceedings
on Collaborative Computing: Networking, Applications and Worksharing (Collabo- of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama,
rateCom 2007). IEEE, New York, NY, 494–503. https://doi.org/10.1109/COLCOM. Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA,
2007.4553880 Article 276, 12 pages. https://doi.org/10.1145/3411764.3445572
[35] Claire Kearney-Volpe and Amy Hurst. 2021. Accessible Web Development: Op- [55] Venkatesh Potluri, Tad Grindeland, Jon E. Froehlich, and Jennifer Mankof. 2021.
portunities to Improve the Education and Practice of Web Development with a Examining Visual Semantic Understanding in Blind and Low-Vision Technology
Screen Reader. ACM Trans. Access. Comput. 14, 2, Article 8 (July 2021), 32 pages. Users. In Proceedings of the SIGCHI Conference on Human Factors in Computing
https://doi.org/10.1145/3458024 Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery,
[36] Richard E Ladner and Kyle Rector. 2017. Making your presentation accessible. New York, NY, USA, 14 pages.
Interactions 24, 4 (2017), 56–59. [56] Venkatesh Potluri, Priyan Vaithilingam, Suresh Iyengar, Y. Vidya, Manohar
[37] Cheuk Yin Phipson Lee, Zhuohao Zhang, Jaylin Herskovitz, JooYoung Seo, and Swaminathan, and Gopal Srinivasa. 2018. CodeTalk: Improving Programming
Anhong Guo. 2021. CollabAlly: Accessible Collaboration Awareness in Document Environment Accessibility for Visually Impaired Developers. In Proceedings of
Editing. Association for Computing Machinery, New York, NY, USA, 1–4. https: the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC,
//doi.org/10.1145/3441852.3476562 Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA,
[38] Cheuk Yin Phipson Lee, Zhuohao Zhang, Jaylin Herskovitz, JooYoung Seo, and 1–11. https://doi.org/10.1145/3173574.3174192
Anhong Guo. 2022. CollabAlly: Accessible Collaboration Awareness in Document [57] T. V. Raman. 1996. Emacspeak—Direct Speech Access. In Proceedings of the Second
Editing. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Annual ACM Conference on Assistive Technologies (Vancouver, British Columbia,
Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, Canada) (Assets ’96). Association for Computing Machinery, New York, NY, USA,
New York, NY, USA, 1–17. 32–36. https://doi.org/10.1145/228347.228354
[39] Franklin Mingzhe Li, Di Laura Chen, Mingming Fan, and Khai N. Truong. 2021. [58] Gema Rodríguez-Pérez, Reza Nadri, and Meiyappan Nagappan. 2021. Perceived
“I Choose Assistive Devices That Save My Face”: A Study on Perceptions of diversity in software engineering: a systematic literature review. Empirical
Accessibility and Assistive Technology Use Conducted in China. In Proceedings Software Engineering 26, 5 (2021), 1–38.
ASSETS ’22, October 23–26, 2022, Athens, Greece Potluri and Pandey, et al.

[59] Johnny Saldaña. 2016. The coding manual for qualitative researchers (3rd ed.). [69] Andreas Stefk and Richard Ladner. 2017. The Quorum Programming Language
Sage, London, UK. (Abstract Only). In SIGCSE Technical Symposium (Seattle, Washington, USA)
[60] Stephan Salinger, Christopher Oezbek, Karl Beecher, and Julia Schenk. 2010. (SIGCSE ’17). ACM, New York, NY, USA, 641.
Saros: an eclipse plug-in for distributed party programming. In Proceedings of the [70] Andreas M. Stefk, Christopher Hundhausen, and Derrick Smith. 2011. On the
2010 ICSE Workshop on Cooperative and Human Aspects of Software Engineering. Design of an Educational Infrastructure for the Blind and Visually Impaired
ACM/IEEE, Cape Town, South Africa, 48–55. in Computer Science. In Proceedings of the 42nd ACM Technical Symposium
[61] Harini Sampath, Alice Merrick, and Andrew MacVean. 2021. Accessibility of on Computer Science Education (Dallas, TX, USA) (SIGCSE ’11). Association for
Command Line Interfaces. Association for Computing Machinery, New York, NY, Computing Machinery, New York, NY, USA, 571–576. https://doi.org/10.1145/
USA, 1–10. https://doi.org/10.1145/3411764.3445544 1953163.1953323
[62] Emmanuel Schanzer, Sina Bahram, and Shriram Krishnamurthi. 2019. Accessible [71] Kevin M. Storer, Harini Sampath, and M. Alice Merrick. 2021. “It’s Just Ev-
AST-Based Programming for Visually-Impaired Programmers. In Proceedings of erything Outside of the IDE that’s the Problem”: Information Seeking by Soft-
the 50th ACM Technical Symposium on Computer Science Education (Minneapolis, ware Developers with Visual Impairments. In Proceedings of the SIGCHI Con-
MN, USA) (SIGCSE ’19). Association for Computing Machinery, New York, NY, ference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21).
USA, 773–779. https://doi.org/10.1145/3287324.3287499 Association for Computing Machinery, New York, NY, USA, 12 pages. https:
[63] Freedom Scientifc. 2022. Jaws Tandem Quick Start Guide. https://support. //doi.org/10.1145/3411764.3445090
freedomscientifc.com/JawsHQ/JawsTandemQuickStart [72] Sublime users. 2020. A request for the implementation of accessibility. Issue #3392.
[64] Chirag Shah and Gary Marchionini. 2010. Awareness in collaborative information sublimehq/sublime_text. GitHub. https://github.com/sublimehq/sublime_text/
seeking. Journal of the American Society for Information Science and Technology issues/3392
61, 10 (2010), 1970–1986. [73] John Tang. 2021. Understanding the Telework Experience of People with Disabil-
[65] Kristen Shinohara and Jacob O Wobbrock. 2016. Self-conscious or self-confdent? ities. Proc. ACM Hum.-Comput. Interact. 5, CSCW1, Article 30 (apr 2021), 27 pages.
A diary study conceptualizing the social accessibility of assistive technology. https://doi.org/10.1145/3449104
ACM Transactions on Accessible Computing (TACCESS) 8, 2 (2016), 1–31. [74] Josh Tenenberg, Wolf-Michael Roth, and David Socha. 2016. From I-Awareness
[66] Alexa Siu, Gene S-H Kim, Sile O’Modhrain, and Sean Follmer. 2022. Supporting to We-Awareness in CSCW. Computer Supported Cooperative Work (CSCW) 25, 4
Accessible Data Visualization Through Audio Data Narratives. In Proceedings of (Oct. 2016), 235–278. https://doi.org/10.1007/s10606-014-9215-0
the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, [75] Christopher Toth and Tyler Spivey. 2018. Documentation NVDA Remote Access.
LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, https://nvdaremote.com/docs/
Article 476, 19 pages. https://doi.org/10.1145/3491102.3517678 [76] Jeremy Warner and Philip J Guo. 2017. Codepilot: Scafolding end-to-end col-
[67] Stack Overfow. 2021. Stack Overfow Developer Survey 2021. https://insights. laborative software development for novice programmers. In Proceedings of the
stackoverfow.com/survey/2021#section-demographics-disability-status 2017 CHI Conference on Human Factors in Computing Systems. ACM, Denver, CO,
[68] Andreas Stefk, Andrew Haywood, Shahzada Mansoor, Brock Dunda, and Daniel 1136–1141.
Garcia. 2009. Sodbeans. In 2009 IEEE 17th International Conference on Program [77] Jacob O Wobbrock and Matthew Kay. 2016. Nonparametric statistics in human–
Comprehension. IEEE, Vancouver, BC, Canada, 293–294. computer interaction. In Modern Statistical Methods for HCI. Springer, Berlin,
Germany, 135–170.
Designing a Customizable Picture-Based Augmented Reality
Application For Therapists and Educational Professionals
Working in Autistic Contexts
Tooba Ahsen Christina Yu Amanda O’Brien
tooba.ahsen@tufts.edu christina.yu@childrens.harvard.edu amanda_obrien@g.harvard.edu
Computer Science Autism Language Program & Speech and Hearing Biosciences and
Tufts University Augmentative Communication Technology
Medford, MA, USA Program Harvard University
Boston Children’s Hospital Cambridge, Massachusetts, USA
Boston, Massachusetts, USA

Ralf W Schlosser Howard C Shane Dylan Oesch-Emmel


Ralf.Schlosser@Northeastern.edu howard.shane@childrens.harvard.edu dylan.oesch_emmel@tufts.edu
Communication Sciences and Boston Children’s Hospital Computer Science
Disorders Boston, Massachusetts, USA Tufts University
Northeastern University Harvard Medical School Medford, Massachusetts, USA
Boston, Massachusetts, USA Boston, Massachusetts, USA

Eileen T. Crehan Fahad Dogar


eileen.crehan@tufts.edu fahad@cs.tufts.edu
Eliot-Pearson Department of Child Computer Science
Study & Human Development Tufts University
Tufts University Medford, Massachusetts, USA
Medford, Massachusetts, USA
Department of Psychiatry
Rush University Medical Center
Chicago, Illinois, USA

ABSTRACT when trying to incorporate picture-based AR in practical therapy


This paper presents the design and evaluation of CustomAR – a exercises, and how they can be addressed.
customizable Augmented Reality (AR) application, designed in col-
laboration with therapists, that allows them to create and customize CCS CONCEPTS
picture-based AR experiences for use in an autistic context. Using a • Human-centered computing → Mixed / augmented reality;
2-week diary study, we gauge whether the application’s customiza- User studies; • Social and professional topics → People with
tion options and features are sufcient to allow therapists and disabilities.
educational professionals to create AR experiences for the various
learning activities they conduct with autistic children, and what KEYWORDS
challenges they face in this regard. We fnd that participants think
Augmented Reality; Picture-Based; Marker-Based; Autism Spec-
the application would be suitable for creating AR experiences for a
trum Condition; Education; Therapy; Customization;
wide range of learning activities, such as choice-making and teach-
ing daily living skills, and think that the application’s freeze feature ACM Reference Format:
can be helpful when working with children with limited attention. Tooba Ahsen, Christina Yu, Amanda O’Brien, Ralf W Schlosser, Howard
Towards the end of the paper we discuss the challenges users face C Shane, Dylan Oesch-Emmel, Eileen T. Crehan, and Fahad Dogar. 2022.
Designing a Customizable Picture-Based Augmented Reality Application
For Therapists and Educational Professionals Working in Autistic Contexts.
In The 24th International ACM SIGACCESS Conference on Computers and
Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New
This work is licensed under a Creative Commons Attribution International
York, NY, USA, 16 pages. https://doi.org/10.1145/3517428.3544884
4.0 License.

ASSETS ’22, October 23–26, 2022, Athens, Greece 1 INTRODUCTION


© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. Autism Spectrum Condition (ASC) is a neurodevelopmental con-
https://doi.org/10.1145/3517428.3544884 dition that is often characterized by difculty in communication,
ASSETS ’22, October 23–26, 2022, Athens, Greece Tooba, et al.

in particular, has helped autistic individuals complete daily living


tasks more independently [22] and has helped them improve their
communication [40] and emotion recognition skills [19].
However, practical use of these previously designed AR appli-
cations may be limited in therapeutic or learning environments
because they lack support for customization. Therapists and edu-
cational professionals often need to customize learning exercises
and content according to an autistic child’s interests [37, 41], to
keep them engaged and encourage positive behaviors [16, 57]. Cus-
tomization is key, and the lack of support for customization could
make it difcult for therapists and educational professionals to inte-
(a) The CustomAR Application. The fgure shows the grate picture-based AR into the day-to-day learning activities they
creation window. Users can associate audio, videos, or conduct. Therefore, there is a gap between research into AR appli-
3D models with a target image to create an AR experi-
ence. The white plane to the center-left of the screen
cations and the practical use of those applications, and a scarcity
shows the image the user has uploaded. The user has of work that bridges this gap.
added a Lego block 3D model to this AR experience and Therapists and educational professionals are also key players
is customizing its position.
in the adoption of assistive or learning technology. Prior to intro-
ducing new technology into an autistic child’s educational plan,
they must use their clinical and educational experience to evaluate
the technology and determine how to efectively utilize it in an
intervention [27]. Therefore, to design AR applications that are
useful in practical autistic contexts, it is important for the research
community to understand the customization needs of therapists
and educational professionals, how they envision using AR in the
learning activities they conduct with autistic children, and what
features will be most useful in this regard.
Our contribution is as follows; We frst highlight the design and
implementation of CustomAR; a mobile AR application, created
(b) The AR-View window. Users can view their AR ex- in collaboration with therapists, that allows them to create and
perience by going to the AR-View window. The user customize picture-based AR experiences from scratch (refer to fg-
has printed the target image onto a piece of paper, and
is holding up their device camera to the target image to ure 1). We highlight key lessons from each of the two phases of
elicit the associated AR experience. The 3D Lego block the design process - from engaging in participatory design with
has appeared on their screen.
therapists to create the frst version of the application for a specifc
symbolic development use-case, to modifying the application into
Figure 1 a customizable authoring tool. The latter could potentially allow
therapists and educational professionals to create AR experiences
for a variety of learning contexts, and for a diverse population of
social-emotional reciprocity, and repetitive behavior [34]. Autis- autistic children.
tic children1 often have varied learning and behavioral profles – Secondly, we conduct a 2-week diary study and follow-up semi-
some may have more difculty with expressive and/or receptive structured interviews with special education teachers, occupational
language, others may have difculty understanding and regulat- therapists, and speech language pathologists, to gauge whether they
ing emotions and behaviors, and many demonstrate very specifc think the application’s customization options and other features are
interests [34, 52]. suitable for use in autism-related learning contexts. Moreover, we
While there is no one-size fts all approach to teaching, therapists shed light on the challenges that could hamper the use of picture-
and teachers on an autistic child’s support team often adopt learning based AR in practical therapeutic or educational settings. Our study
strategies that utilize visual supports (pictures, icons, photographs reveals the following:
etc.), as they have been shown to be efective in autistic contexts
[50, 56]. Picture-based Augmented Reality (AR) is a logical extension • Therapists/teachers fnd the application’s customization op-
to learning activities that rely on visual supports as it allows pictures tions sufcient to create AR experiences for a myriad of
to be superimposed with additional information or virtual content learning activities, ranging from choice-making and teach-
(audio, videos, 3D models etc.) - the virtual content appears on ing daily living skills, to teaching lessons on emotion recog-
the screen when the picture comes into view of the user’s device nition and for collaboration and group-work. Details of these
camera. Augmented Reality, in general, has shown to increase focus learning activities and the customization options that sup-
and engagement in children with autism [30] and Picture-based AR, ported their development are provided in section 6.2 and
1A
6.3.
note on language: Adolescent and adult self-advocates in our studies and our
advisory board have shared that they strongly prefer identity-frst language (e.g., • Therapists/teachers appreciate the ability to ‘freeze’ AR expe-
autistic person) [15, 46] riences on the screen and think that it would be useful when
Designing a Customizable Picture-Based AR Application For Therapists Working in Autistic Contexts ASSETS ’22, October 23–26, 2022, Athens, Greece

working with children with limited focus, or when show- the tooth-brushing task more independently when AR was used
ing AR experiences to children in groups. This is discussed [22]. Prior studies have also used picture-based AR to enhance the
further in section 6.4. learning efects of social stories, [20], to teach emotion recogni-
• Therapists/teachers are concerned about generalizing AR tion [17, 19], and to improve children’s communication abilities by
experiences or creating them ‘just-in-time’ during therapy reinforcing the meaning of pictures/symbols found in the Picture
sessions (section 6.5). Allowing users to share AR experiences Exchange Communication System (PECS) [40, 47]. Picture-based
with others, and recommending AR content to users may AR, therefore, can be an efective teaching tool in an autistic context
alleviate these concerns. This is discussed further in sections and as such it is important to understand what tools and features
7 and 9.1 respectively. could facilitate therapists and teachers in using picture-based AR
Our work provides valuable insight into the features and cus- during the learning exercises they conduct.
tomization options that will help therapists and educational pro-
fessionals introduce AR into the learning activities they conduct. 2.2 Autism, Interest-Based Learning &
Although autistic individuals did not take part in this study, our
fndings may be useful for future studies that position autistic
Customizable AR
children as content creators and investigate which features/UI are Studies have shown that customizing learning exercises accord-
suitable to allow them to independently create AR experiences. In ing to the unique learning needs and interests of autistic children
the subsequent sections of this paper, we summarize related work, [33, 35] leads to positive behaviors [16, 28, 57]. The ability to cus-
provide details about the two-phased design process and the user tomize, therefore, may be invaluable to therapists and educational
study, present and discuss our qualitative fndings, and highlight professionals working with autistic individuals. However, a recent
limitations and future directions for this research. survey study revealed that current computer-based technologies de-
signed for autistic individuals lack robust support for customization
2 RELATED WORK [44]. This includes studies under the umbrella of ‘augmented and
virtual reality’ - although AR interventions have shown to support
This section leverages prior work to highlight how therapists and
autistic individuals in various domains [38, 39, 45] including com-
educational professionals use visual supports in learning exercises,
munication [40, 58, 59], daily-living [22], pretend play [12, 13, 26],
how picture-based AR can extend them, and the signifcance of
and social and emotional learning [29, 43, 54], research into cus-
customization and interest-based learning in autistic contexts.
tomizable AR is lacking [44].
2.1 Using Visual Supports in Learning Very few studies have provided users with the ability to cus-
tomize picture-based AR experiences [24, 47], and even then users
Exercises & How Picture-Based AR Can can only customize within the confnes of an activity, e.g., being
Extend Them able to change the virtual content associated with predefned PECS
Therapists and teachers frequently use visual supports, such as symbols, but being unable to create an experience around new sym-
pictures, symbols or photographs in learning activities involving bols [47]. Moreover, there is only one AR application on the Google
autistic children as they often exhibit strong visual processing abil- play store (for android devices) that allows users to create their own
ities [50]. For example, teachers or therapists may tape pictures picture-based AR experiences [2], and this application has not been
that represent the day’s activities (visual schedules) to the walls created with the needs of therapists/teachers and autistic children
of a classroom to make it easier for autistic children to transition in mind.
from one activity to another [5, 25]. Similarly, they may print pic- In contrast, we involved therapists in the design process and
tures depicting the steps involved in a daily living task, and tape drew on their clinical experience in working with autistic children.
them somewhere in the environment as reminders [1, 50]. Visual Subsequently, we created an application that not only gives them
supports are also used during choice-making activities, reinforcer the agency to create and customize picture-based AR experiences
assessment activities, or activities involving speech and language from scratch using their own images and virtual content, for learn-
[3] ing activities that they themselves have conceptualized, but that
Picture-based AR is a logical extension to some of these learn- also contains features that could potentially facilitate the use of
ing activities that rely on pictures/symbols, and can add layers of AR in autistic contexts. Moreover, we investigate the practical chal-
information and helpful hints on top of them. When users hold lenges they could encounter when trying to inculcate picture-based
their devices over a picture to elicit the associated AR content, the AR in autism-related learning activities - something that no prior
context of the activity (the picture/photograph, the tabletop etc.) work has looked into thus far.
is preserved, and users can view additional information without
shifting their focus away from the task at hand. For example, Cihak
et al. took a traditional tooth-brushing exercise in which a picture
3 RESEARCH GOALS & HIGHER LEVEL
of the steps in the task was taped to the bathroom mirror, and RESEARCH QUESTIONS
superimposed the picture with a video clip depicting how each step Our frst goal is to provide therapists and educational profession-
was performed. Autistic children accessed this additional informa- als with an application that allows them to create and customize
tion by holding their device cameras up to the picture and could picture-based AR experiences from scratch. Section 4 describes
replicate the steps using the toothbrush, toothpaste, and paper cups the two design phases that culminated in the development of our
in front of them. The study found that autistic children performed customizable AR application.
ASSETS ’22, October 23–26, 2022, Athens, Greece Tooba, et al.

Our second goal is to evaluate this application by conducting user experiences (e.g., the target images and 3D models used, the audio
studies with therapists and educational professionals. We aim to and animations to create for each model etc.). Over the remaining
understand; (1) whether therapists can sufciently use the applica- months, the frst author coded the application’s features and UI.
tion’s creation and customization features to create AR experiences When a feature was coded, the frst author would demonstrate
for their autistic students; (2) whether the freeze feature – a feature the feature during the bi-weekly in-person meetings. The SLPs
that was developed based on feedback from therapists during the would provide feedback and suggestions for improvement or re-
design phase (section 4.2.3), and which allows them to freeze AR design, and modifcations would be made accordingly. The SLPs
experiences on the screen while keeping the experiences interactive also informally tested the application amongst themselves (at the
– is useful in the context of autism; (3) what challenges therapists end of Phase I) and provided feedback that spurred the redesign of
face when trying to integrate picture-based AR in the learning the application into a more cutomizable tool (Phase II).
activities they conduct.
The specifc research questions (RQ) are listed below: 4.2 Phase I - Designing an AR Application for a
• RQ 1: Are the application’s customization options suf- Specifc Clinical Context
fcient to allow therapists/educational professionals to 4.2.1 Symbolic Development Use-case: In phase I, the goal
create AR experiences to meet the varied interests and was to design an application that could enhance symbolic knowl-
learning needs of the autistic population they work edge for the minimally verbal children with moderate to severe
with? autism, that visited the clinic. The SLPs often centered therapy
• RQ 2: How does the application’s freeze feature facili- around play, and these children were familiar with numerous toys
tate the use of AR in the context of autism? present at the clinic.
• RQ 3: What are the challenges in using picture-based Let’s consider a common scenario at the clinic; Sam is a 7 year
AR in practical therapy/learning settings? old male who regularly visits the clinic. He has moderate-severe
Section 5 highlights the results of the user study we conducted autism and very little functional speech (a few intelligible spoken
to answer these research questions. words). Therapists at the clinic want to teach Sam how to requests
for objects (toys, food etc) by pointing towards pictures of those
4 APPLICATION DESIGN objects, instead of trying to grab the objects themselves, physically
We designed and developed our application in two phases. In phase dragging an adult towards where the objects are being kept, or
I (section 4.2), we designed the application in collaboration with 4 showing disruptive behavior. When the therapist presents Sam
researchers/Speech Language Pathologists (SLPs) from the Boston with two objects that he likes, a toy (a Lego block) and a snack
Children’s Hospital’s outpatient clinic in Waltham, Massachusetts. (goldfsh crackers), he consciously points towards the object he
The goal was to use AR to enhance the symbolic knowledge for the wants. The therapist then puts the objects out of sight, and presents
moderate to severely autistic children that visited the clinic. Our Sam with a picture of the Lego block and goldfsh crackers instead.
collaborating SLPs informally tested the application and realized Sam is unable to point towards the picture of the object he wants. He
that the lack of customization options was hindering its use. We does not understand that the 2D pictures are symbols that represent
then re-designed the application in phase II (section 4.3), to allow the 3D objects.
users to create/customize picture-based AR experiences for a variety Since pictures and symbols are key to numerous alternate com-
of learning contexts and autistic individuals, not just those specifc munication strategies [14, 31, 32, 49], it is important for these chil-
to the clinic. In this section, we highlight our design journey; We dren to understand the symbolic relationship between 2D pictures
describe our initial symbolic development use-case, features of the and their 3D referents. In our example, Sam does not have a clear
frst prototype, and the lessons we learnt that spurred the redesign understanding of this relationship. We hypothesized that if we
of the application into a more customizable tool. We then describe could momentarily convert the 2D into 3D, that is, superimpose
the features of the latest version of the application. the 2D pictures of the objects with highly similar 3D models of the
objects, Sam may understand the 2D-3D symbolic relationship and
4.1 The Design Team & Participatory Design be able to make a conscious choice between the available options.
Process
4.2.2 Content & Features of the Application:
Our multi-disciplinary design team comprised of computer scien-
tists and 4 researchers/SLPs from the Boston Children’s Hospi- Basic AR Experiences: The fst version of the application was
tal’s outpatient clinic in Waltham, Massachusetts. These clinicians very simple. We pre-programmed a few images and corresponding 3D
primarily work with minimally verbal children with moderate to models into the application (details present in the next bullet point).
severe autism and provide diagnostic, and speech and language Therapists would open the application on an iPad and hold the device
evaluations and treatments. At the time of introduction the SLPs camera over a print-out of one of the pre-programmed images. The
were exploring AR in play contexts, but had no prior experience associated 3D models would then appear on the screen. Figure 2 shows
with picture-based AR. a picture of Lego blocks and goldfsh crackers superimposed with
The design process was as follows; the frst author visited the 3D models of the same objects. Names of the objects would appear
clinic weekly for a month, and then bi-weekly over the next 3 on small banners beside the 3D models. When the 3D models were
months for design meetings. The frst month was dedicated to tapped, an audio recording of the object’s name would play out. It
outlining the features of the application, and the content of the AR is important to note that these AR experiences were built into the
Designing a Customizable Picture-Based AR Application For Therapists Working in Autistic Contexts ASSETS ’22, October 23–26, 2022, Athens, Greece

(a) Target image of Lego block and (a) Bubble wand and ‘Thomas’
Goldfsh crackers

(b) Socks and soccer ball


(b) 3D models of the Lego block and
goldfsh crackers appearing over the
target image Figure 3: Target Images

Figure 2: Basic AR Experience


Hiding Target Images: Sometimes when 3D models were super-
imposed over target images, there would be too many elements on the
application. The therapists could not alter the target images or the 3D screen. Some individuals on the spectrum could perceive this as visual
models associated with them. clutter or sensory overload [21]. To minimize this, we added a ‘hide’
feature to the application. A white plane would appear over the target
3D Objects and Target Images: The initial application was image (Figure 4b), temporarily hiding the target image and drawing
designed while keeping in mind the objects (e.g., toys) that the children focus to the 3D models on top.
at the clinic interacted with the most. The design team picked 6 objects
- a bubble wand, a ‘Thomas the train’ toy, a large Lego block, a packet 4.2.3 Informal Testing & Design Considerations Moving For-
of goldfsh crackers, a soccer ball and a pair of socks. Since the children ward: Our collaborating SLPs informally tested the application
were prompted to make a choice between 2 objects at a time, we took amongst themselves and provided feedback. They were given a
images of pairs of objects. These served as the 2D images the children hi-fdelity prototype of the application and instructed to trigger the
would be presented with and the target images that the application various in-built AR experiences, try out the animations for each 3D
would recognize. Examples of diferent pairs are shown in Figure 3. model, and comment on the strengths and weaknesses of the appli-
cation. After using the application for several months, and due to
Animations: Based on discussions with the therapists, we cre- the overnight shift to remote learning practices due to the Covid-19
ated an animation for each 3D model to promote engagement. For pandemic [53], the SLPs faced various challenges. These challenges,
example, bubbles would appear in front of the bubble wand, the Lego and their implications on application design, are outlined below.
blocks would start stacking on top of each other, and a few pieces of
goldfsh crackers would appear outside their packet. (Figure 4a). Each Predefned Content and Varied Interests - The Need for
animation was paired with relevant audio, such as a popping sound Customization Options: We tried to ensure that the target im-
for the bubble wand, a stacking sound for the Lego blocks and the ages and 3D models found within the application were representative
sound of a packet opening for the goldfsh crackers. of (and highly similar to) the objects found at the clinic that were
interesting to the clinic’s autistic population. However, the move to
ASSETS ’22, October 23–26, 2022, Athens, Greece Tooba, et al.

(a) Animations - Lego blocks stack and gold-


fsh crackers appear
(a) Creation window

(b) Target image hidden by creating a white


plane over the image
(b) Users can use the slide-in menu to add content to an AR
Figure 4: Animations and ‘Hide target Image’ Feature experience

online therapy due to the Covid-19 pandemic presented a challenge.


If a child did not have these specifc objects at home or did not fnd
them interesting, the application would not be engaging/useful for
them [41]. This feedback made it clear that our AR application had to
be more customizable, in order to be usable outside the clinic and to
cater to children’s diverse interests.
Dynamic Therapy Environments and Children’s Limited
Focus - The Need to Freeze Content: Owing to the nature of AR
technology, the AR experience is only visible on the screen when the
device camera has the target image in its sight. However, constantly
holding the iPad over the target image proved to be difcult for the (c) The selected target image appears in the middle of the
SLPs, as therapy environments are dynamic; the child may be unable workspace
to sit at a table (and therefore near the target image) for long periods
of time, or they may get distracted by another item in the room and Figure 5
wander of. The therapist may have to abandon the iPad to follow
after the child and may fnd it difcult to make the child sit still
long enough to trigger the AR experience again. We therefore needed experiences according to the needs of the autistic population they
to ‘freeze’ the AR experience on the screen, that is, retain the AR work with. We also added the freeze feature to make it easier to
experience on the screen even after the target image was no longer in view a completed AR experience. In this section, we provide details
view of the device camera. about these features and the UI of the application.
4.3.1 Customization Options:
4.3 Phase II - Re-designing the Application to Users can avail the following customization options:
Make it More Customizable • Custom Target Images - Users can select custom images
In Phase II of the design process, we added various customization to serve as targets for an AR experience by either taking
options to the application to give therapists and educational pro- an image in real time, or uploading an image from their
fessionals the ability to create and customize picture-based AR device gallery. This gives each user the fexibility to create
Designing a Customizable Picture-Based AR Application For Therapists Working in Autistic Contexts ASSETS ’22, October 23–26, 2022, Athens, Greece

AR experiences around the specifc visual supports they use


during learning/therapy exercises.
• Audio Prompts - Users can record an audio prompt for
an AR experience. The audio prompt plays automatically
when the target image is recognized by the application. Since
users will decide on the contents of the audio recording
themselves, they may choose to record short hints or longer,
more detailed prompts, depending on the activity they have
in mind.
• Videos - Users can associate videos with an AR experience
by either recording them in real time, or uploading videos
from their device gallery. We posit that this may be an im-
portant customization option as video modelling is widely
(a) Users can upload existing videos or take them in real
used as an instructional method when working with autistic time. Tapping on the video object (shown in the center of
children [11, 23, 51]. the screen), plays or pauses the video.
• 3D Models & animations - Users can add one or more
3D models to their AR experience from an in-built library
of 3D models that contains 40 models spanning the cate-
gories of food, hygiene, toys, etc. Each model has a default
audio prompt (the model’s name) and at least one default
animation associated with it. Users can record their own
audio prompts for a model, and choose between animations
if more than one is available for a particular model. Although
we initially added 3D models because they were essential
for the symbolic development use-case described in section
4.2.1, we were interested to know how users utilize them in
other use-cases or learning activities.
To simplify the process of creating and viewing an AR experience, (b) Users can select one of the two animations (stacking or
we decoupled the creation and viewing windows. The customiza- lining up the Lego blocks), using the animation window
shown on the right.
tion options described above can be accessed through the creation
window. The two windows are described below.

4.3.2 Creation Window (Authoring Tool):


The creation window (Figure 5a) allows users to create their own
picture-based AR experiences from scratch. The user must frst
upload a target image, which appears in the middle of the work-
space (Figure 5c). Users can use the slide-in menu to add either
audio, video, 3D models, or a combination of the three, to any AR
experience (Figure 5b). The creation window shows what the target
image and virtual content would look like if users were viewing
them head-on, instead of from above. This is similar to how the AR
experience will appear in the AR-view and helps users adjust the
position of the virtual content in relation to the target image.
(c) Users can re-position, re-size and rotate 3D models and
videos, using the window on the right.
Miscellaneous options: Users can change an animation asso-
ciated with a 3D model (Figure 6b), and can re-position, resize, and
Figure 6
rotate video objects or 3D models within their AR experience (Figure
6c). They can also clear the work-space, load/edit previously created
AR experiences, and shift the camera to look at the work-space from cameras (Figure 7). This window contains features that facilitate
above. Moreover, they can temporarily hide content that is present the use of AR within therapeutic/learning contexts.
within their AR experience to facilitate adding and customizing more
content. Users can also access video-based tutorials that explain how Freeze Feature: The freeze button allows users to freeze the video
to create AR experiences and how to use all the miscellaneous fea- background on the screen (Figure 7a), while keeping the AR content
tures/options just described. interactive. This allows them to retain the context of the AR experience
(the video background and the target image), and move the device
4.3.3 AR-View: The AR-View is where users can view their AR away from the target image without the AR experience disappearing.
experiences by bringing their target images in front of their device This could be useful in dynamic therapy/learning environments where
ASSETS ’22, October 23–26, 2022, Athens, Greece Tooba, et al.

a child is unable to stay near the tabletop. The therapist could activate
the AR experience, freeze it, and then bring the iPad/device closer to
the child for better viewing. Prior work has also shown the efectiveness
of such a feature in mobile-AR contexts [42].

Setings Menu & On-screen Butons: The settings menu allows


users to change the speed of animations and the light intensity, hide
target images and switch between ‘sound mode’ and ‘animation mode’
for the 3D models (Figure 7c). The latter determines what happens
when a user taps a 3D model in an AR experience - either the associated
audio or the chosen animation plays out. In situations where a user
has added more than one type of content (audio, video, 3D models)
to an AR experience, they can toggle through the content using the
(a) The 3D Lego block is superimposed over the target im-
‘Next’ and ‘Prev’ prompt buttons present on the bottom of the main age. The Freeze and Next and Prev prompt buttons are
screen (Figure 7a). By default, the 3D model appears frst, then video, present at the bottom of the screen
and then audio. Users can reverse this order using the settings menu.

This version of the application, at the end of phase II, was used
when conducting formal user studies with therapists and educational
professionals (section 5). .

4.4 Implementation Details


The application was created using Unity [55] and C# scripting. We
used the ARCore [8] and ARKit [36] XR plugins to provide the
necessary AR functionality for both the android and iOS version
of application respectively. We used the interface provided by AR
Foundation [6], within Unity, to communicate with these two plu-
gins. This allowed us to use the same code-base to build both an (b) A video object superimposed over the target image
Android and an iOS version of the application.

5 USER STUDY
To answer our research questions (section 3), we conducted a user
study that was approved by our university’s Institutional Review
Board. Recruitment was done over the course of fve months, from
June 2021 to October 2021. The details are as follows:

5.1 Participants
We recruited participants using word-of-mouth and snowball sam-
pling. All our participants were therapists or educational profes-
sionals who had prior experience working with autistic individuals.
(c) The diferent settings options
We recruited a total of 10 participants in the study; 7 participants
completed the study, 1 participant withdrew because she was not
Figure 7
comfortable with downloading the application on her personal
device, and 2 of them did not complete the study.
For simplicity, we refer to participants using ID numbers P1 to
P7. P1 and P2 belonged to the same clinic as the SLPs that helped Please note, no autistic children participated in this study. Our
design the frst version of the application (Phase I). However, these goal was to get feedback from therapists and educational profes-
participants were not present during the design discussion in phase sionals frst and refne the application, before testing it with autistic
I and did not have access to the application prior to their enrollment children in future studies.
in the study. Since the frst version of the application was created
for a specifc use-case for the clinic, we wanted to see how the latest, 5.2 Study Method
more customizable version could be useful for the same use-case. Each participant was enrolled in a diary study spanning 2 weeks.
Recruiting therapists who had negligible involvement in the design Participants frst took part in a one-on-one information session
and creation of the application, but were familiar with the use-case, via Zoom [9], where they were introduced to the application and
allowed us to get an unbiased opinion. downloaded it on their preferred devices. They were instructed to
Designing a Customizable Picture-Based AR Application For Therapists Working in Autistic Contexts ASSETS ’22, October 23–26, 2022, Athens, Greece

use the application over the next 2 weeks to create their own AR analysis was used. A member of the research team frst transcribed
experiences, and provide feedback about its strengths/weaknesses the data, generated initial codes, gathered data/quotes pertaining
and what learning activity or contexts of use they had in mind. We to each code, collated them into themes, discussed the themes with
built feedback options into the application to encourage partici- other members of the research team, and then reviewed and refned
pants to provide feedback. The types of feedback they provided are them. A second coder (from outside the initial study team) validated
discussed in section 5.4. the themes and the data belonging to each particular theme.
At the end of their 2 weeks, participants took part in a one-on-one
semi-structured interview with a member of the research team. This 6 RESULTS
helped us get an in-depth understanding of how the participants We asked our participants to explore the application while keeping
used the application and in what contexts, what challenges they in mind the autistic population they work with, and the learning
faced, and what improvements they would like to see. At the end exercises they conduct. As such, some participants only explored
of the interview, participants were asked to permanently delete the features relevant to the context they were developing their AR
application from their devices. experiences for, while others explored as many features as possible.
In this section, we frst highlight the participants’ general experi-
5.3 Procedures ence/progress with the application and provide details about the
5.3.1 Information Session: In the information session, a mem- learning activities they had in mind. We then analyze their feedback
ber of the research team frst explained what picture-based aug- and interview responses to answer our research questions.
mented reality was and then played demo/screen capture videos
of the application to explain its layout and the features. Interested 6.1 Participants’ Progress With the Application
participants were taken through an informed consent process in All our participants encountered a learning curve when testing
the presence of a witness. The participants then downloaded the the application, but some made more progress than others. These
application on their preferred devices. All our participants used participants either got better after repeatedly using the application,
iOS devices. For iOS users, we distributed the application through were able to intuitively fgure out what to do, or took advantage of
Apple’s Testfight [7]. the in-app tutorials.
5.3.2 Diary Study and Follow-up Interview: During the 2-week 6.1.1 Geting Beter Afer Repeated Use: Some participants
diary study, we sent emails twice a week to remind and encourage mentioned getting better at using the application over time. For
our participants to continue testing the application and provide example, P1 said that she was able to ‘fgure things out’ after she
feedback. At the end of the 2 weeks, we conducted a one-on-one ‘played around with it’ and that with repeated use, she would be
interview with them via Zoom. These interviews typically lasted ‘more comfortable’ with the application. Similarly, P2 got faster at
an hour and were either audio or audio/video recorded based on using the application over time. She stated, ‘I defnitely got faster at
the participants’ preferences (indicated in the consent forms). it. And so once I was able to do it like a couple times, it was pretty
seamless.’
5.4 Data Collection & Analysis
5.4.1 Types of Data Collected: We collected the following qual- 6.1.2 Using One’s Intuition: P5 mentioned that the initial tuto-
itative data: rial in the one-on-one information session was enough for her to
use the application ‘independently’, and she ‘didn’t really have to
• Written feedback - Participants provided written feedback go back to the tutorials that are built in’. She thought the buttons
by flling out a short Qualtrics survey [4] that opened up ‘were clear’, ‘made sense’ and if she was not sure what to do then
in a browser when they clicked the ‘Survey’ button in the as an ‘intuitive user of apps like this’, she was able to ‘fgure it out’.
application. This short survey asked participants to indicate
which features they used, how they used them, and any other 6.1.3 Leveraging the In-app Video Tutorials: P1 mentioned
comments they had. that the in-app tutorials were ‘defnitely helpful’ and P7 said that
• Voice notes - Participants provided verbal feedback through they were ‘great reminders’. P6 had trouble remembering what
the application by recording voice notes and pressing the the miscellaneous buttons on the screen did (e.g. the ‘load’ but-
‘Send Voice Note’ button to upload them to an AWS simple ton) and sought help from the in-app tutorials. She stated that she
storage bucket [10]. Similar to the survey, participants had to was not ‘good at remembering steps’ so she watched the tutorials
elaborate on which features they used, and in what contexts. multiple times, which were ‘very clear and very easy to understand’.
• Interview recordings - The audio/video recordings of in-
terviews were manually transcribed. In the interviews, we Participants who made less progress either failed to notice the
asked our participants to describe their overall experience in-app video tutorials, or were unclear on how to trigger the AR
with the application, what AR experiences they created and experiences they had created.
for which learning exercises, what features may be more
useful in autistic contexts etc. 6.1.4 Lack of Prior Experience With AR & Failure to Notice
In-app Tutorials: P4 enjoyed ‘exploring’ the application, but was
5.4.2 Analysis Methods: We performed a thematic analysis on ‘new to AR’ and so ‘had some difculty’ with creating full AR
the written feedback, and the transcripts of the voice notes and experiences and viewing them. She did not notice the ‘tutorial’
interview recordings. Braun & Clarke’s [18] approach to thematic button in the application and was therefore unable to get help.
ASSETS ’22, October 23–26, 2022, Athens, Greece Tooba, et al.

6.1.5 Confusion About How to Trigger an AR Experience: 6.3 Participants’ Experiences With The Various
P3 misunderstood how to trigger an AR experience and instead Customization Options
of using the target images to trigger the experience, she tried to
As mentioned in 6.2, our participants envisioned using the appli-
use the actual objects she took images of. For example, instead of
cation for a plethora of learning activities, for autistic children
holding her camera to the picture of her driveway, she held it up to
ranging from younger kids with severe autism, to those who were
the actual driveway and had a hard time making sure the camera
independently navigating high-school. In this section, we discuss
was in the ‘right place’ to trigger the experience. This negatively
our participants’ experiences with the application’s customization
impacted her progress with the application. Similarly, P6’s AR ex-
options and answer RQ 1.
periences did not appear on the screen because her target images
were so big that the entire image was not in view of the device 6.3.1 Recording Custom Audio: Participants could record au-
camera. We identifed and corrected this issue during the interview, dio prompts that would either play automatically when a target
and she was able to trigger her AR experiences right then. That image was recognized, or would play when the associated 3D model
said, both participants provided feedback based on the features they was tapped. P1 added the sound of a train honking from a Youtube
successfully explored, and brainstormed improvements. Despite clip to the 3D model of ‘Thomas the train’ for her choice-making ac-
encountering issues, P6 ‘really liked’ using the application and, at tivity and said that she could see ‘a kid really enjoying it because it’s
the end of the study, said that it was ‘too bad’ she could not have it so realistic’. P5 highlighted that children ‘may be stimulated by hav-
again. ing their voice recorded’ and wanted her students to record audio
prompts for actions associated with each card in their battle game.
In section 7.3 we discuss some improvements we can make to For example, a card could say, ‘You lost fve points’. P6 and P7 used
help reduce the application’s learning curve. videos for their primary use-cases (teaching language and creating
reminders), but used audio for secondary activities. They imagined
using audio to pair instructions for a math lesson, or sounds of
musical instruments with worksheets and pictures respectively.
6.2 Details of Learning Activities Children could hold up their devices to the worksheets/pictures to
Our participants envisioned creating AR experiences for a range of listen to the instructions or the musical notes.
learning activities. Table 1 provides background information about
6.3.2 Adding A Custom Video: Participants liked the option to
each participant, the features they explored, and what learning
add videos into the AR experience because videos either grasp a
contexts they had in mind.
child’s ‘attention’ better than ‘static’ content (P1) or because autistic
P1 and P2 considered the same symbolic development use-case
individuals ‘imitate video models more’ than if a person performed
that was described in section 4.2.1 – they wanted to help children
the same action in front of them (P2). The latter especially could
make intentional choices using pictures of objects (toys, food etc.)
help with teaching daily living skills. P2 added a video of someone
by superimposing them with 3D models of the objects to strengthen
brushing their teeth, over an image of a tooth brush, to teach this
the link between 2D images and their 3D referents. P3 wanted to use
daily living task and P6 imagined using videos to demonstrate
the application to give children situational cues – such as how to
the meaning of action words and emotions. For example, when
behave when entering a classroom. P3 and P4 both wanted to teach
conducting a lesson on frustration, she could superimpose an image
daily living skills, such as ‘hair-brushing’ or tooth-brushing’ by
of a frustrated kid with a video showing what ‘frustration looks
using the application to superimpose pictures representing the steps
like’. P7’s goal was to foster independence in the children he worked
of an activity with videos of the activity being performed. Children
with - If stuck, they could hold their iPads up to their worksheets
would trigger these videos when they forgot the steps in a task. P5
to trigger a video of P7 giving them instructions, such as ‘please
was confdent that her high-school students could independently
write three complete sentences’. While an audio prompt may have
use the application during group-work to create an AR battle game
achieved the same purpose, it seemed the participants valued the
using picture cards. Students would collaborate to make the rules of
visual nature of videos. Participants who had trouble using this
the game, would decide which pictures to use, and would associate
feature (P4) or who found it irrelevant to their use-case (P5) also
actions with each picture (using audio or 3D models). Players could
thought that it had ‘potential’.
trigger these actions by holding their iPads over each picture. This
spurred the discussion on whether any changes need to be made to 6.3.3 Adding 3D Models: Some participants used 3D models as
the UI so that autistic children could create AR experiences on their an integral part of their AR experience (e.g., P1 and P2 used the
own (section 7.4). P6’s school used worksheets (called ‘News2You’) train, bubble wand, Lego block etc. for choice making activities),
that paired action words, such as kicking, with pictures depicting while others explored them out of curiosity (P3 to P7). P5 and P6
the action. She wanted to superimpose these static pictures with liked the ability to add multiple 3D models to an AR experience - P5
videos showing what the actions looked like. Lastly, P7 envisioned could then use diferent 3D models at the same time to demonstrate
using the application to give ‘reminders or second prompts or visual a theme, for example adding a cofee cup and toast together to rep-
cues’ to children during group activities or if a paraprofessional resent ‘breakfast’, and P6 could add multiple 3D models to an AR
was not nearby to repeat instructions. He also wanted to reinforce experience for a math activity (e.g., 2 + 5). That said, participants
the visual schedules in his classroom with audio prompts. Children agreed that having ‘more options’ (P1) or a ‘broader range’ (P2) of
could then ‘independently go up and check the schedule’ by holding 3D models would be helpful. P2 worked with kids‘ in play contexts’
their iPads over the images in the schedule. and would have liked to see 3D models of other toys and P5 said
Designing a Customizable Picture-Based AR Application For Therapists Working in Autistic Contexts ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Participants’ Backgrounds

ID Participant Occupa- Population in Mind Learning Activity Features Tested Intended Users of the Applica-
tion tion
P1 Speech Language Younger kids with Choice making & sym- Audio, Video, 3D The therapist holds the device
Pathologist moderate-severe autism bolic development models (a few) & an- and shows the screen to the
who are minimally verbal imations, Freeze child, the child may interact
with the virtual content them-
selves or with the therapist’s
help
P2 Speech Language Younger kids with Choice making & Audio, Video, 3D Same as above
Pathologist moderate-severe autism symbolic development, models (a few) & an-
who are minimally verbal teaching daily living imations, Freeze
skills
P3 Speech Language 7th/8th graders with Teaching daily living Video, 3D models (a The therapist creates the AR ex-
Pathologist Autism who have artic- skills few) periences beforehand, and the
ulation or signifcant kids use the AR-View to view
communication difculties and interact with the virtual
content independently
P4 Special Education High-school kids with Teaching daily living 3D models —
Teacher moderate-severe Autism skills
P5 Speech Language High-school kids who are Group work / collabo- Audio, Video, 3D The therapists or the kids in-
Pathologist independently navigating a rative learning, creating models (a few) & an- dependently create their own
public school setting AR games together imations, Freeze AR experiences and use the AR-
view to show others what they
have created
P6 Occupational Ther- 14 y/o with Down’s syn- Worksheets with action Audio, Video, 3D The therapist creates the AR
apist drome & a child who suf- sentences, connecting models (a few) & an- experiences, and the kids view
fered stroke in utero - kids words and language, imations them independently
who have difculty speak- teaching emotions,
ing, low cognition and are teaching math concepts
emerging readers
P7 Special Education autistic children in 4th Giving children re- Audio, Video and The therapist creates the AR
Teacher grade minders or directions 3D models (a few) experiences, and the kids view
about individual or them independently
group-related class
activities, visual
schedules and pairing
musical notes with
images of musical
instruments

that categories related to daily living, such as ‘transportation’ or the function of a particular 3D model – to be more useful. For ex-
‘morning routine’, and for her use case, categories like ‘battle ships’ ample, if the toy train had an animation of the train gliding over,
would be helpful. P7 typically puts clipart on his slides to show chil- P1 could use it to teach concepts like ‘pushing the train’. Similarly,
dren what they need for an activity and could have done the same P2 stated that the animations would be great for teaching ‘choice-
with AR if 3D models related to school supplies had been available making’, as they depict the function of the object. Although the
in the in-app library, such as rulers, pencils, and crayons. Therefore, 3D models of day-to-day objects, such as the toothpaste, had func-
providing users with the ability to add 3D models themselves or tional animations, P2 thought that having a sequence of animations
search from a larger database of 3D models would be helpful. We that depict the item’s use (such as all the steps involved in putting
discuss this more in section 6.5.1. toothpaste onto a toothbrush) could be helpful in teaching daily
living skills. Thus, while simple animations attract children’s atten-
6.3.4 Using Animations: Some 3D models had simple anima- tion, functional animations are more useful from a teaching point
tions, such as an apple rotating, while others had functional anima- of view. While animations may not be useful for every child - P3
tions, such as a pair of hands getting soap from a soap dispenser. stated that they would not be necessary for the 7th/8th graders she
Participants found functional animations - animations that show works with – they can be ‘really important to have’ when teaching
ASSETS ’22, October 23–26, 2022, Athens, Greece Tooba, et al.

children with severe autism, or to make the application useful for the activity being considered. In the next subsection, we talk about
‘a wide variety of clients’ (P3). some of the practical challenges that users may face when using
picture-based AR in day-to-day settings (RQ 3).
6.3.5 Are The Given Customization Options Suficient? Re-
call that RQ1 considered whether the application’s customization 6.5 Practical Challenges of Using AR in
options were sufcient to create AR experiences for a range of autis-
tic children and learning exercises. While participants suggested
Therapy / Learning Settings
improvements for existing customization features, they did not When thinking of using AR in day-to-day settings, our participants
fnd anything ‘missing’, and although participants were consider- were concerned about being able to access a wide range of 3D
ing diferent learning activities, they were each able to fnd some models, being able to create AR experiences on-the-fy or ‘just-in-
customization options that were useful for their context. P6, for time’ and generalizing AR Experiences. We discuss these concerns
example, stated the following, ‘I could defnitely use the audio. I below.
could defnitely use video. I could defnitely use an image. I can use 6.5.1 Access to a Wider Range of 3D Models: As mentioned in
all your features to teach a lesson on an emotion’. There is, however, section 6.3.3, most of our participants thought that having a larger
a question of how much customization is enough? P5 thought the library of 3D models at their disposal would help them create AR
application provided a good balance of ‘fexibility’, without the experiences for a wider audience. P3 thought that being able to add
‘possibilities being endless’. P5 envisioned a scenario where her one’s own 3D models would be helpful. For example, participants
students would create AR experiences themselves, instead of the could search for and download 3D models from the internet. How-
teacher/therapist. To that end she stated that some previous tools ever, these models may not have sounds or animations built-in; we
she had used with her students (e.g., 3D printing software) had ‘too had to build the animation and sound associated with each model
many options’, and students got ‘bogged down in the details’, were in the application’s library. Moreover, other participants, may feel
‘too frustrated’, or ‘working too independently without consulting the same as P2 who stated that this study was her ‘frst time re-
each other’, thus defeating the purpose of group exercises. Our ally exploring AR’, and she was unaware that one could search for
application was simple enough that she envisioned her students and download 3D models from the internet. Therefore, building a
‘running with it’ and even ‘coming up with something more cre- library of models into the application, or hosting one online that
ative’ than her game idea, but without getting too bogged down. users can access from within the application, may be preferable. If
That said, a user may come up with a unique or niche use-case users are unable to fnd a relevant 3D model, they could potentially
which requires more customization than what is currently available make requests for 3D models and associated animations through
in the application. Future studies will need to investigate this. In the application, which would then be fulflled by members of the
the next subsection, we attempt to answer RQ 2. research team.

6.4 Can the Freeze Features Facilitate the Use 6.5.2 Creating AR Experience On-the-fly or Just-in-time:
of AR in an Autistic Context? Preparing resources or creating exercise material according to each
child’s interests can be a difcult and time-consuming process. Be-
Our second goal was to understand whether the freeze feature could
ing able to ‘customize an AR experience on-the-fy’ during a therapy
facilitate the use of AR with autistic individuals. Participants liked
session would make a ‘huge diference’ as it would reduce the ‘prep
the freeze feature because it removed the need to continuously
time’ participants have to put in beforehand (P1). P3 echoed this sen-
hold the iPad/device over a target image to keep an AR experience
timent, stating that one must ‘make things quickly’ while working
interactive. P5 saw its utility in group settings - she could ‘capture’
with some children, and that it currently requires a ‘lot of planning’
the AR experience on the screen and then show it to children who
to create relevant AR experiences. Therefore, some degree of in-
were sitting further away from her in a group. She also thought that
telligence / automation may be helpful. For example, P2 thought
taking a screenshot of the frozen AR experience and putting it in
that it would be helpful if the application could detect the contents
a visual schedule would help prepare children for future activities
of a target image and quickly provide options for appropriate 3D
involving the application. Moreover, it could be helpful in scenarios
models and audio prompts that label the objects. This would al-
where children were unable to focus or stay still. P1 mentioned that
low therapists and educational professionals to quickly create AR
‘attention is sometimes very impacted’ for children on the spectrum,
experiences during a therapy session, based on what the child is
so they may ‘wander the room’ or ‘walk away’ when she tries to
interested in at that moment. In section 9.1 we present a prototype
show them something on the iPad. The freeze feature would be
for ‘just-in-time’ content creation as a proof of concept.
‘useful’ in these situations as she could trigger the AR experience,
leave the target image on the desk, and go where the child is. P3 6.5.3 Generalizing AR Experiences: P3 pointed out that ‘gen-
echoed this sentiment and said that having the AR experience not eralization is a big area of focus when working with people on the
‘disappear once you move to a diferent spot’ would be helpful. spectrum’. It involves learning a skill or concept and using it in any
Therefore, it seems that the freeze feature may be especially environment, not just the one in which it was taught. P3 thought
important to have when using AR in an autistic context. In contrast it would be great if an AR experience could be generalized. For
to the freeze feature, most participants skipped over the settings example, in the context of hand-washing, the application would
options in the AR-View or only tried a few options. When asked recognize that a sink was present and would pull up the AR content
why they did so, some participants said that they either did not feel (audio, hand-washing videos etc.) that she had previously associ-
the need to edit the default settings, or they were not needed for ated with another sink. While she was focused more on recognizing
Designing a Customizable Picture-Based AR Application For Therapists Working in Autistic Contexts ASSETS ’22, October 23–26, 2022, Athens, Greece

3D objects and using those as triggers for AR experiences instead starting point, future applications might consider providing fex-
of pictures, her idea could be extrapolated to work for pictures as ibility within a limited set of customization options, or gradually
well. Moreover, she thought that if a clinician could create an AR introducing customization options to users. The latter is discussed
experience and share it so that ‘parents can use it at home’, it would in section 7.3 and 7.4.
help generalize the skills taught in school to a home setting. We
discuss this further in section 7. 7.3 Reducing the Application’s Learning Curve
- Tutorials & Gradually Exposing
7 DISCUSSION Functionality
While users found the application’s features and customization Recall that some participants made more progress in using the
options to be useful, there is room for improvement. In this section, application’s features than others. Although we added in-app video
we use insights from our study to propose design implications tutorials, interactive tutorials where the application helps the user
for future applications (AR or otherwise) that target therapists and create and trigger their frst AR experience and points out useful
autistic contexts. We also discuss ways of reducing the application’s features, may further reduce the learning curve. P7 also advocated
learning curve and position autistic children as content creators. for gradually exposing the application’s functionality to users. He
stated that a lite version of the application, where users can only add
7.1 Sharing AR Experiences & Just-in-Time audio to a target image, may be helpful for users who are not very
technologically savvy. He stated that ‘a broader range of people
Creation - Implications for Content
are more comfortable’ with taking a photo or recording something,
Creation Apps so limiting the customization options can help ease them into the
Sharing AR experiences was initially brought up in the context creation process. Users can later upgrade to a version with videos
of generalization (6.5.3), but could also facilitate just-in-time con- and 3D models.
tent creation and collaboration/group-work. For example, thera-
pists/teachers could share their AR experiences with parents, or
7.4 Autistic Children as Content Creators -
vice versa, by uploading them to an online portal. This would allow
children to be exposed to the same prompts/AR experiences in difer- Will Our Current UI Sufce?
ent environments, which could help with generalizing ideas/skills. We initially envisioned therapists and educational professionals as
Additionally, therapists could download and edit pre-made AR ex- the main creators of AR experiences, while the autistic children
periences or templates to speed up the creation process and reduce they worked with would simply view the AR content. P5, however,
their own time and efort. Moreover, in class/group activities, teach- stated that her high-school students could easily navigate the fea-
ers could upload a half-fnished AR experience for their students to tures of the application to create their own AR experiences. This
download and edit. Students could also share AR experiences with begs the question: is the application’s UI sufcient to allow autistic
each other to collaborate or show what they have created so far. children to create their own AR experiences, or are modifcations
Therefore, applications (AR or otherwise) that involve content cre- necessary? For example, some children may wish to hide some
ation and target therapists/teachers and learning settings, should buttons on the screen to reduce visual input. Other children may
provide support for content-sharing and collaboration. have difculty remembering the steps involved in creating an AR
experience and may beneft from an interface that sequentially
takes them through each step of the creation process. P7’s idea of
7.2 How Much Customization is Enough - gradually exposing functionality to users, as mentioned in section
Implications for Autism-Focused 7.3, may be useful here as well; children can take their time and
Customizable Applications become comfortable with using one customization option before
Recall that in section 6.3.5, we discussed whether or not our cus- moving onto another. To make the application accessible to a wide
tomization options were sufcient. Improvements to 3D models range of content creators, we need to conduct user studies with
notwithstanding, perhaps one reason our approach to customiza- autistic individuals, particularly those who can independently nav-
tion was successful was because we provided sufcient fexibility igate similar applications and are interested in AR. We leave this
without the application becoming too overwhelming. For example, for future work.
users only had the option of adding audio, video or 3D models but
could decide themselves which of the three they wanted to add to 8 KEY TAKEAWAYS
an AR experience and what the content of the audio and/or videos The following bullet points summarize this work’s key takeaways:
should be. This allowed them to create AR experiences for a vari-
ety of learning contexts. Moreover, for future AR application that • The customization options we provided (audio, video and 3D
position autistic children as content creators, it might be helpful models with animation) allowed therapists/educational pro-
to provide a limited set of options but provide fexibility within fessionals to create AR experiences for a variety of learning
those options, so that children can be creative without getting too exercises. Participants appreciated our approach of provid-
‘bogged down’ in the details. Of course some learning contexts may ing fexibility within a limited set of customization options
require support for a lot more customization, and by no means as it prevented users from getting too bogged down. Future
do we claim that our approach fts every context. However, as a researchers/designers creating applications in an AR and
ASSETS ’22, October 23–26, 2022, Athens, Greece Tooba, et al.

autistic domain may beneft from using this approach to


customization (as a starting point).
• The freeze feature can be helpful when working with autistic
students with limited attention, or when working in group
settings. Future researchers/designers creating AR applica-
tions should consider providing a way to freeze AR expe-
riences on the screen to make it easier to use with autistic
children.
• Therapists and educational professionals may need to create
content just-in-time during therapy sessions or may need to
generalize the same content to diferent situations. Future
researchers/designers who are developing content-creation
applications for autistic contexts (AR or otherwise) should
(a) Sequential Interface. A user has uploaded a target im-
provide avenues for creating content quickly, or sharing age of fruits. The application used the AWS Rekognition
content/AR experiences with others. service to provide the following labels for the contents of
the image: ‘Plant’, ‘Fruit’, ‘Food’ etc. The labels appear in
the window at the bottom-right and the user can select the
most appropriate label.
9 LIMITATIONS & FUTURE WORK
Our study had some limitations: Firstly, participants had a limited
selection of 3D models. Future prototypes should allow users to
upload their own 3D models, or provide a larger library to choose
from. Secondly, we cannot make any claims about whether the
application’s current UI will be sufcient for autistic children to
create their own AR experiences. We posit that a lite version, or a
version that either gradually exposes functionality or systematically
takes the user through each step of the creation process, might be
helpful. That said, we have yet to conduct user studies with autistic
children to confrm this hypothesis. The next subsection discusses
a proof-of-concept prototype that could be used in future studies
to answer some of these questions.
(b) 3D Model Recommendations. The application has rec-
ommended several 3D models to the user based on the label
9.1 Just-In-Time Creation and Sequential they chose in the previous step (‘Fruits’). The recommenda-
tions are shown in window on the left. The user can explore
Interface all the other categories of 3D models using the buttons at
the top-right of the screen.
Based on the discussion in 6.5.2 and 7.4 about just-in-time con-
tent creation and the benefts of a sequential interface, we made
Figure 8
some modifcations to the application’s UI. The third version of the
application takes users through each step of the creation process
sequentially. Users can move between steps using the buttons at
the bottom of the screen (fgure 8).
Other features, such as sharing AR experiences through an online
The application uses the AWS Rekognition [48] service to iden-
portal, are more complex and require further thought.
tify labels for the object, scene or actions present in a target image
(fgure 8a). Based on the label that a user selects, the application
provides recommendations for 3D models (fgure 8b). Users can 10 CONCLUSIONS
still explore all the categories manually but we posit that providing This paper describes the design and evaluation of CustomAR, a
them with some recommendations might speed up the creation mobile application that allows therapists to create picture-based
process. Moreover, we use the AWS Simple Storage Service (S3) [10] AR experiences from scratch for use in autism-based learning exer-
to host a larger library of 3D models. Users can download these cises. A diary and interview study revealed that therapists found
3D models through their application. This allows us to provide the application’s customization options to be sufcient for creat-
a larger library of 3D models without encroaching on the user’s ing AR experiences for a variety of learning exercises, ranging
device storage. from choice-making activities to reminders, visual schedules and
This third prototype is a small proof-of-concept. We hope to use groupwork. We posit that our approach was successful because we
it in future design meetings/studies with therapists, educational provided a fair amount of fexibility, without inundating users with
professionals, and autistic children to understand whether they customization options. Participants also thought that the freeze
prefer UIs where the customization options are exposed gradu- feature would be useful when working with children with limited
ally/sequentially or all at once, and whether the recommendations attention. Moreover, our study highlighted some challenges to us-
given by the application are helpful when creating AR content. ing picture-based AR in practical therapy settings, such as fnding
Designing a Customizable Picture-Based AR Application For Therapists Working in Autistic Contexts ASSETS ’22, October 23–26, 2022, Athens, Greece

3D models that capture the children’s interests, generalizing con- chain task for elementary students with autism. Journal of Special Education
cepts, and creating AR experiences quickly during therapy sessions. Technology 31, 2, 99–108.
[23] Blythe A Corbett and Maryam Abdullah. 2005. Video modeling: Why does it work
Moving forward, application developers looking to create customiz- for children with autism? Journal of Early and Intensive Behavior Intervention 2,
able applications for autistic contexts, should allow users to share 1, 2.
[24] Camilla Almeida da Silva, António Ramires Fernandes, and Ana Paula Grohmann.
their AR experiences or create experiences quickly – the latter may 2014. STAR: speech therapy with augmented reality for children with autism
require incorporating some degree of automation or intelligence spectrum disorders. In International Conference on Enterprise Information Systems.
into the application. Springer, 379–396.
[25] Sarah Dettmer, Richard L Simpson, Brenda Smith Myles, and Jennifer B Ganz.
2000. The use of visual supports to facilitate transitions of students with autism.
ACKNOWLEDGMENTS Focus on autism and other developmental disabilities 15, 3, 163–169.
[26] Mihaela Dragomir, Andrew Manches, Sue Fletcher-Watson, and Helen Pain.
We thank Hedi Skali, a high-school intern who worked with us 2018. Facilitating pretend play in autistic children: results from an augmented
during the summer of 2019. We also thank the reviewers for their reality app evaluation. In Proceedings of the 20th International ACM SIGACCESS
Conference on Computers and Accessibility. 407–409.
eforts in reviewing this work and providing constructive feedback. [27] Yao Du, LouAnne Boyd, and Seray Ibrahim. 2018. From Behavioral and Commu-
nication Intervention to Interaction Design: User Perspectives from Clinicians. In
Proceedings of the 20th International ACM SIGACCESS Conference on Computers
REFERENCES and Accessibility. 198–202.
[1] 2017. Daily living skills: Strategies to help sequence & achieve personal hy- [28] Carl J Dunst, Carol M Trivette, and Tracy Masiello. 2011. Exploratory inves-
giene tasks. https://www.toolstogrowot.com/blog/2017/08/09/daily-living- tigation of the efects of interest-based learning on the development of young
skills-strategies-to-help-sequence-achieve-personal-hygiene-tasks children with autism. Autism 15, 3, 295–305.
[2] 2021. AugmentedClass! augmented reality for education - apps on Google Play. [29] Lizbeth Escobedo, David H Nguyen, LouAnne Boyd, Sen Hirano, Alejandro
https://play.google.com/store/apps/details?id=com.AugmentedClass.AClass Rangel, Daniel Garcia-Rosas, Monica Tentori, and Gillian Hayes. 2012. MOSOCO:
[3] 2021. The benefts of visual supports for children with autism. https://www. a mobile assistive tool to support children with autism practicing social skills in
autismparentingmagazine.com/benefts-of-autism-visual-supports/ real-life situations. In Proceedings of the SIGCHI Conference on Human Factors in
[4] 2021. Qualtrics XM - experience management software. https://www.qualtrics. Computing Systems. 2589–2598.
com/ [30] Lizbeth Escobedo, Monica Tentori, Eduardo Quintana, Jesus Favela, and Daniel
[5] 2021. What is Visual Scheduling? https://www.appliedbehavioranalysisprograms. Garcia-Rosas. 2014. Using augmented reality to help children with autism stay
com/faq/what-is-visual-scheduling/#:~:text=Visualschedulingisasystematic, focused. IEEE Pervasive Computing 13, 1, 38–46.
andachievesuccessinlife. [31] Lori Frost and Andy Bondy. 2002. The picture exchange communication system
[6] 2022. About AR Foundation: AR FOUNDATION: 4.1.7. https://docs.unity3d. training manual. Pyramid Educational Products.
com/Packages/com.unity.xr.arfoundation@4.1/manual/index.html [32] Jennifer B Ganz, Theresa L Earles-Vollrath, Amy K Heath, Richard I Parker,
[7] 2022. Apple. https://testfight.apple.com/ Mandy J Rispoli, and Jaime B Duran. 2012. A meta-analysis of single case research
[8] 2022. Build new augmented reality experiences that seamlessly blend the digital studies on aided augmentative and alternative communication systems with
and physical worlds. https://developers.google.com/ar individuals with autism spectrum disorders. Journal of autism and developmental
[9] 2022. Video conferencing, Cloud Phone, WEBINARS, Chat, Virtual EVENTS: disorders 42, 1, 60–74.
ZOOM. https://zoom.us/ [33] Barbara C Gartin and Nikki L Murdick. 2005. Idea 2004: The IEP. Remedial and
[10] 2022. What is Amazon S3? - amazon simple storage service. https://docs.aws. Special Education 26, 6, 327–331.
amazon.com/AmazonS3/latest/userguide/Welcome.html [34] Martin Guha. 2014. Diagnostic and statistical manual of mental disorders: DSM-5.
[11] Mohammed Alzyoudi, AbedAlziz Sartawi, and Osha Almuhiri. 2015. The impact Reference Reviews.
of video modelling on improving social skills in children with autism. British [35] Individual Education Plan IEP. 1978. Special educational needs.
Journal of Special Education 42, 1, 53–68. [36] Apple Inc. 2022. ARKit - augmented reality. https://developer.apple.com/
[12] Zhen Bai, Alan F Blackwell, and George Coulouris. 2013. Through the looking augmented-reality/arkit
glass: Pretend play for children with autism. In 2013 IEEE International Symposium [37] Chloe Jennifer Jordan and Catherine L Caldwell-Harris. 2012. Understanding
on Mixed and augmented reality (ISMAR). IEEE, 49–58. diferences in neurotypical and autism spectrum special interests through internet
[13] Zhen Bai, Alan F Blackwell, and George Coulouris. 2015. Exploring expressive forums. Intellectual and developmental disabilities 50, 5, 391–402.
augmented reality: The FingAR puppet system for social pretend play. In Proceed- [38] Kamran Khowaja, Dena Al-Thani, Bilikis Banire, Siti Salwah Salim, and Asadullah
ings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. Shah. 2019. Use of augmented reality for social communication skills in children
1035–1044. and adolescents with autism spectrum disorder (ASD): A systematic review. In
[14] Andy Bondy and Lori Frost. 2011. A picture’s worth: PECS and other visual 2019 IEEE 6th International Conference on Engineering Technologies and Applied
communication strategies in autism. Woodbine House. Sciences (ICETAS). IEEE, 1–7.
[15] Monique Botha, Jacqueline Hanlon, and Gemma Louise Williams. 2021. Does lan- [39] Kamran Khowaja, Bilikis Banire, Dena Al-Thani, Mohammed Tahri Sqalli,
guage matter? Identity-frst versus person-frst language use in autism research: Aboubakr Aqle, Asadullah Shah, and Siti Salwah Salim. 2020. Augmented reality
A response to Vivanti. Journal of Autism and Developmental Disorders, 1–9. for learning of children and adolescents with autism spectrum disorder (ASD): A
[16] Brian A Boyd, Maureen A Conroy, G Richmond Mancil, Taketo Nakao, and Peter J systematic review. IEEE Access 8, 78779–78807.
Alter. 2007. Efects of circumscribed interests on the social behaviors of children [40] I Kurniawan et al. 2018. The improvement of autism spectrum disorders on chil-
with autism spectrum disorders. Journal of autism and developmental disorders dren communication ability with PECS method Multimedia Augmented Reality-
37, 8, 1550–1561. Based. In Journal of Physics: Conference Series, Vol. 947. IOP Publishing, 012009.
[17] Jorge Brandão, Pedro Cunha, José Vasconcelos, Vítor Carvalho, and Filomena [41] Aaron Lanou, Lauren Hough, and Elizabeth Powell. 2012. Case studies on using
Soares. 2015. An augmented reality gamebook for children with autism spectrum strengths and interests to address the needs of students with autism spectrum
disorders. In The International Conference on E-Learning in the Workplace. 1–6. disorders. Intervention in School and Clinic 47, 3, 175–182.
[18] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. [42] Gun A Lee, Ungyeon Yang, Yongwan Kim, Dongsik Jo, Ki-Hong Kim, Jae Ha Kim,
Qualitative research in psychology 3, 2, 77–101. and Jin Sung Choi. 2009. Freeze-Set-Go interaction method for handheld mobile
[19] Chien-Hsu Chen, I-Jui Lee, and Ling-Yi Lin. 2016. Augmented reality-based video- augmented reality environments. In Proceedings of the 16th ACM Symposium on
modeling storybook of nonverbal facial cues for children with autism spectrum Virtual Reality Software and Technology. 143–146.
disorder to improve their perceptions and judgments of facial expressions and [43] I-Jui Lee, Chien-Hsu Chen, Chuan-Po Wang, and Chi-Hsuan Chung. 2018. Aug-
emotions. Computers in Human Behavior 55, 477–485. mented reality plus concept map technique to teach children with ASD to use
[20] Chi-Hsuan Chung and Chien-Hsu Chen. 2017. Augmented reality based social social cues when meeting and greeting. The Asia-Pacifc Education Researcher 27,
stories training system for promoting the social skills of children with autism. 3, 227–243.
In Advances in ergonomics modeling, usability & special populations. Springer, [44] Roberto E Lopez-Herrejon, Oishi Poddar, Gerardo Herrera, and Javier Sevilla.
495–505. 2020. Customization support in computer-based technologies for autism: A
[21] Seungwon Chung and Jung-Woo Son. 2020. Visual perception in autism spectrum systematic mapping study. International Journal of Human–Computer Interaction
disorder: A review of neuroimaging studies. Journal of the Korean Academy of 36, 13, 1273–1290.
Child and Adolescent Psychiatry 31, 3, 105. [45] Anabela Marto, Henrique A Almeida, and Alexandrino Gonçalves. 2019. Using
[22] David F Cihak, Eric J Moore, Rachel E Wright, Don D McMahon, Melinda M augmented reality in patients with autism: A systematic review. In ECCOMAS The-
Gibbons, and Cate Smith. 2016. Evaluating augmented reality to complete a matic Conference on Computational Vision and Medical Image Processing. Springer,
ASSETS ’22, October 23–26, 2022, Athens, Greece Tooba, et al.

454–463. [53] Elizabeth A Steed and Nancy Leech. 2021. Shifting to Remote Learning Dur-
[46] MegEvans. 2022. Identity-frst language. https://autisticadvocacy.org/about- ing COVID-19: Diferences for Early Childhood and Early Childhood Special
asan/identity-frst-language/ Education Teachers. Early childhood education journal, 1–10.
[47] Esteban Menéndez and MD Lopez De Luise. 2018. Augmented reality as visual [54] Issey Takahashi, Mika Oki, Baptiste Bourreau, Itaru Kitahara, and Kenji Suzuki.
communication for people with ASD. In Proc. 9th Int. Conf. Soc. Inf. Technol.(ICSIT). 2018. FUTUREGYM: A gymnasium with interactive foor projection for children
28–32. with special needs. International Journal of Child-Computer Interaction 15, 37–47.
[48] Abhishek Mishra. 2019. Machine learning in the AWS cloud: Add intelligence [55] Unity Technologies. 2022. https://unity.com/
to applications with Amazon Sagemaker and Amazon Rekognition. https: [56] Catherine Tissot and Roy Evans. 2003. Visual teaching strategies for children
//aws.amazon.com/rekognition/ with autism. Early Child Development and Care 173, 4, 425–433.
[49] Debora Nunes and Mary Frances Hanline. 2007. Enhancing the alternative [57] Laurie A Vismara and Gregory L Lyons. 2007. Using perseverative interests to
and augmentative communication use of a child with autism through a parent- elicit joint attention behaviors in young children with autism: Theoretical and
implemented naturalistic intervention. International Journal of Disability, Devel- clinical implications for understanding motivation. Journal of Positive Behavior
opment and Education 54, 2, 177–197. Interventions 9, 4, 214–228.
[50] Shaila M Rao and Brenda Gagie. 2006. Learning through seeing and doing: Visual [58] Cheng Zheng, Caowei Zhang, Xuan Li, Xin Liu, Chuqi Tang, Guanyun Wang,
supports for children with autism. Teaching exceptional children 38, 6, 26–33. Cheng Yao, Fan Zhang, Wenjie Xu, and Fangtian Ying. 2017. Toon-chat: a cartoon-
[51] Christopher Stephen Rayner. 2010. Video-modelling to improve task completion masked chat system for children with autism. In ACM SIGGRAPH 2017 Posters.
in a child with autism. Developmental Neurorehabilitation 13, 3, 225–230. 1–2.
[52] Mikle South, Sally Ozonof, and William M McMahon. 2005. Repetitive behavior [59] Cheng Zheng, Caowei Zhang, Xuan Li, Fan Zhang, Bing Li, Chuqi Tang, Cheng
profles in Asperger syndrome and high-functioning autism. Journal of autism Yao, Ting Zhang, and Fangtian Ying. 2017. KinToon: a kinect facial projector for
and developmental disorders 35, 2, 145–158. communication enhancement for ASD children. In Adjunct Publication of the 30th
Annual ACM Symposium on User Interface Software and Technology. 201–203.
State of the Art in AAC: A Systematic Review and Taxonomy
Humphrey Curtis Timothy Neate Carlota Vazquez Gonzalez
Department of Informatics Department of Informatics Department of Informatics
King’s College London King’s College London King’s College London
London, UK London, UK London, UK
humphrey.curtis@kcl.ac.uk timothy.neate@kcl.ac.uk carlota.vazquez_gonzalez@kcl.ac.uk

ABSTRACT Critically, challenges in human social and communication skills


People with complex communication needs (CCNs) can use high- subject individuals to many risks such as: negative social interac-
tech augmentative and alternative communication (AAC) devices tions [122], employment challenges [156], educational access [48],
and systems to compensate for communication difculties. While mental health disorders [76] and a myriad other challenges [131].
many use AAC efectively, much research has highlighted chal- Equally, communication and freedom of speech is protected un-
lenges – for instance, high rates of abandonment and solutions der UN legislature [120] and revered as the very “essence of hu-
which are not appropriate for their end-users. Presently, we lack man life” [105]. Aided and unaided AAC strategies and systems
a detailed survey of this feld to comprehend these shortcomings serve to remediate communication difculties experienced by in-
and understand how the accessibility community might direct its dividuals and communities with complex communication needs
eforts to design more efective AAC. In response to this, we con- (CCNs) [126]. In particular, high-tech aided AAC devices encap-
duct a systematic review and taxonomy of high-tech AAC devices sulate the most advanced electronic AAC technology for example
and interventions, reporting results from 562 articles identifed in speech generating devices (SGDs) or voice output communication
the ACM DL and SCOPUS databases. We provide a taxonomical aids (VOCAs) [42]. Data on the prevalence of specifcally high-tech
overview of the current state of AAC devices – e.g. their interac- AAC is limited, however there will be an increase in the number of
tion modalities and characteristics. We describe the communities individuals requiring AAC interventions [9, 107, 135]. Also, there
of focus explored, and the methodological approaches used. We has been sustained academic research into high-tech AAC devices
contrast fndings in the broader accessibility and HCI literature to and interventions since the formation of the International Society
delineate future avenues for exploration in light of the current tax- for AAC (ISAAC) in 1983 and the AAC journal in 1985 [155, 183].
onomy, ofer a reassessment of the norms and incumbent research Since then, high-tech AAC has been developed to support wide age
methodologies and present a discourse on the communities of focus ranges [106] and serve many communities [135]. In many cases,
for AAC and interventions. high-tech AAC devices and interventions have contributed to pos-
itive and successful outcomes for individuals with CCNs whilst
CCS CONCEPTS advances in computer technology, machine learning (ML) and ar-
tifcial intelligence (AI) hold much promise for future high-tech
• Human-centered computing → Empirical studies in acces-
AAC [140, 185].
sibility; Accessibility technologies.

KEYWORDS 1.1 High-tech AAC Device Abandonment and


AAC, Alternative and Augmentative Communication, Systematic
Fellow Systematic Reviews
Review, Taxonomy, Accessibility. Numerous HCI researchers, including Bircanin et al. [20], Ibrahim
et al. [81], Norrie et al. [140] have noted that high-tech AAC de-
ACM Reference Format:
Humphrey Curtis, Timothy Neate, and Carlota Vazquez Gonzalez. 2022. vices too often experience a high rate of abandonment amongst
State of the Art in AAC: A Systematic Review and Taxonomy. In The 24th their target community. The reasons for abandonment of high-tech
International ACM SIGACCESS Conference on Computers and Accessibility AAC devices are far ranging and vary dependent on the community
(ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, and intervention circumstances [140]. However, research has found
22 pages. https://doi.org/10.1145/3517428.3544810 that high-tech AAC devices can be frustrating to use, unreliable,
slow [92, 140] and inefective for certain common communication
1 INTRODUCTION interactions leading to breakdowns and misalignments [81, 92].
Approximately 2.2 million people in the UK and 1.3% of the US Other problems noted by research include that high-tech AAC
population experience a form of communication impairment [18]. devices carry a stigma [128], are too expensive [65], hard to pro-
gram [33] and inconsiderate of cultural factors [100] – making
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed high-tech AAC devices simultaneously challenging for their users,
for proft or commercial advantage and that copies bear this notice and the full citation close caregivers, specialists and wider communities to adopt. At
on the frst page. Copyrights for components of this work owned by others than the the same time high-tech AAC has often been found to inadequately
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission adapt to their users communication strengths and weakness or
and/or a fee. Request permissions from permissions@acm.org. ofer pathways for multimodal or embodied forms of communica-
ASSETS ’22, October 23–26, 2022, Athens, Greece tion [81]. In light of these criticisms, we believe it is important to
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 refect on the full body of high-tech AAC research with a systematic
https://doi.org/10.1145/3517428.3544810 review (SR) of the literature and taxonomic overview of devices
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

fostering novel pathways for improving high-tech AAC devices, research questions, mapped in Figure 1, we have four contributions
interventions and research. in this paper:
There are several pre-existing SRs into AAC, which typically fo- (1) We have built an open-source data-set of 562 coded papers
cus on research interventions for specifc communities and groups. from 1978-2021 focused on research into high-tech AAC de-
Beukelman et al., studied AAC interventions for adults with neu- vices and interventions encompassing peer-reviewed articles
rological conditions [17], Biggs et al., reviewed interventions for using 2021 PRISMA guidelines [153].
children with CCNs [19], both Holyfeld et al., and Logan et al., ex- (2) We present the frst SR and taxonomy of high-tech aided
amined AAC interventions amongst people with autism (ASD) [77, AAC devices within the ACM literature. The taxonomy of
111], van der Meer et al., analysed interventions for individuals high-tech AAC was developed in three ways: (a) collecting
with developmental disabilities (IDDs) [182] and Simacek et al., an inventory of the interaction experience through coding
focused on AAC interventions amongst individuals with multiple high-tech AAC devices’ input and outputs (b) understanding
disabilities [165]. Lastly, Moorcroft et al., assessed research on bar- the features of high-tech AAC devices through coding the
riers in the provision of low-tech and unaided AAC [130]. Some interface layout and scalar attributes (c) understanding the
SRs have even focused specifcally on high-tech AAC devices and communication facilitated via coding the communication
interventions. Baxter et al., conducted two reviews of high-tech model/type facilitated by the device.
AAC interventions, barriers and facilitators [11, 12]. Whilst, the re- (3) We provide an analysis of the methodologies, roles, and
maining high-tech AAC reviews focus on interventions in specifc communities of focus within high-tech AAC interventions
communities and groups. Morin et al., provided a meta-analysis of and research. This data is then cross-analysed with the Mack
AAC interventions for people with IDDs [131], Still et al., exam- et al. SR of accessibility research [112], which we consider to
ined interventions for people with ASD [170] and lastly Ju et al., be normative standards for accessible computing research –
considered high-tech AAC interventions amongst ICU patients [89]. to gain an understanding of similarities and diferences with
Despite this previous research, at present there is not a broad wider accessibility research.
survey of the literature including analysis of the earliest and latest (4) We provide implications for future research on high-tech
high-tech AAC research. Consequently, this review and taxonomy AAC devices and interventions. In the discussion, we identify
will look to supplement the following three research gaps vacated directions for future scientifc investigations and high-tech
by the previous SRs. Firstly, previous SRs were focused on clinical AAC development. With almost 40 years of preceding re-
practice over HCI and device-orientated development – failing search, we consider this contribution signifcant for new
to establish a taxonomy of high-tech AAC, which supports the researchers to improve the development of future high-tech
development of future generations of devices. Secondly, previous AAC devices and decrease the current high rates of aban-
SRs did not track the prevailing research methods and capture data donment [81].
on the studies undertaken. Thirdly, previous SRs have not captured
broad data on which communities have comparatively currently
received the most high-tech AAC research. Additionally, the SRs
2 BACKGROUND AND RELATED WORK
predate the recent criticisms of high-tech AAC (c.f. [20, 81, 140]) and We outline the key related work on SRs of high-tech AAC and frame
ofer no comparison with the Mack et al. fndings from reviewing our work in communication research; discussing the models and
accessibility research [112]. types of communication for developing the codebook.

2.1 Related Work in High-tech AAC


1.2 Research Questions and Contributions
Although no SR and taxonomy of high-tech AAC has been pub-
To focus our contribution we initially developed three research lished within the ACM literature, previous research has explored the
questions: design and barriers of high-tech AAC [11, 12, 166]. Plus, previous
RQ1: What is the current taxonomy and dominant characteristics research has considered specifc sub-areas including high-tech AAC
of high-tech AAC? interventions for aphasia [181], autism [170] and ICU patients1 [89].
RQ2: What research methods are used to contribute towards the 2.1.1 Related Taxonomies and Investigations of AAC Design and
design and study of high-tech AAC devices and interven- Barriers. The closest contribution to a taxonomy for high-tech AAC
tions? is a paper by Belani presenting a usability requirements taxonomy
RQ3: Who does high-tech AAC devices and interventions focus for mobile AAC services [14]. Belani’s taxonomy inherits from sys-
on? tems engineering and focuses on accessible software principles: the
The three research questions support the review and analysis context of AAC usage, user relations and the principles of simplicity,
of previous high-tech AAC research – quantitatively evaluating supplementing and trustworthiness [14]. Furthermore, this taxon-
the design of devices, methodologies and communities of focus. omy tries to develop a set of augmentative requirements difering
The research questions capture data to enable the identifcation from our systematic investigation of the pre-existing literature and
of notable research gaps, development of strategic directions for consequent device taxonomy2 [14, 30]. Barriers and facilitators of
future research and support cross-comparison with the Mack et 1 Critical
research has considered the efcacy of high-tech AAC through consumer
al., research [112]. Thereby supporting more novel, successful and perspectives [24] and as an evidence based practice [131].
long-term high-tech AAC interventions. To answer these three 2 See Brudy et al.’s, taxonomy of the cross-device computing domain [30].
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

high-tech AAC interventions for individuals under the age of six-


teen with ASD [170]. They found portable SGDs have been fre-
quently favoured for interventions – in particular, iPod and iPad-
based applications and intervention results were positive for teach-
ing requesting skills [170]. Within a hospital setting for voiceless
ICU patients, Ju et al. performed an SR considering adoption and
efects of a high-tech AAC intervention [89]. From the 18 stud-
ies qualitatively synthesised with the TAM model, they found that
high-tech AAC was easy to learn and use in most studies – with
customisation and portability of devices most important for pa-
tients [89]. For future positive development of high-tech AAC tech-
nologies, they encouraged further collaboration directly with ICU
staf and patients [89]. Although not specifcally high-tech AAC
research, broader accessibility scholarship has ofered meaningful
refection that should shape future high-tech AAC interventions.
Scholars have accepted the importance of not “medicalizing” dis-
ability5 [45, 112]. In this vein, AAC devices should not serve as
assistive technologies to fx people living with communication dis-
orders. Bennett et al. [15] have efectively conceptualised this form
of technology through a notion of independence – in which the
high-tech AAC device itself incorrectly serves as a dependent in-
terface to communicating with the environment and others6 [15].
Instead, we believe high-tech AAC should be guided by Bennett
et al.’s interdependence framing of assistive technology (AT) – re-
Figure 1: Flow diagram presenting an overview of the fg- framing high-tech AAC as an assistive technology (AT) that all can
ures and tabular contribution towards each research ques- freely engage and leverage to interface and communicate with the
tion. 1.2: Figure 4 and Tables 2, 3, 4. 1.2: Figures 5, 6 and Ta- environment [15].
bles 6, 7, 8. 1.2: Tables 9, 10, 11.
2.2 Models and Types of Communication
High-tech AAC is developed based upon our understanding of com-
munication and models of successful communication exchanges
high-tech AAC are explored in 2 SRs by Baxter et al. [11, 12]. In between human parties. Here, we discuss three recognised models
their frst SR of 27 papers, they identifed several key themes – 11 of communication and two types of communication – this scholar-
papers described the limited reliability of high-tech AAC devices, 8 ship directly infuenced the categories and sub-codes used in our
papers highlighted the difculties of learning how to operate de- taxonomy of devices and high-tech AAC codebook.
vices and 6 papers discussed inappropriate voices/words generated
by devices within specifc cultural contexts3 [11]. Turning to their 2.2.1 Models of Communication. The most basic model of com-
second and more extensive SR of 65 papers and interventions from munication that high-tech AAC can support is the original model
2000-10 [12]. In order of frequency, Baxter et al. report that the of communication i.e. Shannon-Weaver sender-receiver or Linear
most common community of focus for high-tech AAC interventions model [161]. The model consists of just four parts7 and mirrors the
were 14 studies focused on aphasia from non-progressive causes, 13 functioning of radio and telephone technology – for instance the
focused on interventions with people with autism disorder (ASD), sender delivers a message using a high-tech AAC device, the chan-
12 considered interventions for adults with cerebral palsy whilst nel is the AAC device speaker, and the receiver hears the sender’s
other communities faced a smaller subset4 [12]. message [161]. Within this model, information is solely transmitted
between the two parties. Yet, Ibrahim et al., have criticised high-
2.1.2 Sub-areas of High-tech AAC. SRs considering sub-areas of tech AAC devices that exclusively support sender-receiver and
high-tech AAC interventions to a greater extent report optimistic linear communication [81]. Emphasising Kraat et al.’s research [98]
results. For people with aphasia (PWA), Sandt-Koenderman’s SR , Ibrahim et al. argue that high-tech AAC communication should
emphasised the need for AAC devices to be “tailor made”, taking be much more dynamic made up of people, the setting, rules of
advantage of PWAs residual language skills and communicative language use and situated within the context [81]. In contrast, the
strengths [181]. In contrast, Still et al. perform an SR focused on Interactive model ofers more dynamic communication. Here, high-
tech AAC supports the feedback fow between sender and receiver
5 Mankof et al., has called for greater representation of disabled people in accessibility
3 Secondary themes from the research include needs for staf training, limitations of research – in the same vein as Mack et al., we provide some empirical metrics of
technical support, decision difculties faced by families whilst selecting an AAC device representation within our dataset [112, 114].
and negative communication rates with the AAC device [11]. 6 Limiting the AAC users’ autonomy, perpetuating social stigma and increasing
4 Other fndings of the research include that devices were found to be benefcial in marginalization by exacerbating social diferences.
enhancing communication across a broad range of diagnoses and age ranges [12]. 7 These parts are: (1) sender, (2) message, (3) channel, and (4) receiver.
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

– feedback can be verbal (i.e., “yes” or "no" or nonverbal i.e., a nod to foster communication such head/mouth sticks to enable point-
or smile) [190]. However, the feedback provided by the high-tech ing, pen, paper, and eraser boards for drawing or writing to even
AAC is not simultaneous and can be potentially slow or indirect. customised analogue systems such as communication boards and
Lastly, the Transactional model is the most dynamic – senders and books [130]. Medium-tech AAC devices encapsulate very simple
receivers are now considered communicators [10]. Now, high-tech electronic devices with basic technology [142]. Examples include
AAC communication is a co-created process with instantaneous battery powered switches or buttons that communicate one or two
feedback between parties [10]. Although challenging, high-tech messages, an LED light, or appropriated technology devices such
AAC must look to support interactive and even transactional com- as motion toys, radios, and fans [142]. Therefore, high-tech aided
munication where communication is an embodied, meaningful and AAC devices encapsulate the most advanced electronic AAC tech-
multimodal experience for its users [81]. nology with multi-message vocabularies – for example, SGDs or
VOCAs [17]. High-tech AAC devices vary in shape and size, poten-
2.2.2 Types of Communication. Broadly there are two exclusive
tially with dynamic displays for depicting letters, words, phrases,
types of communication, which high-tech AAC can enrich: non-
pictures and/or symbols that the communicator navigates to ex-
verbal and verbal8 . Verbal communication is the use of words to
press messages [4]. They difer in size, weight, and portability as
convey a message whether that be written or dialogue [84]. Ex-
well as access methods – either direct selection of a screen/keyboard
amples of verbal communication include oral forms i.e., speech
with a body part, pointer, or eye gaze adapted mouse/joystick or
and written forms e.g., letters and text messages [84]. In contrast,
indirect selection forms such as switches and scanning [18, 46, 142].
non-verbal communication explains the processes that convey a
Our operational defnition of high-tech AAC is established by the
message in a form of non-linguistic representations [75]. Examples
following three requirements:
of non-verbal include gestures, sign language, facial expression,
and eye contact [75]. Most high-tech AAC devices do not equally Req1: Our scope addresses AAC research. AAC encompasses tools
support both types of communication. High-tech AAC to augment and strategies that individuals use to supplement communi-
and enhance users’ verbal communication has taken precedence cation [42]. Communication is multimodal and takes many
over supporting non-verbal forms [180]. Instead, high-tech AAC forms i.e., speech, glance, text, gestures, facial expressions,
should consider and enhance both types of communication for sign language, symbols, pictures, SGDs etc. contingent upon
successful outcomes [80]. For instance, Valencia et al.’s physical the context and communication partner [98].
expressive objects successfully increase augmentative communica- Req2: The paper must research aided AAC interventions and de-
tors (ACs) agency in conversations with unfamiliar partners using vices. Aided AAC serves as hardware or software that is
solely non-verbal messages and signals [52, 180]. used to supplement, enrich, or replace communication. Aided
AAC can take many forms and is categorised as low, medium
3 METHODS or high-tech.
Req3: The paper must address high-tech aided AAC interventions.
In this paper we followed PRISMA 20219 procedures for SRs [153]
High-tech AAC refers to any AAC hardware, device, tools,
with secondary support from Siddaway et al. [162] and Silva et al.’s
software or technologies powered by electricity that permits
SR guidelines [164]. Firstly, we consider the scope of our investiga-
the storage and retrieval of multiple electronic messages to
tion via defning high-tech AAC. Secondly, we discuss methods for
support or enrich the users communication.
creating a dataset using PRISMA guidelines. Thirdly, we describe
the qualitative, quantitative and programmatic analysis methods of Requirements 3.1 to 3.1 specify the scope of our investigation. We
this study. wanted to focus on AAC research for individuals with CCNs. How-
ever, within this broad domain, we narrowed towards aided AAC
3.1 Scope tools and devices deliberately designed to enrich verbal and nonver-
We start by presenting our defnition of high-tech AAC devices and bal communication rather than unaided forms or systems. Equally,
interventions. AAC comes in many forms to support a variety of we wanted to focus specifcally on high-tech aided interventions,
communities and needs - therefore a clear defnition for the scope meaning our scope avoids appropriated devices or low-tech forms
is signifcant [18]. Unaided AAC interventions use no equipment of AAC.
i.e., signing or body language [130]. For unaided AAC research
there are several pre-existing SRs, such as: [130]. By contrast, aided 3.2 Dataset Establishment
AAC interventions encompasses any enabling aids or technologies There are many ways to conduct SRs, but to derive solutions for our
used to support or replace communication for those with CCNs three research questions (i.e., 1.2– 1.2) we harnessed methods de-
thereby enriching the production or comprehension of communi- tailed in PRISMA 2021 guidelines for reporting meta-analyses [153].
cation [126]. However, this equipment comes in a wide variety of We describe in detail the following: identifcation, screening, eligi-
form factors [18]. Aided AAC devices are categorised as no, low, bility and snowballing depicted in Figure 3.
medium and high-tech [18]. Low-tech AAC devices do not require
electricity or battery power [130]. Typically, they are simple props 3.2.1 Identification. The role of the identifcation stage is to cap-
ture work that addresses the three research questions by performing
8 Sometimes re-conceptualised with more types of communication considered such as queries of the relevant scientifc databases. We chose the ACM DL10
written or visual.
9 The PRISMA acronym stands for Preferred Reporting Items for Systematic reviews
and Meta-Analyses. 10 https://dl.acm.org
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 2: Flow diagram providing the scope of our investigation on high-tech AAC devices and interventions. For greater detail
please refer to 3.1, 3.1 and 3.1 detailed within the text.

Figure 3: The PRISMA diagram [153] illustrating the reference frequencies of identifcation, screening, eligibility, snowballing
and inclusion stages of our SR and taxonomy of high-tech AAC devices and interventions.

and SCOPUS11 as two major electronic databases for Computer Sci- questions and extracted initial keywords – utilising these, we then
ence and AAC research. Initially, we performed some preparatory collectively iterated several times to produce synonyms for more
searches to broadly investigate the search space – before commit- keywords including acronyms and common named high-tech AAC
ting to a defnitive search strategy. From this initial investigation, devices e.g., voice output communication aids and speech generating
we identifed that the ACM database provided essential HCI papers devices. We then developed methods to avoid returning high quan-
yet failed to include key research papers from the academic liter- tities of false-positive papers – for instance, we decided to search
ature – such as the AAC journal. In contrast, SCOPUS ofered a just title, abstract and keywords. Additionally, within the search
broader search space containing literature from the AAC journal string we concatenated the keywords with OR plus AND connec-
and several other prominent venues for high-tech AAC literature. tors to ensure the ordering of the terminology was coherent and
For deciding keywords, we performed an analysis of the research asterisks were liberally used to enable multiple and plural forms of
the keywords. For the ACM DL, the fnal query was:
11 https://www.scopus.com/
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

"query": were then only removed from the dataset once the two authors had
{
Title: ((aac) OR (augment* AND alternat* AND communicat*) OR (comput* AND
come to agreement. This screening process resulted in 1,288 papers
assist* AND communicat*) OR (voice* AND output* AND communicat* AND aid*) remaining in the dataset.
OR (alternat* AND augment* AND communicat*) OR (speech* AND generat* AND device*)) OR
Abstract: ((aac) OR (augment* AND alternat* AND communicat*) OR (voice* AND output*
AND communicat* AND aid*) OR (alternat* AND augment* AND communicat*) OR (speech*
3.2.3 Eligibility. Following PRISMA guidelines – we formulated a
AND generat* AND device*)) OR Keyword ((aac) OR (augment* AND alternat* AND communicat*) set of four eligibility criteria (3.2.3 through to 3.2.3) to further flter
OR (comput* AND assist* AND communicat*) OR (alternat* AND augment* AND communicat*) OR
(voice* AND output* AND communicat* AND aid*) OR (speech* AND generat* AND device*)) out work irrelevant to the scope of our systematic literature review
} and domain:
The ACM query returned 462 results. For the SCOPUS database EC1: Availability of full-text. The full-text research paper is avail-
we queried solely the title feld because initial testing with the able and written in English to be accessible for the research
abstract and keyword felds resulted in too many false positives – team.
meaning featureless, large search spaces exceeding 20,000 papers. EC2: Peer-reviewed academic research. The research must be peer
Additionally, we performed an abstraction based on venue for the reviewed literature and academic. E.g., journal articles, con-
SCOPUS query to limit the state space and number of papers. The ference papers, and PhD dissertations. Other media forms
fnal relevant selected venues were titled with: AAC, technology, would be excluded.
aphasia, language and communication disorder. Meaning the fnal EC3: Aided AAC research and interventions. The paper had to
SCOPUS query was: clearly focus on aided AAC interventions – an AAC aid
( is any device that is electronic or not, which is harnessed
TITLE-ABS-KEY(aac) OR TITLE-ABS-KEY(voice* AND output* AND communicat* AND aid*)
OR TITLE-ABS-KEY(speech* AND generat* AND device*) OR
to transmit and receive messages. In contrast, unaided AAC
TITLE-ABS-KEY(alternat* AND augment* AND communicat*) OR TITLE-ABS-KEY(augment* interventions are those that do not require external tools
AND alternat* AND communicat*)) AND (
LIMIT-TO ( EXACTSRCTITLE,"AAC Augmentative And Alternative Communication" ) e.g., sign languages.
OR LIMIT-TO ( EXACTSRCTITLE,"Lecture Notes In Computer Science Including Subseries EC4: High-tech AAC interventions. The paper must include re-
Lecture Notes In Artificial Intelligence And Lecture Notes In
Bioinformatics" ) search on high-tech aided AAC interventions. As defned
OR LIMIT-TO ( EXACTSRCTITLE,"Augmentative And Alternative Communication" )
OR LIMIT-TO ( EXACTSRCTITLE,"Disability And Rehabilitation Assistive Technology" )
earlier, high-tech refers to the most advanced forms of aided
OR LIMIT-TO ( EXACTSRCTITLE,"Communication Disorders Quarterly" ) AAC – that being electronic devices that enable the retrieval
OR LIMIT-TO ( EXACTSRCTITLE,"Assistive Technology" )
OR LIMIT-TO ( EXACTSRCTITLE,"International Journal Of Language And Communication Disorders" )
and storage of electronic messages to support and enrich
OR LIMIT-TO ( EXACTSRCTITLE,"Advances In Intelligent Systems And Computing" ) communication. There are a variety of high-tech AAC de-
OR LIMIT-TO ( EXACTSRCTITLE,"Conference On Human Factors In Computing Systems Proceedings" )
OR LIMIT-TO ( EXACTSRCTITLE,"Topics In Language Disorders" ) vices in a multitude of diferent forms: such as a dedicated
OR LIMIT-TO ( EXACTSRCTITLE,"Technology And Disability" ) device like a DynaVox e.g., [4], tablet and smartphone apps
OR LIMIT-TO ( EXACTSRCTITLE,"Aphasiology" )
OR LIMIT-TO ( EXACTSRCTITLE,"Assistive Technology Research Series" ) e.g., [124], brain computer interfaces (BCI) e.g., [59] and even
OR LIMIT-TO ( EXACTSRCTITLE,"Communications In Computer And Information Science" )
OR LIMIT-TO ( EXACTSRCTITLE,"Journal Of Special Education Technology" )
wearables e.g., [54]. We wanted to incorporate as much of
OR LIMIT-TO ( EXACTSRCTITLE,"Studies In Health Technology And Informatics" ) this emergent literature as possible to provide an encompass-
OR LIMIT-TO ( EXACTSRCTITLE,"Communication Sciences And Disorders" )
OR LIMIT-TO ( EXACTSRCTITLE,"Pervasivehealth Pervasive Computing Technologies For Healthcare" )
ing analysis of high-tech AAC research.
OR LIMIT-TO ( EXACTSRCTITLE,"ACM Transactions On Accessible Computing" )
OR LIMIT-TO ( EXACTSRCTITLE,"International Journal Of Speech Technology" )
The eligibility process involved the frst author initially labelling
) papers as in, out 12 or unsure. Consequently from the 1288 papers
The SCOPUS query returned 1,531 papers in total. All queries – 490 papers were included, 723 papers were excluded for failing
were run on the 05/11/2021 and returned a combined 1,993 papers to meet eligibility criteria (3.2.3 to 3.2.3) and 75 labelled as unsure.
in total from 1978-2021. Checks on the labelling were performed by the second author and
followed by lengthy discussions on the unsure papers. Then from
3.2.2 Screening. A quick scan of dataset titles revealed that some the unsure papers 44 were included with a further 31 papers ex-
of the 1,993 were out of scope and required manual removal during cluded. Once all procedures were completed, 534 papers remained
the subsequent process of screening. Firstly, we discovered 39 du- in the dataset for manual qualitative coding meaning in total 754
plicates and eliminated them from our dataset. Secondly, we read papers had been excluded.
titles, abstracts and keywords – removing papers not focused on
3.2.4 Snowballing. Snowballing identifed a further 28 papers on
high-tech AAC research. Consequently, 595 of the results were
the topic of high-tech AAC consisting of peer reviewed literature
not papers within the research area and focused on factors such
from 1992-2021. Following guidance from Wohlin, we performed
as: advanced audio coding e.g., [38], digital content distribution
snowballing iterations until dead ends were reached and no new
e.g., [95], communication routing algorithms e.g., [139] and pro-
candidate papers were identifed from forward or backward snow-
teins amino acid composition e.g., [188]. A further 67 results were
balling every paper [196]. The process of snowballing took 3 itera-
clearly not full-text papers i.e., conference proceedings and 4 pa-
tions until Wohlin’s efciency metric reached 0% [196]. Through
pers were removed for not being in English. Throughout screening,
this, we identifed 44 potential candidate papers from titles and
an explanation fle was developed where the frst author provided
abstracts, yet upon further inspection all did not meet our eligi-
an index, reason, and explanation for the suggested removal of a
bility criteria for inclusion. After applying these, just 28 papers
paper from the dataset. Subsequently, the second author reviewed
were added to our dataset. Backward snowballing gave 18 papers
the dataset and explanation fle for quality control purposes – to
by studying the references of selected papers. Forward snowballing
jointly discuss any highlighted papers promulgating uncertainty
and verify removal suggestions made by the frst author. Papers 12 Each paper labelled as out – failed eligibility criteria was noted.
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

presented 10 papers by studying the works cited utilising Google and analyzed corresponding results from the Mack et al., SR into
Scholar. After snowballing our dataset was 562 papers in total. accessibility research [112]. However, we accept that a limitation
of the comparison is that the Mack et al., dataset is from a more
3.3 Analysis recent period of 2010-9.
Analysis involved qualitative coding of the entire dataset and cal-
culating Fleiss’s Kappa inter-rater reliability (IRR)13 [53] to reach 4 RESULTS
agreement between the three authors. Following this, we program-
4.1 A Taxonomy of High-tech AAC Devices
matically examined paper counts over the 43-year period, per-
formed quantitative data extractions for participant counts, de- and Interventions
veloped an inventory of key high-tech AAC devices and analyzed We present results from the following 7 categories within our
corresponding results found by the Mack et al., dataset [112]. dataset of 562 papers: interaction input, output modality, scalar
attributes, layout, scenarios/communication partners, communica-
3.3.1 Qalitative Analysis of the Dataset. We qualitatively coded tion model and communication type.
all 562 papers from 1978-2021. The process of coding and analysing
the dataset was performed by 3 scholars. We note that the proce- 4.1.1 Input and Output Interaction. We successfully coded the in-
dure of building and analyzing the dataset involved subjectivity puts and outputs of high-tech AAC devices within the dataset (see
and we appreciate that our scholarship refects our own biases and Figure 4). Starting with inputs, the most frequent locus of research
beliefs. Authors and coders identifed as male and female of Western is mechanical (N=216, 38.4%) followed by tactile (N=184, 32.7%).
and Southern European backgrounds, with no accessibility needs. Since the earliest high-tech AAC systems, mechanical inputs have
The codebook is synthesized to provide data for our three research come in a wide variety of forms. Examples include, switches [180],
questions (i.e., 1.2, 1.2 and 1.2). Shown in Table 1, the fnal code- keyboards [157], button presses [125], mechanical pointing de-
book included 16 categories with 2-10 subcodes each. Out of the vices [125], trackballs [187], and joysticks [157]. Tactile inputs
16 categories, 8 were developed by the research team and 8 based serve as a key input modality for touchscreen controlled hard-
upon the Mack et al., codebook for accessibility research [112]. The ware e.g., smartphones [124], tablets [124] and smartwatches. In
8 categories adapted from the Mack et al., were to cross-analyse contrast, camera (N=77, 13.7%) and gestural input (N=48, 8.5%) tech-
high-tech AAC research with current normative accessibility stan- niques have been explored to a lesser extent. However, cameras
dards14 . In contrast, the 8 categories from the research team focused have been leveraged for diferent interactions: primarily to enable
on establishing a taxonomy and understanding of the dominant eye gaze [17], blink activated, head motion controlled high-tech
characteristics of high-tech AAC devices. We wanted to code the AAC systems [17], equally to better design and confgure AAC
features and role of the device – its commercialism, inputs/outputs, systems, to provide gesture recognition for sign language [70] and
scalar attributes, communication types/model and scenarios of us- for photography to support contextual word discovery [91, 141],
age. This data would taxonomise and provide an understanding of personalisation of VSDs [192] and even storytelling. Body ges-
the current characteristics of high-tech AAC devices. The codes tures have been leveraged in diferent ways to control high-tech
were iterated by two authors using an iterative inductive analysis AAC devices. Systems have been designed that are controlled by
approach – in which codes were independently developed and f- tongue movement [143], musculature contractions [21], breath con-
nally agreed between both parties. Indeed, the two authors would trolled [46], heart rate signals and brain computer interfaces (BCIs).
regularly meet – to refne and eliminate existing codes or add new BCIs have been developed for diferent brain signals such as EEG
codes. Qualitative coding took three months and IRR was calculated and EMG signals [86]. Recent advances have been made using BCI’s
between the three authors – to mitigate against bias and fatigue. such as P300-based, the RSVP Keyboard and BrainGate to enable
The second and third author retrospectively15 coded a random sam- individuals with a physical disability to use assisted communication
ple of 10% (N=60) of the dataset to provide a Fleiss Kappa IRR and interfaces [59, 148, 149, 178].
provide an opportunity for disagreements to be resolved through Contextual (N=31, 5.5%), and verbal inputs (N=29, 5.2%) are less
consensus. explored in high-tech AAC. Contextual enables smart leveraging
of environmental knowledge to act as an input to the high-tech
3.3.2 Qantitative and Programmatic Analysis. For quantitative
AAC device [22, 99]. Knowledge of the context i.e., environment,
analysis, we programmatically analyzed paper counts for the full
location or communication partner have been harnessed in high-
43 year period and developed novel visualisations to demonstrate
tech AAC systems for: natural language generation (NLG) [152],
our fndings. Following this, we performed an analysis of partici-
synthesised vocabulary searches, discourse prediction, improved
pants counts for user studies – calculating mean, median, interquar-
adaptation to topics and to capture experiences [90, 91]. Verbal
tile range and standard deviation providing analysis for the entire
inputs can be used as another input for high-tech AAC devices –
dataset and each community of focus. Then we constructed an
to provide utterance recognition [174], perform NLP on the com-
inventory of key devices regularly mentioned within the research
munication partners dialogue [195], function as a voice input voice
13 An extension of Cohen’s Kappa for three raters or more [53]. output device [72] and ofer prosody on the wearer’s dialect. Lastly,
14 The identifcation of diferences, advantages and limitations of high-tech AAC versus orientational input (N=7, 1.2%) serves as an underexplored input
wider accessibility standards we believe will improve future high-tech AAC devices technique. Accelerometers have been harnessed to calibrate and im-
and interventions.
15 The third author was new to the dataset and not involved in the code development prove the conversation rate of people with motor impairments [66],
process – resulting in a marginally lower IRR versus other papers [112]. develop wearables to translate sign language in real time [8], for
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

Table 1: The fnal codebook is represented by 85 subcodes across 16 code categories. Retrospective qualitative Fleiss Kappa
IRR calculated across the subcodes for each category between the three authors. IRR ranged from almost perfect to fair with
a high pairwise agreement throughout.

Category Codes Pairwise Fleiss Kappa Level


agreement IRR
Participatory design Yes; No 90% 0.8 Almost perfect
Interaction input Verbal; Camera; Tactile; Gestural; Mechanical; 94.8% 0.789 Substantial
Orientational; Contextual
Ability-based comparison Yes; No 85.6% 0.709 Substantial
Output modality Audio; Visual; Motion; Gustation; Thermocep- 91.3% 0.693 Substantial
tion
Participant groups No user study; People with disabilities; People 88.4% 0.667 Substantial
without disabilities; Specialists; Caregivers
Communication model Linear; Interactive; Transactional 85.9% 0.644 Substantial
Communication type Verbal; Non-verbal 81.7% 0.622 Substantial
Use of commercial AAC Yes; No 77.8% 0.553 Moderate
Contribution type Empirical; Artifact; Methodological; Theoreti- 84.3% 0.543 Moderate
cal and opinion; Dataset; Survey
User study method Controlled experiment; Randomized control tri- 90.4% 0.501 Moderate
als; Survey; Usability testing; Interviews; Fo-
cus groups; Case study; Field study; Workshop;
Other
Use of proxies Yes; No 75% 0.5 Moderate
Community of focus BVI; DHH; Motor/physical impairment; Autism; 88.2% 0.462 Moderate
IDD; Other cognitive; Older adults; General dis-
ability; Other
Location No user study; Near/at researchers lab; Home, 85.2% 0.455 Moderate
residence or school; Neutral; Online/remote;
Other
Scalar attributes Morphable; Customisability; Automaticity; Ex- 88.9% 0.447 Moderate
pressivity; Adaptive; Practicality; Combined;
Parallel
Interface layout Symbols; Pictographic; Text; Animation; Grid; 80% 0.442 Moderate
VSD; Novel; No layout
Scenarios and communication Fellow AC; Family/friends; Professionals; 87.9% 0.388 Fair
partners Groups; Strangers; Anyone; Virtual; Unclear

speech therapy, and to make apps more usable amongst children 4.1.2 Typical Features of High-tech AAC. In Table 2, the feature
with disabilities [25]. Unlike inputs, outputs for high-tech AAC space of high-tech AAC is rich – here we coded for scalar attributes
devices have been less widely explored. Audio signals dominant and the interface layouts of high-tech AAC. We found that ac-
communication output (N=353, N=62.8%) – high-tech AAC devices cessibility researchers have prioritised developing high-tech AAC
tend to serve as SGDs, VOCAs and for speech synthesis. High-tech that is customisable (N=310, 55.2%) and automatic (N=262, 46.6%).
AAC outputting visual signals for the purpose of communication Customisability has been honed as a signifcant factor for making
have also been well explored (N=157, 27.9%) – including print- high-tech AAC more usable and personalized thereby increasing
able text interfaces, direct screens with captions, photographs, and adoption and acceptance of the high-tech AAC [34]. Automation has
graphics for the communication partner [40, 150]. More abstract been a well-explored and key area for high-tech AAC – automation
visual signal systems have been researched including LED lights of keystrokes [61], predictive text, abbreviation expansion [158]
and graphics [166]. Motion-based AAC is an emerging research and leveraging AI [22] has been researched to improve users’ com-
area (N=11, 2%). For instance, the motion of robots to communi- munication rates. Following these two subcodes, high-tech AAC
cate [85] – including in LEGO adaptable forms for children [1] and has been developed to be expressive (N=110, 19.6%) for diferent
co-designed novel sidekicks [180]. In addition, haptic feedback has genders [68], cultural groups [82], and age ranges [35]. To a lesser
been briefy explored for individuals with deaf-blindness [171]. Our extent AAC has been developed to receive combinations of inputs
dataset did not include AAC that uses taste or heat as a modality (N=57, 10.1%) to improve usability. Research has considered making
for communication. high-tech AAC more practical (N=45, 8%) – through making the
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

Inputs Outputs

Mechanical - 216 (38.4%) - 140 (24.9%)

Audio - 353 (62.8%) - 280 (49.8%)

Tactile - 184 (32.7%) - 111 (19.8%)

Visual - 157 (27.9%) - 87 (15.5%)


Camera - 77 (13.7%) - 34 (6%)

Gestural - 48 (8.5%) - 28 (5%) Motion - 11 (2%) - 4 (0.7%)

Contextual - 31 (5.5%) - 5 (0.9%)


Thermoception - 0 (0%) - 0 (0%)
Verbal - 29 (5.2%) - 8 (1.4%)
Gustation - 0 (0%) - 0 (0%)
Orientational - 7 (1.2%) - 3 (0.5%)

Figure 4: Sankey diagram proportionally representing inputs and outputs of AAC found in the review. The frst number indi-
cates the frequency, the second the number or instances this code was found without other inputs/outputs.

Table 2: Frequency of applied codes for high-tech AAC scalar attributes and high-tech AAC interface layouts mentioned in
dataset.

Scalar attributes Papers w/code This code only Interface layout Papers w/code This code only
Customizable 310 (55.2%) 81 (14.4%) Text 322 (57.3%) 84 (14.9%)
Automatic 262 (46.6%) 39 (6.9%) Symbols 223 (39.7%) 18 (3.2%)
Expressive 110 (19.6%) 8 (1.4%) Grid format 157 (27.9%) 2 (0.4%)
Combined 57 (10.1%) 3 (0.5%) Pictures/drawn 127 (22.6%) 8 (1.4%)
Practical 45 (8%) 4 (0.7%) No layout 29 (5.2%) 29 (5.1%)
Adaptive 42 (7.5%) 3 (0.5%) VSDs 29 (5.2%) 2 (0.4%)
Parallel 17 (3%) 0 (0%) Video or animation 23 (4.1%) 4 (0.7%)
Morphable 14 (2.5%) 0 (0%) Novel 5 (0.2%) 2 (0.4%)

device discreet, wearable, small and lightweight [193]. Accessibility wearable and head worn display [193]. In terms of interface, we
researchers have considered leveraging AI systems to improve the fnd that most high-tech AAC interfaces use text (N=322, 57.3%) and
adaptiveness (N=42, 7.5%) of the AAC system – to improve feature symbols (N=223, 39.7%). Historically, grid formats (N=157, 27.9%)
recognition i.e., eye gaze/landmark detection, to orient to the user for AAC often with accompanying symbols have been popular [4].
over time [6] and for improving frequency prediction algorithms To a lesser extent high-tech AAC has leveraged pictures or been
of vocabulary dependent on location and usage history [62]. drawn (N=127, 22.6%). Few papers had no or a limited interface
A small amount of research has considered making high-tech meaning the AAC operated of sensors (N=29, 5.2%) [151]. A notably
AAC parallel (N=17, 3%) and morphable (N=14, 2.5%). Examples of small number of papers (N=29, 5.2%) provided research of high-tech
parallel AAC includes a high-tech device that receives eye-gaze and AAC with a visual scene display (VSD). Lastly, a small selection
simultaneous input suggestions from the communication partner to of papers (N=5, 0.2%) had entirely novel interfaces including LED
increase communication rates during exchanges [51]. Limited high- morse code [78], augmented reality [94], and tactile surfaces [171].
tech AAC is morphable thereby physically adaptive according to
the consumer preferences or designed to run on diferent hardware
and devices. Some examples, include do it yourself (DIY) AAC 4.1.3 Communication Supported by High-tech AAC. High-tech
kits [71], Lego robots physically confgurable by the child [2] and AAC is designed for diferent scenarios and contexts of usage (see
an AAC device that changes form factor by running on either a wrist Table 3) – here we code if papers explicitly mentioned contexts in
which the device could be used. Signifcantly, devices have mainly
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

Table 3: Frequency of applied codes for scenarios of high- for each year since 2011. This replicates wider trends noting the in-
tech AAC usage mentioned within dataset. creasing prominence of accessibility research [112]. Notable market
available hardware innovations have also fed this growth. High-tech
Scenario Papers w/code This code only AAC devices have become more readily available with designers
able to shift from dedicated hardware e.g., Dynavox DynaMyte 3100
Professionals 315 (56%) 155 (27.8%)
(1999) to increasingly developing apps e.g., Proloquo2Go (2009) for
Family/friends 221 (39.3%) 56 (10%)
accessible touch-screen devices such as smartphones e.g., iPhone
Strangers 65 (11.6%) 14 (2.5%)
(2007) and smart-tablets e.g., iPad (2010). Further analysing the
Groups 39 (6.9%) 1 (0.2%) input/output data over time in Table 5, we fnd trends have to a
Virtual 37 (6.6%) 21 (3.7%) greater extent reciprocated evolution’s in the market available hard-
Fellow AAC users 11 (2%) 1 (0.2%) ware and a general increase over time16 . Predictably, in terms of
Anyone 7 (1.2%) 5 (0.9%) inputs camera (0% to 22.4%) and tactile touchscreen technology has
Unclear 22 (3.9%) 22 (3.9%) increased with time (0% to 43.3%).
Whilst, mechanical inputs have steadily fallen over time (100%
to 22.4%). In Table 6, we found the majority of high-tech AAC con-
been built for professional contexts typically with specialists, teach- tributions were empirical (N=410, 73%) and artifact (N=224, 39.9%)
ers or carers (N=315, 56%). Additionally, high-tech AAC has often contributions. The two contributions regularly occurred in conjunc-
been designed to be used with family and friends (N=221, 39.3%) at tion (N=169, N=30.1%). Similarly, in the Mack et al. dataset these
home or in social settings. A smaller number of papers (N=65, 11.6%) were both the two most popular contribution types [112]. Equally,
mentioned designing high-tech AAC for use with strangers or unfa- the following four contribution types occurred comparatively less
miliar communication partners. In addition, an even smaller number with Survey (N=60, 10.7%), methodological (N=43, 7.7%), theoreti-
of papers discussed high-tech AAC to be used with groups (N=39, cal and opinion (N=19, 3.4%) and Dataset (N=6, 1.1%). However, a
6.9%) and virtual communication (N=37, 6.6%). Group scenarios notable diference with the Mack et al., was that literature survey
involve several people – making the communication environment contributions were much higher for high-tech AAC (+10.1%) and
pose potentially signifcantly more challenges for the high-tech typically published within the AAC journal (N=31/60, 51.6%) – lit-
AAC device [180]. Also high-tech AAC will need to facilitate user’s erature surveys are a regular practice to provide essential refection
self-expression and communication in virtual environments (N=37, on current medical practices [155].
6.6%) – a small subset of papers has considered this for phone 4.2.2 Location of User Studies. Incorporating locations beyond the
calls [50], videoconferencing [96], and online gaming [169]. Lastly, lab may enable accessibility researchers to have more users and test
we fnd little high-tech AAC aware of supporting communication if high-tech AAC functions in settings where it will be used day-
exchanges between fellow AAC users (2%) with the same or a dif- to-day. Our results noted in Table 7, quite closely replicate trends
ferent high-tech AAC device – yet it is potentially common for found in the Mack et al., research [112]. A substantial proportion
two high-tech AAC users to directly communicate with each other of studies take place at home, residence or school (N=181, 43.6%)
regularly e.g., within special needs schools [81, 140]. In Table 4, we and near or in the researchers laboratory (N=178, 42.8%). High-tech
fnd that most high-tech AAC is designed to augment and enrich AAC researchers have been successful at recruiting participants
verbal communication (N=519, 92.3%) – high-tech AAC to just sup- for studies that take place at home or schools – a place where par-
port users non-verbal forms of communication is neglected in the ticipants are comfortable and visit frequently [55]. Furthermore,
literature (N=8, 1.4%) [52]. Furthermore, we fnd limited high-tech user studies in these locations may broaden participation amongst
AAC devices operating at communication rates beyond interactive. vulnerable groups e.g., testing high-tech AAC with children [55].
Instead, linear is by far the most common communication model Lab studies are almost as popular, enabling the research team to
(N=491, 87.4%) – here using the device to provide feedback is re- carefully control the variables and environment for testing – sup-
strained and the AAC user must take signifcant time to construct porting the collection of observation data [173]. Neutral locations
messages [11]. Interactive high-tech AAC devices (N=61, 10.9%) (N=71, 17.1%) feature relatively prominently and are diverse such
increase the communication rate by enabling the AAC user to ofer as: day programs [39], intervention camps [32], clinics, medical
feedback in a restrained manner. We fnd some high-tech AAC centres [163], fast food restaurants [44], extracurricular clubs and
(N=8, 1.4%) could potentially ofer pathways to transactional com- community centres [191]. Some of these locations have the ad-
munication for its user. Here, high-tech AAC leverages snippets vantage of testing the high-tech AAC in live, realistic and natural
of discourse utilising advanced technologies to take in contextual conditions that are hard to replicate within a lab. Often neutral
information [63, 195] signifcantly diminishing the reciprocity gap locations are selected by the participants themselves due to per-
between communicators [27]. sonal preference [118]. Furthermore, these settings may imbue the
user with confdence that the AAC can be used in a public location.
4.2 The Incumbent Methodologies Used to Online and remote participation (N=50, 17.1%) has been used to
Design and Study High-tech AAC
16 We also note that this review was done approximately 1/5 the way through the 2020s,
4.2.1 High-tech AAC Contribution Counts and Types. Depicted in
so accounts for a smaller number of papers. Extrapolating the number (i.e. multiplying
Figure 5, paper counts for high-tech AAC has grown steadily over the number of papers in the 2020s by 5), we get 335 papers – suggesting a continuing
the period of 1978-2021 yet this growth surges with counts over 25 increase in the number of papers on AAC.
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 4: Frequency of applied codes for high-tech AAC communication type and model within dataset.

Comms type Papers w/code This code only Comms model Papers w/code This code only
Verbal 519 (92.3%) 517 (92%) Linear 491 (87.4%) 430 (76.5%)
Non-verbal 8 (1.4%) 6 (1.1%) Interactive 61 (10.9%) 0 (0%)
Transactional 8 (1.4%) 0 (0%)

40
37
36 36
35
32
31
30 30 30
30
28
27
26
25
22
Paper counts

21
20
20
17 17
16
15
15

10 9
8
7 7 7
6 6 6 6
5 5 5 5
5
3 3
1 1 1
0 0 0 0 0 0 0 0
0
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
Year

Figure 5: Frequency of 562 paper counts by year for high-tech AAC within dataset from 1978-2021.

test with large participant numbers and also may remove travel 11.6%) also rank quite highly – researchers often try to learn of the
burdens for participants [168]. success of deployment by observing in naturalistic settings high-
tech AAC usage [67]. Lastly, we found no contributions using ran-
4.2.3 Methods in User Studies. Refecting on the user-centred na- domized control trials (N=0, 0%) (RCTs). Surveys and questionnaires
ture of high-tech AAC, overwhelmingly user studies are favoured (N=65, 15.7%) have been regularly deployed to assess over large
(N=415, 73.8%) – despite the potential difculties of obtaining partic- groups of users and gain an understanding of potential patterns
ipants with CCNs who often cannot easily provide verbal consent. and commonalities [168]. Furthermore, surveys enable high-tech
In contrast, studies which do not incorporate user studies (N=147, AAC users to communicate at there preferred rates without feel-
26.2%) involve artifacts with no user testing (N=57, 10.1%) survey ing pressure to provide feedback quickly [133] and AAC is a well
(N=53, 9.4%), methodological (N=25, 4.4%), theoretical (N=15, 2.7%), established community with foundations/charities with extensive
empirical (N=7, 1.2%) and dataset (N=4, 0.7%) contributions. These mailing lists [47, 121, 183]. Focus groups (N=21, 5.1%) and work-
studies without formal user testing – typically involve prototype shops (N=4, 1%) feature quite lowly – due to the complexities of
development [177], exploratory studies [108] and analyses of user- gaining feedback from users with communication barriers [16, 90].
studies conducted in other research [154]. Noted in Table 8, the Comparing to the broader Mack et al. dataset [112], high-tech AAC
dominant preference for studying high-tech AAC is controlled research comparatively favours more case studies (+7.6%) and con-
experiments (N=188, 45.3%) and usability testing (N=119, 28.7%). In- trolled experiments (+10.7%) at the expense of interviews (-26%),
terviews (N=67, 16.1%) rank to a lesser extent and typically involves workshops (-17.4%) and usability testing (-13%).
interviewing speech and language therapists (SLTs) and family
members due to the challenges of directly interviewing someone 4.2.4 Participatory Methods. Participatory design (PD) provides a
with CCNs [132]. Field studies (N=75, 18.1%) and case studies (N=48, method for involving users of technology direct in its design [57,
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

Table 5: Incremental 10 year binning of frequency of applied codes for input/output data of AAC over time. Percentage calcu-
lation involves dividing number of papers with code by paper count within period. Data is weighted towards AAC from the
2000s and 2010s, with a steady year-on-year increase over time. We see a gradual difusion in output types, away from only
mechanical to a more equal split of form factors. We see audio and visual dominating as outputs consistently over time.

Input 1970s 1980s 1990s 2000s 2010s 2020s


Paper count 1 11 49 136 298 67
Verbal (%) 0 9.1 2 8.8 5 0
Camera (%) 0 0 0 7.4 17.5 22.4
Tactile (%) 0 0 0 27.9 39.3 43.3
Gestural (%) 0 18.2 2.1 6.6 9.7 10.5
Mechanical (%) 100 81.8 71.4 55.2 27.2 22.4
Orientational (%) 0 0 0 0 1.7 3
Contextual (%) 0 9.1 2.1 4.4 6 7.5
Output 1970s 1980s 1990s 2000s 2010s 2020s
Audio (%) 0 45.5 65.3 71.3 59.1 64.2
Visual (%) 100 45.5 28.6 22.8 28.9 29.9
Motion (%) 0 0 0 0 2.4 6
Gustation (%) 0 0 0 0 0 0
Thermoception (%) 0 0 0 0 0 0

Table 6: Frequency of applied codes for high-tech contribution types within dataset versus Mack et al. accessibility data [112].

Contribution types Papers w/code % dif. This code only % dif.


Empirical 410 (73%) +12.7 226 (40.2%) +6.4
Artifact 224 (39.9%) -15.6 52 (9.2%) -26.8
Survey 60 (10.7%) +10.1 51 (9.1%) +9.1
Methodological 43 (7.7%) +4.5 18 (3.2%) +2.8
Theoretical and opinion 19 (3.4%) -5.3 12 (2.1%) +0.9
Dataset 6 (1.1%) -0.3 6 (1.1%) +0.7

Table 7: Frequency of applied codes for location for 415 user study papers within dataset versus Mack et al. accessibility
data [112].

Study location Papers w/code % dif. This code only % dif.


Home, residence or school 181 (43.6%) +14.7 139 (33.5%) +15.7
Near/at researchers lab 178 (42.9%) +15.6 145 (34.9%) +15.4
Neutral location 71 (17.1%) +10.4 37 (8.9%) +5.8
Online/remote 50 (12%) -8.5 34 (8.2%) -1.9

112]. However, PD methods have not been widely adopted in high- as user-centred design [186], intervention programs [187], use of
tech AAC user-study papers (N=20, 4.8%) – lower than the Mack probes [129] and academic workshops [195].
et al. dataset (-5.5%). Indeed, for users with CCNs – traditional PD
methods are more inaccessible compared to other communities 4.2.5 Category of Devices. Figure 6, shows that research has regu-
and groups [28, 136]. PD is cognitively demanding and requires larly provided interventions using expensive standalone commer-
for people with CCNs to have high levels of speech and language cial high-tech AAC devices including DynaVox models (N=63) e.g.,
profciency [136, 194]. Prior research has outlined challenges in the Dynavox 3100 (N=9) and their eyetracking technologies the To-
engaging populations with autism [57], aphasia [28, 90], Parkin- bii models e.g., Tobii T60 (N=5), Tobii X120 (N=1). Other standalone
son’s/dementia [26] and IDD [136] in PD. Researchers have de- high-tech AAC devices used in research includes the Lightwriter
veloped solutions to mitigate against this through proxies [28] – (N=14), Pathfnder (N=14) and Liberator (N=13). Yet, recently re-
e.g using SLTs for the PD of high-tech AAC. Other studies that search has increased with downloadable software e.g, EZ keys (N=6),
did not use co-design engaged in other rich design activities such SentenceShaper (N=7) or apps e.g., Proloquo2Go (N=16), Go Talk
Now (N=11), which converts laptops, smartphones, iPad’s (N=46)
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 8: Frequency of applied codes for methods for 415 user study papers within dataset versus Mack et al. accessibility
data [112].

Method Papers w/code % dif. This code only % dif.


Controlled experiment 188 (45.3%) +10.7 128 (30.8%) +19.3
Usability testing 119 (28.7%) -13 45 (10.8%) +1.2
Field studies 75 (18.1%) +0.3 25 (6%) +1.4
Interviews 67 (16.1%) -26 18 (4.3%) -1.4
Surveys and questionnaires 65 (15.7%) -9.9 23 (5.5%) +4.2
Case studies 48 (11.6%) +7.6 32 (7.7%) +7.5
Focus groups 21 (5.1%) -0.8 7 (1.7%) +0.9
Workshops 4 (1%) -17.4 1 (0.2%) -2.9
Other 2 (0.5%) -15.6 2 (0.5%) -0.3
Randomized control trials 0 n/a 0 (0%) n/a

Vanguard models 5
Vantage models 6
Minspeak icon system 6
TouchTalker models 6
EZ keys software 6
SentenceShaper software 7
AAC devices and technologies

BIGmack switches 7
Dynavox 3100 9
GoTalk Now app 11
DeltaTalker models 11
Tablet 11
DecTalk synthesizer 13
Liberator models 13
Pathfinder models 14
Lightwriter models 14
Proloquo2go app 16
Tobii T60 and eye tracking technologies 21
iPad high-tech AAC 46
Dynavox related devices 63
0 10 20 30 40 50 60 70
Frequency of papers

Figure 6: Inventory of 19 key devices and technologies with a frequency of above 5 papers within dataset.

or tablets (N=11) into a high-tech AAC device has grown in promi- the end users within the study. Advantages of the non-commercial
nence. Elsewhere, we fnd a slight preference towards testing with solutions is that the high-tech AAC is often free to its user, and
commercial (N=227, 54.7%) versus non-commercial (N=188, 45.3%) does not involve external purchase, subscription and maintenance
high-tech AAC within user-study papers. Indeed, for SLTs and fees [186].
non-computer science research groups it is often easier to program
and customise pre-existing available commercial high-tech AAC 4.3 Communities of Focus and Participants for
devices than develop new non-commercial high-tech AAC. Other
High-tech AAC Research
advantage of commercial high-tech AAC is that they are a more
reliable intervention as the device is maintained by an external cor- 4.3.1 The Communities of Focus. Shown in Table 9, high-tech AAC
poration [4]. Non-commercial oferings involve the research team research has largely focused on users categorised as Other (N=219,
actively building and developing new high-tech AAC solutions for 39%) and motor impairments (N=182, N=32.4%). Users categorised
as “Other" do not belong to a specifc community of focus and
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

Table 9: Frequency of applied codes for community of focus within dataset versus Mack et al. accessibility data [112].

Method Papers w/code % dif. This code only % dif.


Other 219 (39%) +29.9 209 (37.2%) +33.2
Motor impairments 182 (32.4%) +18.2 146 (26%) +14.3
Other cognitive 74 (13.2%) +4.1 59 (10.5%) +4.8
Autism 71 (12.6%) +6.5 46 (8.2%) +4
IDD 67 (11.9%) +9.1 29 (5.2%) +3.6
DHH 14 (2.5%) -8.8 5 (0.9%) -7.6
BVI 11 (2%) -41.5 2 (0.4%) -40.2
Older adults 4 (0.7%) -8.2 0 (0%) -5.7
General disability 3 (0.5%) -8.6 1 (0.2%) -5.9

the research is to contributing to a broad community of high- is even quite common for papers to test with proxies to gain ini-
tech AAC users e.g., “children" or “AAC users" – this is despite tial data on high-tech AAC before considering a disabled group
each disabled group having very specifc requirements17 . Motor of users [28]. Albeit, sometimes it makes legitimate sense to use
impairments have also received a signifcant research contribu- participants without disabilities because the research is examining
tion (N=182, 32.4%). In particular, disabilities such as cerebral palsy perspectives on high-tech AAC – monitoring its social acceptance
(N=107, 19.1%) and ALS (N=20, 3.6%) have been a specifc focus for amongst peers e.g., [101].
the motor-impairment high-tech AAC research. Groups that have A signifcant proportion of high-tech AAC user-study papers
received less research include cognitive impairments (N=74, 13.2%), include specialists (14.9%) and caregivers (18.1%). Additionally, in
autism (N=71, 12.6%) and IDD (N=67, 11.9%). Within other cognitive these studies, the median participant count positively increases
impairments, the dominant contribution has been towards people for specialists (N=9.5) and caregivers (N=8.5). Indeed, this is to
living with aphasia (N=47, 8.4%) and people with TBIs (N=10, 1.8%). be expected as people with CCNs in user-studies sometimes re-
Many of the autism and IDD contributions have built high-tech quire caregivers and specialists to assist with communicating and
AAC with child and adolescent users e.g., [55]. Groups with little expressing themselves [136]. Equally, these groups can provide
research include BVI (N=11, 2%), DHH (N=14, 2.5%), Older adults key insights and feedback on the high-tech AAC within the user-
(N=4, 0.7%) and General disabilities (N=3, 0.5%) – these are not study [28]. Comparing these fndings versus the wider Mack et
mainstream users of high-tech AAC with CCNs. Our data versus al. research we fnd a lower number of user studies coded with
the Mack et al. dataset for accessibility research reveals some difer- people with disabilities (-11.2%) and higher number of user stud-
ences [112]. In broader accessibility research BVI is the dominant ies using people without disabilities (+8.2%). In addition, we fnd
focus (-41.5%), whilst in high-tech AAC it is Other (+29.9%) and a marginally lower number of papers with specialists (-2.1%) but
motor impairments (+18.2%). Also, Autism (+6.5%), IDD (+9.1%) and much higher usage of caregivers (+8.7%). In terms of community of
Other cognitive impairments (+4.1%) received a slightly higher pro- focus, participant counts are median highest for the non-specifc
portion of contributions within our dataset. However, DHH (-8.8%), category of Other (N=16). However, median participant counts are
Older adults (-8.2%) and General disability (-8.6%) received a slightly lower for Motor impairments (N=6) and Other cognitive (N=6),
lower contribution within our dataset. yet participant counts are very low for high-tech AAC user-study
papers involving people with IDD (N=5) and Autism (N=4)19 . Over-
4.3.2 Study Participants. Table 10 shows the participant data for
all, the low median participant counts, high ranges and standard
user studies. The majority of studies include people with disabili-
deviations refect the methods favoured in high-tech AAC research
ties (N=305, 73.5%) albeit in these studies the median participant
– very large surveys versus interventions with just one or a small
counts are predominantly low (N=5) and imbalanced with a high
group of users.
standard deviation (SD) and range. Equally, fewer papers include
people with just disabilities only (N=178, 42.9%). A proportion of 4.3.3 Usage of Proxies and Ability-based Comparisons. From our
the papers include caregivers (N=75, 18.1%) and specialists (N=62, 415 user-study papers, we identifed 86 papers where proxies were
14.9%) particularly often to complement other users involved in used (N=86, 20.7%) – higher than the Mack et al. dataset (+12.7%).
the user-study18 . A sizeable proportion of the user studies include To a certain degree, this trend was expected as proxies are often
people without disabilities (N=130, 31.3%) with even N=82 papers used to circumvent communication barriers [28]. However many
(19.8%) only with people without disabilities. For studies including of the proxy papers (34.9%, N=30/86) were for motor impairments,
people without disabilities we fnd participant counts are much which compromises the validity of fndings as it is difcult for a
higher with a median of N=18. Due to the difculties designing proxy user to accurately replicate the role of a motor impaired
for users with CCNs – proxies are often favoured [28]. Indeed, it user [113]. Sometimes a member of the research team has acted
as a proxy high-tech AAC user to obtain social perspectives from
17 E.g.,
The high-tech AAC requirements of a user living with aphasia is very diferent
to a user living with cerebral palsy.
18 Indeed, only N=15 papers are coded with just specialists and N=6 coded with just 19 Typically, these studies can often include vulnerable children users of high-tech
caregivers. AAC resulting in a lower-participant count
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 10: Frequency of applied codes for study participants within 415 user study papers versus Mack et al. accessibility
data [112].

Participant group Papers w/code % dif. This code only % dif.


People with disabilities 305 (73.5%) -11.2 178 (42.9%) -2
People without disabilities 130 (31.3%) +8.2 82 (19.8%) +18.8
Caregivers 75 (18.1%) +8.7 6 (1.4%) +0.6
Specialists 62 (14.9%) -2.1 15 (3.6%) +1.7

Table 11: Quantitative analysis of participant counts for 415 user-study papers by community of focus and participant groups.

Group Median IQR SD Mean Range Total papers


Older 78 79 80.3 94.7 24–182 3
People without disabilities 18 20.25 84.6 32.7 1–893 130
Other 16 27.5 1188.5 146.1 1–12,776 121
Specialists 9.5 18.75 63.8 31.6 1–320 62
DHH 9.5 37.8 3678.5 1085.6 1–12,776 12
Caregivers 8.5 26 87.2 37.6 1–624 75
General 8 4.5 4.7 6.3 1–10 3
All user-studies 7 17 635.1 54.2 1–12,776 415
Other cognitive 6 9.5 12.4 10.5 1–75 63
Motor impairments 6 12 1024.3 105.4 1–12,776 158
People with disabilities 5 11 735.7 62.5 1–12,776 305
IDD 5 8 1719.9 243.9 1–12,776 55
Autism 4 8.75 1735.8 248 1–12,776 54
BVI 2 20 22 14.9 1–60 8

peers on the device and its user [176] and proxies are used for initial our examination and codifcation of the literature we fnd that me-
pilot testing of devices to quickly determine the efectiveness and chanical (38.4%) was the dominant input modality for standalone
usability of the device [197]. We identifed just 25 cases of ability high-tech AAC devices before a surge in tactile (32.7%) inputs with
based comparisons (N=25, 6%) in which disabled and non-disabled the decentralisation of high-tech AAC through capacitive touch-
participants were compared in the user-study papers – sometimes screens (e.g., smartphones, tablets) and easily downloadable apps
this is to have able participants act as the control group for deter- (e.g. through app stores). However, audio (62.8%) and visual (27.9%)
mining if the high-tech AAC SGD ofers the same intelligibility as have remained the most well explored output modalities. In re-
natural human vocals [127] and whether the communication rate is sponse to these fndings, high-tech AAC designers should more
equivalent [88]. Additionally, testing has been performed in dyads systematically explore the input/output interaction possibilities of
to determine cross-task performance [159]. Comparing to the Mack using high-tech AAC. For example, with regards to inputs: camera,
et al., ability based comparisons are used less frequently (-7.6%) gestural, verbal and contextual inputs serve as a still largely un-
within our dataset [112]. explored input modalities (5.2% to 13.7%), which we believe hold
much promise for future high-tech AAC. For instance, Fiannaca
5 DISCUSSION et al., increased gaze-based and eye-tracking communication rates
leveraging contextual inputs with the AACrobat [51]. Additionally,
Despite four decades of research, abandonment of high-tech AAC Wisenburn et al., used verbal inputs from partner speech recog-
devices and interventions continues [140, 185]. Therefore, this frst nition providing contextually relevant utterances and increased
SR and taxonomy of ACM high-tech AAC research serves as a communication rates [195]. Lastly, with regards to gestural inter-
refection for more successful future high-tech AAC development action Ascari et al., explored non-invasive personalised gestural
and intervention outcomes. Based on our taxonomy and results, interaction for the purposes of communication [7] whilst BCI-AAC
we formulate directions for further research in high-tech AAC and remains an emerging research area, which enables practitioners to
interventions in light of our three research questions: 1.2 to 1.2. entirely bypass users motor system – albeit the successful deploy-
ment and calibration of BCIs is currently challenging [119, 148, 149].
5.1 Expanding the Characteristics of High-tech High-tech AAC devices that have leveraged novel input charac-
AAC Devices and Interventions teristics have ofered promising results for their user groups, for
instance using camera input for eye-tracking [97], storytelling [3]
Our frst research question – presents what is the pre-existing
and understanding the environment [141]. Harnessing pre-existing
taxonomy and dominant characteristics of high-tech AAC. From
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

interdisciplinary AI research such as natural language process- there own unique communication skills including non-verbal sig-
ing (NLP) [73] or smart environments [23, 36, 64], might support nals to provide feedback more quickly [15, 52, 81].
the development of contextual and verbal inputs for high-tech
AAC to improve communication rates, enable easier interactions, 5.2 Reassessment of Incumbent Research
more agency and autonomy. Complementary research on outputs, Contributions and Methodologies in
fnds that high-tech AAC typically outputs audio (62.8%) and visual
High-tech AAC Research and Interventions
(27.9%) channels to communicate – future work might explore the
feasibility of the motion (2%) and even thermoception (0%) as out- The second research question of this paper focused on what meth-
puts. Indeed, these more subtle and less information dense channels ods are used to contribute towards high-tech AAC research and
can still be leveraged to support communication and engaging hu- interventions. Versus other contributions, dataset contributions
man to human interaction, particularly non-verbal communication (1.1%) are comparatively under-explored, thus high-tech AAC re-
(c.f. “sidekicks" [179] and physical expressive objects [180]). search could certainly have more contributions focused on AI and
With respect to common scalar attributes and interface layouts of ML datasets. As proposed by Mack et al. and Kane et al., looking
high-tech AAC, we fnd that AAC is often customisable (55.2%) and to the future, AI and ML datasets should certainly be rooted in ac-
ofers automation (46.6%) typically for prediction to foster increased cessibility research [112] and equally have the potential to greatly
communication rates [74]. Whilst high-tech AAC layouts often use enrich the efectiveness of high-tech AAC research and interven-
text (57.3%), symbols (39.7%), grids (27.9%) and pictures (22.6%) – tions [92, 140]. Already, work at ASSETS’21 by Theodorou et al.
for instance the DynaVox 3100 serves as a typical grid, symbol [172] explores the concept of a “Disability-First” dataset, which
and text based interface [4]. To better expand these characteristics, supports an accessible method for visually impaired users to create
future AAC that just ofers customisation and automation is likely their own datasets for personal object recognition. Furthermore,
not enough. Indeed, too often using high-tech AAC does not feel as rightly noted by Park et al., there is a defnitive lack of data
natural but restrained for the end user’s autonomy resulting in from disabled populations and consequently they have provided a
abandonment [81, 140] – the voice of the device does not refect their robust set of design guidelines for developers looking to gather data
personhood [117], the device is impractical to carry around day-to- from disabled populations [147]. Indeed, the properly considered
day [11], indiscreet limiting social acceptance [145], difcult to use deployment of AI within high-tech AAC might support quicker,
with few modes of input [20] and not adaptive to the constantly empathetic and more seamless communication opportunities for
evolving communication abilities of the user [20, 80]. Additionally, people living with CCNs [140]. For instance, Vertanen et al., suc-
future AAC research could explore more VSDs, video and animation cessfully deployed crowd-sourced data-sets to ofer improved word
– these are under researched interfaces with the ability to convey prediction and better high-tech AAC communication rates – yet
meaning [83]. Importantly, high-tech AAC designers must be aware critically, the underlying data did not originate specifcally from
that text-based interfaces are challenging for some groups – e.g., AAC users and disabled communities [184].
people with aphasia [189]. User-study papers (73%) compromise a sizable proportion of our
Turning to scenarios of usage, communication types and mod- dataset and we believe this must continue and even increase for suc-
els – we fnd that high-tech AAC devices are typically designed to cessful high-tech AAC outcomes [140]. The location of user studies
be used with professionals (56%) and families/friends (39.3%) sup- is signifcant for high-tech AAC – researchers have successfully
porting verbal communication (92.3%). However, high-tech AAC tested widely in: residences (43.6%), labs (42.8%), neutral (17.1%) and
designers must be aware that users with CCNs often have larger even remote (12%) locations. Nonetheless, researchers must look to
social groups than anticipated [180], users also require the ability to test high-tech AAC performance in the very conditions where it
use their high-tech AAC to communicate virtually (6.6%) on social will be deployed. Key reasons preventing high-tech AAC adoption,
media [34, 123] or as widely appreciated during the pandemic, on include that the devices promise an unrealistic “magic wand” [20],
videoconferencing software [96, 137, 180]. Little high-tech AAC failing to adapt to the contextual environmental challenges (e.g.,
is designed to function in group settings (6.9%), particularly if the noisy environments) [96, 117] or consider peer perceptions of the
communication partner is a fellow AAC user (2%) with the same or high-tech AAC device [5] – the potential accompanying stigmas and
a similar high-tech AAC device – research has shown that in this its efect in conversational power dynamics [15, 101]. Indeed, the
case breakdowns are common unless both users have analogous high-tech AAC device should be shaped by its user and context of
vocabulary libraries pre-installed prior to the exchange [20, 81]. In usage [20] – high-tech AAC performance cannot be assumed on the
addition, there has been an under representation of high-tech AAC basis of unnatural controlled conditions rather than the actual con-
that empowers the users non-verbal communication (1.4%) – this texts of daily usage [140]. With regards to high-tech AAC research
compounds into most high-tech AAC ofering linear communica- methods, we found no cases of randomized control trials (RCTs)
tion interactions (87.4%) [180]. Indeed, high-tech AAC must show (N=0, 0%) within our dataset. Indeed in 2018, Kent-Walsh et al. report
an appreciation of the users pre-existing communication abilities that no RCTs have been published in the AAC journal [93]. How-
to increase communication rates beyond linear – encouraging em- ever, further searching outside of the realm of this SR within the
bodied forms of communication and re-framing high-tech AAC as medical literature20 we do fnd some RCTs with AAC e.g., [79, 134].
an accepted interdependent assistive technology [15, 81]. Too often, Furthermore, Todman et al. have strongly advocated for their usage
high-tech AAC restrains communication by acting as a physical
independent barrier – restraining the user from freely employing 20 Beyond the scope of our investigation, RCTs with AAC were found via medical
literature databases i.e., https://pubmed.ncbi.nlm.nih.gov/ and Google Scholar.
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

as AAC devices are categorised as medical equipment requiring ran- nonetheless high-tech AAC could feasibly contribute towards peo-
dom assignment procedures for treatment efcacy [175]. Therefore, ple living with dementia and other CCNs that will become more
we suggest further RCTs to mitigate against bias and the ensure prominent from an ageing global population [116].
clinical efcacy of high-tech AAC interventions [176]. With regards to participant groups, people with disabilities fea-
Elsewhere, usability testing (28.7%), interviews (16.1%), work- ture most prominently (73.5%), followed by people without disabili-
shops (1%) and PD methods (4.8%) all rank lower than the Mack et ties (31.3%). Although our high-tech AAC dataset has less people
al. dataset for the general accessibility community. Consequently, with disabilities (-11.2%) and more people without (+8.2%) versus
we fnd a smaller use of methods in which AAC users with CCNs the Mack et al. – we believe that this is because our dataset oc-
are directly engaged and the development process is cooperative curs over a broader period (1978-2021) meaning that this change
e.g., [23, 90]. We accept that it is difcult to recruit, engage in is perhaps symptomatic of evolving standards of practice for ac-
interviews, participatory design, usability testing [16] and work- cessibility research in particular a focus on increased engagement
shops [57] with users that have severe CCNs, struggle to communi- with participants with disabilities as a consequence of the social
cate independently [90] and provide consent [28, 136]. Nonetheless, model of disability [45]. Therefore, we echo Mack et al.’s encourage-
we echo Mankof et al. [114] – we must, as a community, actively ment of more direct engagement with people with disabilities [112].
strive to include end users within the design and development pro- Furthermore, high-tech AAC research has shown a propensity to
cess of AAC. Researchers have, in summary, not received enough engage with caregivers (18.1%), specialists (14.9%) and peers – in
feedback21 from empowered end users during the development pro- the right circumstances this should be encouraged. Indeed, care-
cess. Future research should build upon prior PD which has focused givers, specialists (i.e., SLTs) and peers can often work as good
on motor-impairments [41], autism [57, 167], dementia [109], apha- communication partners in user studies with people with CCNs –
sia [60, 138, 194] and older individuals [110]. Further participatory supporting the high-tech AAC users’ expression when engaging
designed high-tech AAC solutions for people with CCNs could re- with the study [29] or for dyads [103]. Equally, if the study involves
sult in better outcomes and more long-term adoption [136]. Indeed, children, parents or guardians they can also help support the child
examples of PD found within our high-tech AAC dataset include with CCNs refection and potentially provide feedback on the un-
Kane et al.’s TalkAbout - a context-aware, adaptive AAC system derlying efectiveness of the intervention [43]. Indeed, Delarosa et
co-designed in collaboration with 5 adults with aphasia [90] and al., rightly note that the successful integration of a high-tech AAC
de Faria Borges et al.’s PD4CAT customized communication device system requires strong commitment from parents and other family
developed in collaboration with a child with cerebral palsy [41]. members [43].
Non-disabled participants are regularly used within our dataset
5.3 Engaging with Communities and Greater (31.3%), typically for either proxy usage (20.7%) or ability-based
Representation in High-tech AAC Research comparisons (6%). Despite difculties designing and recruiting par-
and Interventions ticipants with CCNs [160], we advocate not using just proxies as
Mack et al., note these high-tech AAC user studies run the risk
For the third research question and understanding who does high- of reinforcing normalist beliefs [112]. Furthermore, proxy users
tech AAC research focus upon, we get very diferent results from the can often not accurately replicate a disabled user for testing the
Mack et al. research. In contrast, within our dataset Other (39%) and usability of a high-tech AAC device – particularly for users with
Motor impairments (32.4%) feature prominently versus the Mack et motor-impairments [113]. Instead, proxies can be used jointly to
al. yet comparatively BVI takes up a much smaller minority of re- perform initial testing [28], triangulate fndings [58] and perhaps
search (-41.1%). Directly commenting on these results, pre-existing learn of the diferences in social perceptions versus a disabled user
high-tech AAC research has perhaps lacked specifcity regarding with high-tech AAC [13]. Although ill-frequent within our dataset,
its targeted community of focus as the majority of contributions ability-based comparisons are an imperfect heuristic for determin-
within our dataset is towards Other [112]. Investigating further, we ing the performance of a high-tech AAC device – other qualitative
fnd that the Other community of focus frequently in papers is rep- fndings concerning preference and satisfaction from disabled high-
resented as diluted communities such as “AAC users" e.g., [56] or tech AAC users are more useful for determining if an intervention
“Children" e.g., [144]. This lack of specifcity has perhaps resulted in is successful [11, 87]. Indeed, even if the high-tech AAC success-
high-tech AAC that lacks focus or curated design for specifc target fully ofers pathways for communication that is not defnitively a
groups, disabilities, communities and their unique set of needs [112]. pre-requisite for long-term adoption [20].
Therefore, a clear direction for research is that we implore future
high-tech AAC development to be directly contributing towards a
specifc community of focus with CCNs. Elsewhere, we fnd that 6 LIMITATIONS
Other cognitive (13.2%), ASD (12.6%) and IDD (11.9%) could re- As in all SRs, we had to limit its scope. Despite a robust PRISMA
ceive more high-tech AAC contributions and are under represented guided methodology, with snowballing, our approach would have
versus Motor impairments (32.4%) – indeed Other cognitive [146], not covered all AAC papers. Indeed we only used two scientifc
Autism [37] and IDD [115] are prominent communities within so- databases: the ACM DL and Scopus for constructing our SR dataset
ciety. Lastly, Older adults have a very small contribution (0.7%) – - there are other sources (e.g., IEEE, Elsevier, Thompson Reuters
21 Bircanin et al., have even suggested the formation of AAC publics
etc.) which will likely reveal more papers on the topic of high-
to promote greater
discourse on high-tech AAC to ensure more empowering high-tech AAC solutions are tech AAC. Additionally, for manual coding of such a large dataset
innovated [20]. there are consequently areas of potential inconsistency, subjectivity
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

and errors. Consequently, some researchers have argued in favour [8] S Aswin, Ayush Ranjan, and KV Prashanth. 2019. Smart Wearable Speaking Aid
of using quantitative and automated analyses to mitigate against for Aphonic Personnel. In International Conference On Computational Vision
and Bio Inspired Computing. Springer, 179–186.
this concern [49, 69, 102, 104]. Our comparison of fndings with [9] Jon Baio. 2014. Prevalence of autism spectrum disorder among children aged
the Mack et al. [112] dataset has limitations – we are a diferent 8 years-autism and developmental disabilities monitoring network, 11 sites,
United States, 2010. (2014).
research team and therefore have introduced subjective biases into [10] Dean C Barnlund. 2017. A transactional model of communication. In Commu-
our review and our dataset covers a wider breadth of literature. nication theory. Routledge, 47–57.
Indeed, future systematic reviews of AAC could look to be more [11] Susan Baxter, Pam Enderby, Philippa Evans, and Simon Judge. 2012. Barriers
and facilitators to the use of high-technology augmentative and alternative com-
focused – for instance, explicitly focus on commercial devices or munication devices: a systematic review and qualitative synthesis. International
AAC developed within a smaller period of inclusion. Much like other Journal of Language & Communication Disorders 47, 2 (2012), 115–129.
research [31, 112], we also encountered highly varied language use [12] Susan Baxter, Pam Enderby, Philippa Evans, and Simon Judge. 2012. Interven-
tions using high-technology communication devices: a state of the art review.
across papers, which impacted our classifcation of codes. Folia Phoniatrica et Logopaedica 64, 3 (2012), 137–144.
[13] Ann Beck, Stacey Bock, James Thompson, and Kullaya Kosuwan. 2002. Infu-
ence of communicative competence and augmentative and alternative com-
7 CONCLUSION munication technique on children’s attitudes toward a peer who uses AAC.
AAC is an essential support for many, yet is under-adopted and Augmentative and Alternative Communication 18, 4 (2002), 217–227.
[14] Hrvoje Belani. 2012. Towards a usability requirements taxonomy for mobile
frequently abandoned. To provide deeper insight and directions AAC services. In 2012 First International Workshop on Usability and Accessibility
for future research, we provide a taxonomic overview of the cur- Focused Requirements Engineering (UsARE). IEEE, 36–39.
[15] Cynthia L Bennett, Erin Brady, and Stacy M Branham. 2018. Interdependence
rent state of high-tech AAC devices, the methodologies, contribu- as a frame for assistive technology research and design. In Proceedings of the
tions and communities of focus – compiling a paper count and 20th International ACM SIGACCESS Conference on Computers and Accessibility.
inventory of the key devices used. Our results suggest three future 161–173.
[16] Cynthia L Bennett, Burren Peil, and Daniela K Rosner. 2019. Biographical
research directions. Firstly, more research is needed to explore high- prototypes: Reimagining recognition and disability in design. In Proceedings of
tech AAC inputs/outputs, the role of high-tech AAC in supporting the 2019 on Designing Interactive Systems Conference. 35–47.
non-verbal communication, high-tech AAC communication with [17] David R Beukelman, Susan Fager, Laura Ball, and Aimee Dietz. 2007. AAC
for adults with acquired neurological conditions: A review. Augmentative and
groups, fellow AAC users and virtual scenarios. Secondly, in terms alternative communication 23, 3 (2007), 230–242.
of specifcally methods, future high-tech AAC research could use [18] David R Beukelman and Pat Mirenda. 2013. Augmentative & alternative com-
PD methods and RCTs. Thirdly and lastly, with respect to commu- munication: Supporting children and adults with complex communication needs.
Paul H. Brookes Pub.
nities of focus limited high-tech AAC research focuses on Other [19] Elizabeth E Biggs, Erik W Carter, and Carly B Gilson. 2018. Systematic review
Cognitive, Autism and IDD with low median participant counts of interventions involving aided AAC modeling for children with complex
communication needs. American Journal on Intellectual and Developmental
for studies incorporating people with disabilities (N=5). We hope Disabilities 123, 5 (2018), 443–473.
that our contribution will foster new and more optimally directed [20] Filip Bircanin, Bernd Ploderer, Laurianne Sitbon, Andrew A Bayor, and Margot
AAC research, towards promulgating greater adoption and more Brereton. 2019. Challenges and opportunities in using augmentative and alterna-
tive communication (AAC) technologies: Design considerations for adults with
successful long-term intervention outcomes. severe disabilities. In Proceedings of the 31st Australian Conference on Human-
Computer-Interaction. 184–196.
[21] Alexandre Luís Cardoso Bissoli, Yves Luduvico Coelho, and Teodiano Freire
ACKNOWLEDGMENTS Bastos-Filho. 2016. A system for multimodal assistive domotics and augmenta-
We would like to thank Dr Rita Borgo for her support, comments tive and alternative communication. In Proceedings of the 9th ACM International
Conference on PErvasive Technologies Related to Assistive Environments. 1–8.
and advice throughout the establishment and writing of the paper. [22] Rolf Black, Per Ola Kristensson, Jianguo Zhang, Annalu Waller, Sophia Bano,
Many thanks to Dr Claudia Daudén Roquet for their helpful critique Zulqarnain Rashid, and Christopher Norrie. 2016. ACE-LP: Augmenting com-
of an earlier draft of this paper. This work was supported in part munication using environmental data to drive language prediction. In Commu-
nication Matters-CM2016 National Conference.
by a UKRI EPSRC Studentship. [23] Rolf Black, Annalu Waller, Ross Turner, and Ehud Reiter. 2012. Supporting
personal narrative for children with complex communication needs. ACM
transactions on computer-human interaction (TOCHI) 19, 2 (2012), 1–35.
REFERENCES [24] Sarah W Blackstone, Michael B Williams, and Mick Joyce. 2002. Future AAC
[1] Kim Adams and Al Cook. 2016. Using robots in “hands-on” academic activi- technology needs: consumer perspectives. Assistive Technology 14, 1 (2002),
ties: a case study examining speech-generating device use and required skills. 3–16.
Disability and Rehabilitation: Assistive Technology 11, 5 (2016), 433–443. [25] Jamie B Boster and John W McCarthy. 2018. Designing augmentative and
[2] Kim D Adams and Albert M Cook. 2013. Programming and controlling robots alternative communication applications: The results of focus groups with speech-
using scanning on a speech generating communication device: A case study. language pathologists and parents of children with autism spectrum disorder.
Technology and Disability 25, 4 (2013), 275–286. Disability and Rehabilitation: Assistive Technology 13, 4 (2018), 353–365.
[3] Abdullah Al Mahmud, Rikkert Gerits, and Jean-Bernard Martens. 2010. XTag: [26] Aikaterini Bourazeri and Simone Stumpf. 2018. Co-Designing Smart Home
designing an experience capturing and sharing tool for persons with aphasia. Technology with People with Dementia or Parkinson’s Disease. In Proceedings
In Proceedings of the 6th Nordic Conference on Human-Computer Interaction: of the 10th Nordic Conference on Human-Computer Interaction (Oslo, Norway)
Extending Boundaries. 325–334. (NordiCHI ’18). Association for Computing Machinery, New York, NY, USA,
[4] In North America, Pure Ofces Plato Close, Tachbrook Park, and Leamington 609–621. https://doi.org/10.1145/3240167.3240197
Spa. 2014. DynaVox T10/T15 User’s Guide. (2014). [27] LouAnne E Boyd, Alejandro Rangel, Helen Tomimbang, Andrea Conejo-Toledo,
[5] Kate Anderson, Susan Balandin, and Sally Clendon. 2011. “He cares about me Kanika Patel, Monica Tentori, and Gillian R Hayes. 2016. SayWAT: Augmenting
and I care about him.” Children’s experiences of friendship with peers who use face-to-face conversations for adults with autism. In Proceedings of the 2016 CHI
AAC. Augmentative and Alternative Communication 27, 2 (2011), 77–90. Conference on Human Factors in Computing Systems. 4872–4883.
[6] Rúbia EO Schultz Ascari, Roberto Pereira, and Luciano Silva. 2020. Computer [28] Jordan L Boyd-Graber, Sonya S Nikolova, Karyn A Mofatt, Kenrick C Kin,
vision-based methodology to improve interaction for people with motor and Joshua Y Lee, Lester W Mackey, Marilyn M Tremaine, and Maria M Klawe. 2006.
speech impairment. ACM Transactions on Accessible Computing (TACCESS) 13, Participatory design with proxies: developing a desktop-PDA system to support
4 (2020), 1–33. people with aphasia. In Proceedings of the SIGCHI conference on Human Factors
[7] Rúbia EO Schultz Ascari, Luciano Silva, and Roberto Pereira. 2019. Personalized in computing systems. 151–160.
interactive gesture recognition assistive technology. In Proceedings of the 18th [29] Alisa Brownlee and Lisa M Bruening. 2012. Methods of communication at end
Brazilian Symposium on Human Factors in Computing Systems. 1–12. of life for the person with amyotrophic lateral sclerosis. Topics in Language
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

Disorders 32, 2 (2012), 168–185. [51] Alexander Fiannaca, Ann Paradiso, Mira Shah, and Meredith Ringel Morris. 2017.
[30] Frederik Brudy, Christian Holz, Roman Rädle, Chi-Jui Wu, Steven Houben, AACrobat: Using mobile devices to lower communication barriers and provide
Clemens Nylandsted Klokmose, and Nicolai Marquardt. 2019. Cross-device autonomy with gaze-based AAC. In Proceedings of the 2017 ACM Conference on
taxonomy: Survey, opportunities and challenges of interactions spanning across Computer Supported Cooperative Work and Social Computing. 683–695.
multiple devices. In Proceedings of the 2019 chi conference on human factors in [52] Alexander J Fiannaca, Ann Paradiso, Jon Campbell, and Meredith Ringel Morris.
computing systems. 1–28. 2018. Voicesetting: voice authoring UIs for improved expressivity in augmenta-
[31] Emeline Brulé, Brianna J Tomlinson, Oussama Metatla, Christophe Joufrais, tive communication. In Proceedings of the 2018 CHI Conference on Human Factors
and Marcos Serrano. 2020. Review of Quantitative Empirical Evaluations of in Computing Systems. 1–12.
Technology for People with Visual Impairments. In Proceedings of the 2020 CHI [53] Joseph L Fleiss and Jacob Cohen. 1973. The equivalence of weighted kappa and
Conference on Human Factors in Computing Systems. 1–14. the intraclass correlation coefcient as measures of reliability. Educational and
[32] Joan Bruno and David Trembath. 2006. Use of aided language stimulation psychological measurement 33, 3 (1973), 613–619.
to improve syntactic performance during a weeklong intervention program. [54] Amanda Fleury, Gloria Wu, and Tom Chau. 2019. A wearable fabric-based
Augmentative and Alternative Communication 22, 4 (2006), 300–313. speech-generating device: system design and case demonstration. Disability
[33] Jessica Caron, Janice Light, Beth E Davidof, and Kathryn DR Drager. 2017. and Rehabilitation: Assistive Technology 14, 5 (2019), 434–444.
Comparison of the efects of mobile technology AAC apps on programming [55] Margaret Flores, Kate Musgrove, Scott Renner, Vanessa Hinton, Shaunita
visual scene displays. Augmentative and Alternative Communication 33, 4 (2017), Strozier, Susan Franklin, and Doris Hil. 2012. A comparison of communication
239–248. using the Apple iPad and a picture-based system. Augmentative and Alternative
[34] Jessica Caron, Janice Light, and Kathryn Drager. 2016. Operational demands of Communication 28, 2 (2012), 74–84.
AAC mobile technology applications on programming vocabulary and engage- [56] Richard Foulds, Mathijs Soede, and Hans van Balkom. 1987. Statistical disam-
ment during professional and child interactions. Augmentative and Alternative biguation of multi-character keys applied to reduce motor requirements for
Communication 32, 1 (2016), 12–24. augmentative and alternative communication. Augmentative and alternative
[35] Richard Cave and Steven Bloch. 2021. Voice banking for people living with motor communication 3, 4 (1987), 192–195.
neurone disease: Views and expectations. International Journal of Language & [57] Christopher Frauenberger, Julia Makhaeva, and Katta Spiel. 2017. Blending
Communication Disorders 56, 1 (2021), 116–129. methods: Developing participatory design sessions for autistic children. In
[36] Rosanna Yuen-Yan Chan, Eri Sato-Shimokawara, Xue Bai, Motohashi Yukiharu, Proceedings of the 2017 conference on interaction design and children. 39–49.
Sze-Wing Kuo, and Anson Chung. 2019. A context-aware augmentative and al- [58] Melanie Fried-Oken, Lynn Fox, Marie T Rau, Jill Tullman, Glory Baker, Mary
ternative communication system for school children with intellectual disabilities. Hindal, Nancy Wile, and Jau-Shin Lou. 2006. Purposes of AAC device use for
IEEE Systems Journal 14, 1 (2019), 208–219. persons with ALS as reported by caregivers. Augmentative and Alternative
[37] Flavia Chiarotti and Aldina Venerosi. 2020. Epidemiology of autism spectrum Communication 22, 3 (2006), 209–221.
disorders: a review of worldwide prevalence estimates since 2014. Brain sciences [59] Melanie Fried-Oken, Aimee Mooney, Betts Peters, and Barry Oken. 2015. A
10, 5 (2020), 274. clinical screening protocol for the RSVP keyboard brain–computer interface.
[38] Simone Ciccia, Alberto Scionti, Giacomo Vitali, and Olivier Terzo. 2020. Disability and Rehabilitation: Assistive Technology 10, 1 (2015), 11–18.
QuadCOINS-Network: A Deep Learning Approach to Sound Source Localization. [60] Julia Galliers, Stephanie Wilson, Jane Marshall, Richard Talbot, Niamh Devane,
In Conference on Complex, Intelligent, and Software Intensive Systems. Springer, Tracey Booth, Celia Woolf, and Helen Greenwood. 2017. Experiencing EVA
130–141. park, a multi-user virtual world for people with aphasia. ACM Transactions on
[39] Lauren Cooper, Susan Balandin, and David Trembath. 2009. The loneliness Accessible Computing (TACCESS) 10, 4 (2017), 1–24.
experiences of young adults with cerebral palsy who use alternative and aug- [61] Nestor Garay-Vitoria and Julio Abascal. 2004. A comparison of prediction
mentative communication. Augmentative and alternative communication 25, 3 techniques to enhance the communication rate. In ERCIM Workshop on User
(2009), 154–164. Interfaces for All. Springer, 400–417.
[40] Elke Daemen, Pavan Dadlani, Jia Du, Ying Li, Pinar Erik-Paker, Jean-Bernard [62] Luís Filipe Garcia, Luís Caldas De Oliveira, and David Martins De Matos. 2015.
Martens, and Boris de Ruyter. 2007. Designing a free style, indirect, and inter- Measuring the performance of a location-aware text prediction system. ACM
active storytelling application for people with aphasia. In IFIP Conference on Transactions on Accessible Computing (TACCESS) 7, 1 (2015), 1–29.
Human-Computer Interaction. Springer, 221–234. [63] Paola García, Eduardo Lleida, Diego Castán, José Manuel Marcos, and David
[41] Luciana Correia Lima de Faria Borges, Lucia Vilela Leite Filgueiras, Cristiano Romero. 2015. Context-aware communicator for all. In International Conference
Maciel, and Vinicius Carvalho Pereira. 2012. Customizing a communication on Universal Access in Human-Computer Interaction. Springer, 426–437.
device for a child with cerebral palsy using participatory design practices: [64] Miguel Gea-Megías, Nuria Medina-Medina, María Luisa Rodríguez-Almendros,
contributions towards the PD4CAT method. In Proceedings of the 11th Brazilian and María José Rodríguez-Fórtiz. 2004. Sc@ ut: Platform for communication
Symposium on Human Factors in Computing Systems. 57–66. in ubiquitous and adaptive environments applied for children with autism. In
[42] Denise C DeCoste. 1997. The handbook of augmentative and alternative commu- ERCIM Workshop on User Interfaces for All. Springer, 50–67.
nication. Cengage Learning. [65] Amy Goldman. 2008. Funding AAC. Perspectives on Augmentative and Alterna-
[43] Elizabeth Delarosa, Stephanie Horner, Casey Eisenberg, Laura Ball, Anne Marie tive Communication 17, 1 (2008), 33–35.
Renzoni, and Stephen E Ryan. 2012. Family impact of assistive technology scale: [66] Isabel Gómez, Pablo Anaya, Rafael Cabrera, Alberto Molina, Octavio Rivera,
Development of a measurement scale for parents of children with complex and Manuel Merino. 2010. Augmented and alternative communication system
communication needs. Augmentative and Alternative Communication 28, 3 based on dasher application and an accelerometer. In International Conference
(2012), 171–180. on Computers for Handicapped Persons. Springer, 98–103.
[44] L Scott Doss, Peggy Ann Locke, Susan Johnston, Joe Reichle, Jef Sigafoos, Paul [67] Carol Goossens’. 1989. Aided communication intervention before assessment:
Charpentier, and Dulce Foster. 1991. Initial comparison of the efciency of a A case study of a child with cerebral palsy. Augmentative and Alternative
variety of AAC systems for ordering meals in fast food restaurants. Augmentative Communication 5, 1 (1989), 14–26.
and Alternative Communication 7, 4 (1991), 256–265. [68] Daniel Gorenfo and Carole Gorenfo. 1997. Efects of synthetic speech, gender,
[45] Elizabeth Ellcessor. 2010. Bridging disability divides: A critical history of web and perceived similarity on attitudes toward the augmented communicator.
content accessibility through 2001. Information, Communication & Society 13, 3 Augmentative and Alternative Communication 13, 2 (1997), 87–91.
(2010), 289–308. [69] Sebastian Götz. 2018. Supporting systematic literature reviews in computer sci-
[46] Yasmin Elsahar, Sijung Hu, Kaddour Bouazza-Marouf, David Kerr, and Annysa ence: the systematic literature review toolkit. In Proceedings of the 21st ACM/IEEE
Mansor. 2019. Augmentative and alternative communication (AAC) advances: International Conference on Model Driven Engineering Languages and Systems:
A review of confgurations for individuals with a speech disability. Sensors 19, 8 Companion Proceedings. 22–26.
(2019), 1911. [70] Zahid Halim and Ghulam Abbas. 2015. A kinect-based sign language hand
[47] P Enderby, S Judge, S Creer, and A John. 2013. Examining the need for, and gesture recognition system for hearing-and speech-impaired: a pilot study of
provison of, AAC in the United Kingdom. (2013). Pakistani sign language. Assistive Technology 27, 1 (2015), 34–43.
[48] Annette Estes, Vanessa Rivera, Matthew Bryan, Philip Cali, and Geraldine Daw- [71] Foad Hamidi, Melanie Baljko, Toni Kunic, and Ray Feraday. 2014. Do-It-Yourself
son. 2011. Discrepancies between academic achievement and intellectual ability (DIY) assistive technology: a communication board case study. In International
in higher-functioning school-aged children with autism spectrum disorder. conference on computers for handicapped persons. Springer, 287–294.
Journal of autism and developmental disorders 41, 8 (2011), 1044–1052. [72] Mark S Hawley, Pam Enderby, Phil Green, Stuart Cunningham, and Rebecca
[49] Hans J Eysenck. 1994. Systematic reviews: Meta-analysis and its problems. Bmj Palmer. 2006. Development of a voice-input voice-output communication aid
309, 6957 (1994), 789–792. (VIVOCA) for people with severe dysarthria. In International Conference on
[50] Torsten Felzer and Rainer Nordmann. 2008. Using intentional muscle contrac- Computers for Handicapped Persons. Springer, 882–885.
tions as input signals for various hands-free control applications. In Proceedings [73] D Jefery Higginbotham, Gregory W Lesher, Bryan J Moulton, and Brian Roark.
of the 2nd International Convention on Rehabilitation Engineering & Assistive 2012. The application of natural language processing to augmentative and
Technology. 87–91. alternative communication. Assistive Technology 24, 1 (2012), 14–24.
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

[74] D Jefery Higginbotham, Howard Shane, Susanne Russell, and Kevin Caves. [95] Aggelos Kiayias. 2011. On the efects of pirate evolution on the design of
2007. Access to AAC: Present, past, and future. Augmentative and alternative digital content distribution systems. In International Conference on Coding and
communication 23, 3 (2007), 243–257. Cryptology. Springer, 223–237.
[75] Robert A Hinde and Robert Aubrey Hinde. 1972. Non-verbal communication. [96] Wooseok Kim and Sangsu Lee. 2021. “I Can’t Talk Now”: Speaking with Voice
Cambridge University Press. Output Communication Aid Using Text-to-Speech Synthesis During Multiparty
[76] Anthony Hogan, Megan Shipley, Lyndall Strazdins, Alison Purcell, and Elise Video Conference. In Extended Abstracts of the 2021 CHI Conference on Human
Baker. 2011. Communication and behavioural disorders among children with Factors in Computing Systems. 1–6.
hearing loss increases risk of mental health disorders. Australian and New [97] Susan Koch Fager, Melanie Fried-Oken, Tom Jakobs, and David R Beukelman.
Zealand journal of public health 35, 4 (2011), 377–383. 2019. New and emerging access technologies for adults with complex communi-
[77] Christine Holyfeld, Kathryn DR Drager, Jennifer MD Kremkow, and Janice Light. cation needs and severe motor impairments: State of the science. Augmentative
2017. Systematic review of AAC intervention research for adolescents and adults and Alternative Communication 35, 1 (2019), 13–25.
with autism spectrum disorder. Augmentative and alternative communication [98] Arlene W Kraat. 1987. Communication interaction between aided and natural
33, 4 (2017), 201–212. speakers: A state of the art report. (1987).
[78] Ming-Che Hsieh and Ching-Hsing Luo. 1999. Morse code typing training of [99] Per Ola Kristensson, James Lilley, Rolf Black, and Annalu Waller. 2020. A de-
an adolescent with cerebral palsy using microcomputer technology: case study. sign engineering approach for quantitatively exploring context-aware sentence
Augmentative and Alternative Communication 15, 4 (1999), 216–221. retrieval for nonspeaking individuals with motor disabilities. In Proceedings of
[79] Li Huang, Szu-Han Kay Chen, Shutian Xu, Yongli Wang, Xing Jin, Ping Wan, the 2020 CHI Conference on Human Factors in Computing Systems. 1–11.
Jikang Sun, Jiming Tao, Sicong Zhang, Guohui Zhang, et al. 2021. Augmentative [100] Saili S Kulkarni and Jessica Parmar. 2017. Culturally and linguistically diverse
and alternative communication intervention for in-patient individuals with student and family perspectives of AAC. Augmentative and Alternative Commu-
post-stroke aphasia: study protocol of a parallel-group, pragmatic randomized nication 33, 3 (2017), 170–180.
controlled trial. Trials 22, 1 (2021), 1–9. [101] Joanne Lasker and David R Beukelmanoe. 1999. Peers’ perceptions of storytelling
[80] Ms Seray Ibrahim, Asimina Vasalou, and Michael Clarke. 2017. Rethinking by an adult with aphasia. Aphasiology 13, 9-11 (1999), 857–869.
technology design for and with children who have severe speech & physical [102] Joseph Lau, John PA Ioannidis, and Christopher H Schmid. 1997. Quantitative
disabilities. (2017). synthesis in systematic reviews. Annals of internal medicine 127, 9 (1997), 820–
[81] Seray B Ibrahim, Asimina Vasalou, and Michael Clarke. 2018. Design opportu- 826.
nities for AAC and children with severe speech and physical impairments. In [103] Emily Laubscher, Janice Light, and David McNaughton. 2019. Efect of an
Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. application with video visual scene displays on communication during play:
1–13. Pilot study of a child with autism spectrum disorder and a peer. Augmentative
[82] Rabia Jafri, Ameera Masoud Almasoud, Reema Mohammed Taj Alshammari, and Alternative Communication 35, 4 (2019), 299–308.
Shahad Eid Mohammed Alosaimi, Raghad Talal Mohammed Alhamad, and [104] Hang Li, Harrisen Scells, and Guido Zuccon. 2020. Systematic review automation
Amzan Abdullah Saleh Aldowighri. 2020. A Low-Cost Gaze-Based Arabic tools for end-to-end query formulation. In Proceedings of the 43rd International
Augmentative and Alternative Communication System for People with Severe ACM SIGIR Conference on Research and Development in Information Retrieval.
Speech and Motor Impairments. In International Conference on Human-Computer 2141–2144.
Interaction. Springer, 279–290. [105] Janice Light. 1997. “Communication is the essence of human life”: Refections
[83] Vinoth Jagaroo and Krista Wilkinson. 2008. Further considerations of visual on communicative competence. Augmentative and Alternative Communication
cognitive neuroscience in aided AAC: The potential role of motion perception 13, 2 (1997), 61–70.
systems in maximizing design display. Augmentative and Alternative Communi- [106] Janice Light and Kathryn Drager. 2007. AAC technologies for young children
cation 24, 1 (2008), 29–42. with complex communication needs: State of the science and future research
[84] Roman Jakobson. 1972. Verbal communication. Scientifc American 227, 3 (1972), directions. Augmentative and alternative communication 23, 3 (2007), 204–216.
72–81. [107] Janice Light and David McNaughton. 2012. The changing face of augmentative
[85] Kyung Hea Jeon, Seok Jeong Yeon, Young Tae Kim, Seokwoo Song, and John and alternative communication: Past, present, and future challenges. , 197–
Kim. 2014. Robot-based augmentative and alternative communication for non- 204 pages.
verbal children with communication disorders. In Proceedings of the 2014 ACM [108] Janice C Light, Kathryn DR Drager, and Jessica G Nemser. 2004. Enhancing the
International Joint Conference on Pervasive and Ubiquitous Computing. 853–859. appeal of AAC technologies for young children: Lessons from the toy manufac-
[86] S Jirayucharoensak, A Hemakom, W Chonnaparamutt, and P Israsena. 2011. turers. Augmentative and Alternative Communication 20, 3 (2004), 137–149.
Design and evaluation of a picture-based P300 AAC system. In Proceedings of the [109] Stephen Lindsay, Katie Brittain, Daniel Jackson, Cassim Ladha, Karim Ladha, and
5th International Conference on Rehabilitation Engineering & Assistive Technology. Patrick Olivier. 2012. Empathy, participatory design and people with dementia.
1–4. In Proceedings of the SIGCHI conference on Human factors in computing systems.
[87] Jeanne M Johnson, Ella Inglebret, Carla Jones, and Jayanti Ray. 2006. Perspectives 521–530.
of speech language pathologists regarding success versus abandonment of AAC. [110] Stephen Lindsay, Daniel Jackson, Guy Schofeld, and Patrick Olivier. 2012. En-
Augmentative and Alternative Communication 22, 2 (2006), 85–99. gaging older people using participatory design. In Proceedings of the SIGCHI
[88] Rachel Kay Johnson, Monica Strauss Hough, Kristin Ann King, Paul Vos, and conference on human factors in computing systems. 1199–1208.
Tara Jefs. 2008. Functional communication in individuals with chronic severe [111] Kristy Logan, Teresa Iacono, and David Trembath. 2017. A systematic review of
aphasia using augmentative communication. Augmentative and Alternative research into aided AAC to increase social-communication functions in children
Communication 24, 4 (2008), 269–280. with autism spectrum disorder. Augmentative and Alternative Communication
[89] Xin-Xing Ju, Jie Yang, and Xiao-Xin Liu. 2021. A systematic review on voiceless 33, 1 (2017), 51–64.
patients’ willingness to adopt high-technology augmentative and alternative [112] Kelly Mack, Emma McDonnell, Dhruv Jain, Lucy Lu Wang, Jon E. Froehlich,
communication in intensive care units. Intensive and Critical Care Nursing 63 and Leah Findlater. 2021. What Do We Mean by “Accessibility Research”? A
(2021), 102948. Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to
[90] Shaun K Kane, Barbara Linam-Church, Kyle Althof, and Denise McCall. 2012. 2019. In Proceedings of the 2021 CHI Conference on Human Factors in Computing
What we talk about: designing a context-aware communication tool for people Systems. 1–18.
with aphasia. In Proceedings of the 14th international ACM SIGACCESS conference [113] Jennifer Mankof, Holly Fait, and Ray Juang. 2005. Evaluating accessibility by
on Computers and accessibility. 49–56. simulating the experiences of users with vision or motor impairments. IBM
[91] Shaun K Kane and Meredith Ringel Morris. 2017. Let’s Talk about X: Combining Systems Journal 44, 3 (2005), 505–517.
image recognition and eye gaze to support conversation for people with ALS. [114] Jennifer Mankof, Gillian R Hayes, and Devva Kasnitz. 2010. Disability studies as
In Proceedings of the 2017 Conference on Designing Interactive Systems. 129–134. a source of critical inquiry for the feld of assistive technology. In Proceedings of
[92] Shaun K Kane, Meredith Ringel Morris, Ann Paradiso, and Jon Campbell. 2017. the 12th international ACM SIGACCESS conference on Computers and accessibility.
"At times avuncular and cantankerous, with the refexes of a mongoose" Under- 3–10.
standing Self-Expression through Augmentative and Alternative Communica- [115] Pallab K Maulik, Maya N Mascarenhas, Colin D Mathers, Tarun Dua, and
tion Devices. In Proceedings of the 2017 ACM Conference on Computer Supported Shekhar Saxena. 2011. Prevalence of intellectual disability: a meta-analysis
Cooperative Work and Social Computing. 1166–1179. of population-based studies. Research in developmental disabilities 32, 2 (2011),
[93] Jennifer Kent-Walsh and Cathy Binger. 2018. Methodological advances, op- 419–436.
portunities, and challenges in AAC research. Augmentative and Alternative [116] Auriel A May, Shakila Dada, and Janice Murray. 2019. Review of AAC in-
Communication 34, 2 (2018), 93–103. terventions in persons with dementia. International Journal of Language &
[94] Chutisant Kerdvibulvech and Chih-Chien Wang. 2016. A new 3D augmented Communication Disorders 54, 6 (2019), 857–874.
reality application for educational games to help children in communication [117] M Shannon McCord and Gloria Soto. 2004. Perceptions of AAC: An ethnographic
interactively. In International Conference on Computational Science and Its Appli- investigation of Mexican-American families. Augmentative and alternative
cations. Springer, 465–473. communication 20, 4 (2004), 209–227.
State of the Art in AAC ASSETS ’22, October 23–26, 2022, Athens, Greece

[118] Miechelle Mckelvey, David L Evans, Norimune Kawai, and David Beukelman. [139] Zeyun Niu, Wenbing Yao, Qiang Ni, and Yonghua Song. 2007. Dereq: a qos
2012. Communication styles of persons with ALS as recounted by surviving routing algorithm for multimedia communications in vehicular ad hoc networks.
partners. Augmentative and Alternative Communication 28, 4 (2012), 232–242. In Proceedings of the 2007 international conference on Wireless communications
[119] Deirdre McLaughlin, Betts Peters, Kendra McInturf, Brandon Eddy, Michelle and mobile computing. 393–398.
Kinsella, Aimee Mooney, Trinity Deibert, Kerry Montgomery, and Melanie [140] Christopher S Norrie, Annalu Waller, and Elizabeth FS Hannah. 2021. Establish-
Fried-Oken. 2021. Decision-making for access to AAC technologies in late stage ing context: AAC device adoption and support in a special-education setting.
ALS. Augmentative and Alternative Communication: Challenges and Solutions ACM Transactions on Computer-Human Interaction (TOCHI) 28, 2 (2021), 1–30.
(2021), 169–199. [141] Mmachi G Obiorah, Anne Marie Piper, and Michael Horn. 2017. Independent
[120] Sharynne McLeod. 2018. Communication rights: Fundamental human rights Word Discovery for People with Aphasia. In Proceedings of the 19th International
for all. International Journal of Speech-Language Pathology 20, 1 (2018), 3–11. ACM SIGACCESS Conference on Computers and Accessibility. 325–326.
[121] David McNaughton, David Beukelman, and Patricia Dowden. 1999. Tools [142] University of Wisconsin Hospital and Clinics. 2010. AAC Glossary of Terms.
to support international and intercommunity collaboration in AAC research. [143] Bernard O’Keefe, Lina Brown, and Reinhard Schuller. 1998. Identifcation and
Augmentative and Alternative Communication 15, 4 (1999), 280–288. rankings of communication aid features by fve groups. Augmentative and
[122] David McNaughton, Diane Bryen, Sarah Blackstone, Michael Williams, and Alternative Communication 14, 1 (1998), 37–50.
Pamela Kennedy. 2012. Young adults with complex communication needs: Re- [144] Judith Oxley and Janet Norris. 2000. Children’s use of memory strategies:
search and development in AAC for a “diverse” population. Assistive Technology Relevance to voice output communication aid use. Augmentative and Alternative
24, 1 (2012), 45–53. Communication 16, 2 (2000), 79–94.
[123] David Mcnaughton and Diane Nelson Bryen. 2007. AAC technologies to enhance [145] Amanda M O’Brien, Ralf W Schlosser, Howard Shane, Oliver Wendt, Christina
participation and access to meaningful societal roles for adolescents and adults Yu, Anna A Allen, Jacqueline Cullen, Andrea Benz, and Lindsay O’Neill. 2020.
with developmental disabilities who require AAC. Augmentative and alternative Providing visual directives via a smart watch to a student with Autism Spectrum
communication 23, 3 (2007), 217–229. Disorder: an intervention note. Augmentative and Alternative Communication
[124] David McNaughton and Janice Light. 2013. The iPad and mobile technology 36, 4 (2020), 249–257.
revolution: Benefts and challenges for individuals who require augmentative [146] Ricardo Pais, Luís Ruano, Ofélia P Carvalho, and Henrique Barros. 2020. Global
and alternative communication. , 107–116 pages. cognitive impairment prevalence and incidence in community dwelling older
[125] David McNaughton, Tracy Rackensperger, Elizabeth Benedek-Wood, Carole adults—a systematic review. Geriatrics 5, 4 (2020), 84.
Krezman, Michael B Williams, and Janice Light. 2008. “A child needs to be given [147] Joon Sung Park, Danielle Bragg, Ece Kamar, and Meredith Ringel Morris. 2021.
a chance to succeed”: Parents of individuals who use AAC describe the benefts Designing an online infrastructure for collecting AI data from people with
and challenges of learning AAC technologies. Augmentative and alternative disabilities. In Proceedings of the 2021 ACM Conference on Fairness, Accountability,
communication 24, 1 (2008), 43–55. and Transparency. 52–63.
[126] Sally Millar, Janet Scott, et al. 1998. What is augmentative and alternative [148] Shailaja Arjun Patil. 2009. Brain gate as an assistive and solution providing
communication? An introduction. Augmentative Communication in Practice 2 technology for disabled people. In 13th International Conference on Biomedical
(1998). Engineering. Springer, 1232–1235.
[127] Pamela Mitchell and Carolyn Atkins. 1989. A comparison of the single word [149] Kevin M Pitt and Jonathan S Brumberg. 2021. Evaluating the perspectives
intelligibility of two voice output communication aids. Augmentative and of those with severe physical impairments while learning BCI control of a
Alternative Communication 5, 2 (1989), 84–88. commercial augmentative and alternative communication paradigm. Assistive
[128] Karyn Mofatt, Golnoosh Pourshahid, and Ronald M Baecker. 2017. Augmenta- Technology (2021), 1–9.
tive and alternative communication devices for aphasia: The emerging role of [150] Tracy Rackensperger, Carole Krezman, David Mcnaughton, Michael B Williams,
“smart” mobile devices. Universal Access in the Information Society 16, 1 (2017), and Karen D’silva. 2005. “When I frst got it, I wanted to throw it of a clif”: The
115–128. challenges and benefts of learning AAC technologies as described by adults who
[129] Aimee Mooney, Steven Bedrick, Glory Noethe, Scott Spaulding, and Melanie use AAC. Augmentative and alternative communication 21, 3 (2005), 165–186.
Fried-Oken. 2018. Mobile technology to support lexical retrieval during activity [151] Joseph Reddington and Nava Tintarev. 2011. Automatically generating stories
retell in primary progressive aphasia. Aphasiology 32, 6 (2018), 666–692. from sensor data. In Proceedings of the 16th international conference on Intelligent
[130] A Moorcroft, N Scarinci, and C Meyer. 2019. A systematic review of the barriers user interfaces. 407–410.
and facilitators to the provision and use of low-tech and unaided AAC systems [152] Ehud Reiter, Ross Turner, Norman Alm, Rolf Black, Martin Dempster, and
for people with complex communication needs and their families. Disability Annalu Waller. 2009. Using NLG to help language-impaired users tell stories
and Rehabilitation: Assistive Technology 14, 7 (2019), 710–731. and participate in social dialogues. In Proceedings of the 12th European Workshop
[131] Kristi L Morin, Jennifer B Ganz, Emily V Gregori, Margaret J Foster, Stephanie L on Natural Language Generation (ENLG 2009). 1–8.
Gerow, Derya Genç-Tosun, and Ee Rea Hong. 2018. A systematic quality review [153] Melissa L Rethlefsen, Shona Kirtley, Siw Wafenschmidt, Ana Patricia Ayala,
of high-tech AAC interventions as an evidence-based practice. Augmentative David Moher, Matthew J Page, and Jonathan B Kofel. 2021. PRISMA-S: an ex-
and Alternative Communication 34, 2 (2018), 104–117. tension to the PRISMA statement for reporting literature searches in systematic
[132] Robert R Morris, Connor R Kirschbaum, and Rosalind W Picard. 2010. Broad- reviews. Systematic reviews 10, 1 (2021), 1–19.
ening accessibility through special interests: a new approach for software cus- [154] Laura Roche, Jef Sigafoos, Giulio E Lancioni, Mark F O’Reilly, and Vanessa A
tomization. In Proceedings of the 12th international ACM SIGACCESS conference Green. 2015. Microswitch technology for enabling self-determined respond-
on Computers and accessibility. 171–178. ing in children with profound and multiple disabilities: A systematic review.
[133] Joan Murphy, Ivana Marková, Eleanor Moodie, Janet Scott, and Sally Boa. 1995. Augmentative and Alternative Communication 31, 3 (2015), 246–258.
Augmentative and alternative communication systems used by people with [155] MaryAnn Romski, Rose A Sevcik, Andrea Barton-Hulsey, and Ani S Whitmore.
cerebral palsy in Scotland: Demographic survey. Augmentative and Alternative 2015. Early intervention and AAC: What a diference 30 years makes. Augmen-
Communication 11, 1 (1995), 26–36. tative and Alternative Communication 31, 3 (2015), 181–202.
[134] Elizabeth Murray, Patricia McCabe, and Kirrie J Ballard. 2012. A comparison [156] Robert J Ruben. 2000. Redefning the survival of the fttest: communication
of two treatments for childhood apraxia of speech: Methods and treatment disorders in the 21st century. The Laryngoscope 110, 2 (2000), 241–241.
protocol for a parallel group randomised control trial. BMC pediatrics 12, 1 [157] Anna-Liisa Salminen, Helen Petrie, and Susan Ryan. 2004. Impact of computer
(2012), 1–9. augmented communication on the daily lives of speech-impaired children. Part
[135] NCBI. 2017. Augmentative and Alternative Communication and Voice Products I: Daily communication and activities. Technology and Disability 16, 3 (2004),
and Technologies. The Promise of Assistive Technology to Enhance Activity and 157–167.
Work Participation; The National Academies Press: Washington, DC, USA (2017), [158] Igor Schadle. 2004. Sibyl: AAC system using NLP techniques. In International
209–310. Conference on Computers for Handicapped Persons. Springer, 1009–1015.
[136] Timothy Neate, Aikaterini Bourazeri, Abi Roper, Simone Stumpf, and Stephanie [159] Jennifer M Seale, Ann M Bisantz, and Jef Higginbotham. 2020. Interaction
Wilson. 2019. Co-created personas: Engaging and empowering users with symmetry: Assessing augmented speaker and oral speaker performances across
diverse needs within the design process. In Proceedings of the 2019 CHI conference four tasks. Augmentative and Alternative Communication 36, 2 (2020), 82–94.
on human factors in computing systems. 1–12. [160] Andrew Sears and Vicki Hanson. 2011. Representing users in accessibility
[137] Timothy Neate, Vasiliki Kladouchou, Stephanie Wilson, and Shehzmani Shams. research. In Proceedings of the SIGCHI conference on Human factors in computing
2021. Just Not Together”: The Experience of Videoconferencing for People with systems. 2235–2238.
Aphasia during the Covid-19 Pandemic. In Just Not Together”: The Experience of [161] Claude Elwood Shannon. 2001. A mathematical theory of communication. ACM
Videoconferencing for People with Aphasia during the Covid-19 Pandemic. ACM. SIGMOBILE mobile computing and communications review 5, 1 (2001), 3–55.
[138] Timothy Neate, Abi Roper, Stephanie Wilson, Jane Marshall, and Madeline [162] Andy P Siddaway, Alex M Wood, and Larry V Hedges. 2019. How to do a
Cruice. 2020. CreaTable content and tangible interaction in Aphasia. In Pro- systematic review: a best practice guide for conducting and reporting narrative
ceedings of the 2020 CHI Conference on Human Factors in Computing Systems. reviews, meta-analyses, and meta-syntheses. Annual review of psychology 70
1–14. (2019), 747–770.
ASSETS ’22, October 23–26, 2022, Athens, Greece Curtis et al.

[163] Jef Sigafoos, Robert Didden, and MARK O’REILLY. 2003. Efects of speech output sources. In Proceedings of the 2011 Conference on Empirical Methods in Natural
on maintenance of requesting and frequency of vocalizations in three children Language Processing. 700–711.
with developmental disabilities. Augmentative and Alternative Communication [185] Annalu Waller. 2019. Telling tales: unlocking the potential of AAC technologies.
19, 1 (2003), 37–47. International journal of language & communication disorders 54, 2 (2019), 159–
[164] Rodrigo Silva and Fran Neiva. 2016. Systematic Literature Review in Computer 169.
Science - A Practical Guide. (11 2016). https://doi.org/10.13140/RG.2.2.35453. [186] Annalu Waller, Rolf Black, David A O’Mara, Helen Pain, Graeme Ritchie, and Ruli
87524 Manurung. 2009. Evaluating the standup pun generating software with children
[165] Jessica Simacek, Brittany Pennington, Joe Reichle, and Quannah Parker- with cerebral palsy. ACM Transactions on Accessible Computing (TACCESS) 1, 3
McGowan. 2018. Aided AAC for people with severe to profound and mul- (2009), 1–27.
tiple disabilities: A systematic review of interventions and treatment intensity. [187] Annalu Waller and Alan F Newell. 1997. Towards a narrative-based augmenta-
Advances in Neurodevelopmental Disorders 2, 1 (2018), 100–115. tive communication system. International Journal of Language & Communication
[166] Kiley Sobel, Alexander Fiannaca, Jon Campbell, Harish Kulkarni, Ann Paradiso, Disorders 32, S3 (1997), 289–306.
Ed Cutrell, and Meredith Ringel Morris. 2017. Exploring the Design Space of [188] Shunfang Wang, Zicheng Cao, Mingyuan Li, and Yaoting Yue. 2019. G-DipC: an
AAC Awareness Displays. In Proceedings of the 2017 CHI Conference on Human improved feature representation method for short sequences to predict the type
Factors in Computing Systems. 2890–2903. of cargo in cell-penetrating peptides. IEEE/ACM Transactions on Computational
[167] Katta Spiel, Laura Malinverni, Judith Good, and Christopher Frauenberger. 2017. Biology and Bioinformatics 17, 3 (2019), 739–747.
Participatory evaluation with autistic children. In Proceedings of the 2017 CHI [189] Janet Webster, Julie Morris, Carli Connor, Rachel Horner, Ciara McCormac, and
Conference on Human Factors in Computing Systems. 5755–5766. Amy Potts. 2013. Text level reading comprehension in aphasia: What do we
[168] Roger J Stanclife, Sheryl Larson, Karen Auerbach, Joshua Engler, Sarah Taub, know about therapy and what do we need to know? Aphasiology 27, 11 (2013),
and K Charlie Lakin. 2010. Individuals with intellectual disabilities and augmen- 1362–1380.
tative and alternative communication: Analysis of survey data on uptake of aided [190] Bruce H Westley and Malcolm S MacLean Jr. 1957. A conceptual model for
AAC, and loneliness experiences. Augmentative and alternative communication communications research. Journalism Quarterly 34, 1 (1957), 31–38.
26, 2 (2010), 87–96. [191] Mary Wickenden. 2011. Talking to teenagers: Using anthropological methods
[169] Stephen Steward. 2009. Designing AAC interfaces for commercial brain- to explore identity and the lifeworlds of young people who use AAC. Commu-
computer interaction gaming hardware. In Proceedings of the 11th international nication Disorders Quarterly 32, 3 (2011), 151–163.
ACM SIGACCESS conference on Computers and accessibility. 265–266. [192] Krista M Wilkinson and Janice Light. 2014. Preliminary study of gaze toward
[170] Katharine Still, Ruth Anne Rehfeldt, Robert Whelan, Richard May, and Simon humans in photographs by individuals with autism, Down syndrome, or other
Dymond. 2014. Facilitating requesting skills using high-tech augmentative intellectual disabilities: Implications for design of visual scene displays. Aug-
and alternative communication devices with individuals with autism spectrum mentative and Alternative Communication 30, 2 (2014), 130–146.
disorders: A systematic review. Research in Autism Spectrum Disorders 8, 9 [193] Kristin Williams, Karyn Mofatt, Denise McCall, and Leah Findlater. 2015. De-
(2014), 1184–1199. signing conversation cues on a head-worn display to support persons with
[171] Arthur Theil, Lea Buchweitz, James Gay, Eva Lindell, Li Guo, Nils-Krister Pers- aphasia. In Proceedings of the 33rd Annual ACM Conference on Human Factors in
son, and Oliver Korn. 2020. Tactile board: a multimodal augmentative and Computing Systems. 231–240.
alternative communication device for individuals with Deafblindness. In 19th [194] Stephanie Wilson, Abi Roper, Jane Marshall, Julia Galliers, Niamh Devane,
International Conference on Mobile and Ubiquitous Multimedia. 223–228. Tracey Booth, and Celia Woolf. 2015. Codesign for people with aphasia through
[172] Lida Theodorou, Daniela Massiceti, Luisa Zintgraf, Simone Stumpf, Cecily Morri- tangible design languages. CoDesign 11, 1 (2015), 21–34.
son, Ed Cutrell, Matthew Tobias Harris, and Katja Hofmann. 2021. Disability-frst [195] Bruce Wisenburn and D Jefery Higginbotham. 2008. An AAC application using
Dataset Creation: Lessons from Constructing a Dataset for Teachable Object speaking partner speech recognition to automatically produce contextually rele-
Recognition with Blind and Low Vision Data Collectors. In International ACM vant utterances: Objective results. Augmentative and Alternative Communication
SIGACCESS Conference on Computers and Accessibility (ASSETS). ACM. https: 24, 2 (2008), 100–109.
//www.microsoft.com/en-us/research/publication/disability-frst-datasets/ [196] Claes Wohlin. 2014. Guidelines for snowballing in systematic literature studies
[173] John Todman. 2000. Rate and quality of conversations using a text-storage AAC and a replication in software engineering. In Proceedings of the 18th international
system: Single-case training study. Augmentative and Alternative Communica- conference on evaluation and assessment in software engineering. 1–10.
tion 16, 3 (2000), 164–179. [197] Xiaoyi Zhang, Harish Kulkarni, and Meredith Ringel Morris. 2017. Smartphone-
[174] John Todman, Norman Alm, Jef Higginbotham, and Portia File. 2008. Whole based gaze gesture communication for people with motor disabilities. In Pro-
utterance approaches in AAC. Augmentative and alternative communication 24, ceedings of the 2017 CHI Conference on Human Factors in Computing Systems.
3 (2008), 235–254. 2878–2889.
[175] John Todman and Pat Dugard. 1999. Accessible randomization tests for single-
case and small-n experimental designs in AAC research. Augmentative and
alternative communication 15, 1 (1999), 69–82.
[176] John Todman, Leona Elder, and Norman Alm. 1995. Evaluation of the content of
computer-aided conversations. Augmentative and Alternative Communication
11, 4 (1995), 229–235.
[177] Bálint Tóth, Géza Németh, and Géza Kiss. 2004. Mobile devices converted into
a speaking communication aid. In International Conference on Computers for
Handicapped Persons. Springer, 1016–1023.
[178] Kathryn Tringale, Daniel Bacher, and Leigh Hochberg. 2012. Towards the
optimal design of an assistive communication interface with neural input. In
2012 38th Annual Northeast Bioengineering Conference (NEBEC). IEEE, 197–198.
[179] Stephanie Valencia, Michal Luria, Amy Pavel, Jefrey P Bigham, and Henny
Admoni. 2021. Co-designing Socially Assistive Sidekicks for Motion-based AAC.
In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot
Interaction. 24–33.
[180] Stephanie Valencia, Mark Steidl, Michael Rivera, Cynthia Bennett, Jefrey
Bigham, and Henny Admoni. 2021. Aided Nonverbal Communication through
Physical Expressive Objects. In The 23rd International ACM SIGACCESS Confer-
ence on Computers and Accessibility. 1–11.
[181] Mieke van de Sandt-Koenderman. 2004. High-tech AAC and aphasia: Widening
horizons? Aphasiology 18, 3 (2004), 245–263.
[182] Larah van der Meer, Jef Sigafoos, Mark F O’Reilly, and Giulio E Lancioni. 2011.
Assessing preferences for AAC options in communication interventions for
individuals with developmental disabilities: A review of the literature. Research
in Developmental Disabilities 32, 5 (2011), 1422–1431.
[183] Gregg C Vanderheiden. 2003. A journey through early augmentative communi-
cation and computer access. Journal of rehabilitation research and development
39, 6; SUPP (2003), 39–53.
[184] Keith Vertanen and Per Ola Kristensson. 2011. The imagination of crowds:
conversational AAC language modeling using crowdsourcing and large data
AAC with Automated Vocabulary from Photographs:
Insights from School and Speech-Language Therapy Setings
Mauricio Fontana de Vargas Jiamin Dai Karyn Mofatt
mauricio.fontanadevargas@mail.mcgill.ca jiamin.dai@mail.mcgill.ca karyn.mofatt@mcgill.ca
School of Information Studies School of Information Studies School of Information Studies
McGill University McGill University McGill University
Montréal, Canada Montréal, Canada Montréal, Canada
ABSTRACT acquisition, learning symbolic AAC demands a linguistic-rich envi-
Traditional symbol-based AAC devices impose meta-linguistic and ronment, with frequent opportunities for receiving and producing
memory demands on individuals with complex communication language through the symbolic modality. Conversation partners
needs and hinder conversation partners from stimulating symbolic have a crucial role in this process: they need to ensure that the
language in meaningful moments. This work presents a prototype AAC tool is programmed with relevant symbols and then model
application that generates situation-specifc communication boards language use with the tool as conversation opportunities naturally
formed by a combination of descriptive, narrative, and semantic arise [24].
related words and phrases inferred automatically from photographs. However, the traditional hierarchical organization of symbol-
Through semi-structured interviews with AAC professionals, we based AAC tools imposes substantial meta-linguistic and memory
investigate how this prototype was used to support communication demands on users searching to fnd desired words [35, 46], and re-
and language learning in naturalistic school and therapy settings. quires a great amount of time and efort from conversation partners
We fnd that the immediacy of vocabulary reduces conversation to select and pre-program relevant vocabulary [2]. Consequently,
partners’ workload, opens up opportunities for AAC stimulation, tools are often programmed with only a small set of words that
and facilitates symbolic understanding and sentence construction. cannot scale to unplanned situations, drastically limiting the op-
We contribute a nuanced understanding of how vocabularies gen- portunities for symbolic language use and acquisition.
erated automatically from photographs can support individuals One promising approach to alleviate the navigation and pre-
with complex communication needs in using and learning symbolic programming demands of traditional symbol-based AAC is combin-
AAC, ofering insights into the design of automatic vocabulary gen- ing Visual Scene Displays (VSDs) with “just-in-time” (JIT) program-
eration methods and interfaces to better support various scenarios ming [4, 8, 37, 46]. This approach associates language concepts with
of use and goals. a photograph or image of a naturally occurring scene. Conversation
partners can program these concepts with the participation of AAC
KEYWORDS users while the interaction takes place, i.e., “on the fy.” For example,
while at an amusement park, a family member can take a photo-
Augmentative and Alternative Communication, autism, just-in-
graph of the roller coaster and program concepts such as “high,”
time, automatic
“scared,” and “scream” on a page displaying the photograph. This en-
ACM Reference Format: ables the conversation partner to model those concepts quickly, and
Mauricio Fontana de Vargas, Jiamin Dai, and Karyn Mofatt. 2022. AAC with the individual with complex communication needs to interact with
Automated Vocabulary from Photographs: Insights from School and Speech- the concepts simultaneously with the relevant real-world referents.
Language Therapy Settings. In The 24th International ACM SIGACCESS
While this approach can capitalize on teachable moments [36], and
Conference on Computers and Accessibility (ASSETS ’22), October 23–26, 2022,
Athens, Greece. ACM, New York, NY, USA, 18 pages. https://doi.org/10.1145/
increase symbolic communication turns [21], it still requires efort
3517428.3544805 to manually select and program appropriate vocabularies that is
difcult to accomplish in unexpected or emergent situations.
1 INTRODUCTION Automated vocabulary generation techniques have been pro-
posed for constructing JIT vocabularies without human assistance,
Symbol-based Augmentative and Alternative Communication using diferent types of contextual information as seeds [46]. Re-
(AAC) leverages the relative strengths in visual processing of in- searchers have explored the use of geographical locations [19, 44],
dividuals with complex communication needs such as children identifcation of conversation partners’ speech [55, 56], and a combi-
with autism spectrum disorder. As with other forms of language nation of diferent sensor data [7, 45, 53] for generating or retrieving
Permission to make digital or hard copies of all or part of this work for personal or contextually relevant vocabularies. Photographs have also been ex-
classroom use is granted without fee provided that copies are not made or distributed plored for supporting people with aphasia ordering dinner [43]
for proft or commercial advantage and that copies bear this notice and the full citation and retelling past activities [39]. By applying image captioning
on the frst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, and optical character recognition (OCR), Obiorah et al.’s proto-
to post on servers or to redistribute to lists, requires prior specifc permission and/or a types [43] were able to translate photographs of food items and
fee. Request permissions from permissions@acm.org.
menus of local restaurants into interactive symbols during labora-
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Association for Computing Machinery. tory experiments. However, their approach is limited to labeling
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 items directly depicted in the photograph and cannot be used to
https://doi.org/10.1145/3517428.3544805
ASSETS ’22, October 23–26, 2022, Athens, Greece

generate additional related concepts. To generate a set of ten words changes (e.g., past and plural forms), programming of full sen-
related to a scene photographed, the system by Mooney et al. [39] tences, and access to keyboard. In contrast, emergent and context-
processed user-generated comments from a fctitious social media. dependent communicators focus on gaining symbolic communi-
Although their approach showed promising results for supporting cation skills, and are the groups most relevant to the focus of our
people with aphasia in retelling past activities, the approach cannot work.
provide instantaneous support as it is dependent on other people Context-dependent communicators can use symbolic com-
frst commenting on the photo. munication reliably but are still limited to certain contexts. Indi-
To date, there has been no research on the creation of AAC viduals with this profle often use dynamic displays containing
tools that automatically generate vocabulary from photographs for larger vocabularies organized hierarchically, but still do not take
use in a broad variety of communication contexts. The design of advantage of all capabilities of a robust AAC system (e.g., com-
such tools, both in terms of generation methods and interactive prehensive vocabulary and syntax modifers). They are starting to
interfaces, and the factors of the dynamics between individuals compose two or more symbol messages, but their interactions are
with complex communication needs, their conversation partners, still dependent on familiar partners, who must facilitate communi-
and automated language support are unexplored. Consequently, cation, selecting and programming words and messages for them,
the exact kind of support and how such tools could be integrated or helping navigating vocabulary [34]. Intervention goals for this
into real-life settings is unknown. group include increasing access to vocabulary, building literacy
In this work, we present Click AAC, a prototype tool that gener- skills, and expanding the communicator’s ability to interact with
ates situation-specifc communication boards organized in a VSD- more partners and contexts.
like layout and formed by a combination of descriptive, narrative, Emergent communicators use mostly body language, such
and semantic related words and phrases inferred automatically from as gestures, and non-symbolic modalities that are often not easily
photographs based on the technique proposed by de Vargas and understandable by unfamiliar partners, and communicate primarily
Mofatt [18]. Through our analysis of semi-structured interviews about the current context. AAC interventions for such individu-
with AAC professionals, we investigate how these professionals and als focus on establishing more reliable communication through
their clients with complex communication needs used Click AAC symbolic expression and increasing opportunities for communica-
during their routine therapy and school activities. We contribute a tion interactions. To support emergent communicators learning
deep understanding of how vocabularies generated automatically the associations between real-world objects or actions with their
from photographs can support individuals with complex communi- symbolic representation, professionals often pair these individuals
cation needs using and learning symbolic AAC. We ofer additional with single button communicators or static communication boards
insights into the design of automatic vocabulary generation meth- composed of a few symbols (e.g., 4–20 on GoTalk series) repre-
ods and interactive interfaces to provide adequate support across senting very common words (i.e., core words). Recently, VSDs have
scenarios of use and goals. been proposed as alternative support for this group. VSD tools
associate language concepts with photographs taken or uploaded,
2 BACKGROUND AND RELATED WORK either as embedded “hot-spots” in the photograph that reveal the
concept when selected, or as a dedicated panel attached to the
2.1 AAC interventions by communicator photograph [4, 8].
profles
Individuals with complex communication needs have an extensive 2.2 Challenges learning and using symbolic
range of expressive communication abilities. Professionals such AAC
as speech-language pathologists (SLPs) and assistive technology Not diferently from spoken language acquisition, learning sym-
evaluators are responsible for selecting tools that can adequately bolic communication requires regular exposure to a rich linguistic
attend to the specifc communicator’s evolving needs. The mapping environment and frequent opportunities for language use. SLPs and
between available tools and users can be described according to family members have a crucial role in this process [49]. They must
the three broad profles of communicators, as classifed by the immerse learners in environments rich in AAC language, ensur-
speech-hearing community: independent, context-dependent, and ing the availability of relevant vocabulary and actively performing
emergent communicators [9]1 . aided language stimulation2 during meaningful and motivating op-
Independent communicators have literacy skills on par with portunities [24]. In this technique, a conversation partner models
same-age peers and are able to generate completely spontaneous language on the learner’s device while they speak. This includes
messages about any topics or contexts while interacting with famil- describing their own actions while they engage in parallel play
iar and unfamiliar partners, usually through text-based or robust with the learner, describing the learner’s actions, providing an ex-
AAC systems—those containing a very large symbol set (e.g., 2000+) ample of target production, and repeating the learner’s utterances
organized hierarchically and with consistent arrangement for sup- with additional words to create more semantically or syntactically
porting motor planning, in addition to allowing morphological complete sentences [5]. Aided language stimulation has proved
efective in increasing learner’s semantic understanding of sym-
bols, [14, 15], number of communication turns, and syntax under-
1 This classifcation is also used in Dynamic AAC Goals Grid-2 (DAGG-2) , a tool standing complexity [47], and therefore, researchers recommend
for assessment and measurement of an individual’s current level of communication
popular in the clinical community. 2 Also known as aided language modeling or aided language input.
AAC with Automated Vocabulary from Photographs ASSETS ’22, October 23–26, 2022, Athens, Greece

that conversation partners should perform it in at least 70% of In an attempt to generate vocabularies that attend to emergent
interaction opportunities [15]. user needs in various locations, researchers [19] have applied infor-
However, the design of traditional symbol-based AAC devices mation retrieval and natural language generation (NLG) techniques
hinders such frequent exposure to relevant symbolic communi- on internet-accessible corpora such as websites, dictionaries, and
cation. The main challenge with such tools is the difculty in or- Wikipedia pages related to user’s current location or conversation
ganizing a large number of symbols needed for the spontaneous topic. Although this approach was useful for augmenting a base
creation of sentences in a manner that allow individuals with com- vocabulary with context-specifc terms during a laboratory experi-
plex communication needs and their conversation partners to easily ment simulating two locations and two topics, it is unclear how this
access desired language concepts when needed. Traditional tools approach would behave in naturalistic settings or how to extend it
display symbols out of context, arranged in grid-based displays that to personal situations (e.g., telling someone about last weekend’s
are organized hierarchically following linguistic (e.g., nouns and trip) for which internet-accessible corpora are unlikely to exist.
verbs) or hierarchically-based (i.e., superordinate → ordinate, like Storytelling vocabulary has been successfully generated for sup-
“food” → “dessert”) categories, imposing signifcant meta-linguistic porting children with complex communication needs in recounting
and memory demands [35, 46]. “how was school today” to their families [53]. This method clusters
To facilitate the availability of relevant vocabulary and reduce unstructured sensor data (e.g., RFID tags determining the user’s lo-
navigational demands, conversation partners can create topic- cation within the school) and transforms it into narrative sentences
specifc communication boards by selecting words related to a using a knowledge base containing the school’s timetable and RFID
topic they deem as useful and grouping them on a single page. mapping information.
Nonetheless, this strategy does not scale to unexpected situations More recently, researchers have started exploring photographs
and imposes a heavy workload on conversation partners, who must as the contextual information input. Obiorah et al. [43] designed
anticipate learners’ vocabulary needs and dedicate time to program prototypes aimed at supporting people with aphasia in ordering
that vocabulary into the devices. Consequently, vocabulary avail- meals in restaurants by providing automated captioning of scenes
ability tends to be restricted to a small set of topics or a series of using images from the internet and making text-based information
frequent words that can be used across most contexts (e.g., want, and menus interactive through OCR. Mooney et al. [39] utilized
go). Conversation partners are not able to capitalize on naturally comments from a simulated social network to generate context-
occurring opportunities for language learning, further hindering related words of personally relevant events to support people with
symbolic communication learning and learners independent use of primary progressive aphasia in retelling their past events. Although
AAC. participants in their study demonstrated increased lexical retrieval
during controlled experiments, this approach is dependent on the
availability of user-comments and thus cannot be used immediately
after the photograph is taken.
2.3 Automated JIT support for AAC In this work, we build of of recent research by de Vargas and
The concept of “just-in-time” (JIT) support, as used in the AAC Mofatt [18], which proposed a novel method for generating story-
feld, refers to the programming and availability of language con- telling vocabulary automatically from photographs for use in AAC.
cepts at the moment they are needed, through technologies that Their method generates a rank of key words and short narrative
allow the easy creation of VSDs or other AAC content within in- phrases from a single input photo, by matching against the Visual-
teractions [46]. This includes either mentor-generated JITs, such as Storytelling Dataset (VIST) [27] and then expanding this initial
the creation of a hotspot on a VSD when a certain activity is hap- word list to include related ideas using the SWOW model of the
pening, or automated JITs, which do not require additional human human lexicon [17]. Their performance evaluation using a subset of
assistance, such as playing a video that demonstrates how to wash VIST as groundtruth vocabulary (1,946 photos and 9,730 narrative
hands when a learner enters the bathroom. The benefts of JIT sup- sentences) has shown that this method can provide relevant vocab-
port are hypothesized based on conceptual underpinnings related ulary for creating narrative sentences. However, it is unclear how
to working memory demands, situated cognition, and teachable well the technique performs when integrated into an interactive
moments [46]. application and evaluated by users in real-world contexts.
Context-aware computing has demonstrated value as an enabling
approach for automated JIT vocabulary support. Through a par-
ticipatory design involving people with aphasia, researchers [29] 3 INTERACTIVE APP DESIGN
explored the concept design of an AAC system that would adapt To explore the usefulness of automatic JIT vocabulary from pho-
the vocabulary presented according to the user location or conver- tographs in supporting symbol-based AAC, we designed Click AAC,
sation partner to facilitate word fnding. In a Wizard of Oz study an interactive mobile application that integrates diferent tech-
in a local aphasia center, vocabulary was manually pre-assigned to niques for generating context-related words and phrases. Click
diferent contexts (e.g. doctor’s ofce) and presented to participants AAC runs on Android and Apple smartphones and tablets.
while they were imagining using the device in that location. Al- The design of Click AAC is rooted in evidence-based recom-
though their study showed the usefulness of providing vocabulary mendations from HCI and AAC literature, including the design of
tailored to the user’s context, the technical challenges of building well-established AAC tools. Throughout the design process, the
algorithms capable of generating those context-related words were frst author volunteered for eight months in a local aphasia center,
not addressed. and integrated lessons from frst-hand communications with SLPs
ASSETS ’22, October 23–26, 2022, Athens, Greece

and people with aphasia into the design of Click AAC. Before the The app categorizes words by their parts of speech applying the
launch of this user study, hundreds of AAC professionals working NLTK library’s tagger. Users can enable and disable each method in
directly with individuals with complex communication needs in- the settings menu. Finally, symbols representing the vocabulary are
formally checked the design and overall concept of the prototype retrieved from ARASAAC, a repository containing more than 11,000
through a post on specialized social media groups. They confrmed AAC symbols5 . If the language set in the application is diferent
that the prototype design was suitable to be tested with end users from English, generated words and phrases are translated to the
during therapy and school activities. target language through the Google Translate API.
We detail the design rationale and important facets of Click
AAC’s vocabulary generation and user interface below. 3.2 Interface
Our mobile application is composed of three main screens designed
3.1 Vocabulary Generation to provide direct access to its main features, as shown in Fig. 1: (1) a
AAC tools must support users in a variety of communication func- home screen from which the user can import existing photos from
tions across diferent contexts, such as commenting, describing, the device’s gallery, take a new photo, or view their album, (2) an
asking and answering questions, and engaging socially [23, 33]. album screen from which the user can navigate through all their
Therefore, Click AAC employs a combination of three generation previously imported photos and open associated communication
methods (descriptive, related, narrative) that provide vocabulary boards, and (3) the vocabulary page screen that presents the vocab-
spanning the main parts of speech for symbolic AAC (i.e., pronouns, ulary generated for an individual photo. The smartphone version
nouns, verbs, and adjectives). consists of the same three screens, with minor diferences in the
The frst step for all methods consists of creating a set of candi- Vocabulary Page, as detailed next.
date description tags and a human-like description sentence (i.e., 3.2.1 Vocabulary Page. Click AAC borrows the overall layout con-
caption) for the input photograph using the computer vision tech- cept and key features from VSDs, a state-of-art AAC support for
nique from Fang et al. [22]3 , as done in the work from de Vargas and early symbolic communicators and individuals with cognitive and
Mofatt [18]. By applying captioning rather than pure object detec- linguistic limitations [4, 8, 12, 36, 37]: vocabulary is organized in
tion and labelling, Click AAC obtains abstract concepts represent- communication boards around a center topic represented by the
ing the interactions between the objects, people, and environment main photograph (e.g., “eating quesadillas”), rather than in hierar-
depicted in the photograph (e.g., “playing”, “angry”). chical categories representing abstract concepts (e.g., “actions” or
This initial vocabulary is then used with distinct goals in each “foods”).
method: We follow evidence-based guidelines for the design of VSDs and
(1) Descriptive: Simple description of the scene. It includes grid displays for children with developmental disabilities and adults
lemmas of all description tags, as well as the description with acquired conditions given by Light et al. [37].
phrase. First, the set of words is displayed in a grid layout with symbols
(2) Related (Expanded): Words semantically related to the grouped and colored according to their part of speech, following
elements in the scene. It includes lemmas of all description the Modifed Fitzgerald Key [38] color coding, as in popular AAC
tags plus lemmas of the three words most strongly connected tools. We chose this confguration over embedding vocabulary in
in SWOW—a model of the human mental lexicon constructed the photograph itself through “hotspots” to allow a larger number
from word-association experiment data—for each description of symbols to be displayed without navigation to other pages, and
word4 . to facilitate the transition between Click AAC and other popular,
(3) Narrative: Words and phrases used for creating narratives grid-based tools. Each generated sentence is displayed as a single
about the scene photographed, obtained through the tech- button containing the symbols of its content words. Users can
nique proposed by de Vargas and Mofatt [18]. This technique trigger synthesized audio output by tapping on the vocabulary
selects vocabulary associated with similar photographs (i.e., buttons. Scrolling up or down possibly reveals items hidden due to
having semantically similar captions) from the visual story- lack of available space on the screen.
telling dataset Vist [27], which contains 16,168 stories about Second, tablet users can navigate to other vocabulary pages by
65,394 photos created by 1,907 mechanical Turk workers. selecting thumbnails of the signature photos, available via the navi-
By default, the fnal set of vocabulary presented to the user is gation bar on the left of the communication board currently open, a
the combination of all methods, limited to 20 verbs, 20 nouns, 15 strategy demonstrated benefcial by clinical researchers [4, 37, 54].
adjectives, and 6 phrases, and fxed pronouns (I, you, he, she, they, Selected words are displayed in a message bar on top of the screen,
and we). We chose this combination of values to maximize the allowing users to compose sentences combining individual symbols,
number of vocabulary items displayed while keeping symbols size as in typical AAC devices. On smartphones, due to the restricted
similar to current tools, and minimizing scrolling. Vocabularies screen size, the message bar and navigation bar are not displayed.
from the Descriptive method have the highest priority, followed by The main photograph is displayed on the top of the screen, with
the Related (Expanded). the associated vocabulary in the bottom. Thus, smartphone users
must swipe left or right, or tap on arrows located on the sides of
the photo to navigate to other vocabulary pages.
3 Microsoft
Azure API implementation.
4 The
words human, person, man, men, woman and women are not expanded with 5 ARASAAC is maintained by the Department of Culture, Sports and Education of the
SWOW vocabulary. Government of Aragon (Spain): https://arasaac.org/
AAC with Automated Vocabulary from Photographs ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 1: Click AAC’s Home Screen, Album, and Vocabulary Page containing words and phrases generated automatically from
a photograph. Within a Vocabulary page, users can navigate to other photos through the vertical panel on the left, or interact
with symbols. Taping on a symbol activates text-to-speech and adds the concept to the message bar on the top. Users can
reorder, remove, edit the symbol associated with a word, and add new words and sentences. The size of all elements and the
number of vocabulary items generated are customizable.

3.2.2 Editing Vocabulary Generated. To support user’s agency dur- between vocabulary buttons, colours attributed to each part of
ing communication [52], Click AAC allows the editing of the initial speech, number of columns in the photo album, voice type, rate,
vocabulary set generated automatically to correct errors and enter and pitch. The interface in Fig. 1 shows the default confguration.
missing items. To edit vocabulary generated, users must enter the
edit mode by tapping on the main photograph and holding it for at 4 METHODS
least 500 ms. In line with the design of other AAC apps6 , we chose We conducted a user study involving AAC professionals and their
a non-obvious interaction for editing to prevent unintentional ac- clients with complex communication needs who used Click AAC
tivation. The editing mode displays a new menu bar next to the in their routine practices of therapy sessions or school activities.
photo with options for reordering and removing words and phrases, Through questionnaires and semi-structured one-on-one online
adding new words, and editing symbols associated with the words. interviews with AAC professionals, we investigated an overarching
To perform one of those actions (e.g., remove a word), users must question:
select the option (remove) and then select the item that will receive
How can situation-specifc vocabularies auto-
the action for at least 500 ms.
matically generated from photographs support
3.3 Personalization settings communication and language learning for indi-
viduals with complex communication needs?
To maximally support a range of diferent user profles, Click AAC
allows personalization of several aspects of vocabulary generation We explore this research question in terms of professionals’ re-
and interface: type of vocabulary (generation method), maximum fections on their experiences with our prototype, as well as broader
number of words generated for each part of speech, number of factors and concepts envisioned through their experiences. This ap-
phrases generated, language, number of columns used for each proach allowed us to understand the broad application of automatic
part of speech (automatically adjusting the vocabulary buttons generation of vocabularies from photographs, without limiting
to ft in the available space), and size of interface’s components use scenarios or introducing artifcial ones. As a long-established
(main photo in the vocabulary page, phrases panel, words grid, practice, AAC interventions must consider not just the needs of
and menu bar). Users can also modify the font size (or remove individuals that require AAC, but also those of their conversation
fonts completely, as suggested by Light et al. [37]), the spacing partners [5, 34, 36]. These professionals regularly try novel AAC
technologies and combine multiple tools to accommodate emerging
6 Popular tools employ three-fnger swipe or a small button in the corners of screen needs dependent on the situation and client profle, in addition to
ASSETS ’22, October 23–26, 2022, Athens, Greece

practicing symbolic communication with clients and instructing professionals used their own expertise and judgment in selecting
family members on how to support AAC at home. Therefore, their which clients to try the application with.
expertise can provide unique higher-level perspectives than indi- For participants who tested with AAC learners, frequency of use
vidual users. This broad perspective was particularly pertinent to ranged from a few sessions spread over four weeks to continuous
this stage of our longer term research. HCI researchers working in use during approximately two months. The time using the app
similar contexts, e.g., designing technologies for dementia care [20], within their routines also greatly varied because most of the usage
have also noted the value of working with clinicians. occurred as the need and opportunities arose, rather than during
In this exploratory study, our goal was not to specifcally eval- time slots dedicated to testing the app. Consultants/evaluators who
uate Click AAC, but rather to understand the use and expand the tested the app by themselves used the app less extensively, given
design space regarding automatic JIT vocabulary from photographs that their evaluation mostly consisted of uploading several pho-
in AAC. Engaging directly with users and observing them using tographs and investigating vocabulary generated without engaging
the application on defned tasks might bring valuable insights for in specifc activities with AAC learners. Most participants used only
designing an application but would ofer little support for such the tablet version, with the exception of P20 and P7 who also used
exploration. Nonetheless, during all interviews, care was taken to the smartphone one.
ensure that the participants not only shared their perspectives but We interviewed the professionals through online video meetings
also relayed the experiences of their clients. The virtual format of once they deemed their evaluation was complete and they were
the interviews also enabled us to reach a broad set of use cases ready to provide feedback. Each interview took approximately 20–
across learning, therapy, and cultural contexts, without geographic 50 minutes. The semi-structured interview was guided by eight
constraints. questions, covering scenarios of use, profle of users, comparison
As a secondary investigation, we looked into user experiences against current AAC tools, adequacy of the tool in professionals
with Click AAC to understand its overall usability in naturalistic and learners routines, and strengths and weaknesses of the current
settings. This investigation was not intended to obtain performance prototype.
metrics of Click AAC in comparison to existing approaches through Professionals who used the application with their clients or
controlled experiments, but rather to ensure that the app had a students also responded to a 5-point Likert scale questionnaire
reasonable usability and to provide evidence to help us interpret containing 16 questions about interaction, vocabulary, and usage
fndings that are directly infuenced by our particular implemen- factors (Fig. 2). Four participants (P16–P19) were uncomfortable
tation. This analysis could also shed light on how to improve the with communicating in English and were instead interviewed in
interactive vocabulary support to inform future designs of such their preferred language (i.e., French (1), Spanish (2), Italian (1)) by
tools. email. Interview questions and participant answers were translated
with third-party services and checked by the frst author who has
4.1 Participants basic knowledge of those languages. Each participant received a
10$ honorarium.
We made Click AAC publicly available through mainstream app
store platforms, and recruited AAC professionals through a mes-
4.3 Data Analysis
sage displayed in its initial screen. This message prompted SLPs,
who were trying or expecting to try Click AAC with one or more We conducted a refexive thematic analysis [10, 11] on the inter-
individuals with complex communication needs, as well as AAC con- view transcripts within MAXQDA20227 . The frst author performed
sultants or evaluators, who assessed the app independently based inductive open coding, guided by our overarching research ques-
on their professional expertise, to enter their contact information if tion. The open codes were iteratively developed into themes and
they were interested in participating in the study. Eighty-four (84) sub-themes through axial coding, followed by selective coding. All
individuals agreed to participate, and 53 answered a preliminary authors discussed the inductive codes as they were evolving, until
questionnaire regarding their experience with AAC, the profes- reaching an agreement on the themes and their interpretations.
sional setting of use, and their expected timeline for trying the app
with individuals with complex communication needs. 5 OVERALL USABILITY
Within this group, 14 SLPs used Click AAC with their clients on To help with the interpretation of the thematic analysis, we frst
private therapy sessions or with their students in special education present the results regarding the overall usability. These illustrate
for at least four weeks, and additional 6 consultants/evaluators how users perceived the quality of diferent elements in the app,
tested the app by themselves. This study includes the data from such as the vocabulary generated and the overall interaction style.
these 20 professionals. Through them, we reached a variety of It also revealed necessary improvements, new features, and possible
settings and user profles (detailed in Table 1). We refer to these avenues for improved interactive language support that can help
clients and students as AAC learners throughout the paper. inform the design of future applications with automatic vocabulary
from photographs.
4.2 Procedure Fig. 2 shows the post-questionnaire answers as a diverging
stacked bar chart, with the count of participants’ answers8 rep-
Since this study aimed to understand the use in naturalistic settings,
resented in the x-axis. Horizontal bars are aligned by the center
participants were not instructed on how or where to use the app,
but rather asked to use or continue to use it in their routine prac- 7 https://www.maxqda.com/new-maxqda-2022

tices in the ways they judged to be most appropriate. Accordingly, 8 P5 and P13 did not answer the post-questionnaire.
AAC with Automated Vocabulary from Photographs ASSETS ’22, October 23–26, 2022, Athens, Greece

Current
AAC Users
ID Profession Years Exp. Caseload Setting User Profles
in this Study
(AAC Users)

non-verbal and min-verbal children; 3 with ASD, 1


P1 SLP 6 250 PT 4
with cerebral palsy
non-verbal and min-verbal children; severe sensory
P2 SLP 7 20 SES 5
dysregulation

P3 SLP 20 1 PT 1 5 yo with ASD and apraxia ; non-verbal

child with down syndrome; literate; dysarthirc


P4 SLP 6 many PT 1
speech

P5 SLP 43 25 PT 3 2 teenagers and 1 adult, all non-verbal

P6 SLP 10 4 SES 1 non-verbal child; fne-motor skill difculties

3–22 yo non-verbals and min-verbals; intellectual


P7 SLP 30 50 EC 20–25
disabilities; some with fne-motor skill difculties
non-verbal child with ASD; 2 young adults
P8 SLP 5 39 SES 3
min-verbal
6 yo with ASD, min-verbal, sensory needs; teen
P9 SLP 20 2 SES 2
min-verbal, apraxia
SLP, AAC 8 and 16 yo with signifcant cognitive and social
P11 20 3 PT 2
consultant issues, dependent on conversation partner

P13b SLP 16 50 PT, SES 2 min-verbal child; literate min-verbal 17 yo

verbal and non-verbal children with intellectual


P17ac SLP 20 16 PT 16
disabilities

P18ac SLP 15 22 PT 6 verbal and non-verbal children with ASD

SLP, AAC
P19ae 25 180 PHI 1 non-verbal child with ASD
specialist

SLP, AAC
P10 11 80–150 PS 0 diverse intellectual disabilities
specialist
SLP, AT
P12 20 40 PT 0 diverse disabilities
consultant
AT 300
P14 28 PS 0 diverse disabilities
specialist classrooms
SLP, AAC illiterate individuals with language impairment or
P15 25 3 PT 0
researcher intellectual disabilities
SLP, AAC 2800 emerging communicators and AAC experts;
P16ad 20 CATI 0
advisor subscribed aphasia, diverse cognitive impairments
SLP, AAC anyone in people with difculty naming objects and
P20ac 24 PHI 0
advisor the country navigating vocabulary
SES: Special ed. school PT: Private therapy PS: Public school CATI: Center for AT innovation PHI: Public health institute
aEmail interview b Hebrew app c Spanish app d French app e Italian app

Table 1: Participants in our user study.


ASSETS ’22, October 23–26, 2022, Athens, Greece

Strongly disagree Disagree Neutral Agree Strongly agree


1. The symbol set used was appropriate
2. The voice output quality was appropriate
3. Users could easily select desired vocabulary
4. Users could easily remove undesired vocabulary
5. Users could easily find a desired vocabulary page
6. Users could easily create a new vocabulary page
7. Users tended to access old vocabulary pages
8. Users tended to access new vocabulary pages
9. Vocabulary generated included desired words
10. Vocabulary generated included undesired words
11. Vocabulary order was adequate
12. Users enjoyed using the app
13. Users demonstrated willingness to use the app
14. Users operated the app independently
15. Users were more communicative using the app
16. Users would benefit from an app based on our prototype
12 11 10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12
Count of participants' answers

Figure 2: Post-questionnaire scores for 12 professionals who used Click AAC with AAC learners during their practices. Original
questions are provided as supplementary material.

of the “Neutral” category. In general, professionals were satisfed the app, and that learners would beneft if there was a complete,
with the design of Click AAC and the support it provided. Most commercially ready application based on our prototype.
users could operate the application without major issues (Q1–8).
5.0.1 Importance of personalization. Besides vocabulary editing,
Some participants commented that some learners’ motor abilities
participants often had to perform other personalization actions to
required additional methods of access, such as external switches
accommodate learner’s needs (e.g., “I made some reduced choice
or scanning options that were not supported by Click AAC, and
boards for students that couldn’t handle as many choices” (P14)).
thus could not easily select desired items on a vocabulary page
The importance of limiting the number of vocabulary items
(Q3). Three participants also disagreed that removing undesired
for each part of speech category and modifying the layout to in-
vocabulary was easy to perform. During the interviews, they ex-
crease symbol buttons or the main photograph were highlighted
plained that they had not fgured out how to access the editing
among participants. P3 attributed the observed improvement on
options, but once they were instructed, it became straightforward
her learner’s communication to the ability of personalizing the tool
to remove undesired items, highlighting the importance of more
to match the learner’s profle:
evident instructions.
Regarding the perceived quality of vocabulary generated (Q9– P3: I think your board was what was able to start her
11), seven out of the nine participants who used the app in English on more modalities of communication, in more areas
agreed that generated vocabulary included desired words. However, of her day, then what I had been previously doing with
most participants also agreed that the vocabulary set included non- her . . . because I was able to make it more specifc for
relevant words. Participants who had more issues with the quality of her needs and for her level of communication . . . and
vocabulary generated commented during interviews that Click AAC I like how easy it is to add and take of a lot of the
was not recognizing the photographs they wanted to talk about, icons, so that it’s not so overwhelming
and thus vocabulary generated was mostly irrelevant. Participants 5.0.2 Improving interactive language support. Participants identi-
who used translated vocabulary were least satisfed with its quality, fed two main improvements required for providing better inter-
with two disagreeing that generated options were relevant and one active language support. First, they noted that, although the app
being neutral. This was not unexpected given the simplistic way was highly customizable, they also need to be able to add familiar
translation was handled. people as pronouns symbols and that automatic identifcation of
The answers covering the app usage (Q12–16) indicate that, al- the names of people photographed “would be an amazing feature”
though independent use of app occurred , professionals mostly (P5), speeding up the process. The second relates to the availability
operated the app together with the learner or by themselves to of frequent words, i.e. “core” vocabulary, for all photographs. Par-
accommodate learners’ needs (e.g., physical access, level of prof- ticipants highlighted that would be “extremely useful” (P3) to have
ciency with symbolic communication). Also, participants strongly a personalized core vocabulary present on all pages, and displayed
agreed that learners demonstrated willingness and enjoyed using in the same location on the screen to leverage motor planning.
AAC with Automated Vocabulary from Photographs ASSETS ’22, October 23–26, 2022, Athens, Greece

6 FINDINGS favorite things”, such as a visit to the “museum or zoo over the
Our thematic analysis revealed three main themes that together weekend”, could “take pictures of” such an event and “bring it to
answer our overarching research question. The frst theme describes like, a family party and talk about it”. P15, who has experience with
the situations and ways in which Click AAC was incorporated in people with aphasia also envisioned such use cases, but warned that
school and therapy activities. It details the kinds of support provided the user would need to have “pretty good reading comprehension”
for diferent learner profles, in addition to presenting envisioned to discern whether the vocabulary generated was adequate or not.
use cases. The second theme interprets how people benefted from Otherwise, the app would get the user “into trouble because [users]
the immediacy of vocabulary provided by Click AAC during those would be selecting messages that were maybe not appropriate or
activities. The third theme explores the dynamics between AI and relevant to the photo and not realize it.”
users, weighting the benefts and issues introduced by automation
6.1.2 A complementary tool to talk about past and present contexts,
and revealing the importance of keeping humans in the loop.
“giving them a voice” and “facilitating language development” . Pro-
fessionals viewed Click AAC as a complementary tool to facilitate
6.1 Click AAC ofers a fexible, complementary language development and enable communication about specifc
AAC tool for a wide range of user profles topics. They reported a variety of main goals for using the app,
6.1.1 A wide range of learner profiles can benefit. Professionals se- including to “expand expressive language” (P2), “augment and fa-
lected learners for trying Click AAC through the feature-matching cilitate speech language development” (P4), and reduce prompt
process [25, 48], the gold standard for AAC evaluations, in which dependency (P8), “build[ing] up towards using it as an alternative
they consider the “learner profle, the environment they are in, and form of communication” (P4).
the tasks they need to do” (P14) to select appropriate tools from They did not see the tool as a substitute to existing robust sys-
their “toolbox” for communication support or language instruction. tems because of the uncertainty of vocabulary that will be available
Overall, professionals felt that a wide range range of learner pro- and because it does not give access to all language concepts the user
fles may beneft from technology similar to Click AAC. Selected may need at all times—which are limitations inherent to the con-
learners were, in majority, emerging or context-dependent commu- cept of automatic JIT vocabulary and topic-specifc communication
nicators that were non- or minimally-verbal children with diverse boards, respectively. In addition to those limitations, some aspects
developmental disabilities (e.g., ASD, cerebral palsy). Professionals of our specifc design further hindered the adoption of the tool as
also used Click AAC with a smaller number of children and young substitute AAC device, such as the the lack of a fxed core vocab-
adults with functional communication and some literacy skills. Pro- ulary set across all pages and a variable arrangement of symbols
fessionals described how a wide range of user profles could beneft that does not promote motor planning.
from auto JIT from photographs depending on how the professional On the other hand, the ability to easily access language in situa-
incorporates it in their practice. For example, P2 explained their tions where they “want[ed] to talk about something that’s unique or
optimism about Click AAC potentially benefting more profles as [to] tell a story” (P12) supported learners in diferent ways towards
it extended the expressive capabilities for kids along the verbal and the intervention goals. While learners used the app independently
nonverbal spectrum in terms of visual cues and structures, working in some instances (as reported by P2, P9, and P17), on most occa-
in tandem with existing AAC devices. sions professionals operated the app in conjunction with learners
P2: I don’t know if there’s one profle that [Click AAC due to the learner’s cognitive and motor difculties, in line with
is] better for. I think it’s pretty open to whatever pro- their regular practice involving other AAC tools. They used the app
fle you’re working with and tweaking . . . based on for: i) taking photos of the current context (e.g., “desk space”, toys,
what the kid’s goals are . . . [For kids who] are ver- “table top activity”, and doing horticulture); ii) uploading generic
bal . . . , this really helps them visually . . . build those photos obtained from the internet about topics personally relevant
sentences . . . to expand their expressive language and to learners (e.g., favourite toys and cartoon characters); iii) upload-
hopefully generalize outside of just the app. [For] ing photos of past activities or events (e.g., “past vacation”, “last
completely nonverbal kids . . . this is really awesome weekend, and “cooking”).
because . . . you’re limited to the amount . . . you can Then, professionals used the vocabulary generated to work on
talk about so it gives them more structure. So, hope- diferent activities that encouraged symbolic AAC, such as per-
fully then [they’ll] be able to build their language forming aided language stimulation (i.e., modeling language) with
in whatever communication app that they’re using, emerging communicators while describing, asking questions, or
that has the 3000+ words . . . not just whatever picture making comments about the scene photographed. With context-
they’ve taken. dependent communicators, professionals extended the activity by
instructing learners to construct their own sentences. For example,
Professionals also envisioned the use of such technology for
P20 explained how she guided her student to compose sentences by
other populations such as adults with aphasia and dementia. P2,
using the part of speech categorization, and P8 commented on how
for example, described how people with Broca’s aphasia9 , who
Click AAC enabled her to model language faster and supported
have “telegraphic speech” and often “want to be talking about their
one learner in constructing his own sentences about an activity
9 Individuals with Broca’s aphasia have trouble speaking fuently but their comprehen- previously performed in the school, as he was telling a story:
sion can be relatively preserved. This type of aphasia is also known as non-fuent or ex-
pressive aphasia. The National Aphasia Association. https://www.aphasia.org/aphasia- P20: [I] used it for sentence construction. At frst it
resources/brocas-aphasia/ was me who used it to teach my students. Then they
ASSETS ’22, October 23–26, 2022, Athens, Greece

learned how to do it and they are the ones who choose design to better attend users needs across the naturalistic activities
the picture and the pictures to compose the sentence. within therapy and school settings. The next theme explains how
I guided them to put the subject frst, then the verb the immediacy of vocabulary benefted and can beneft users during
and fnally the complements. those activities.
P8: The speaker is able to [model] quicker. It’s like an
easier application to be able to have [language] mod- 6.2 Immediacy of vocabulary facilitates
eled and then have them either replicate it or have communication and language learning “on
them generate their own sentence from it. [later] ev- the spot” with reduced workload
eryone was cooking . . . for the week, . . . making que-
Our fndings in this theme reveals how learners and professionals
sadillas. And so, . . . we just generated sentences based
benefted and may beneft from immediate availability of situation-
on that and . . . one of them . . . could structure it to
specifc vocabulary from photographs across the diferent activities
where it was almost like he was telling a story. Like,
and intervention goals described in the frst theme.
he could say: frst, I added the cheese, and then we
used the cheese, . . . kind of a sequencing story 6.2.1 Reduced workload opening up opportunities for AAC stimu-
P11 further illustrated how she used Click AAC in therapy as a lation. Professionals stressed the importance of selecting and pro-
bridge between the auditory-verbal realm and the symbolic con- gramming appropriate fringe vocabulary to support learners in
cepts for emerging communicators, and discussed the potential of the various situations encountered in their routines. They further
auto JIT vocabularies from photographs for enabling communica- described how this is typically an arduous task, but that vocab-
tion about past events to family after therapy: ulary generated automatically from photographs can alleviate it.
P11: It can be used fexibly either way. . . . So these are Not surprisingly, conversation partners were not only overloaded
not individuals with cognitive ability that you can just by the need of selecting and programming vocabulary on current
hand [Click AAC] to them, and they can start talking tools, but also unable to plan and perform these tasks for all situa-
and modify it . . . They’re dependent on me to present tions encountered by learners. The instant availability of relevant
something that’s relevant to them, and they may not vocabulary allowed participants to increase the frequency of mo-
have the physical capabilities to access it . . . [Click ments in which they could model language or engage learners with
AAC] is a great bridge for me to use as a therapy tool AAC in general (which is fundamental for successful AAC inter-
so that I’m not just existing in the auditory-verbal ventions, as introduced in Section 2.2). For example, P7 pointed
realm with them, because then I can become over out the challenges of helping multiple students with diferent tasks
narrating everything, and I’m not anchoring them concurrently in teaching routines. She then commented how she
to any concepts. So when I have something that’s was currently able to provide only core vocabulary throughout
concrete, I can use that as my anchor to bridge what the school day, and how auto JIT from photographs encouraged
I’m saying, and . . . the concepts I’m wanting to teach, communication on the spot by providing easy access to relevant
and they’re learning. . . . If you have somebody that fringe vocabulary:
can . . . take a picture of . . . this event you’re seeing, and P7: One of the big drawbacks in teaching in that kind
when you go home, you have the picture, and you of environment . . . is you don’t know what every-
can have some way of communicating that one time body’s doing . . . I could be outside doing pruning and
event you saw, I see that as being very, very powerful. snipping and lopping, or I could be in working with
In fact, P6 and her learner used the app for a scenario similar to somebody on hand washing, or somebody with feed-
the one envisioned by P11. They relied on the generated vocabulary ing . . . [Later] You always want a child to have the
to start talking about a past visit to the dentist when the learner’s ability to communicate, but the time for teachers to
main AAC device could not provide support: do that is very limited. . . . I just I can’t keep up with
P6: So I used [Click AAC] to download a photo of fringe [vocabulary]. With something like [Click AAC],
the a doctor or dentist, and I asked [a learner] where a teacher could take a picture and could encourage
she had been . . . and when we pulled up the dentist that communication and they could do it quickly and
picture, we got a few more words about it. So, we used they could do it easily . . . So this is brilliant.
it as a supplement to an AAC that she’s already using Continuing, P6 commented that Click AAC ofered “more spe-
. . . it did help us kind of open up a conversation about cifc words” to talk about what was happening in their environ-
the dentist which we didn’t have easy access to use ment (school room) than the learner’s main AAC device, and P7
in her standard device. described one episode in which the instant generation of vocabu-
lary supported an unexpected situation for which she did not have
6.1.3 Theme summary. This theme revealed that a range of user
vocabulary material prepared in advance, enabling her to engage
profles can beneft from automated generation of vocabulary from
the learner with AAC:
photographs, and that Click AAC was used as a complementary tool
to facilitate symbolic language learning and to enable communica- P7: The other day in horticulture . . . I had some fringe
tion about specifc things, addressing some previously unmet needs. vocabulary, but I had not taken pictures of pruners,
These fndings signal potential directions to expand the existing and loppers and snipers [for low-tech AAC] . . . So, I
AAC with Automated Vocabulary from Photographs ASSETS ’22, October 23–26, 2022, Athens, Greece

was able to take a picture of those three tools [with (P13). Professionals explained how the immediate creation of sym-
Click AAC] [and] start talking about those tools. bols from real world concepts support teaching learners on “how
They also postulated that auto JIT from photographs might par- to use symbolic communication to communicate something more
ticularly beneft families, who are less experienced with AAC tech- specifc” (P11). Because learners could see and use the symbols at
nologies and vocabulary selection than professionals, and there- the same time as they engaged with the associated object or con-
fore face challenges in creating adequate on-topic communication cept, it was easier to “understand that a referential symbol replaced
boards. the presence of that object” when composing messages on the AAC
tool, “such as words [do] in oral language,” as P20 described:
P1: [It] takes me probably like a few minutes to be able
to create a new page in someone’s communication P20: The person can have a pictographic representa-
application, but that’s because I do that . . . fve days a tion of an object or scene immediately, and therefore,
week [for] seven years . . . but imagine a family who have at his disposal the symbolic representation that
either is not tech savvy [or] just trying to keep up he or she must learn to use communicatively to refer
with their child who has special needs. It takes them to similar objects and scenes.
like hours . . . they just don’t have that time. . . . the
convenience and quickness of being able to program Participants noticed that having a concise set of vocabulary
information into it is the most impressive thing that displayed next to a photograph setting the communication context
I’ve seen. in “real-time” supported learners in engaging in the formulation
of spontaneous sentences. P13 commented that this strategy gives
The signifcantly reduced programming workload required by
“situational context cues”, “making it easier to compose sentences
Click AAC could “engage more families” to implement AAC at home,
and to make it more of a conversation” with minimum navigation.
as well as teachers in classrooms. P14, who works on selecting and
As P14 mentioned, “sometimes the navigation between boards,
recommending assistive technologies, commented that she shared
as you’re learning to build sentences, it’s like a lot together”. P8
the app to families and teachers in her county as an attempt to
described how the layout particularly supported users with limited
encourage them to use AAC more often at home and in classroom:
attention spans:
P14: We have a weekly that we share to families with
free apps on it where we give information and [Click P8: [It’s] an easier application to . . . have them either
AAC is] one of the tools for a student that’s start- replicate [sentences] or have them generate their own
ing with communication where we want to . . . get the sentence from it. It’s already right there as opposed to
family engaged with, this would be nice because it’s needing to return back to the picture or go down to
not a lot of heavy programming . . . We have a lot of the choices. Everything is all right there, which keeps
families that are not super comfortable with technol- them focused . . . if they’re limited by their attention
ogy. So, if it’s easier, then we’re more likely to see span [or] cognition . . . Everything is all right there in
implementation . . . front of them.
In addition, P7 discussed how the easiness of having immediate Spontaneous sentence construction was also supported by the
relevant vocabulary can help parents ofering opportunities for automatic classifcation of vocabulary into part-of-speech columns.
immersing learners in AAC, and thus keeping learners engaged Participants found the organization of vocabulary into columns
during a variety of scenarios where is not be possible to have other following the “modifed Fitzgerald coloring system” helpful because
AAC tools to support communication: it was “easier to read” (P9), and “one of the most common ones, so
P7: So, for the parent who wants to communicate the kids [were] familiar with that color coding” (P2), “match[ing
with their child, but they’re not going to carry around with] another couple of AAC” (P6). P3 further illustrated the ease
15 core boards with the possibility of what might be of use and the consistency of this organization:
there, this is terrifc! I mean, you could be at Sea world
and take a picture of Shamu and you’re going to get P3: It was very easy . . . with the categories where you
great stuf to talk to your kid about. You could be at have them lines up, like, the pronouns are on the
at a restaurant and take a picture of the food and be left . . . then the verbs . . . it makes it easier to follow
able to talk about what you’re doing or while you’re through when you’re trying to formulate sentences
standing there waiting in line. Endlessly . . . to go to the or questions. I think it’s an easier fow, and it was
Harry Potter word, you could take diferent pictures consistent across all the boards.
of things . . . keep them engaged .
6.2.3 Support for communicating personal interests. Professionals
6.2.2 Facilitated symbolic understanding and sentence construction. noted that instant situation-related vocabulary from photographs
Professionals discussed how Click AAC benefted learners and enabled children who relied on nonverbal communication and had
themselves during aided language stimulation and sentence con- major difculties navigating traditional AAC systems to initiate
structions activities. Having “something visual [to] anchor some communication about personally relevant topics, allowing profes-
concepts” was deemed particularly important for modelling lan- sionals to “expand and build on whatever modality [learners were]
guage for emerging communicators because a picture taken was using” (P2). P6 explains how nonverbal learners often want to
“live” and “connecting [the symbols with] something very physical” initiate communication about topics interesting to them but are
ASSETS ’22, October 23–26, 2022, Athens, Greece

hindered due to the lack of easy access to relevant vocabulary, com- P13: [For] example, [a boy] had no interest... he [tried
paring how auto JIT from photographs provided better support in Click AAC] and it was very, very emotional ’cause
relation to existing tools: . . . Once we started it till now, he’s like, I can’t read,
I don’t know, I’ll never know . . . but when I started
P6: My main purpose for it would be that “on the
talking to him about the app . . . [he] started saying,
go,” when I have a student that needs to talk about
yeah, I’m going to learn to read and learn to write.
something that is just too frustrating to fnd the words
It . . . got him . . . motivated to even try, which is very
for on their device. . . . right now it’s snowing, we could
new for him.
take a picture of the snow and the playground, and
then the language that comes up about that is concise 6.2.5 Theme summary. This theme detailed the impact of having
and related. And, the Proloquo, the Snap Scene . . . we vocabularies generated instantly from photographs, revealing four
have to dig and dig and dig and fnd “go”, back to the main benefts in terms of how such vocabularies: (1) reduced the
page that has “playground”, and go forward to the workload for selecting and programming situation-specifc vocabu-
page that has “weather”. And I go back to the page lary for professionals, which led to increased opportunities for AAC
that has “clothes” . . . practice , (2) facilitated the immersion of learners in symbolic com-
The ability to choose a communication topic by selecting a pho- munication during language modeling and sentence construction
tograph and having related vocabulary instantly available was also activities, (3) supported the communication of personal interests,
deemed important because it allows users to talk about things that and (4) impacted on motivation and confdence engaging with sym-
were popular among other children, as P5 detailed: bolic AAC.

P5: What my young people are screaming at me about 6.3 Biases introduced did not compromise
. . . is that [they] can’t say what [they] want to say, and
support but highlighted the importance of
talk like the other kids that are out there (their peers
without disabilities) And this application gives them AI-human cooperation
the ability to do that, if the vocabulary that’s being Automation of vocabulary selection proved helpful and led to posi-
generated is right, ’cause they can pull up the things tive outcomes, but participants’ experiences highlighted the impor-
that are popular, the things that are of interest to them. tance of keeping humans in the loop and revealed new aspects and
challenges intrinsic to human-AI cooperation for AAC. This theme
6.2.4 Potential impact on motivation and confidence. Although our frst demonstrates how participants’ perceptions of the vocabulary
user study did not focus on measuring language outcomes, our quality was related to the type of photographs they used as input
fndings provided preliminary evidence that such technology may and the context of use. Then, it shows common biases and errors
improve motivation and confdence for some learners, particularly caused by the algorithms powering Click AAC and revealed how
for those who had been least successful with current tools. Some participants cooperated with the AI not only to overcome those
professionals commented that learners were receptive to the tech- issues, but also to achieve improved support that would not be
nology and motivated to used it. P9 also observed improved com- possible by the AI or themselves alone.
munication in a particular moment for a student who was more
confdent to speak words thanks to the support provided by Click 6.3.1 Qality of vocabulary was directly related to the photograph’s
AAC. content, failing for some relevant situations. Our analysis revealed
common patterns in the quality of vocabulary generated across dif-
P9: I’ve seen the most improvement with the one ferent input photographs, signaling the system’s high dependency
who’s minimally verbal . . . This actually happened just on the input photograph’s content.
yesterday . . . We had used the app the whole session Overall, professionals judged individual words generated to be
with a puzzle, . . . so I would push the thing to say my mostly relevant and requiring only a few modifcations, when the
turn and take a turn and then I would prompt him to scene photographed was correctly identifed (which users could
do the same and he started doing that. . . . But then, verify through the descriptive phrase displayed at the top option).
after just about a couple of minutes, he decided to be Participants positively noted that words were “not limited,” “not too
verbal, so I would say my turn and then he would just predictable,” and included not only the names of objects depicted
verbally say my turn instead of pushing the button. in the photo but also a broader set of words related to the scene,
So I think that kind of gave him like a little bit of “expand[ing] language.” However, they noticed that some words
confdence . . . and an understanding of . . . I see what I were unrelated and deemed the quality of the narrative phrases as
need to do . . . I’m okay with being verbal with with inappropriate to support communication in the naturalist settings
this part. So it was it was a really cool moment I they experimented in, as the instances described by P14 below:
thought for him to take that app in and start with it,
P14: We took a picture of rainbow fsh, and that did
. . . and he’s shown some emergence of that kind of
pull up a lot of really good vocabulary about stufed
skills a little bit through therapy.
animals that it . . . had all the colors, . . . things like draw.
P13 commented that her learner, who was working on literacy And things . . . that were good to go with the book. But
skill and had no interest in engaging in language learning activities then there were things that showed up that we were
with other tools, got motivated by trying Click AAC: like, I’m not sure how this fts in. [Later] the sentences
AAC with Automated Vocabulary from Photographs ASSETS ’22, October 23–26, 2022, Athens, Greece

often didn’t go with the activity that I was putting to make sure that when the water bottle came up, [Click AAC]
together. doesn’t show me ‘wine’ ” Professionals—especially those using
When the scene was not properly identifed, vocabulary was Click AAC in languages other than English—also noticed difer-
generally not useful and the application was left aside. Some par- ences in the vocabulary style of our generated options and their
ticipants got frustrated due to the misidentifcation, but then, by community norms, describing these as “not adapted to our country”
experimenting, learned the kinds of pictures the app was able to (P18). P10 further expressed the need for recognizing subtle dif-
properly identify. ferences between apparent synonyms, especially in language and
P14: [There was] this little learning period where I cultural contexts:
was . . . a little frustrated with the AI, because it was P10: I don’t think that my community [would] really
misreading the pictures, but once we got kind of fg- use the word “flth” in that way. We would use the
uring it out, I was very happy with it. word “dirty” So, just being able to tweak that would
In general, participants reported that Click AAC was able to be helpful.
correctly identify the scene in the majority of photographs (“most 6.3.2 Errors and biases introduced did not compromise support, and
of the time it picks up what you’re doing” (P7)). However, although efort correcting and complementing the AI “was worth it”. Despite
identifcation on “cluttered pictures”, or with very specifc elements the aforementioned errors and biases introduced by AI, participants
or details occurred in some instances, such as specifc gardening noted that the automation still facilitated performing language stim-
tools (e.g., pruners and loppers (P7)), TV characters (White, Rue, ulation with the learners during meaningful activities. In addition,
and Bea from Golden girls (P5)), facial expressions (“straight face” the great majority of participants found that the efort fltering,
(P10), and age-related attributes (“historic” (P7)), Click AAC often complementing, and correcting the AI was worthwhile in compari-
misinterpreted photographs relevant for learner’s common activi- son with the amount of work needed for programming the current
ties. tools (“It takes less time to create a few boxes than to recreate a
Participants who encountered the most difculties cited input complete page” (P6)).
photographs containing “two-dimensional” “images that are not Participants found it easy to edit and add individual words once
real”, such as “cartoon’s characters”, “specifc toys”, a “door knob”, they had learned how to perform those actions, either through the
a “smiley” face (to talk about emotions), “super heroes”, “ax throw- app’s embedded tutorial or by asking for researcher instructions:
ing place”, “play-doh”, “bubbles”, “holiday tree Tu BiShvat”, “body “It was easy for me to move it around, take of what I didn’t want
parts for Mr. Potato head”, and “Peppa Pig”. P2, for example, dis- and add what I did want” (P3). They highlighted the importance of
cussed how learners wanted to take photos of cartoon characters or being able to quickly edit vocabulary for professionals and learners,
games, but the incorrect identifcation of the specifc toy resulted as P5 elaborated below:
in unusable vocabulary: P5: When you’re busy, when you’re programming
P2: A lot of the pictures they’re wanting to take are . . . and also for users who are doing their own pro-
fat pictures . . . of Spiderman or, of Dora the Explorer gramming, [editing] needs to be streamlined to be as
. . . one of the car mats, so it’s like the carpets and it easy as they can go in and do it . . . So, the least amount
just has the cars that you can drive on it . . . But then of efort is the best thing.
the vocabulary that came up, I couldn’t really talk
In instances where professionals were mostly working on aided
about it at all cause [Click AAC] thought it was a
language stimulation, professionals just mentally ignored irrelevant
box of cards . . . so those are the kind of pictures that
symbols and focused on relevant ones to maximize the immediacy
we’re ending up taking. It’s not of your normal three
of the symbolic representation, as P9 discussed: “I don’t delete
dimensional objects
[anything]. I can . . . go through and determine which ones I like
Besides the lack of specifcity in the identifcation of “uncommon” best”.
scenes, participants noticed some items being constantly identifed In most situations though, participants edited the vocabulary
as other similar, but totally unrelated objects. For example, P10 prior to engaging with learners, as a preparation for a specifc ac-
discussed how “random cylinder objects” were being recognized as tivity, or in conjunction with the learner when the communication
soda containers, and P7 experienced “laundry soaps . . . and some was taking place. P9 described how she and her learners worked
softeners . . . ” being ‘identifed as food’ ”. Some participants also together to add missing words:
acknowledge how tricky it is to correctly identify some photos,
P9: I think it’s really easy to add words, . . . I can do it
given potential similarities. For example, P15 described an instance
in real time during the session [when] we really need
where Click AAC identifed a goat as a dog, expressing “but to be
this word. So, “let me put it in real quick.” . . . and my
fair, he does kind of look like a dog in this picture”.
other student who can already program the words on
Part of the vocabulary that participants judged inappropriate
his own . . . if he comes up with a word, I’ll say, “oh,
were gender or language style-related biases introduced by the
that’s a great idea. Why don’t you add that in there?”
datasets used when training the machine learning models adopted
in our generation methods. Identifying people with long hair as 6.3.3 Cooperation led to extended support. In most cases, once
woman was a common comment among participants. Another Click AAC displayed a new vocabulary page, participants checked
common issue reported was that the language was not adequately if the overall scene identifcation was correct, and scanned (with
matched to user’s age-group. For example, P6 noted: “I would want learners in some instances) through the items to remove undesired
ASSETS ’22, October 23–26, 2022, Athens, Greece

items and/or add missing words. Professionals reported that during 7 DISCUSSION
this scanning process, the initial set of words generated by the AI Our fndings revealed insights into the potential for automatic gen-
often “served the role of a prime,” stimulating them to think of eration of context-related vocabulary from photographs to support
new relevant words that they would have not thought if they were AAC, as well as on aspects specifc to our implementation in school
selecting the vocabulary by themselves, as P15 discussed: and speech-language therapy settings. We now discuss how the
P15: I might see something that was generated by observed and envisioned benefts of such technology relate to the
the app that makes me think: “Oh, that’s a good conceptual underpinning of JIT support introduced by Schlosser
idea.” . . . this would also be appropriate and I might et al. [46], moving to the implications for the design of such tools
not have thought of that before. . . . when it comes for the variety of user profles and contexts of use identifed in
to . . . vocabulary development, it’s kind of the difer- our analysis. We conclude with refections on the study design
ence between a blank slate, where you’re thinking, employed and directions for future research.
okay where do I start? What do I? How do I come up
with something that’s relevant that . . . and having the 7.1 Conceptual underpinning for the benefts
app generate some stuf for you, based on a relevant from immediacy of vocabulary
picture, and then that triggers more ideas. So then The benefts of being able to immediately generate vocabulary as
you might think of other things that you would try needs arise, as revealed in the second theme, included reduced
programming to see if that would work for the client. workload leading to increased opportunities for AAC stimulation,
P11 discussed a similar efect, highlighting “the endless poten- facilitated symbolic understanding and sentence construction, sup-
tial” of having vocabulary that is not strictly related to the input port for communicating personal interests, and potential impact
photograph, using “imaginative” sentences as a starting point for on motivation and confdence. These benefts are all tightly related
stimulating conversation through other forms of AAC: to the conceptual foundations of the JIT support: working memory
demands, situated cognition, and teachable moments [46].
P11: [When I tried with a photo of] my dog, it said “a When communicating with the aid of a traditional dynamic grid
dog standing on a wood foor,” and then it came up display, learners must keep the desired concept in mind while si-
with something imaginative like “He decided to dress multaneously remembering the page where that symbol is located,
up his dog.” So, then I could take that and run with how to navigate to that page, and the location of the desired symbol
[the learner]. I wouldn’t have thought of that myself, on the target page, while avoiding distractions that may arise dur-
and I would be “What a great idea!” I could go to the ing this process. When forming sentences, users must go through
markup tool and on the iPhone or the iPad and start this process several times [51]. With the combination of automated
dressing my dog up in diferent things. So, it could generation of vocabulary from photographs and VSD-like interface,
be a springboard to something . . . I see that as being users do not need to hold in memory the symbols previously se-
endless potential. lected nor to remember how to navigate to a desired symbol while
constructing sentences, reducing memory demands. Our partici-
P7 also illustrated that the mutual collaboration between users
pants emphasized how this was particularly useful for constructing
and AI led to novel levels of support. She discussed how she adapted
sentences to model language because learners can focus on the
her communication to incorporate words ofered by Click AAC and
language concepts rather than being burdened with the navigation
expanded the interactions with the learner:
task.
P7: [After uploading a photo of a dog,] if a child is not Our approach enabled users to have symbols representing the
really scanning, but they touch “mammal”, I can go real world concepts they were engaging with readily available,
ahead and talk about that and I can say: yeah, she’s which can not only alleviate working memory demands but also
a mammal, let’s think of some other mammals. Let’s facilitate situated cognition. Cognition and learning are inher-
see . . . animals that have “fur” (points to the vocab ently dependent on the social and cultural contexts in which they
button). . . . You can really expand just with a handful occur, and this is no diferent for language learning and comprehen-
of vocabulary like that, that you would go. . . . Why sion [13]. Associating language elements with perceived referents
would I want the word “mammal” on a fringe board? while a situation takes place is crucial for learners to comprehend
. . . that’s exactly why! So, you can go ahead and ex- and use language. The immediacy of symbolic representation helps
pand on language so that no matter what they touch, to clarify the relation between objects, symbols, events, and agents
I can go further with them . . . participating in that situation [1]. By providing related vocabu-
lary instantly without requiring users to anticipate the situation,
6.3.4 Theme summary. This theme demonstrated how the per- our approach can increase the frequency of moments for which
ceived quality of vocabulary was directly related to the photograph the learning of symbolic representation through aided language
content, informing future selection of machine learning models and stimulation is possible.
training dataset for improved scene recognition. It also explained This relates to the third conceptual underpinning of JIT support,
how participants cooperated with the AI to overcome the errors teachable moments. According to the education literature [28],
and biases introduced, providing insights into how this cooperation teachable moments are those opportunities that emerge when stu-
can be leveraged to reach improved support. dents are excited, engaged, and primed to learn. Adults must provide
AAC with Automated Vocabulary from Photographs ASSETS ’22, October 23–26, 2022, Athens, Greece

activities to children according to their level of development, al- That dynamic would be impacted during independent use of the
lowing them to “learn what they want and when they are ready to tool, and arguably would be very diferent across individuals with
learn” [28]. The provision of automatic JIT vocabulary can support autism, aphasia, and dementia, or other disabilities. In those cases,
conversation partners capitalizing on those teachable moments by a generation method with much higher precision may be required
being able to adapt the ofered support to emerging and unforeseen to minimize the frustrations and confusion that the provision of
situations quickly, and to engage in topics of interest of the learner, unrelated or not-so-related vocabulary may cause. For example, as
which can activate background knowledge about those contexts, emerged from our fndings, some narrative sentences generated
consequently promoting comprehension [26]. Our fndings indeed were very imaginative, not exactly ftting the scene, but nonethe-
demonstrated how the auto generation of vocabulary in Click AAC less were useful as a springboard to initiate communication. Such
enabled or facilitated communication in those teachable moments, serendipitous prompts can potentially better support agency for
even when the generated vocabulary had missing or irrelevant people with aphasia in creative activities [40, 50] and people with
words. Participants explained how the app could provide relevant dementia in art therapy [32] and social sharing [16, 31].
vocabulary during unplanned, very specifc activities (e.g., horti- Second, although professionals judged the editing interactions
culture), or when fnding the words on the main device was too as easy to perform once they had learned how, and the smartphone
frustrating for the learner (e.g., visit to dentist). form-factor facilitated one-hand operation, it is unclear how our de-
Bringing these three concepts together we can see that auto sign would support individuals with complex communication needs
JIT vocabulary from photographs not only reduced the workload operating the app to edit vocabulary and access it when needed.
of AAC professionals, but also enabled them to take advantage of More research is needed to investigate the design of interaction ges-
teachable moments that arose during school or therapy activities, fa- tures and layout confgurations for supporting independent editing
cilitating the use of situated cognition in stimulating symbolic AAC. of vocabulary. For example, people with aphasia often have motor
These benefts also speak to the need of considering modeling of defcits that would dictate diferent layout or interaction capabil-
AAC use and the creation of AAC user-friendly environments when ities. Another interesting venue for investigation is the trade-of
designing new technologies, as highlighted by Bircanin et al. [6], as between vocabulary precision and editing efort across individuals.
well as some of the key facilitators (increased availability of techni- Some people may fnd it easier to have a larger set of vocabulary
cal solutions, motivated community of stakeholders) and barriers generated automatically, and then scan through it deleting unde-
(complex technologies, resource restrictions) for AAC adoption sired items; others may fnd this task too difcult and require a
and support in special education setting revealed on the ethno- smaller, but more precise set of words.
graphic study by Norrie et al. [42]. Looking beyond our results, the Third, it is unclear exactly how other populations would make
impact of reduced workload on families, as hypothesized by our use of the vocabulary generated. AAC use in aphasia ranges from
participants, could be even more remarkable. Learners naturally prompting to compensation [3]; for example, in the study by Obio-
spend most of their time with family, but family members often rah et al. [43], involving the use of AI for helping people with
lack the expertise and time for selecting and programming vocab- aphasia ordering dinner, a participant used the support provided by
ulary needed to perform aided language stimulation sufciently a prototype to rehearse what he wanted to say rather than having
throughout their daily routines [36]. To underscore the magnitude the system speak out loud or automatically place the restaurant
of this workload, note that it’s been recommended that at least 70% order for him. Literate users profcient of AAC, who do not need
of interaction opportunities should be modeled through the aid the symbolic representation of vocabulary, may use the generated
tool [15]. With automatic vocabulary from photographs, families words as a supplement for the next-word prediction mechanism
have the opportunity to greatly increase the frequency of teach- running in their current devices, potentially increasing the commu-
able moments without adding workload to their routines. Future nication rate. Future research is needed to understand the impact of
research can investigate the use of such tools with family members, removing the conversation partner and increasing the participation
potentially revealing new facets of the cooperation between AI and of the individual with complex communication needs in the coop-
users, and new naturalist use cases unexplored in therapy or school eration with the AI in tools that provide automatically generated
settings. vocabulary from photographs.

7.2 Expanding auto-generation of vocabulary 7.3 Designing for specifc use cases
from photographs for other populations Our fndings from the frst theme provide insights into the scenarios
The fact that auto JIT vocabulary from photographs can support in which automatic JIT vocabulary from photographs can provide
a wide range of profles opens up opportunities for broader user support, as well as how people used the support ofered across
groups. As envisioned by our participants, for example, people with these situations. This enables future research to narrow down the
aphasia or dementia could use such an app independently as an design of tools such as Click AAC. Since our study was exploratory
alternative mode of communication, which would likely result in in nature, we designed Click AAC as a generic tool aimed at sup-
very diferent dynamics of cooperation between AI and users. porting a wide set of contexts. Future research can now explore
First, independent use of the tool would require individuals with how to leverage the capabilities of automatic JIT vocabulary from
complex communication needs themselves to judge the relevance photographs to facilitate the specifc activities identifed, including
of generated vocabulary. In our study, professionals acted as a fl- language modeling, sentence construction, language expansion,
ter, either by editing the vocabulary or ignoring unrelated words. and past event recount.
ASSETS ’22, October 23–26, 2022, Athens, Greece

For example, researchers can explore diferent interfaces for between AAC users, professionals, and AI to achieve enhanced sup-
facilitating single word modeling for emergent communicators, port. This includes, for example, new techniques that incorporate
such as providing only the symbol of the main object identifed in corrections on the image descriptions and vocabulary set generated
the scene, maximized in the display. Continuing, future work can made by all users for retraining or reinforcing the image identifca-
look into the design of tools that facilitate the practice of sentence tion model and/or vocabulary generation method over time, aiming
construction using language concepts extracted from photographs. at improving their overall accuracy and precision.
This may include exercises for flling gaps in sentences related to The fndings that emerged in the third theme also inform how
the identifed scene, in which sentences and available options are novels methods for expanding the image description into a set of
generated automatically. For example, taking a photograph with contextually related terms following user’s own style are needed.
a boy playing soccer as input, the application could automatically The narrative method used by Click AAC used corpora from adults
generate the sentence “the boy is playing”, and ask the user to in the USA. This was insufcient, leading to mismatch between
complete it from a option list including baseball, tennis, and soccer. users language styles and support ofered. Future research should in-
To support language expansion, future directions include probing vestigate generation methods for AAC that accommodate regional
new interactive interfaces and organization strategies that allow styles, and more importantly, that provide children and teenagers
easy exploration of semantically related words. For example, words with language that sounds like their peers’. One possible avenue is
semantically related to the concepts appearing in the photographs, to reproduce the user language style by applying the lexicon terms
generated by the related-expanded method, could be displayed in a manually associated with a certain photograph to new photographs
secondary level that would appear only when the user selects the containing similar elements (as judged by the AI) during the gener-
main concepts in the photograph. Finally, we propose studying how ation process. Other strategy could be to reinforce the generation
to generate more meaningful sentences to retell a past event, in method with vocabulary selected during communication.
addition to facilitating the presentation and editing of such phrases The necessity of running performance evaluations of AAC sys-
for maximum personalization. The exploration how to combine tems on datasets has been discussed in the feld [18, 30]. Obtaining
multiple photos of the same event for providing support is another quantitative fndings that are statistically signifcant and can inform
possibility, given that people often capture diferent moments and the fne-tuning of internal components for optimizing the system,
angles of personally relevant events. and anticipating faws before testing the system with end users
Another avenue for future research is to study how to create a are the main reasons. In the initial evaluation of the storytelling
robust AAC that integrates automatic vocabulary photographs. Our generation method by de Vargas and Mofatt [18], authors found
fndings pointed to some design opportunities, such as the use of that the method was robust to variations in the input photograph.
a customizable core vocabulary board across all pages, consistent However our fndings revealed that the technique for identifying
spacial arrangement of items to support motor planning, access to the scene failed for several AAC use cases, leading to unrelated
a keyboard, and possibility to do morphological infections (e.g., vocabulary and lack of support. Our fndings on what kind of pho-
plural and past tense). tographs professionals and learners want to use the technology
The understanding of the usefulness of such tool on school set- with inform the construction of new datasets for this frst stage
tings also raises questions on the other form factors that such of system evaluation that better represents AAC use. A possible
application could be created for. The use of tabletop displays and next step would be to extend the VIST dataset with photos and
smart boards, for example, may provide new opportunities for pro- vocabulary for cartoon characters, popular people, school objects,
viding a “shared communication space” among the entire classroom, and toys.
potentially increasing the participation of peers in the interaction.

7.5 Limitations
Our approach to recruiting professionals interested about the con-
7.4 Improving quality of vocabulary generated cept uncovered perspectives essential for exploring the broad use
Our study did not focus on evaluating the quality of vocabulary gen- of automated vocabulary from photographs for AAC. Professionals
erated through controlled experiments. Nonetheless, our fndings expertise allowed us to understand the unique needs of users when
are able to provide insights into some common, general patterns in learning symbolic AAC, and how auto JIT vocabulary can be inte-
the quality of vocabulary generated in relation to the photograph grated into their existing practices to support the learning process.
content, in addition to the use cases for such technology, informing However, it hindered direct investigation into how AAC learners
i) the future selection of machine learning models and training interact with the tool.
dataset for improved scene recognition, ii) context-related vocabu- Future work could perform on-site observations during therapy
lary generation methods, and iii) the selection of adequate datasets and school sessions to better understand the interactions between
for evaluating generation methods during early stages of system professionals, learners, and the tool, as well as assessing the level
design. of language support provided for diferent situations through more
Future research can integrate existing techniques for identifying controlled experiments. A possible approach is to employ a single-
cartoon’s characters [41, 58] and person re-identifcation [57], for subject treatment design to measure the diference in the individuals
example, and study whether these models are able to attend the lexical retrieval skills when using Click AAC and other tools to
needs of AAC professionals and learners during their routine activ- support the person retelling past activities, such as in the study by
ities. Another thread of research can study forms of cooperation Mooney et al. [39].
AAC with Automated Vocabulary from Photographs ASSETS ’22, October 23–26, 2022, Athens, Greece

8 CONCLUSION pilot study to improve communication using AAC. Aphasiology 31, 11 (2017),
1282–1306.
The immense potential of the “iPad and mobile technology revolu- [13] John Seely Brown, Allan Collins, and Paul Duguid. 1989. Situated cognition and
tion” for benefting AAC users has been discussed for more than a the culture of learning. Educational researcher 18, 1 (1989), 32–42.
[14] Joan Bruno and David Trembath. 2006. Use of aided language stimulation to
decade, but current symbol-based tools still have not realized the ad- improve syntactic performance during a weeklong intervention program. Aug-
vantages brought by recent advancements in artifcial intelligence mentative and Alternative Communication 22, 4 (2006), 300–313.
and context-aware computing. In this work, we integrated computer [15] Shakila Dada and Erna Alant. 2009. The efect of aided language stimulation on
vocabulary acquisition in children with little or no functional speech. (2009).
vision and machine learning techniques proposed by de Vargas and [16] Jiamin Dai and Karyn Mofatt. 2020. Making Space for Social Sharing: Insights
Mofatt [18] to create Click AAC—a mobile application that gener- from a Community-Based Social Group for People with Dementia. In Proceedings
ates situation specifc communication boards automatically from of the 2020 CHI Conference on Human Factors in Computing Systems. ACM, 1–13.
[17] Simon De Deyne, Steven Verheyen, Amy Perfors, and Daniel J Navarro. 2015.
photographs. We conducted a user study with AAC professionals Evidence for widespread thematic structure in the mental lexicon.. In CogSci.
and their clients with complex communication needs who used the [18] Mauricio Fontana de Vargas and Karyn Mofatt. 2021. Automated Generation of
Storytelling Vocabulary from Photographs for use in AAC. In Proceedings of the
application in their routine practices for therapy sessions or school 59th Annual Meeting of the Association for Computational Linguistics and the 11th
activities. We contribute a nuanced understanding of how situation- International Joint Conference on Natural Language Processing (Volume 1: Long
specifc vocabularies automatically generated from photographs Papers). 1353–1364.
[19] Carrie Demmans Epp, Justin Djordjevic, Shimu Wu, Karyn Mofatt, and Ronald M
can support communication and language learning for individuals Baecker. 2012. Towards providing just-in-time vocabulary support for assistive
with complex communication needs, ofering new insights into the and augmentative communication. In Proceedings of the 2012 ACM international
design of automatic vocabulary generation methods and interactive conference on Intelligent User Interfaces. 33–36.
[20] Emma Dixon and Amanda Lazar. 2020. Approach Matters: Linking Practitioner
interface to provide adequate support across naturalistic scenarios Approaches to Technology Design for People with Dementia. In Proceedings of
of use and goals. the 2020 CHI Conference on Human Factors in Computing Systems. ACM, 1–15.
[21] Kathryn DR Drager, Janice Light, Jessica Currall, Nimisha Muttiah, Vanessa Smith,
Danielle Kreis, Alyssa Nilam-Hall, Daniel Parratt, Kaitlin Schuessler, Kaitlin
ACKNOWLEDGMENTS Shermetta, et al. 2019. AAC technologies with visual scene displays and “just in
time” programming and symbolic communication turns expressed by students
We would like to thank all participants, who generously shared with severe disability. Journal of intellectual & developmental disability 44, 3
(2019), 321–336.
their time and expertise in the development of this work. This [22] Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh K Srivastava, Li Deng, Piotr
research was funded by the Fonds de Recherche du Québec - Nature Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C Platt, et al. 2015.
et Technologies (FRQNT), the Natural Sciences and Engineering From captions to visual concepts and back. In Proceedings of the IEEE conference
on computer vision and pattern recognition. 1473–1482.
Research Council of Canada (NSERC) [RGPIN-2018-06130], the [23] Brenda Fossett and Pat Mirenda. 2007. Augmentative and alternative communi-
Canada Research Chairs Program (CRC), and by AGE-WELL NCE, cation. Handbook on developmental disabilities (2007), 330–348.
Canada’s technology and aging network. [24] Carol Goossens’. 1989. Aided communication intervention before assessment: A
case study of a child with cerebral palsy. Augmentative and Alternative Commu-
nication 5, 1 (1989), 14–26.
[25] Jessica Gosnell, John Costello, and Howard Shane. 2011. Using a clinical ap-
REFERENCES proach to answer “What communication apps should we use?”. Perspectives on
[1] Lawrence W Barsalou. 1999. Language comprehension: Archival memory or augmentative and alternative communication 20, 3 (2011), 87–96.
preparation for situated action? (1999). [26] Audrey L Holland. 1975. Language therapy for children: Some thoughts on
[2] David Beukelman, Jackie McGinnis, and Deanna Morrow. 1991. Vocabulary context and content. Journal of Speech and Hearing Disorders 40, 4 (1975), 514–
selection in augmentative and alternative communication. Augmentative and 523.
alternative communication 7, 3 (1991), 171–185. [27] Ting-Hao Huang, Francis Ferraro, Nasrin Mostafazadeh, Ishan Misra, Aishwarya
[3] David R Beukelman, Susan Fager, Laura Ball, and Aimee Dietz. 2007. AAC Agrawal, Jacob Devlin, Ross Girshick, Xiaodong He, Pushmeet Kohli, Dhruv
for adults with acquired neurological conditions: A review. Augmentative and Batra, et al. 2016. Visual storytelling. In Proceedings of the 2016 Conference of the
alternative communication 23, 3 (2007), 230–242. North American Chapter of the Association for Computational Linguistics: Human
[4] David R Beukelman, Karen Hux, Aimee Dietz, Miechelle McKelvey, and Kristy Language Technologies. 1233–1239.
Weissling. 2015. Using visual scene displays as communication support options [28] Eunsook Hyun and J Dan Marshall. 2003. Teachable-moment-oriented curriculum
for people with chronic, severe aphasia: A summary of AAC research and future practice in early childhood education. Journal of Curriculum Studies 35, 1 (2003),
research directions. Augmentative and Alternative Communication 31, 3 (2015), 111–127.
234–245. [29] Shaun K Kane, Barbara Linam-Church, Kyle Althof, and Denise McCall. 2012.
[5] David R Beukelman, Pat Mirenda, et al. 1998. Augmentative and alternative What we talk about: Designing a context-aware communication tool for people
communication. Paul H. Brookes Baltimore. with aphasia. In Proceedings of the 14th international ACM SIGACCESS conference
[6] Filip Bircanin, Bernd Ploderer, Laurianne Sitbon, Andrew A Bayor, and Margot on Computers and accessibility. 49–56.
Brereton. 2019. Challenges and opportunities in using Augmentative and Alterna- [30] Per Ola Kristensson, James Lilley, Rolf Black, and Annalu Waller. 2020. A Design
tive Communication (AAC) technologies: Design considerations for adults with Engineering Approach for Quantitatively Exploring Context-Aware Sentence
severe disabilities. In Proceedings of the 31st Australian Conference on Human- Retrieval for Nonspeaking Individuals with Motor Disabilities. In Proceedings of
Computer-Interaction. 184–196. the 2020 CHI Conference on Human Factors in Computing Systems. 1–11.
[7] Rolf Black, Joseph Reddington, Ehud Reiter, Nava Tintarev, and Annalu Waller. [31] Amanda Lazar, Caroline Edasis, and Anne Marie Piper. 2017. Supporting people
2010. Using NLG and sensors to support personal narrative for children with with dementia in digital social sharing. In Proceedings of the 2017 CHI Conference
complex communication needs. In Proceedings of the NAACL HLT 2010 Workshop on Human Factors in Computing Systems. 2149–2162.
on Speech and Language Processing for Assistive Technologies. 1–9. [32] Amanda Lazar, Jessica L. Feuston, Caroline Edasis, and Anne Marie Piper. 2018.
[8] Sarah Blackstone, J Light, D Beukelman, and H Shane. 2004. Visual scene displays. Making as Expression: Informing Design with People with Complex Communi-
Augmentative Communication News 16, 2 (2004), 1–16. cation Needs through Art Therapy. In Proceedings of the 2018 CHI Conference on
[9] Nancy C Brady, Kandace Fleming, Kathy Thiemann-Bourque, Lesley Olswang, Human Factors in Computing Systems. ACM, 1–16.
Patricia Dowden, Muriel D Saunders, and Janet Marquis. 2012. Development of [33] Janice Light. 1988. Interaction involving individuals using augmentative and
the communication complexity scale. (2012). alternative communication systems: State of the art and future directions. Aug-
[10] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. mentative and alternative communication 4, 2 (1988), 66–82.
Qualitative research in psychology 3, 2 (2006), 77–101. [34] Janice Light. 1989. Toward a defnition of communicative competence for individ-
[11] Virginia Braun and Victoria Clarke. 2019. Refecting on refexive thematic analysis. uals using augmentative and alternative communication systems. Augmentative
Qualitative research in sport, exercise and health 11, 4 (2019), 589–597. and alternative communication 5, 2 (1989), 137–144.
[12] Kris Brock, Rajinder Koul, Melinda Corwin, and Ralf Schlosser. 2017. A com- [35] Janice Light, Kathryn Drager, John McCarthy, Suzanne Mellott, Diane Millar,
parison of visual scene and grid displays for people with chronic aphasia: A Craig Parrish, Arielle Parsons, Stacy Rhoads, Maricka Ward, and Michelle Welliver.
ASSETS ’22, October 23–26, 2022, Athens, Greece

2004. Performance of typically developing four-and fve-year-old children with Disabilities 41, 2 (2016), 101–115.
AAC systems using diferent language organization techniques. Augmentative [48] H Shane and J Costello. 1994. Augmentative communication assessment and the
and Alternative Communication 20, 2 (2004), 63–88. feature matching process. In Mini-seminar presented at the annual convention of
[36] Janice Light, David McNaughton, and Jessica Caron. 2019. New and emerging the American Speech-Language-Hearing Association, New Orleans, LA.
AAC technology supports for children with complex communication needs and [49] Martine M Smith. 2015. Language development of individuals who require aided
their communication partners: State of the science and future research directions. communication: Refections on state of the science and future research directions.
Augmentative and Alternative Communication 35, 1 (2019), 26–41. Augmentative and Alternative Communication 31, 3 (2015), 215–233.
[37] Janice Light, Krista M Wilkinson, Amber Thiessen, David R Beukelman, and [50] Carla Tamburro, Timothy Neate, Abi Roper, and Stephanie Wilson. 2020. Ac-
Susan Koch Fager. 2019. Designing efective AAC displays for individuals with cessible Creativity with a Comic Spin. ASSETS 2020 - 22nd International ACM
developmental or acquired disabilities: State of the science and future research SIGACCESS Conference on Computers and Accessibility, 1–11.
directions. Augmentative and Alternative Communication 35, 1 (2019), 42–55. [51] Jennifer J Thistle and Krista M Wilkinson. 2013. Working memory demands of
[38] Eugene T McDonald and Adeline R Schultz. 1973. Communication boards for aided augmentative and alternative communication for individuals with develop-
cerebral-palsied children. Journal of Speech and Hearing Disorders 38, 1 (1973), mental disabilities. Augmentative and Alternative Communication 29, 3 (2013),
73–88. 235–245.
[39] Aimee Mooney, Steven Bedrick, Glory Noethe, Scott Spaulding, and Melanie [52] Nava Tintarev, Ehud Reiter, Rolf Black, and Annalu Waller. 2014. Natural language
Fried-Oken. 2018. Mobile technology to support lexical retrieval during activity generation for augmentative and assistive technologies. In Natural Language
retell in primary progressive aphasia. Aphasiology 32, 6 (2018), 666–692. Generation in Interactive Systems. Cambridge University Press, 252–277.
[40] Timothy Neate, Abi Roper, Stephanie Wilson, and Jane Marshall. 2019. Empower- [53] Nava Tintarev, Ehud Reiter, Rolf Black, Annalu Waller, and Joe Reddington.
ing expression for users with aphasia through constrained creativity. Proceedings 2016. Personal storytelling: Using Natural Language Generation for children
of the 2019 CHI Conference on Human Factors in Computing Systems, 1–12. with complex communication needs, in the wild. . . . International Journal of
[41] Nhu-Van Nguyen, Christophe Rigaud, and Jean-Christophe Burie. 2017. Comic Human-Computer Studies 92 (2016), 1–16.
characters detection using deep learning. In 2017 14th IAPR international confer- [54] Sarah E Wallace and Karen Hux. 2014. Efect of two layouts on high technology
ence on document analysis and recognition (ICDAR), Vol. 3. IEEE, 41–46. AAC navigation and content location by people with aphasia. Disability and
[42] Christopher S Norrie, Annalu Waller, and Elizabeth FS Hannah. 2021. Establishing Rehabilitation: Assistive Technology 9, 2 (2014), 173–182.
Context: AAC Device Adoption and Support in a Special-Education Setting. ACM [55] Bruce Wisenburn and D Jefery Higginbotham. 2008. An AAC application using
Transactions on Computer-Human Interaction (TOCHI) 28, 2 (2021), 1–30. speaking partner speech recognition to automatically produce contextually rele-
[43] Mmachi God’sglory Obiorah, Anne Marie Marie Piper, and Michael Horn. 2021. vant utterances: Objective results. Augmentative and alternative communication
Designing AACs for People with Aphasia Dining in Restaurants. In Proceedings 24, 2 (2008), 100–109.
of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14. [56] Bruce Wisenburn and D Jefery Higginbotham. 2009. Participant evaluations of
[44] Rupal Patel and Rajiv Radhakrishnan. 2007. Enhancing Access to Situational rate and communication efcacy of an AAC application using natural language
Vocabulary by Leveraging Geographic Context. Assistive Technology Outcomes processing. Augmentative and Alternative Communication 25, 2 (2009), 78–89.
and Benefts 4, 1 (2007), 99–114. [57] Di Wu, Si-Jia Zheng, Xiao-Ping Zhang, Chang-An Yuan, Fei Cheng, Yang Zhao,
[45] Ehud Reiter. 2007. An architecture for data-to-text systems. In Proceedings of the Yong-Jun Lin, Zhong-Qiu Zhao, Yong-Li Jiang, and De-Shuang Huang. 2019. Deep
Eleventh European Workshop on Natural Language Generation (ENLG 07). 97–104. learning-based methods for person re-identifcation: A comprehensive review.
[46] Ralf W Schlosser, Howard C Shane, Anna A Allen, Jennifer Abramson, Emily Neurocomputing 337 (2019), 354–371.
Laubscher, and Katherine Dimery. 2016. Just-in-time supports in augmentative [58] Yi Zheng, Yifan Zhao, Mengyuan Ren, He Yan, Xiangju Lu, Junhui Liu, and Jia Li.
and alternative communication. Journal of Developmental and Physical Disabilities 2020. Cartoon face recognition: A benchmark dataset. In Proceedings of the 28th
28, 1 (2016), 177–193. ACM international conference on multimedia. 2264–2272.
[47] Samuel C Sennott, Janice C Light, and David McNaughton. 2016. AAC model-
ing intervention research review. Research and Practice for Persons with Severe
LaMPost: Design and Evaluation of an AI-assisted Email Writing
Prototype for Adults with Dyslexia
Steven M. Goodman Erin Buehler Tifanie N. Horne
smgoodmn@uw.edu Patrick Clary Michal Lahav
University of Washington Andy Coenen Robert MacDonald
Seattle, Washington, USA
Google Research
Aaron Donsbach Rain Breaw Michaels
ebuehler@google.com tifanieh@google.com
Seattle, Washington, USA
pclary@google.com mlahav@google.com
andycoenen@google.com bmacdonald@google.com
donsbach@google.com rainb@google.com
Google Google
Mountain View, California, USA Mountain View, California, USA

Ajit Narayanan Lei Shi Meredith Ringel Morris


Mahima Pushkarna Rachel Sweeney merrie@google.com
Joel Riley Phil Weaver Google Research
Seattle, Washington, USA
Alex Santana Ann Yuan
ajitnarayanan@google.com leileilei@google.com
mahimap@google.com rachelsweeney@google.com
joel.c.riley@gmail.com pweaver@google.com
alexsantana@google.com annyuan@google.com
Google Google
Mountain View, California, USA Mountain View, California, USA

Figure 1: The LaMPost interface. The system augments a typical browser-based email editor with three AI-powered features:
(left) users can generate an outline of the email’s main ideas (a) with the option for a related subject line (b); (center) users can
generate suggestions for possible changes to a selected passage, and (right) users can generate rewritten text for a selected
passage based on a human- or machine-written instruction.
ABSTRACT word retrieval technologies to address these challenges. However,
Prior work has explored the writing challenges experienced by peo- the capabilities for natural language generation demonstrated by
ple with dyslexia, and the potential for new spelling, grammar, and the latest class of large language models (LLMs) highlight an oppor-
tunity to explore new forms of human-AI writing support tools. In
this paper, we introduce LaMPost, a prototype email-writing inter-
face that explores the potential for LLMs to power writing support
This work is licensed under a Creative Commons Attribution International
4.0 License.
tools that address the varied needs of people with dyslexia. LaM-
Post draws from our understanding of these needs and introduces
ASSETS ’22, October 23–26, 2022, Athens, Greece novel AI-powered features for email-writing, including: outlining
© 2022 Copyright held by the owner/author(s). main ideas, generating a subject line, suggesting changes, rewriting
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3544819 a selection. We evaluated LaMPost with 19 adults with dyslexia,
ASSETS ’22, October 23–26, 2022, Athens, Greece Goodman, et al.

identifying many promising routes for further exploration (includ- writing—such as organization, expression, and voice—are absent
ing the popularity of the “rewrite” and “subject line” features), but from accessibility literature. This gap highlights an opportunity to
also fnding that the current generation of LLMs may not surpass explore the potential for AI-powered writing support tools that use
the accuracy and quality thresholds required to meet the needs state-of-the-art neural language models.
of writers with dyslexia. Surprisingly, we found that participants’ Neural language models are neural networks that are trained
awareness of the AI had no efect on their perception of the system, to predict the next word in a sequence given the previous words.
nor on their feelings of autonomy, expression, and self-efcacy We use “large language models,” or LLMs, to refer to the recent
when writing emails. Our fndings yield further insight into the class of neural language models (e.g., GPT-3 [7]) that have been
benefts and drawbacks of using LLMs as writing support for adults trained using the Transformer neural architecture [64] and are
with dyslexia and provide a foundation to build upon in future capable of generating long passages of text that human evaluators
research. perceive as human-written [15]. With few-shot learning to enable
controllable text generation, LLMs hold potential to drive new
CCS CONCEPTS technologies that bolster written expression [69]. This functionality
• Human-centered computing → Accessibility technologies; Em- may provide signifcant value to writers with dyslexia by alleviating
pirical studies in accessibility. common difculties and simplifying their existing workfow, but
questions arise over the correct approach for their implementation.
KEYWORDS For example, although automatic text generation could help some
large language models, dyslexia, writing writers with dyslexia to conquer their “fear of the blank page” [47],
machine-powered writing may raise concerns over the authors’
ACM Reference Format: control and autonomy in the writing process [27].
Steven M. Goodman, Erin Buehler, Patrick Clary, Andy Coenen, Aaron
In this paper, we introduce LaMPost, an LLM-based prototype
Donsbach, Tifanie N. Horne, Michal Lahav, Robert MacDonald, Rain Breaw
Michaels, Ajit Narayanan, Mahima Pushkarna, Joel Riley, Alex Santana,
to support adults with dyslexia with writing emails. LaMPost im-
Lei Shi, Rachel Sweeney, Phil Weaver, Ann Yuan, and Meredith Ringel plements LaMDA [60], an LLM for dialog applications, to augment
Morris. 2022. LaMPost: Design and Evaluation of an AI-assisted Email a standard email editor with AI-powered outlining, subject gener-
Writing Prototype for Adults with Dyslexia. In The 24th International ACM ation, suggestion, and rewriting features. We evaluated LaMPost
SIGACCESS Conference on Computers and Accessibility (ASSETS ’22), October with 19 adult participants with dyslexia. Our fndings indicate en-
23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 18 pages. https: thusiasm among this demographic for high-level writing support
//doi.org/10.1145/3517428.3544819 features, including rewriting passages in a particular tone or style
(e.g., “more formal”, “more concise”) and generating summative
1 INTRODUCTION content such as subject lines based on an email’s body. However,
Dyslexia refers to a cluster of symptoms that result in challenges we also found that accuracy and quality issues in the current gen-
with word recognition, reading fuency, spelling, and writing that eration of LLMs present obstacles to a reliable and trustworthy
impacts up to 20% of the population [1, 32, 59]. While some adults writing-support experience. Further, efectively utilizing LLMs for
with dyslexia may learn and adopt compensatory strategies for read- writers with dyslexia may require HCI innovations to manage trade-
ing difculties over time [39, 42, 44], the combination of reading, ofs, such as autonomy vs. cognitive load and personalization vs.
comprehension, and planning skills required to carry out writing privacy. Knowledge that our writing-support tool contained AI
tasks may lead to on-going difculties [42, 55]. In addition to low- did not have a signifcant efect on participants’ perception of the
level obstacles such as spelling and grammar, writers with dyslexia system and written work. Our fndings highlight opportunities and
report a variety of high-level challenges (e.g.. [13, 42, 47]), such as challenges of AI-assisted writing support for people with dyslexia
ordering and expressing their ideas, choosing language to match and provide a foundation for future work as the capabilities of gen-
their desired tone, and writing with clarity and precision. To over- erative language models—and our understanding of their risks and
come these obstacles, they report using a variety of strategies—such trade-ofs—mature.
as speech-to-text tools to dictate ideas, templates to match style,
and revising feedback from friends and family—but these can add 2 RELATED WORK
further complexity and time to their writing process [13, 46].
Our research is informed by and builds upon: work on dyslexia and
Prior work in accessibility has explored a number of approaches
associated writing challenges, prior accessibility research with this
to overcome the reading challenges associated with dyslexia, such
population, and AI-assisted writing tools.
as experimenting with various forms of text presentation [19, 50])
and synonym substitution for complex words [51]. However, work
targeting dyslexia’s associated writing challenges has primarily 2.1 Writers with Dyslexia
focused on low-level interventions, including automatic suggestions Dyslexia is a multifaceted condition characterized by difculties
to support word retrieval [37, 41, 49] and specialized spellcheck with word recognition, reading fuency, spelling, and/or writing [59].
tools (e.g., [37, 45, 52, 66]). AI-based eforts, when present, have According to van Schaik [63], the complete defnition of dyslexia
continued this thread; for example, Wu et al. evaluated a dyslexia- varies according to the lens studying it. Through a medical lens,
tuned Neural Machine Translation model for spelling and grammar dyslexia is defned as a cognitive defciency that is associated with
support on social media posts [66]. However, tools that can lend persistent difculties with reading, spelling, short-term/working
support to people with dyslexia for important high-level aspects of memory, and day-to-day organization (e.g., [13, 24]). Through a lens
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia ASSETS ’22, October 23–26, 2022, Athens, Greece

of neurodiversity, however, dyslexia is defned by heightened spa- for people with dyslexia, but it was never evaluated. Wu et al. ex-
tial and perceptual abilities, interconnected and dynamic reasoning, amined the experience of writers with dyslexia on social media,
and narrative and holistic thinking—alongside commonly defned fnding challenges associated with not only the writing task, but
defcits [4]. Critical scholars defne dyslexia as a person’s failure concerns over social self-presentation [55]—and used their results
to meet the socially constructed expectations of timelines, literacy, to build a dyslexia-tuned Neural Machine Translation model for
and communication that are embedded in one’s broader social and spelling and grammar support on Facebook posts [66].
cultural context [17, 29]. While dyslexia impacts up to 20% of the So far, accessibility researchers have not explored the high-
population [1], structural disparities including gender, class, and level challenges associated with dyslexia during text construc-
race [42, 63] cause many cases to go undiagnosed—leading many tion, nor have they begun exploring AI-powered solutions to these
to be unaware of the cause of their reading and writing difculties. challenges—a gap our work aims to address.
While many dyslexia-related challenges remain into adulthood [8],
individuals with dyslexia may learn and adopt compensatory strate- 2.3 Large Language Models
gies for reading difculties over time [39, 42, 44]. Although reading The most recent class of large language models such as GPT-3 [7]
may still prove challenging, some adults with dyslexia report that demonstrate signifcant advances in natural language generation.
writing tasks tend to provide their greatest difculties [42, 55]. At their core, these models have a simple API: given a string of text,
Writing challenges are wide-ranging, but some are commonly re- known as a prompt [55], they return plausible continuations for
ported [13, 18, 42, 47, 61]: on a high level (overall plan and structure), that string. For example:
these include organizing and expressing one’s thoughts, structuring
and ordering ideas, and overcoming a “fear of the blank page” [47]. prompt: A healthy lunch includes
At a lower level (sentence and word), challenges can include word language model: fruits, vegetables, protein, and whole grains.
retrieval, sentence composition, appropriate tone and concision, Prompts can also be written with exemplars of a desired response,
grammar, spelling, punctuation, and proofreading. As with read- such that the model ends up performing a specifc task by continu-
ing, writers with dyslexia may adopt strategies to assist in their ing the text. In this way, they are capable of few-shot learning [7]
writing—such as preferred spell-checkers, text-to-speech and dicta- which is shown to be more accurate than the zero-shot example
tion software, and support from friends and family—but these can above [71]. This example prompts the language model for a piece
add complexity and time to their writing process [13, 46]. of clothing and an accessory to plan for the weather:
In this paper, we contribute fndings regarding the needs and prompt: When it’s sunny, I need:
challenges experienced by adults with dyslexia in email-writing, shorts and sunscreen
and we explore how these might be addressed with AI interventions. When it’s raining, I need:
rain boots and an umbrella
When it’s snowing, I need:
2.2 Dyslexia & Accessible Technology Design
language model: mittens and a shovel
Because text readability is impacted by the visual display of text, re-
searchers have explored how to alleviate reading challenges through Researchers have begun exploring many end-user facing applica-
text presentation, such as typography choices [50], word segmenta- tions powered by LLMs, including chatbots and conversational
tion [3], background colors [53], and increased font size and margin agents [68], code generation [2, 14, 70], creative writing [16, 28, 69],
space [19, 54]. To support reading comprehension, one promising and even accessibility applications such as keystroke-saving abbre-
approach involves text simplifcation [49]. Rello et al. [51] found viations expansions to accelerate eye-gaze typing by users with
promising results among readers with dyslexia when displaying ba- motor disabilities [10].
sic synonyms alongside complex words, although readers struggled While LLMs exhibit impressive performance on many tasks and
when a simpler word was substituted automatically. In a study of have many potential applications, these models also have draw-
web searchers with dyslexia [41], participants sought pages utiliz- backs. Of particular note is that such models risk generating factu-
ing multimedia whilst containing minimal visual clutter and large ally incorrect, ofensive, or stereotyped text since they are trained
text blocks (preferring headings and bullets instead). In the design on content from the internet [5, 65]. “Memorization” (e.g., regur-
of our email-writing prototype, we draw from elements of this body gitating existing text rather than producing novel content) is also
of work to address usability challenges—including guidelines for a risk of current LLMs [12]. The risks of erroneous or inappropri-
text presentation and visual clutter—and explore text simplifcation ate output from LLMs carry additional ethical challenges when
as a form of writing support. embedded in systems used by vulnerable audiences, such as users
Work in accessibility targeting writers with dyslexia has primar- with dyslexia, who may experience challenges in interpreting the
ily focused on low-level interventions, such as specialized spellcheck output’s quality [25, 40]. Mitigating the risks associated with LLMs
tools (e.g., [37, 45, 52, 66]). While common spellcheck tools provide is an active area of research; for example, LaMDA (the LLM under-
value to this population, specialized versions are motivated by pinning LaMPost) uses fne-tuning to improve model safety [60].
the high occurrence of "real word" errors (e.g., “hear” and “here”)
among people with dyslexia that most common tools cannot recog- 2.4 AI-Assisted Writing
nize [45, 52]. Text suggestions can also provide value when writing Human-AI co-creation in the writing domain (for general audiences)
to overcome word retrieval difculties [37, 41, 49]. PoliSpell [37] has been widely studied, and applications such as Gmail’s Smart
was an early attempt to design a spellcheck and autocomplete tool Reply feature [35] have already been deployed to massive audiences.
ASSETS ’22, October 23–26, 2022, Athens, Greece Goodman, et al.

Buschek et. al. [9] explored the impact of multiple suggested text to demonstrate and compare diferent approaches for building a
continuations when writing emails, fnding benefts to ideation at text editor infused with generative writing support.
the cost of writing speed. Gero et. al. [26, 27] studied automatic In the following sections, we explain how LaMPost works and
synonym and metaphor generation and found both features enabled describe its design (including accessibility considerations to support
greater expression during the writer’s process, but questions arose users with dyslexia) and functionality (including the key motiva-
over autonomy and ownership over the produced text [27]. Further tions from our formative work for LaMPost’s LLM-based features).
questions arise over how the algorithms powering these systems
should be presented to the user, as users’ perceptions toward an 3.1 Email-Writing Support Through Few-Shot
AI system and desire to use it can be impacted by this choice of Learning
presentation [33, 36]. We explore these questions further in this
LaMPost is powered by LaMDA, a neural language model [60],
work.
and adapts the few-shot prompting methods introduced by the
Wordcraft [16, 69] explored LLMs incorporated into the writ-
Wordcraft system [16, 69] (a LaMDA-powered tool for creative
ing process: users collaborate with the model to write a story
story-writing by general audiences). When the user selects one
through a variety of operations—including inflling, elaboration,
of LaMPost’s operations, the system constructs a custom few-shot
and rewriting—as well as open-ended dialog. The system augments
learning prompt and sends it to the language model.
a traditional text editor with a set of integrated LLM-powered con-
Prompt performance is highly sensitive to word choice, for-
trols driven by novel prompting techniques that enabled users to
matting, and the content of the exemplars [71]. Writing efective
build their own custom controls, such as "rewrite the text to be more
prompts requires rigorous testing and iteration to achieve reliable
melodramatic". In this paper, we adapt Wordcraft’s approach to
and accurate responses from the model. When building LaMPost,
provide LLM-powered controls for writers with dyslexia, building
we experimented with diferent prompting methods for several
an email editor with features for automatic outlining, subject gener-
possible LLM-based features before settling on the three in our fnal
ation, rewriting, and suggestions. We also focus specifcally on how
system. We describe our iterative development process for LaMPost
AI-assisted writing can support the needs of adults with dyslexia
and refect on lessons learned in Section 5.3.
(rather than general audiences).

3 LAMPOST: AN LLM-POWERED PROTOTYPE 3.2 LaMPost’s Design and Functionality


LaMPost’s interface consists of a main panel that resembles a stan-
FOR EMAIL WRITING SUPPORT
dard email editor, including sections for the email’s recipients, sub-
Prior to developing the LaMPost system, our interdisciplinary re- ject, and body; and “undo”, “redo”, and “clear” buttons near the
search team engaged in over a year of participatory research with bottom. A secondary panel on the right is reserved for three LLM-
the dyslexia community. This included participatory design ses- powered features: identifying main ideas (with the option to gener-
sions and workshops with partner organizations with expertise in ate an email subject), rewriting a selection, and suggesting how to
reading disabilities1 , and culminated in a brainstorming workshop rewrite a selection. These three features were inspired by fndings
on AI-assisted writing support with experts in accessibility and from our formative work with organizations and participants with
dyslexia (including team members with lived experience of dyslexia dyslexia, as we describe in more detail below.
and other visual processing and reading disabilities).
As a key step in our formative work, we conducted a 90-minute 3.2.1 LLM Feature 1: Identify Main Ideas. Users can generate a
formative study to motivate the design of an AI-powered writing visual outline of their email with the main idea from each para-
system. We recruited seven adults with dyslexia to join the two- graph. Additionally, they can choose to generate a new subject
part study for (1) individual interviews on writing practices and line from this outline (Figure 2). This feature was motivated by
challenges, and (2) a group assessment of possible ideas for AI writ- feedback from our formative study with dyslexic adults, in which
ing support. Individual interviews highlighted several challenges, participants noted that to overcome difculties with organizing
such as: planning how to order ideas, expressing ideas in clear ideas and making them understandable to readers, visual organi-
and concise wording, writing with appropriate tone, and fnding zation and automatic summarization were desired technological
proofreading help. Further, the group interviews highlighted an supports. The visual outline can make it easier to parse sections of a
overall interest in AI writing support: revising feedback could help long text, while simplifed content can make it easier to understand
with clarity, verbosity, and tone; summarization could validate an that text [49]. The option to generate a subject line allows users
intended meaning; and visual organization could help to order and to ask the AI to simplify the content for them. By displaying the
structure ideas. However, concerns arose over users’ capability to AI’s interpretation of the email’s salient points, we imagined this
address AI feedback, maintaining autonomy and control over their feature could also show users how the email’s main ideas might be
work, and their privacy. extracted by another reader.
Informed by our formative inquiries and prior work [42, 46, 47],
3.2.2 LLM Feature 2: Suggest Possible Changes. Users can select
we built LaMPost, a web application and LLM-powered prototype
a word, phrase, or paragraph and ask the AI for suggestions on
for email writing support for writers with dyslexia (Figure 1). We
how to rewrite it (Figure 3). Participants in our formative study
chose email writing as a constrained—yet highly practical—use case
described feeling unsure about the kinds of adjustments needed for
1 Partner
organizations included the British Dyslexia Association, Understood.org, their writing, and were interested in automatic suggestions for high-
Madras Dyslexia Association, and the Landmark School in Prides Crossing, MA. level language characteristics like tone and clarity. Results from
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 2: The Identify Main Ideas feature. Users can click ‘identify main ideas’ (1) and ‘include a new subject’ (2) to generate an
outline of their email based on the main ideas of each paragraph (4) with a subject line generated on top (3). Hovering over
each item of the outline highlights that respective paragraph in the email body.

Figure 3: The Suggest Possible Changes feature. A user can select a passage of text (1) and click ‘suggest how to rewrite’ (2).
Several suggestions from the AI for changing the passage will populate in the right-hand panel (3). Users can choose to exit the
operation and rewrite the text themselves (4), or choose to have individual suggestions ‘read aloud’, discarded, or to used as a
prompt for the ‘Rewrite My Selection’ feature (5).

the LLM appear as several suggestions for changing the selected and sent to the model; the model responds with several suggested
passage; for example, “Rewrite this sentence to be less business-like”. changes to the passage for users to consider. We adapted the meta-
Users can take these suggestions into consideration to guide their prompting method from [69] to allow users to optionally use a
own revisions, or use a preferred suggestion as a prompt to gener- suggestion as a precursor for the stand-alone rewriting feature
ate rewritten passages in a follow-up operation (described in the described below.
following section).
To implement this feature, the few-shot prompt included sev- 3.2.3 LLM Feature 3: Rewrite My Selection. Users can select a word,
eral examples containing: a passage from an email (i.e., the user’s phrase, or paragraph and provide an instruction to the AI to rewrite
selected), the full email to provide context, and a suggestion for the text in an arbitrary way (Figure 4); for example, ‘rewrite this:
improving the passage (the ideal response from the model). When to be shorter’. Participants in our formative study shared common
users press the ‘suggest how to rewrite’ button, their current selec- difculties with appropriate tone and style, while prior work shows
tion and full email is appended to the end of the few-shot prompt that people with dyslexia will often rely on a thesaurus [47] or
ASSETS ’22, October 23–26, 2022, Athens, Greece Goodman, et al.

Figure 4: The Rewrite My Selection feature. A user can select a piece of text (1), provide a custom instruction for changing it (2),
and click ‘rewrite my selection’ (3). Several rewritten choices from the AI will populate in the right-hand panel (4). Highlighting
a choice with show a preview in the editor (5). For each choice, users can have it ’read aloud’, discard it, or apply it to replace
the original passage (6).

templates [13, 46] to achieve desired phrasing. Through custom been shown to be the most readable and preferred by users with
instructions, users with dyslexia can specify their intentions for dyslexia [50]. To further improve readability, we incorporated siz-
a passage and call upon the AI to select wording that meets that ing recommendations from prior work [54] suggesting a large font
intention. Rewritten passages from the LLM were returned as sev- (18 points or more) with line spacing near the default value (1.0
eral choices in an efort to maintain users’ autonomy over the fnal units or 120% of the font size); based on feedback from our team,
passage. we chose an 18pt font size with 140% line spacing for the main
To implement this feature, we adapted the related example prompt- editor panel. To support visual referencing, we paired most buttons
ing method from [69]: rather than anticipating every possible user with icons and added highlighting to the sentence surrounding the
instruction and including these in the few-shot learning prompt, our insertion point cursor (visible in Figure 4-1). Finally, for users that
prompt only contained a collection of examples related to instruc- felt more comfortable listening to on-screen text than parsing it
tions that we anticipated would be relevant to users with dyslexia visually, we used the Web Speech API 2 to include a “read aloud”
for emails. Related examples are generally able to steer the model feature for the email’s body, the generated outline, and each choice
to completing an unseen, user-generated task [69]. Our examples returned by the LLM.
included instructions for conciseness (‘to be simpler’), tone (‘to be
more polite’), audience (‘to be more formal’), and precision (‘to be 4 LAMPOST EVALUATION
more clear’). Each example also contained the user’s selected text We evaluated the LaMPost prototype in a hands-on demonstration
paired with some text before and after the passage to provide con- and practical email writing exercise. Our primary goals were to
text, and an ideal way to rewrite that passage according to the given explore the potential ways that LLMs can be incorporated into
instruction. When users press the ‘rewrite my selection’ button, their the email-writing process of writers with dyslexia, and to assess
current selection and surrounding text is appended to the end of users’ perceptions of each of LaMPost’s writing support features.
the few-shot prompt and sent to the model; the model responds In addition, we had secondary goals of understanding users’ feel-
with several rewritten passages, shown to the user as choices. ings of satisfaction, self-expression, self-efcacy, autonomy, and
control while writing with LLMs, and to assess how exposure to AI
3.3 Accessibility Considerations terminology may impact these feelings.
Building a text-editing tool for users who fnd it difcult to parse and 4.1 Method
manipulate text presents an inherent design challenge. Although
4.1.1 Participants. We recruited 32 participants via a survey shared
our primary goal for the LaMPost system was to demonstrate the
with a large sampling pool maintained by our institution; 19 com-
functionality of LLMs for writing to users with dyslexia, we recog-
pleted the study. All were based in the U.S. (N =16) and Canada
nized that usability issues may impact their ability to fully evaluate
(3); all said English was their preferred language to write. The
the system. To mitigate this, we made several design choices to
recruiting survey asked if they had a dyslexia diagnosis, their email-
maximize usability for this population, and tested iterations of our
ing habits (Figure 5), and a series of demographic questions. We
design among members of our team who identify as having dyslexia.
We used a sans serif font throughout the system because these have 2 https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 5: Email-writing habits for the evaluation’s 19 participants, showing frequency and duration when writing (a) new
emails and (b) replies. Most participants reported writing new emails and replies multiple times per day and spending 20
minutes or less on each one.

screened for experience writing emails (at least one per year) and Maintaining feelings of autonomy during AI-assisted writing
self-reported challenges associated with dyslexia, but we did not was a key design goal for LaMPost, following user concerns ex-
require a dyslexia diagnosis to accommodate individuals without ac- pressed during our formative study and in prior work [27]. However,
cess to formal screening procedures. Fourteen participants reported as illustrated by emerging work showing the efect that metaphor
having a formal dyslexia diagnosis and four reported discovering choice can have on user perceptions [33, 36], our choices for LaM-
their dyslexia on their own; one participant did not specify. We Post’s presentation may infuence the expectations, evaluations,
aimed for balanced representation across gender and age categories, and attitudes among users. Overt AI presence might be viewed
but we attained neither due to cancellations. Four participants iden- as reducing autonomy, while obscured AI could resemble tradi-
tifed as female, 14 as male, and one as non-binary. One participant tional computer-aided writing (e.g., spelling/grammar check). To
was 18-24 years old, seven were 25-34 years old, and 11 were 35-54 better understand how framing and the presence of AI metaphors
years old. Participants were compensated with a $100 gift card for can impact perceptions of an LLM-powered writing tool among
their time. adults with dyslexia, we segmented the system evaluation into two
between-subjects conditions:
4.1.2 Procedure Overview. The evaluation procedure was split into
three parts during a 75-minute period and conducted remotely due (1) With AI metaphors (N =9): We introduced LaMPost as an
to the ongoing COVID-19 pandemic. First, we asked a few back- “AI-powered” email editor and used language throughout the
ground questions to understand each participant’s email-writing session to present LaMPost’s LLM as a personifed AI agent
workfow, then provided a hands-on demonstration of the LaMPost providing writing assistance. For example, “You can get a
system.3 Second, we conducted an informal writing exercise in few suggestions from the AI for how it thinks you should write
which participants freely used the LaMPost system to write at least this diferently”, “Hang on, the AI is thinking...”, and “This is
one realistic email. Third, we asked semi-structured follow-up ques- feedback from the AI on how to improve your selection.”
tions about the experience, then asked them to fll out rating scales (2) Without AI metaphors (10): We introduced LaMPost as an
evaluating the system’s usefulness, consistency, and their own feel- “enhanced” email editor and used language throughout the
ings while using the system—including satisfaction, self-expression, session to obscure the presence of an AI/LLM. The examples
self-efcacy, autonomy, and control. above in this condition: “You can get a few suggestions for how
to write this diferently”, “Hang on while the system loads...”,
and “This is feedback on how to improve your selection.”
3 Two participants (P12, P18) were unable to access the system due to an unknown
technical issue. These participants dictated the content of the email and directions for The interface did not reference the system’s underlying LLM mech-
using each feature to the researcher via a shared screen. anism, and we left it unchanged for both conditions.
ASSETS ’22, October 23–26, 2022, Athens, Greece Goodman, et al.

Before the session, we asked participants to prepare two ideas immediate thoughts about it, any concerns with it, and if they
for emails that they intended to write and felt comfortable sharing thought they might use it when writing emails.
with us. To ensure email ideas were of sufcient length to conduct
a substantive test of the system, we included an example: “family 4.1.4 Part 2: Writing Exercises (25 min). After participants had
newsletter on key events from 2021”. We provided assurance that been introduced to LaMPost, they used the system to write emails
the writing exercises were only meant to provide a realistic expe- based on the ideas that we had requested for the session. To mini-
rience of using the system, and that their performance would not mize stress associated with time constraints among writers with
be evaluated during the session. With participants’ permission, all dyslexia [11, 30, 46] and challenges associated with learning an
evaluation sessions were recorded for later viewing and analysis. unfamiliar system, we did not require participants complete their
email during the 25 minutes allotted. Instead, we told participants
to write as much as they were able in the time provided and to
4.1.3 Part 1: Background and Demo (25 min). The frst part of the freely use the LLM features as desired; if they fnished with one
study was used to learn more about each participant’s current ap- email, they could try writing another.
proach to email writing through a small set of interview questions We asked participants to “think aloud” throughout the writing
and rating scales, and to demonstrate the LaMPost system’s func- exercise, and to freely voice any questions, observations, sugges-
tionality. To begin the interview, we asked participants why they tions, or concerns about the system. If their actions were unclear
write emails, to share successes and challenges experienced when for any reason, the researcher prompted them for an explanation.
emailing, to recall a past instance of confusion among an email The researcher also provided limited answers to questions about
recipient, and to walk us through their email-writing process. Next, the system’s functionality with the appropriate language for each
we asked participants to rate their confdence in their emailing participant’s AI-framing condition. To ensure that the participants
ability (relates to self-efcacy [57]), ability to express themselves had experience with each LLM feature (Main Ideas, Rewrite, Sug-
and their ideas (self-expression; e.g., [38]), and their overall satis- gest), the researcher allowed them to write and use the system for at
faction with their emails. The three ratings were predicated by a least fve minutes before prompting them to try an unused feature,
positive statement (e.g., “I am confdent in the emails that I write.” ) repeating until all features had been used at least once. We logged
and followed by a 7-point scale from “Completely disagree” to “Com- participants’ complete use of the system, including: all typed addi-
pletely agree”. For privacy, we shared the scales with participants tions, changes, and deletions; all buttons and LLM features used;
via a linked form and ofered to read each statement aloud if they all responses from the LLM; and all accepted LLM responses added
desired: two participants opted for rating statements read aloud to the document.
throughout the study; one read the statement aloud themselves; 16
read silently. 4.1.5 Part 3: Follow-up Interview and Rating Scales (25 min). Fol-
After participants had completed the rating scales, the researcher lowing the email writing exercises, we discussed the experience
introduced the functionality of the system according to their as- via semi-structured interview questions and rating scales. We be-
signed AI-metaphor condition in a hands-on demonstration. Partic- gan with questions to learn about participants’ overall experience,
ipants opened the LaMPost system in a new tab and shared their asking them to compare their use of the system to their typical
screen with the researcher, who began by walking them through experience writing emails, and to share anything they found eas-
each element of the main editor panel: input for the email’s recip- ier or more difcult than usual. Next, we asked for their opinions
ient and subject, space for the body text, and buttons for “undo”, about the LaMPost system, including what they liked most about it,
“redo”, “read aloud”, and “clear”. To make sure participants were what needed improvement, and if they had any ideas for additional
following along, we asked participants to move their mouse to each features that could assist them with writing. We used post-use rat-
element before it was introduced. To aid in demonstrating the LLM ing scales to assess their overall impressions of the system and
features, we included an additional “insert sample text” button that measure the impact of the AI-framing manipulation. The ratings
added a two-paragraph sample email into the body of the editor. targeted several concepts, including: usefulness and consistency
Next, we introduced each of the three LLM features (Identify of each LLM feature and the system overall; satisfaction with the
Main Ideas, Rewrite My Selection, Suggest Possible Changes): we system and the emails produced with it; and personal feelings of
explained the feature’s intended function, asked participants to self-efcacy [57], self-expression (e.g., [38]), autonomy, and control
try using it on the sample email, and explained any follow-up (e.g., [22, 62]) while using the system. Each rating was predicated
functionality associated with that feature. For the Main Ideas feature, by a positive statement and followed by a 7-point scale from “Com-
this functionality included hovering over an idea in the structure pletely disagree” to “Completely agree”. After they had privately
to highlight its associated paragraph, clicking on an idea to hear it flled in the scales related to each concept, we asked a follow-up
read aloud, and using the checkbox to generate a new subject line. question related to the concept to capture the reason for each rating.
For the Rewrite feature, this included selecting diferent options to
see previews in the editor, hearing an option read aloud, deleting 4.1.6 Analysis. We analyzed our qualitative data following the
undesired options, and choosing an option to replace the selected thematic coding process of Braun and Clarke [6] using a combined
passage. We gave a similar explanation for the options generated by inductive and deductive approach. Prior to the study, we produced
the Suggest feature, with an additional note that a chosen suggestion a set of deductive codes to categorize: existing email-writing prac-
could be sent as an instruction for the Rewrite feature if desired. tices; positives, negatives, and desired changes for each feature
After introducing each feature, we asked participants to share their and the overall system; and feelings of self-efcacy, self-expression,
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia ASSETS ’22, October 23–26, 2022, Athens, Greece

autonomy, and control during and after system use. During data col- dictate their thoughts; others had abandoned it due to errors: “Soft-
lection, three researchers produced session notes and observations wares that I’ve used have had a slightly higher [error] percentage than
and generated a set of inductive codes through analytic connection my own [typed] inaccuracies” (P5). For typing, nine participants
across participants. We used both sets to produce a fnal codebook mentioned predictive text, such as GMail’s Smart Compose [35],
containing a 3-level hierarchy: level-1 included high-level codes for as being particularly helpful to fnd their desired wording: “I like
writing practices, the overall system, and each feature; level-2 in- the feature when it fnishes what you are thinking; when you don’t
cluded positives, negatives, concerns for each; and level-3 included have to type it all out” (P7). Common typed approaches included
low-level codes based on our expectations following the formative the “word faucet” strategy (N =5) identifed during our formative
study (deductive) and unexpected themes emerging from system study—“I get all of my thoughts out so [...] I’ve got this blob of text”
use (inductive). One researcher used the fnal codebook to indepen- (P10)—or bullet points (3) to map out high-level details. The re-
dently code transcripts for each of the 19 sessions, and resulting maining participants described unique initial drafting processes;
themes were organized into subsections and constructed to form for example, P6 described a linear, spontaneous method: “I wing it.
our narrative. Our fnal codebook is provided as Supplementary Ma- [...] I start the frst sentence, then it’s like, ‘Okay, now I know what
terial. For quantitative data, we used pre-use ratings to characterize the second sentence will be.”’ In contrast, P8 preferred to write in a
participants’ existing feelings about writing emails. To compare non-linear fashion: “I start with just drafting the middle paragraph
how our between-subjects manipulation of AI framing shaped par- because I know that’s the one that has the most information.”
ticipants’ perception of the system, we used a Mann-Whitney U test After participants had moved their ideas to writing, additional
(two-tailed) to test for signifcance in post-use ratings for usefulness, challenges emerged during revising and proofreading. Spelling and
consistency, satisfaction, self-expression, self-efcacy, autonomy, grammar was the most common issue, and ten participants said
and control. they relied on a trusted spell checker, such as Grammarly, over their
email platform to catch “real word” errors (P10: “‘there’ and ‘their”’)
4.2 Findings and other mistakes. Cutting overly verbose, or “wordy” (P16), drafts
The core goal of our study was to understand the potential ways to a succinct email was another frequent struggle (N =7). Partici-
that LLMs can be incorporated into the email writing process of pants recalled instances when they had used language that was
writers with dyslexia. In the following sections, we describe their misinterpreted by the email’s recipient, most often due to a lack
current experience writing emails, reactions to our system and each of clarity (N =7). For example, P13 recalled a recipient “calling me
feature, and the (lack of) an efect with the AI-metaphors condition. on the phone that evening [because] he had no idea of what I was
asking.” Tone was another common source of misunderstanding
4.2.1 Current Email-Writing Experience. We briefy explored par- (N =6): “They thought that I was just writing to them all mad and
ticipants’ thoughts on writing emails to understand opportunities pissed of, when in reality, I was just explaining myself” (P8). To
and challenges specifc to this genre of writing. Figure 5 shows par- check the email before sending, some participants (4) said they
ticipants’ email-writing habits; most participants said they wrote asked someone else to read the email, while others (3) said they
both new emails and replies daily (P5: “In my busy season, up to 100 listened to the message via text-to-speech. Several participants (7)
a day”), though a few wrote less often (P19: “once or twice a month”). mentioned they struggled to fnd enough time to adequately revise
Participants said they primarily wrote emails for work communi- and proofread, especially when responding to urgent emails.
cation; some also wrote personal emails to connect with family
4.2.2 Reactions to the LaMPost Prototype. In the following sec-
and friends or conduct service inquiries (e.g., P8: “doctor’s appoint-
tion, we describe participants’ responses to LaMPost’s three LLM-
ments”). In rating their level of agreement towards statements about
powered features and the overall system based on usefulness ratings
writing emails (Figure 6), participants generally felt a strong sense
and feedback provided throughout the evaluation.
of self-efcacy when emailing: the majority of participants (N =15)
We asked participants to rate their level of agreement towards
felt confdent in their ability to write emails (avg.=5.05, SD=1.22).
the usefulness of LaMPost’s three LLM-powered features and the
Participants also generally felt satisfed with their emails (avg.=4.89,
overall system. To begin, we used a two-tailed Mann-Whitney U
SD=1.10) and that they could express themselves when emailing
test to compare usefulness ratings between the study’s two AI-
(avg.=4.79, SD=1.36), although the overall agreement towards each
framing conditions—i.e., with (N =9) and without (10) AI-related
of these statements was more mixed.
metaphors—but we did not fnd a signifcant diference for any of
When discussing their experiences writing emails, participants
the ratings (p>0.05). Figure 7 shows results as a whole as well as
described similar challenges and mitigation strategies as we had
results for each framing condition; we return to implications of the
heard about in our formative work and focus group, as well as
AI-framing experiment in the Discussion (Section 5.4).
those discussed in prior work (e.g., [13, 42, 47]). Over half of partic-
Of LaMPost’s three features, the Rewrite My Selection feature
ipants (N =10) said they liked to draft emails outside of an emailing
was rated highest for usefulness on average (avg.=5.26, SD=1.26);
platform (e.g., Microsoft Word, pen and paper) due to “habit” (P7),
13 participants agreed that it was useful for writing emails4 and
“personal preference” (P6), or “to separate it from the anxiety of having
nine selected it as LaMPost’s most useful feature. Participants said
to respond” (P2). Within their preferred platform, participants de-
the primary beneft of the Rewrite feature was its capability to fnd
scribed a common frustration of trying to convert information from
satisfying and appropriate wording for an intended idea: “You’re
their mind into writing—or, “Putting what I’m trying to say in my
able to get a start on what you’re going to say and you can tweak your
brain into words” (P13). Verbal communication was said to be easier
than writing, and six participants relied on speech-to-text tools to 4 Ratings ≥ 4 on 7-point agreement scale.
ASSETS ’22, October 23–26, 2022, Athens, Greece Goodman, et al.

Figure 6: Results of rating scales for feelings of self-efcacy, self-expression, and satisfaction during participants’ existing
email-writing process. Generally, most participants felt confdent about email-writing, but had slightly more mixed feelings
about self-expression and satisfaction.

Figure 7: Post-use rating scales among all participants, and across each of the AI framing conditions. We measured the usefulness
(left) and consistency (center) of the system and each feature, and several of their feelings about using the system (right). In
general, most found the LaMPost system useful while writing emails, appreciating the “Rewrite my selection” feature most of
all.

writing from there” (P18). P6 liked the feature because the synonyms ideas more compact—“I end up typing like three or four sentences to
and alternative wording helped them to understand the meaning of explain something simply”—while P13 liked that he could “sculpt
their own writing—a similar beneft has been shown in prior work the email to the audience” via instructions to rewrite the text to be
[49, 51]. While testing the Rewrite feature, participants saw further more “business-like” or “laid-back”. However, several participants
value for mitigating problems with language precision (N =7) and took issue with the feature’s implementation and functionality: the
tone (4). For example, P1 said the feature would help to make his most common concerns were inaccuracy and noise from the model
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia ASSETS ’22, October 23–26, 2022, Athens, Greece

(N =13) and its overly numerous choices (11); four participants agreed that the Suggest feature was helpful. Two participants were
selected Rewrite as LaMPost’s least useful feature as a result. We unsatisfed with feature’s limited scope, saying it, “Raises more
outline these and other concerns in greater detail in the following questions than answers” (P16), and desired further explanations of
sections. the suggested changes: “It’s saying, ‘Rewrite this sentence to be less
The Identify Main Ideas feature was rated the second high- wordy’. Is it telling me that the sentence is wordy? And why is it at
est on average in terms of its usefulness (avg.=5.11, SD=1.82); 13 the top of the list? [...] And is it really wordy? There’s two sentences
participants agreed that it was useful to some extent, but only four in the paragraph.” (P10). Other frequent issues identifed for the
participants selected it as the most useful of LaMPost’s three core Suggest feature were similar to those for the Rewrite feature: several
features. These participants identifed the primary value of the fea- participants (N =9) were concerned about inaccuracy and noise in
ture’s visual outline as validating that their writing contained the the results, while others (5) mentioned the “overwhelming” (P19)
intended meaning for the email. For example, P4 said the feature quantity of choices. We describe these issues further in the following
was the system’s “biggest selling point” because it would allow “the sections.
ability to see, and make an independent verifcation, that I’m hitting Finally, while responses to individual features were varied, most
the points that I want to hit.” While the feature’s automatic sum- participants identifed at least one feature that was useful to them.
marization was sufcient to capture the “gist” (P4, P16) of each As a result, the usefulness of the LaMPost system overall was
paragraph, four participants desired key details added to the out- rated fairly high following the email-writing exercise (avg.=5.53,
line: “It defnitely helped me recognize what I was talking about, other SD=1.38). P13 summed up the system’s utility as, “It allowed me to
than the fact that it missed time-sensitive information” (P2). Some validate for myself that the point I’m trying to make is actually getting
participants (N =8) also voiced concerns over the summarizations across, and it gives me the opportunity to rewrite it if it’s not.” In gen-
feeling “sterile” (P17), or missing the emotion contained in the text. eral, participants were fond of LaMPost’s capabilities for automatic,
For example, P9 felt dissatisfed when a three-sentence paragraph— content-specifc support (P19: “like it’s an extra person helping you”)
written to thank his project collaborators and share a draft of their and being able to direct that support to the scale of their choosing:
production—was reduced to an outline item that stated, “We have “I could do sentence-by-sentence, or paragraph-by-paragraph” (P10).
a video.” Nine participants selected the Main Ideas feature as the However, participants were split on how the system’s current capa-
least useful overall, including four who said that they could not bilities would help during day-to-day emailing. For example, P14
see a practical use for it in their writing process: “It seems very explained the contexts where he saw it having value: “Where I’m
obvious” (P6). having an informal dialogue with a couple teammates—I don’t need to
Notably, the option to generate a subject line received a very spend this amount of time rewriting three sentences. [...] [But] I can see
positive response. Because LaMPost created the subject line from the beneft of using something like this when writing emails to a VIP,
the Main Ideas outline returned by the LLM, we chose to pair the or if I’m trying to convey a complex topic.” In total, fve participants
subject as an optional checkbox attached to the feature and did not mentioned LaMPost as potentially increasing the time required to
ask participants to rate its usefulness separately. However, several write emails, but eight participants saw an overall time-saving ben-
participants—who otherwise felt tepid about the visual Main Ideas eft: “I don’t have to rewrite it, and rewrite it, and rewrite it. [...] I get
outline—saw value in a separate automatic subject feature: “I always fve minutes back of my life” (P4). Notably, one participant disagreed
leave the subject line blank. [...] Nothing fts what I would be thinking” that the overall system was useful for his needs: “It’s more about
(P19). Most participants thought LaMPost provided accurate subject potentiality versus reality. If you mean in its current state, I would
lines for their emails, but a few (N =3) found issues with the framing say not very useful. If you mean if I see how it could be extremely
of the subject. For example, P5 wondered why LaMPost added useful, then it’s on the complete opposite end of the spectrum” (P5).
the subject line “Invitation to [Name]” for his email informing the In the following sections, we outline participants’ concerns with
individual that they would not be receiving an invitation: “It makes the system’s current state in greater detail—highlighting their con-
sense, but given the context, that’s not quite right. The idea I’m trying cerns over accuracy and choices in particular—and discuss fur-
to get across is quite the opposite: something about bad news, or ther improvements they requested to better assist with their email-
‘Information about Upcoming Event’.” writing process.
Reactions to the Suggest Possible Changes feature were mixed
4.2.3 Concerns Over Accuracy and Noise. One of the most com-
in terms of its usefulness (avg.=5.05, SD=2.03); six participants
mon issues highlighted by participants throughout the evaluation
said it was the most useful feature, and six said it was the least
was unhelpful, inaccurate, or “nonsensical” (P8, P11) results they
useful feature. Twelve participants at least partially agreed that the
received from the model. Anticipating the potential for instability
feature was useful, and they identifed the feature’s primary beneft
in our few-shot learning prompts when given unseen tasks [71],
as support for fxing a detected, but unclear issue—i.e., “Times where
we included an additional consistency rating for each feature and
I’m like, ‘Something doesn’t sound right”’ (P15). For example, in an
the overall system.5 A two-tailed Mann-Whitney U test to compare
email to his wife, P19 used the Suggest feature on a sentence that
each consistency rating between the evaluation’s AI-framing con-
“didn’t feel natural”, yielding a positive result: “Here we go, ‘Rewrite
ditions did not yield a signifcant diference (p>0.05) for any rating;
this phrase to be more romantic’. That’s kind of what I was getting
Figure 7 shows results for each consistency rating as a whole, as
at.” A few participants further identifed the Suggest feature’s value
well as results for each framing condition.
in tandem with the Rewrite feature to provide possible boundaries
for the latter’s custom instructions: “It helped narrow down the 5A 7-point rating scale measuring level of agreement toward “The [feature / system]
options that [Rewrite] could do” (P12). However, not all participants worked how I expected it would.”
ASSETS ’22, October 23–26, 2022, Athens, Greece Goodman, et al.

Poor results were most apparent from the Rewrite and Suggest participants described this as “the paradox of choice” (P5, P6), re-
features, where our few-shot prompts tasked the LLM with provid- ferring to the psychological concept that large sets of choices can
ing new suggestions for rewriting or changing the text; generative feel overwhelming and lead to poor or unsatisfying fnal selections
tasks that are more unpredictable than the relatively constrained [43, 56]. “My curiosity would just lead me into spiraling and read-
Main Ideas summarization. This unpredictability sometimes led to ing hundreds of these possible choices. And maybe there’d be more
striking “hallucinations”—factually incorrect or non-existent con- self-doubt from that spiraling” (P6). Yet others found numerous op-
tent generated by the LLM [20, 48]. For the Suggest feature, the tions were helpful—even necessary—to fnd a correct option amidst
hallucinations amounted to suggested changes that were irrelevant undesirable or inaccurate results from the model. “One that was
for the selected passage, such as “Rewrite this sentence to not use so suggested was pretty much on point. The rest of them are either a little
many commas” given to P11: “There’s not a comma in it.” Some par- of, or would require a bit of rewriting” (P4). Still, most participants
ticipants were able to tolerate a few irrelevant suggestions (P5: “It agreed that the number of options displayed at a time should be con-
sparks other ideas for how to rewrite things”), but others could not: siderably reduced, suggesting around “three to four options” (P19),
“If this is consistent, I’m going to think, ‘Why am I bothering to use or “four to fve; nothing I need to scroll through” (P2). One promising
[this feature] in the frst place?”’ (P10) idea for fltering through a reduced number of options came from
In contrast, hallucinations within the Rewrite feature amounted P6: “The choices should be sorted by word count, just fve or six choices,
to seemingly relevant yet imaginary details added to the rewritten tops. And if I ‘X’ out one of those, it gives me a new one.”
passage. For example, when evaluating a choice containing the Additional tensions around user choice emerged from the open-
phrase, “Maybe that nice patio you were telling me about,” P12 noted, ended possibilities for the Rewrite feature’s free-form instructions.
“The original text doesn’t mention a patio.” Although he removed the Six participants spoke of the immense value in being able to write
option from his remaining choices, he wondered where the system their own instructions, such as P10: “I don’t tend to think in the way
had found this information, why it had chosen to include it in the that commonly is spoke, so being able to put down my thoughts and
email, and whether or not he would be able to catch more “wrong have an AI [respond]— [...] It’s like having a thesaurus for someone’s
info” in the future. In one instance, the hallucinated details carried thought." Others voiced concerns about this capability, including
a deeper personal implication: P6 described feeling unsettled by an doubts towards complex inputs (N =3), its handling of misspellings
option containing the phrase, “I’ve also posted on local Facebook,” (P8, P13), and potential dangers with misguided instructions: “If it’s
for an email informing a friend about schools in the area: “That’s two in the morning and I’m emailing back and forth with support, I
kind of one of those ‘gaslight’-y inaccuracies that unnerve me because might want to type something like, ‘Make it sound mad”’ (P4). These
I would never be on Facebook. Although now, I’m like, ‘Wow, wait a concerns led to six participants requesting pre-written instructions
minute. Duh. Of course that would be a place to fnd out about this of general changes included alongside the open-ended input—and
stuf.’ [...] It’s almost like a suggestion or an assistant telling me to go distinct from the context-specifc instructions that could be queried
do that. That’s not necessary.” from the Suggest feature. For example, P13 said it seemed tedious
Hallucinations weren’t the only source of inaccuracy for the to type certain instructions: “I had to go to my spell checker on my
Rewrite feature. Four participants commented on choices that did phone to fgure out how to spell the word ‘formal’ before I could even
not satisfy the instruction that they had given; for example, after use it. [I want] a drop-down of some of the most basic ones: [...] ’make
the instruction “Rewrite this text: to sound more detailed” gave a it shorter’, ‘make it longer’, ‘more professional’, ‘more casual’.”
few concise results, P2 noted, “To me, this [instruction] implies more The other choice extreme—the single result of the Main Ideas and
text than I wrote.” Although the system removed repeated sugges- subject line feature—also raised concerns, although less frequently.
tions in the fnal choices displayed to each user, three participants While pointing out that key details were missing from the visual
mentioned seeing overly similar results: “This one is just the same Main Ideas outline, two participants wondered why they were un-
sentence that I’ve typed” (P3). Two participants noticed that some able to choose the main ideas themselves. For example, P1 criticized
rewritten choices had removed important details contained in the the feature for “missing the forest for the trees”—concentrating on
original text. Although many participants were ultimately able to one thing while ignoring the rest—and ofered a solution: “An option
use the Rewrite and Suggest features to fnd their desired wording or where I could highlight a section and then tell it, ‘This is what I want
query for helpful changes, the process of sifting through inaccurate you to work your identifcation around.”’ Three participants ques-
or irrelevant results was both “time wasteful” (P7) and cognitively tioned why the generated subject line was returned as single result,
demanding (P14: “I need to put in a lot of focus”). such as P8 after several repeated attempts: “It would be helpful to
see a list of options.”

4.2.4 “The Paradox of Choice”. The LaMPost system included two


extremes with regard to choice: the Rewrite and Suggest features 4.2.5 Feelings About Email-Writing with LLMs. The secondary goal
displayed numerous choices to users (i.e., 15 responses returned of our evaluation was to understand how writing with LLMs can
from the model, minus duplicates), while the Main Ideas and subject impact personal feelings of satisfaction, self-expression, autonomy,
line returned the model’s frst response each time. In this section, and control among writers with dyslexia. We asked participants to
we discuss participants’ concerns with each extreme, and their rate their level of agreement towards these feelings within the con-
suggestions for improvement. text of the LaMPost system (Figure 7); a two-tailed Mann-Whitney
Twelve participants across both the Rewrite and Suggest features U test to compare each consistency rating between the evaluation’s
voiced concerns over the sheer volume of choices to parse. Two AI-framing conditions did not yield a signifcant diference (p>0.05)
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia ASSETS ’22, October 23–26, 2022, Athens, Greece

for any rating. In the following section, we briefy discuss each feel- of building their trust in the system, especially if they were relying
ing based on the results of the rating scales and relevant comments on it for support in challenge areas. P13 used his own difculty
provided throughout the study. with determining writing’s tone to demonstrate the issue: “If you
Participants generally felt satisfed with the system (avg.=5.26, wrote the same statement three diferent ways—one professional, one
SD=1.14), but they voiced several concerns over the LLM’s accuracy casual and one romantic— [...] I would read them and they’d all look
and our implementation of choices (described in the sections above). exactly the same to me. [...] So if [the system is] saying, ‘I can take this
Most participants also felt satisfed with what they wrote using the sentence and make it more businesslike’, I’m going to accept whatever
system (avg.=5.16, SD=1.52), but this rating varied depending on you’re ofering me.”
the complexity of their chosen emailing task: the majority (N =12)
were able to fnish a full email within the 25-minute writing period
(P8: “I would actually send this out to one of my friends”), while the 4.2.6 Additional Features and System Improvements. Participants
rest (7) were unable to complete their task due to limited time. This suggested several changes for the LaMPost system. In addition to
variation was also apparent in their ratings of personal autonomy overall improvements to the system’s accuracy, six participants
(avg.=5.26, SD=1.88), where participants had ranging levels of agree- thought the system should include personalization–either “learning
ment about whether or not they had achieved their emailing goals. through prior conversations” (P6) or “narrow[ing] down what words
Autonomy ratings generally aligned with email completion; e.g., a user would use” (P19) by learning from choices over time. Some
P11: “I got my point across” vs. P6: “I still had a few more ideas to element of personalization was needed to capture the writer’s voice,
express.” However, two participants that did not fnish writing their according to P17, “Rather than it being a canned response from the
emails said the system still helped them to achieve other personal computer.” For the LaMPost interface, the most common requests
goals: “staying on task” (P16) and “keeping focus” (P2). related to how each of the LLM-based features presented results
Participants’ sense of self-efcacy using LaMPost was fairly high to the user. Participants desired fewer choices for the Rewrite and
(avg.=5.26, SD=1.41), and 14 participants at least partially felt conf- Suggest features (N =12), and more choices for the Main Ideas fea-
dent writing emails with the system: “It would be like a having a ture (3). Five participants requested more features to track their
proof-reader along with me” (P3). Those that felt less confdent ex- progress and the system’s results during each writing session, in-
pressed doubts towards the system’s accuracy: “I think the AI would cluding a “changelog” (P6), the ability to save and favorite individual
start breaking down if it really had to compute more and more” (P1). choices (P18), and creating action items from the Suggest results
Participants generally rated their sense of self-expression with LaM- (P2). Further requests for the Suggest feature included an “explana-
Post high (avg.=5.53, SD=1.34), particularly due to the suggestions tion” button for each result (P10, P16) and suggestions generated
provided by the Rewrite feature: “I was able to look and say, ‘Well, in real time (P4).
this is more of how I would speak,’ or, ‘This is more of how I would Drawing from their preferred approach of writing emails from
want the email to sound”’ (P18). However, four participants did not bullet points, four participants wondered if the system could gener-
fnd LaMPost helpful for expressing their ideas, such as P8, who ate the body of an email from a given outline (i.e., implementing
gave a neutral rating: “The AI is limited on what it receives from the the Main Ideas feature in reverse): “I write the three main topics that
user, and it cannot explain much of the email if I don’t give it the I want and the system writes [an email] around them” (P9). Three
[information]. [...] And I don’t always know what to write.” participants also thought the visual Main Ideas outline would be
AI-assisted writing systems can introduce complex questions useful for reading other people’s emails, such as, “If I’m late to a
around whether the user or agent is ultimately in control over meeting and they’re referring to an email that I have not read” (P8).
the produced work [27]. Participants largely felt in control over Six participants appreciated the “Read aloud” feature (P12: “I wish
their email content while using LaMPost (avg.=5.58, SD=1.56), and more things could just be clicked on and read back to me”), but they re-
positive responses were closely associated with the ability to flter quested options to change the speed (P14, P15) and improvements to
through the model’s results and make fnal decisions over changes make the “mechanical” (P6, P12) voice sound more natural. Finally,
to the text. For example, P15 felt a strong command over his work: four participants requested improvements to spelling and grammar
“Even though the system gave suggestions, in the end, I’m the one detection, preferring to use existing “autocorrect” implementations
that’s deciding what it’s going to say.” However, three participants on mobile operating systems and native email clients.
disagreed with the statement, citing limited control over the gener-
ated subject line (P5), the opaque source of the writing suggestions
(P17), and the inability to troubleshoot after undesirable results: “It 4.2.7 Summary. All participants identifed one of LaMPost’s fea-
wasn’t acting on it’s own or anything, [...] but there was no actual, tures as being potentially useful to them when writing emails, and
refned control over each process once I put it into motion” (P1). the Rewrite My Selection feature was chosen as the most useful over-
While not included in our rating scales, our evaluation also all. However, accuracy and quality concerns limited the practical
elicited feelings around privacy and trust. For privacy, seven par- usefulness of many features, despite LaMPost using today’s state of
ticipants voiced concerns about the system reading and storing the art models and prompting techniques. Quality concerns (and a
their personal data: “What is it doing with the information that I desire to support writer autonomy) led us to present generated text
put in there? Where does that go?” (P17) To protect their sensitive as many choice for users to select from; however, a top-N approach
information, P9 desired “the option to not have it read it” while P4 rather than a top-1 approach can be particularly challenging for
requested a clear explanation of the system’s data storage policy. end-users with dyslexia due to the additional reading and cognitive
With regard to trust, a few participants highlighted the importance load challenges.
ASSETS ’22, October 23–26, 2022, Athens, Greece Goodman, et al.

5 DISCUSSION be reduced by displaying fewer options, adding more variation, sort-


The LaMPost system provided a testbed to explore the feasibility ing among displayed options (e.g., by length), and allowing users to
and potential of LLMs as AI-powered writing support tools through save and return to options later on. Although the Rewrite feature’s
three features to meet high-level writing needs: Identify Main Ideas open-ended instruction was identifed as a valuable mechanism
(+ subject line generation), Rewrite My Suggestion, and Suggest Possi- for expressing users’ intention, it led to further usability issues:
ble Changes. Our results indicate that LLMs with optimal output some users had difculty fnding the words to express their desired
hold potential to assist with email-writing tasks, and our evalua- changes while others encountered spelling and grammatical errors.
tion highlighted several promising routes for future explorations Additional scafolding to overcome these issues while instilling
of AI-assisted writing—including the popularity of the controllable a sense control could include pre-written instructions, dictation
Rewrite and subject line features. However, our features as-is did options for users that prefer speaking, or probing questions to help
not surpass participants’ accuracy and quality thresholds, and as users narrow down their revising goals. Our work highlights possi-
a result, we conclude that state-of-the-art LLMs (as of early 2022) ble usability pitfalls with control-enabling mechanisms, and future
are not yet ready to fulfll the real-world needs of writers with work should seek to balance each users’ sense of control with their
dyslexia. Surprisingly, we found no efect in the use (or non-use) ability to fully leverage the system’s features.
of AI metaphors on perceptions of the system, nor on feelings of Assessing the results of an AI-enabled assistive technology can
autonomy, expression, and self-efcacy when writing emails. As a present obstacles for primary users when the data is inherently
whole, our fndings yield insights on the benefts and drawbacks of not accessible [25]; likewise, assessing the quality of text-based
using LLMs as writing support for adults with dyslexia. Below, we results from an AI-enabled writing system presents challenges for
discuss implications of our fndings and opportunities for future users with dyslexia. Additional assessment mechanisms may be
work. needed to support this population: LLMs often produce factually
incorrect “hallucinations” [20, 48], and have been shown to inherit
biases present in their training data (e.g., internet posts) [21]. We
5.1 Implications for Designing Dyslexia did not encounter ofensive results in our work, but certain “non-
Support with LLMs sensical” results prompted LaMPost users to request explanations
When evaluating LaMPost, we found that some users with dyslexia (e.g., question mark buttons) to assist with determining the suit-
desire personalized writing support, where the system would learn ability of results for their writing. User feedback options to fag
the user’s preferred diction and tone from past writing samples and incorrect or unacceptable language may ofer a promising solution
apply it when producing rewrites and suggestions for their current for developers to make targeted fxes. Increasing the transparency
work. A personalized system may enable writers with dyslexia to of an automated system can foster trust among users and drive con-
express themselves more naturally, and results flled with familiar tinued use [23], and further trust can be gained from performance
vocabulary and phrasing may be easier to parse. However, since improvements. However, our work suggests that with sufcient
personalization requires access to the data of individual users, it trust and confdence in the system, users with dyslexia may feel
may also run in confict with their privacy choices [31]—a concern less inclined to invest their time and energy towards analyzing
that was expressed by participants in our work. What degree of each result—increasing the risk for harm. To address this, future
personalization is preferred, and how much writing data is needed work should explore potential safeguarding methods; for example,
to implement it? Should the system collect this data automatically before sending an email, a system could perform a fnal check of
(e.g., from an email platform’s “Sent” folder), or are users willing all machine-written text, and ask the user to verify that this text
to identify and share specifc writing examples themselves? Can seems accurate.
complex privacy policies be made more accessible through built-in
text-to-speech or writing with simple phrasing that meets visual
preferences for reading? There are opportunities for future work to 5.2 New Datasets May Improve Support
incorporate personalization in AI-powered writing tools to support Our LaMPost evaluation highlighted users’ limited tolerance for
the needs of writers with dyslexia, but researchers must consider inaccurate or unhelpful LLM results when writing emails. We chose
the potential trade-ofs between these systems and the privacy to use a pre-trained model as a generalized base for the LaMPost
preferences of users. prototype, and used few-shot prompting [16, 69] to demonstrate
Confrming and extending prior work on ownership in AI-assisted each task from several exemplars of the task performed on generic
writing [27], we found that it was important for users with dyslexia email text. Although our exemplars contained text pulled from real
to feel in control of AI-produced writing. To make the fnal decision emails that covered a range of topics, these were completed emails
over changes to the text, participants with dyslexia in our formative that had been produced by writers of unknown lexical ability—
study requested writing suggestions be delivered as several options, examples that may not have refected the characteristics of early
and we incorporated these fndings in the design of the LaMPost drafts produced by participants in our study. Further work is needed
system’s Rewrite and Suggest features (but overlooked choices for to understand if state-of-the-art LLMs show the same limitations
the subject line). Our evaluation reiterated the importance of user when employed as email-writing support tools among the general
choice, but also highlighted usability issues with LaMPost’s large population, and whether or not LaMPost’s outlining, rewriting, and
list of options: making a choice—especially from similar and/or suggesting features (or similar) can beneft this audience.
lengthy options—was time-consuming, cognitively demanding, and If we had constructed prompts with a more representative base-
overwhelming to some users. Our fndings suggest these issues can line of participants’ writing, the quality of LaMPost’s results may
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia ASSETS ’22, October 23–26, 2022, Athens, Greece

have improved during our study. To to our knowledge, however, a match that of the bulleted ideas, but this reached an extreme when
public corpus of writing samples produced by adults with dyslexia the expanded “emails” were word-for-word reproductions of the
does not exist. To maximize the potential of AI-assisted writing given outline; while it successfully constrained the LLM’s imag-
tools for this demographic, future work could collect samples of ination, we decided this functionality would be of little value to
writing from users with dyslexia. A small dataset could be added users. Future eforts could try adding empty spaces throughout the
to few-shot learning prompts to improve results from pre-trained exemplary email in place of any new language: the LLM will likely
LLMs, while a sufciently large corpus could be used to train smaller, copy this stylistic choice and users may feel comfortable flling in
specialized models for high performance on constrained tasks (e.g., specifc details themselves.
summarizing main ideas). An ideal dataset should include snap- The third attempt drew from the “word faucet” approach de-
shots from diferent times during the writing process, and give scribed in our formative study by exploring automatic ordering
consideration to the varied writing approaches described by par- and structuring for a long, disorderly text produced by, e.g., speech-
ticipants in our study. As an example, an early email draft from a to-text dictation. We intended for the feature to improve a given
user that prefers to start writing by outlining key points is very “block” of text by arranging ideas into logical order and separating
diferent from one dictated via a speech-to-text tool, and each draft the writing into discrete paragraphs. Here, we ran into issues with
will develop diferently over time—yet users desire a tool that can the model’s limited context window: because it accepts prompts
support their needs across all stages of writing. Constructing this with a fnite length, we could include just one or two exemplars to
dataset would be a complex task, but it may be achievable through demonstrate the structuring task—leading to inconsistent results.
usage logging during future prototype evaluations with a diverse To overcome this, we tried breaking up the task via a prompt-
population of participants. chaining process [67]: frst, summarizing the key ideas contained
in the “block” (creating a paragraph structure); next, connecting
each sentence with one of the key ideas (building each paragraph’s
5.3 Lessons Learned in Few-Shot Prompting for content); and fnally, arranging the paragraph’s sentences into a
LLMs logical order. We decided this feature was not practical for our user
Before settling on the core features in LaMPost, we experimented study: the chaining process required several minutes to complete,
with other high-level writing features identifed during formative and deconstructing the input in this way stripped each sentence
research as having potential beneft to writers with dyslexia; we of meaningful surrounding context. However, as LLM context win-
present them here to show the limits of few-shot prompting meth- dows increase in size and chaining paradigms mature, this feature
ods and potential design opportunities for future work. may be achievable in the future.
Following positive responses toward personifed “digital writing
companions” in our formative study, we explored how LLMs could
be used for a collaborative, conversational writing experience. We
5.4 Limitations of Framing with and without AI
developed a prototype for emails drafted through an instant messag- Metaphors
ing interface that could leverage the LaMDA model’s dialog-based Our evaluation explored whether or not the presence of AI metaphors
prompting format. After users supplied a short statement explain- impacted users’ perception of the tool, but we did not fnd statistical
ing the purpose of the email, the LLM-as-chatbot would generate signifcance for any rating. Based on prior work showing the efect
probing follow-up questions to capture each detail to be included in of diferent conceptual metaphors on perceptions of automatic sys-
the message; in a separate panel, users could watch as the LLM grad- tems [33, 36] and concerns over autonomy in human-AI writing
ually constructed an email from each piece of relayed information. [27], we had hypothesized that a user’s knowledge that LaMPost
However, chatting with the LLM exposed its tendency to propose was AI-powered could reduce their sense of autonomy during use.
operations that it could not functionally perform (e.g., “Do you want A lack of a signifcant result for any of the subjective ratings in our
me to remove that information from the email?”), and we could not AI metaphors conditions may be taken as positive outcome for pub-
determine a reliable method to limit these misleading responses. lic attitudes towards AI writing support tools; knowledge of the AI
We faced further issues with transforming the question-and-answer proved neutral for users’ sense of ownership over the text, and for
chat stream into a meaningful email. each of our other measured feelings. However, this result may have
Inspired by an individual in our formative study that preferred also been caused by systematic error in the form of small sample
to write by expanding an initial bulleted list of key points (a prac- sizes for each condition (N =9 vs. 10) or study fatigue [58] (most
tice also described in our evaluation), we explored a feature that rating scales were administered at the end of a 75-minute video
would automatically draft an email from a given outline. We cre- conferencing session). We did not attempt to measure participants’
ated several exemplars containing an outline of 3-7 bulleted ideas, prior experience with AI; per our institution’s recruiting guidelines,
and a complete email containing the outline’s ideas reordered and participants were not afliated with the technology industry and
expanded into full sentences. While the exemplary email added we did not expect them to have deep familiarity with AI before the
transition language and some light rephrasing, it did not include study. Yet a few participants in the without AI group assumed the
any new information; despite this, the model tended to “hallucinate” system was powered by AI and described it as such without our
additional information that did not exist in the original bulleted list mentioning it; this could further indicate that our manipulation was
(e.g., making up names of individuals when none were given; adding not designed correctly (e.g., AI was implied despite non-specifc vo-
unrelated information from the prompt itself). We attempted to cabulary), or it may simply refect an increasing awareness among
constrain the language of each exemplary email to more closely the public toward the capabilities of AI and the likelihood that AI
ASSETS ’22, October 23–26, 2022, Athens, Greece Goodman, et al.

is powering many new products and experiences—despite many [4] Thomas Armstrong. 2011. The Power of Neurodiversity: Unleashing the Advan-
lay users not understanding which specifc technologies are AI- tages of Your Diferently Wired Brain (published in Hardcover as Neurodiversity).
Hachette Books.
infused [34]. Future research should attempt a deeper exploration of [5] Emily M Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret
the efects of presenting writing tools as AI-forward vs. obscured to Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be
Too Big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability,
better characterize the possible impacts of each on user attitudes— and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing
particularly end-users with dyslexia—and provide design guidelines Machinery, New York, NY, USA, 610–623.
for of-the-shelf systems that begin to incorporate this technology. [6] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.
Qualitative research in psychology 3, 2 (2006), 77–101.
[7] Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan,
Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda
5.5 Limitations Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan,
We identifed several issues as a result of conducting a remote Rewon Child, Aditya Ramesh, Daniel M Ziegler, Jefrey Wu, Clemens Winter,
Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin
lab evaluation amidst the ongoing COVID-19 pandemic. We re- Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya
cruited 32 participants, but only 19 completed the evaluation; this Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners.
caused an unbalanced demographic representation in our data and a (May 2020). arXiv:2005.14165 [cs.CL]
[8] Maggie Bruck. 1990. Word-recognition skills of adults with childhood diagnoses
smaller-than-planned sample size for each condition of our between- of dyslexia. Dev. Psychol. 26, 3 (May 1990), 439–454.
subjects AI-framing experiment. The remote nature of the study [9] Daniel Buschek, Martin Zürn, and Malin Eiband. 2021. The Impact of Multiple
Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native
also limited our ability to control the testing environment. An un- and Non-Native English Writers. (Jan. 2021). arXiv:2101.09157 [cs.HC]
known technical issues prevented two participants from accessing [10] S Cai, S Venugopalan, K Tomanek, A Narayanan, M R Morris, and M Brenner.
the system and had to dictate the email’s content to the researcher 2022. Context-Aware Abbreviation Expansion Using Large Language Models. In
Proceedings of NAACL.
via screenshare. Other participants required support with setup and [11] H E Cameron. 2016. Beyond cognitive defcit: the everyday lived experience of
troubleshooting, which reduced their time using the system during dyslexic students at university. Disabil. Soc. 31, 2 (2016), 223–239.
the writing exercise, and added variation to the average duration [12] Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-
Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson,
spent writing LaMPost. While all participants tested each of LaM- Alina Oprea, and Colin Rafel. 2020. Extracting Training Data from Large Lan-
Post’s features during the exercise, seven participants were unable guage Models. (Dec. 2020). arXiv:2012.07805 [cs.CR]
[13] Christine Carter and Edward Sellman. 2013. A view of dyslexia in context:
to complete their writing in the time provided. Finally, participants implications for understanding diferences in essay writing experience amongst
were unable to experiment with LaMPost for diferent email topics higher education students identifed as dyslexic. Dyslexia 19, 3 (Aug. 2013),
and audiences (e.g., work vs. personal), potentially skewing their 149–164.
[14] Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de
responses according to the complexity their chosen email writing Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg
task. Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf,
Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail
Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter,
6 CONCLUSION Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fo-
tios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex
In this paper, we introduced LaMPost, an email-writing interface Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shan-
that explored the potential for large language models to power tanu Jain, William Saunders, Christopher Hesse, Andrew N Carr, Jan Leike, Josh
Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles
writing support tools that address the varied needs of people with Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei,
dyslexia. LaMPost introduced AI-assisted writing features inspired Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. 2021. Evaluating Large
Language Models Trained on Code. (July 2021). arXiv:2107.03374 [cs.LG]
by the needs of adults with dyslexia, including rewrite my selec- [15] Elizabeth Clark, Tal August, Sofa Serrano, Nikita Haduong, Suchin Gururangan,
tion, identify main ideas (with subject line generation), and suggest and Noah A Smith. 2021. All That’s ’Human’ Is Not Gold: Evaluating Human
possible changes. Additionally, we contributed insights from an Evaluation of Generated Text. In Proceedings of the 59th Annual Meeting of the
Association for Computational Linguistics and the 11th International Joint Con-
evaluation of LaMPost with 19 adults with dyslexia. Our study iden- ference on Natural Language Processing (Volume 1: Long Papers). Association for
tifed many promising routes for further exploration—including Computational Linguistics, Online, 7282–7296.
the popularity of the “rewrite” and “subject line” features—but also [16] Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, and Ann Yuan. 2021.
Wordcraft: a Human-AI Collaborative Editor for Story Writing. (July 2021).
found that state-of-the-art LLMs (as of early 2022) may not yet have arXiv:2107.07430 [cs.CL]
sufcient accuracy and quality to meet the needs of writers with [17] Craig Collinson and Claire Penketh. 2010. ‘Sit in the corner and don’t eat the
crayons’: postgraduates with dyslexia and the dominant ‘lexic’ discourse. Disabil.
dyslexia. Surprisingly, we found no efect in the use (or non-use) of Soc. 25, 1 (Jan. 2010), 7–19.
AI metaphors on users’ perceptions of the system, nor on feelings of [18] V Connelly, E J Sumner, and A Barnett. 2014. Dyslexia and writing: Poor spelling
autonomy, expression, and self-efcacy when writing emails. Our can interfere with good quality composition. Brookes eJournal of Learning and
Teaching 6, 2 (Dec. 2014).
fndings yield further insight into the benefts and drawbacks of [19] Vagner Figueredo de Santana, Rosimeire de Oliveira, Leonelo Dell Anhol Almeida,
using LLMs as writing support for adults with dyslexia and provide and Maria Cecília Calani Baranauskas. 2012. Web accessibility and people with
a foundation to build upon in future research. dyslexia: a survey on techniques and guidelines. In Proceedings of the International
Cross-Disciplinary Conference on Web Accessibility (Lyon, France) (W4A ’12, Article
35). Association for Computing Machinery, New York, NY, USA, 1–9.
[20] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT:
REFERENCES Pre-training of Deep Bidirectional Transformers for Language Understanding.
[1] 2017. Dyslexia FAQ - Yale Dyslexia. https://dyslexia.yale.edu/dyslexia/dyslexia- (Oct. 2018). arXiv:1810.04805 [cs.CL]
faq/. [21] Jwala Dhamala, Tony Sun, Varun Kumar, Satyapriya Krishna, Yada Pruk-
[2] 2021. GitHub Copilot: Your AI pair programmer. https://copilot.github.com/. sachatkun, Kai-Wei Chang, and Rahul Gupta. 2021. BOLD: Dataset and Met-
[3] Damien Appert and Philippe Truillet. 2016. Impact of Word Presentation for rics for Measuring Biases in Open-Ended Language Generation. (Jan. 2021).
Dyslexia. In Proceedings of the 18th International ACM SIGACCESS Conference on arXiv:2101.11718 [cs.CL]
Computers and Accessibility (Reno, Nevada, USA) (ASSETS ’16). Association for [22] Steven P Dow, Alana Glassco, Jonathan Kass, Melissa Schwarz, Daniel L Schwartz,
Computing Machinery, New York, NY, USA, 265–266. and Scott R Klemmer. 2011. Parallel prototyping leads to better design results,
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia ASSETS ’22, October 23–26, 2022, Athens, Greece

more divergence, and increased self-efcacy. ACM Trans. Comput.-Hum. Interact. [43] Antti Oulasvirta, Janne P Hukkinen, and Barry Schwartz. 2009. When more
17, 4 (Dec. 2011), 1–24. is less: the paradox of choice in search engine use. In Proceedings of the 32nd
[23] Mary T Dzindolet, Scott A Peterson, Regina A Pomranky, Linda G Pierce, and international ACM SIGIR conference on Research and development in information
Hall P Beck. 2003. The role of trust in automation reliance. Int. J. Hum. Comput. retrieval (Boston, MA, USA) (SIGIR ’09). Association for Computing Machinery,
Stud. 58, 6 (June 2003), 697–718. New York, NY, USA, 516–523.
[24] John Everatt. 1997. The abilities and disabilities associated with adult develop- [44] Henriette Folkmann Pedersen, Riccardo Fusaroli, Lene Louise Lauridsen, and
mental dyslexia. J. Res. Read. 20, 1 (Feb. 1997), 13–21. Rauno Parrila. 2016. Reading Processes of University Students with Dyslexia -
[25] Leah Findlater, Steven Goodman, Yuhang Zhao, Shiri Azenkot, and Margot Hanley. An Examination of the Relationship between Oral Reading and Reading Compre-
2020. Fairness issues in AI systems that augment sensory abilities. SIGACCESS hension. Dyslexia 22, 4 (Nov. 2016), 305–321.
Access. Comput. 125 (March 2020), 1. [45] Jennifer Pedler. 2007. Computer Correction of Real-word Spelling Errors in Dyslexic
[26] Katy Ilonka Gero and Lydia B Chilton. 2019. How a Stylistic, Machine-Generated Text. Ph. D. Dissertation. Birkbeck College, London University.
Thesaurus Impacts a Writer’s Process. In Proceedings of the 2019 on Creativity and [46] Marco Pino and Luigina Mortari. 2014. The inclusion of students with dyslexia
Cognition (San Diego, CA, USA) (C&C ’19). Association for Computing Machinery, in higher education: a systematic review using narrative synthesis. Dyslexia 20,
New York, NY, USA, 597–603. 4 (Nov. 2014), 346–369.
[27] Katy Ilonka Gero and Lydia B Chilton. 2019. Metaphoria: An Algorithmic Com- [47] Geraldine A Price. 2006. Creative solutions to making the technology work: three
panion for Metaphor Creation. In Proceedings of the 2019 CHI Conference on case studies of dyslexic writers in higher education. ALT-J 14, 1 (March 2006),
Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19, Paper 296). 21–38.
Association for Computing Machinery, New York, NY, USA, 1–12. [48] Alec Radford, Jefrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever,
[28] Marjan Ghazvininejad, Xing Shi, Jay Priyadarshi, and Kevin Knight. 2017. Hafez: and Others. 2019. Language models are unsupervised multitask learners. OpenAI
an Interactive Poetry Generation System. In Proceedings of ACL 2017, System blog 1, 8 (2019), 9.
Demonstrations (Vancouver, Canada). Association for Computational Linguistics, [49] Maria Rauschenberger, Ricardo Baeza-Yates, and Luz Rello. 2019. Technologies
Stroudsburg, PA, USA. for Dyslexia. In Web Accessibility: A Foundation for Research, Yeliz Yesilada and
[29] Dené Granger. 2010. A tribute to my dyslexic body, as I travel in the form of a Simon Harper (Eds.). Springer London, London, 603–627.
ghost. Disabil. Stud. Q. 30, 2 (April 2010). [50] Luz Rello and Ricardo Baeza-Yates. 2016. The Efect of Font Type on Screen
[30] Noel Gregg, Chris Coleman, Mark Davis, and Jill C Chalk. 2007. Timed essay Readability by People with Dyslexia. ACM Trans. Access. Comput. 8, 4 (May 2016),
writing: implications for high-stakes tests. J. Learn. Disabil. 40, 4 (July 2007), 1–33.
306–318. [51] Luz Rello, Ricardo Baeza-Yates, Stefan Bott, and Horacio Saggion. 2013. Simplify
[31] Foad Hamidi, Kellie Poneres, Aaron Massey, and Amy Hurst. 2018. Who Should or help? text simplifcation strategies for people with dyslexia. In Proceedings of
Have Access to My Pointing Data? Privacy Tradeofs of Adaptive Assistive the 10th International Cross-Disciplinary Conference on Web Accessibility (Rio de
Technologies. In Proceedings of the 20th International ACM SIGACCESS Conference Janeiro, Brazil) (W4A ’13, Article 15). Association for Computing Machinery, New
on Computers and Accessibility (Galway, Ireland) (ASSETS ’18). Association for York, NY, USA, 1–10.
Computing Machinery, New York, NY, USA, 203–216. https://doi.org/10.1145/ [52] Luz Rello, Miguel Ballesteros, and Jefrey P Bigham. 2015. A Spellchecker for
3234695.3239331 Dyslexia. In Proceedings of the 17th International ACM SIGACCESS Conference
[32] IDA Editorial Contributors. 2020. Dyslexia Basics. https://dyslexiaida.org/ on Computers & Accessibility (Lisbon, Portugal) (ASSETS ’15). Association for
dyslexia-basics/. Computing Machinery, New York, NY, USA, 39–47.
[33] Ji-Youn Jung, Sihang Qiu, Alessandro Bozzon, and Ujwal Gadiraju. 2022. Great [53] Luz Rello and Jefrey P Bigham. 2017. Good Background Colors for Readers: A
Chain of Agents: The Role of Metaphorical Representation of Agents in Con- Study of People with and without Dyslexia. In Proceedings of the 19th International
versational Crowdsourcing. In Proceedings of the 2022 CHI Conference on Hu- ACM SIGACCESS Conference on Computers and Accessibility (Baltimore, Maryland,
man Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Asso- USA) (ASSETS ’17). Association for Computing Machinery, New York, NY, USA,
ciation for Computing Machinery, New York, NY, USA, Article 57, 22 pages. 72–80.
https://doi.org/10.1145/3491102.3517653 [54] Luz Rello, Martin Pielot, and Mari-Carmen Marcos. 2016. Make It Big! The Efect
[34] Shaun K. Kane, Anhong Guo, and Meredith Ringel Morris. 2020. Sense and of Font Size and Line Spacing on Online Readability. In Proceedings of the 2016
Accessibility: Understanding People with Physical Disabilities’ Experiences CHI Conference on Human Factors in Computing Systems (San Jose, California,
with Sensing Systems. In The 22nd International ACM SIGACCESS Conference USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA,
on Computers and Accessibility (Virtual Event, Greece) (ASSETS ’20). Associ- 3637–3648.
ation for Computing Machinery, New York, NY, USA, Article 42, 14 pages. [55] Lindsay Reynolds and Shaomei Wu. 2018. “I’m never happy with what I write”:
https://doi.org/10.1145/3373625.3416990 Challenges and strategies of people with dyslexia on social media. In Twelfth
[35] Anjuli Kannan, Karol Kurach, Sujith Ravi, Tobias Kaufmann, Andrew Tomkins, International AAAI Conference on Web and Social Media.
Balint Miklos, Greg Corrado, Laszlo Lukacs, Marina Ganea, Peter Young, and [56] Barry Schwartz. 2003. The Paradox of Choice: Why More Is Less. Harper Collins.
Vivek Ramavajjala. 2016. Smart Reply: Automated Response Suggestion for Email. [57] Ralf Schwarzer and Matthias Jerusalem. 1995. General self-efcacy scale. Applied
In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Psychology: An International Review (1995).
Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Associa- [58] Laure M Sharp and Joanne Frankel. 1983. Respondent Burden: A Test of Some
tion for Computing Machinery, New York, NY, USA, 955–964. Common Assumptions. Public Opin. Q. 47, 1 (Jan. 1983), 36–53.
[36] Pranav Khadpe, Ranjay Krishna, Li Fei-Fei, Jefrey T Hancock, and Michael S [59] S E Shaywitz, M D Escobar, B A Shaywitz, J M Fletcher, and R Makuch. 1992.
Bernstein. 2020. Conceptual Metaphors Impact Perceptions of Human-AI Collab- Evidence that dyslexia may represent the lower tail of a normal distribution of
oration. Proc. ACM Hum.-Comput. Interact. 4, CSCW2 (Oct. 2020), 1–26. reading ability. N. Engl. J. Med. 326, 3 (Jan. 1992), 145–150.
[37] Alberto Quattrini Li, Licia Sbattella, and Roberto Tedesco. 2013. PoliSpell: An [60] Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kul-
Adaptive Spellchecker and Predictor for People with Dyslexia. , 302–309 pages. shreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, Yaguang
[38] Ryan Louie, Andy Coenen, Cheng Zhi Huang, Michael Terry, and Carrie J Cai. Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali,
2020. Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen,
Models. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Vincent Zhao,
Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, Yanqi Zhou, Chung-Ching Chang, Igor Krivokon, Will Rusch, Marc Pickett,
New York, NY, USA, 1–13. Pranesh Srinivasan, Laichee Man, Kathleen Meier-Hellstern, Meredith Ringel
[39] Sônia Maria Pallaoro Moojen, Hosana Alves Gonçalves, Ana Bassôa, Ana Luiza Morris, Tulsee Doshi, Renelito Delos Santos, Toju Duke, Johnny Soraker, Ben
Navas, Graciela de Jou, and Emílio Sánchez Miguel. 2020. Adults with dyslexia: Zevenbergen, Vinodkumar Prabhakaran, Mark Diaz, Ben Hutchinson, Kristen Ol-
how can they achieve academic success despite impairments in basic reading son, Alejandra Molina, Erin Hofman-John, Josh Lee, Lora Aroyo, Ravi Rajakumar,
and writing abilities? The role of text structure sensitivity as a compensatory Alena Butryna, Matthew Lamm, Viktoriya Kuzmina, Joe Fenton, Aaron Cohen,
skill. Ann. Dyslexia 70, 1 (April 2020), 115–140. Rachel Bernstein, Ray Kurzweil, Blaise Aguera-Arcas, Claire Cui, Marian Croak,
[40] Meredith Ringel Morris. 2020. AI and accessibility. Commun. ACM 63, 6 (May Ed Chi, and Quoc Le. 2022. LaMDA: Language Models for Dialog Applications.
2020), 35–37. (Jan. 2022). arXiv:2201.08239 [cs.CL]
[41] Meredith Ringel Morris, Adam Fourney, Abdullah Ali, and Laura Vonessen. 2018. [61] W Tops, C Callens, E Van Cauwenberghe, J Adriaens, and M Brysbaert. 2013.
Understanding the Needs of Searchers with Dyslexia. In Proceedings of the 2018 Beyond spelling: the writing skills of students with dyslexia in higher education.
CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) Read. Writ. 26, 5 (May 2013), 705–720.
(CHI ’18, Paper 35). Association for Computing Machinery, New York, NY, USA, [62] Kristen Vaccaro, Dylan Huang, Motahhare Eslami, Christian Sandvig, Kevin
1–12. Hamilton, and Karrie Karahalios. 2018. The illusion of control. In Proceedings of
[42] Tilly Mortimore and W Ray Crozier. 2006. Dyslexia and difculties with study the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC
skills in higher education. Studies in Higher Education 31, 2 (April 2006), 235–251. Canada). ACM, New York, NY, USA.
ASSETS ’22, October 23–26, 2022, Athens, Greece Goodman, et al.

[63] Monica van Schaik. 2021. “ACCEPT THE IDEA THAT NEURODIVERSE KIDS Machinery, New York, NY, USA, 1–14.
EXIST”: DYSLEXIC NARRATIVES AND NEURODIVERSITY PARADIGM VISIONS. [67] Tongshuang Wu, Ellen Jiang, Aaron Donsbach, Jef Gray, Alejandra Molina,
Ph. D. Dissertation. Wilfrid Laurier University. Michael Terry, and Carrie J Cai. 2022. PromptChainer: Chaining Large
[64] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Language Model Prompts through Visual Programming. (March 2022).
Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you arXiv:2203.06566 [cs.HC]
need. In Proceedings of the 31st International Conference on Neural Information [68] Jing Xu, Arthur Szlam, and Jason Weston. 2021. Beyond Goldfsh Memory:
Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Long-Term Open-Domain Conversation. (July 2021). arXiv:2107.07567 [cs.CL]
Inc., Red Hook, NY, USA, 6000–6010. [69] Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft:
[65] Laura Weidinger, John Mellor, Maribeth Rauh, Conor Grifn, Jonathan Uesato, Story Writing With Large Language Models. In 27th International Conference on
Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Intelligent User Interfaces (Helsinki, Finland) (IUI ’22). Association for Computing
Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Machinery, New York, NY, USA, 841–852.
Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean [70] Wojciech Zaremba, Greg Brockman, and OpenAI. 2021. OpenAI Codex. https:
Legassick, Geofrey Irving, and Iason Gabriel. 2021. Ethical and social risks of //openai.com/blog/openai-codex/.
harm from Language Models. (Dec. 2021). arXiv:2112.04359 [cs.CL] [71] Tony Z Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh. 2021. Calibrate
[66] Shaomei Wu, Lindsay Reynolds, Xian Li, and Francisco Guzmán. 2019. Design Before Use: Improving Few-Shot Performance of Language Models. (Feb. 2021).
and Evaluation of a Social Media Writing Support Tool for People with Dyslexia. arXiv:2102.09690 [cs.CL]
In Proceedings of the 2019 CHI Conference on Human Factors in Computing Sys-
tems (Glasgow, Scotland Uk) (CHI ’19, Paper 516). Association for Computing
Exploring Smart Speaker User Experience for People Who
Stammer
Anna Bleakley Daniel Rough Abi Roper
University College Dublin University of Dundee City, University of London
Ireland Scotland, UK England, UK
anna.bleakley@ucdconnect.ie drough001@dundee.ac.uk Abi.Roper.1@city.ac.uk

Stephen Lindsay Martin Porcheron Minha Lee


University of Glagow Swansea University Eindhoven University of Technology
Scotland, UK Wales, UK Netherlands
Stephen.Lindsay@glasgow.ac.uk m.a.w.porcheron@swansea.ac.uk M.Lee@tue.nl

Stuart Nicholson Benjamin Cowan Leigh Clark


Dyson Institute of Engineering and University College Dublin Swansea University
Technology Ireland Wales, UK
England, UK benjamin.cowan@ucd.ie l.m.h.clark@swansea.ac.uk

ABSTRACT ACM Reference Format:


Speech-enabled smart speakers are common devices used for numer- Anna Bleakley, Daniel Rough, Abi Roper, Stephen Lindsay, Martin Porcheron,
Minha Lee, Stuart Nicholson, Benjamin Cowan, and Leigh Clark. 2022. Ex-
ous tasks in everyday life. While speech-enabled technologies are
ploring Smart Speaker User Experience for People Who Stammer. In The
widespread, using one’s voice as a computing modality introduces 24th International ACM SIGACCESS Conference on Computers and Accessibil-
new accessibility challenges for people with speech disfuencies ity (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY,
such as stammering (also known as stuttering). This paper investi- USA, 10 pages. https://doi.org/10.1145/3517428.3544823
gates the smart speaker user experiences of people who stammer
over three weeks. We conducted diary studies and semi-structured
interviews with 11 individuals to identify their daily routines, dif- 1 INTRODUCTION
fculties with successful interactions, and strategies to overcome The rapid evolution of speech technology has led to increasing
these barriers. Our analysis demonstrates key factors such as de- use of speech as a modality to engage with applications through
vice location, its afordances, and the structure of commands had a devices such as smart speakers, smartphone voice assistants, and
strong impact on user experience. Participants highlighted difer- wearables [45]. Smart speakers in particular have become a popular
ent linguistic strategies to try and overcome interaction difculties way of accessing applications and services, being owned by over 95
and discussed the potential of using smart speakers for speech and million people in the US alone, with 50% of device owners stating
language therapy. We emphasise the need to further understand that they use these devices on a daily basis [28]. There are concerns
the experiences of people who stammer in smart speaker design to about how users with diverse speech patterns like stammering
increase their accessibility. may successfully interact with speech-enabled devices [11]. These
users may fnd interaction with voice-only systems such as smart
CCS CONCEPTS speakers challenging in their current form, and at worst they may
• Human-centered computing → Empirical studies in acces- be potentially excluded from smart speaker use altogether. Recent
sibility; User studies; Empirical studies in HCI. research eforts have highlighted that these technologies create
accessibility challenges for others, including older adult users [43],
those who are deaf or hard of hearing [7], and people who are blind
KEYWORDS
or have visual impairments [2, 3].
stammer, stutter, inclusivity, accessibility, speech interface, voice Stammering (also known as stuttering) is classifed as a neuro-
interface, conversational user interface logical condition that impacts the rhythmic fow of speech [39].
Disruptions to speech can include repetition, prolongation or block-
ing of specifc sounds and words. Blocking describes audible or
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed silent moments when people are unable to produce a specifc sound
for proft or commercial advantage and that copies bear this notice and the full citation or word despite intending to produce them [34]. Recent estimations
on the frst page. Copyrights for components of this work owned by others than the suggest 8% of children and 2% of adults experience some form of
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission stammer [53]. Given the global increase of smart speaker interac-
and/or a fee. Request permissions from permissions@acm.org. tions and speech interface use more generally, there is a risk that
ASSETS ’22, October 23–26, 2022, Athens, Greece people who stammer may not have successful interactions with
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 these devices, and their experiences are not commonly included in
https://doi.org/10.1145/3517428.3544823 the design and testing of such interactions [36]. While technical
ASSETS ’22, October 23–26, 2022, Athens, Greece Bleakley et al.

developments in improving speech recognition are ongoing (e.g. better access and use smart speakers [7]. Additional visual modali-
[31, 49]), an understanding of how people who stammer engage ties, such as the use of light and customisation of smart assistant
with these devices is still limited [11]. voices to match individual needs can improve accessibility. HCD
To address this research gap, this paper explores how 11 peo- approaches have also been used by Glasser et al. [19] and Mande
ple who stammer interacted with commercially available smart et al. [35] to understand the challenges and preferences of smart
speakers over a three-week period in their homes. Through diary speaker use by those who are deaf and hard-of-hearing (DHH).
studies and semi-structured interviews, we investigated: 1) how Their work highlights preferences for waking up devices, as well
smart speakers were used and implemented into daily lives and as the potential for sign language interaction, but also notes the
routines, and 2) the opportunities and challenges that emerged in very limited regular use, or even attempts to use, smart speakers
these interactions. Our fndings show common themes in partic- by DHH users.
ipants regularly using devices as part of their routines, yet these Work on older adults’ use of speech interfaces also identifes
were hindered by the presence of other people during interactions ongoing challenges and barriers to interaction, how the systems
leading to some participants preferring private smart speaker use. should speak and the potential benefts and drawbacks of anthro-
The study also highlights the potential of using smart speakers to pomorphism [50]. While the spoken nature of interactions may
practice difcult speech patterns in a controlled environment and initially lower the barrier to using technology more generally for
the possibility of using these devices to access speech and language older adults [43], there remain concerns about the reliability of both
therapy. However, our fndings also showed a number of partici- the device and the information it provides. Despite the ongoing chal-
pants had interaction difculties with specifc sounds and words lenges, existing work highlights the importance of understanding
(including device wake words), and a lack of appropriate error re- people’s needs and wishes as a means of improving their interac-
covery measures. These were compounded by fatigue, anxiety, and tions with speech interfaces.
past negative experiences. Based on these fndings, we emphasise
the importance of examining speech interface experiences for peo-
ple who stammer, and including them in the design process, to
complement technical developments.
2.2 Stammering and Speech Interfaces
2 RELATED WORK While work focusing on the wider user experience is scarce, cur-
rent work on speech impairments and speech interfaces, such as
2.1 Speech as an Accessible Modality stammering or dysarthria [37], have presented technical improve-
Prior research on speech as a modality in human-computer inter- ments in the use of automatic speech recognition (ASR). Progress
action (HCI) has demonstrated its benefts in improving access to has been made by the development of appropriate datasets. For
interactive systems [12]. Through speech, people with limited hand instance, The SEP-28k dataset [31] has been shown to improve
dexterity can gain hands-free access to mobile devices [13], and stammering event detection performance by tagging speech events
users with motor difculties can employ voice as an alternative to of blocks, prolongations, sound repetitions, word repetitions, and
mice or touch-pad cursor control[22]. Speech interfaces, in particu- interjections. Other approaches include University College Lon-
lar those that produce synthetic speech output, can also support don’s Archive of Stuttered Speech (UCLASS) [25], the LibriStutter
people who cannot speak or have temporarily lost the ability to dataset of synthesised stuttered speech[29], and the FluencyBank
speak, afording them the opportunity to engage in conversation [44]. While improvements to speech recognition are observed when
with others through assistive dialogue interfaces [4]. Additionally, using these datasets, they are still relatively small [52], and do not
speech interfaces have also been shown to ofer support to deaf include those systems that are not made publicly or commercially
users to privately engage with medical professionals when in the available [31].
presence of an interpreter [40]. Adaptations for diverse speech patterns like stammering have
Speech is a critical access modality for blind and low vision started to emerge in commercial smart speakers. For example, the
people. A review of speech technologies demonstrated the wide Amazon Alexa application has recently introduced a feature to
array of speech-enabled devices that can be used to improve ac- enable longer waiting times for Alexa when people are speaking
cessibility for healthcare, educational tools, communication, and [6]. Google have also introduced a communication tool to improve
daily living [17]. Speech-enabled screen readers such as Jobs Access speech comprehension for a range of abilities [10], albeit in a beta
With Speech (JAWS) [51] are publicly available for people including testing stage.
those who are blind or have low vision, although there is current While advancements in stammering datasets and speech inter-
debate about whether these meet necessary standards for web con- face design are promising, they are often restricted to closed-source
tent accessibility [58]. Speech commands have also been used to systems that limit the wider coverage of accessibility they can
help improve the accessibility of using visual content like images provide [11]. More signifcantly, there is a lack of research where
and emojis [61]. people who stammer are included in the design and testing process
While speech as a modality allows access to technology previ- of technology that can support them [36]. As such, there is an op-
ously constrained to motor input devices, improving the accessibil- portunity to draw inspiration from existing HCD approaches, in
ity of speech interfaces themselves is a burgeoning research area. exploring how people who stammer use speech interfaces through
Human-centered design (HCD) approaches [16] have been imple- devices such as smart speakers and how they may be improved to
mented to examine how people who are deaf or hard of hearing can better accommodate such users.
Exploring Smart Speaker User Experience for People Who Stammer ASSETS ’22, October 23–26, 2022, Athens, Greece

2.3 Research Questions 3) Have you used new strategies to overcome difculties inter-
Following similar research on home-based smart speaker deploy- acting with the Google Nest? If you have, please elaborate on this
ments [42], this study deployed smart speakers in the homes of below.
people who stammer, with an aim to answer two questions: For both phases, participants were encouraged to log at least one
diary entry a day in a Microsoft Form1 that was sent to them via
(1) How do people who stammer interact with smart speakers
email between 8:00 AM and 10:00 AM. Midway through the diary
in their daily routines?
study (day 11), a set of unmoderated tasks (UT; N=10) were also
(2) What opportunities and challenges do smart speakers pro-
introduced as a probe to delineate diferences between diferent
vide for people who stammer?
task interactions. Participants were asked to complete three tasks:
In doing so, our goal was to create a foundation of knowledge 1) add an event to their calendar; 2) participate in an interactive
in an under-explored demographic and present actionable insights game of their choice (e.g. quiz); and 3) add items to a shopping list.
for the wider research and design community. We achieved this At the end of phase two of the diary study, an optional refec-
through qualitative analysis using a combination of diary studies tive interview invited participants to expand on their experiences.
and semi-structured interviews. In these interviews, participants were asked about their overall
experiences, difculties and strategies when interacting with the
3 METHOD smart speaker, and any improvements they believed could be im-
plemented in future smart speaker design. Questions were tailored
3.1 Recruitment and Participants to individual participants’ activities they had logged during the
12 participants were recruited to participate in the study through the diary studies. Those that participated (either in written form or
STAMMA charity based in the United Kingdom, via social media verbally) in this Refective Interview (RF; N=9) demonstrated a mix
channels (Facebook, Twitter), emails and the STAMMA charity of cumulative negative and positive responses and they additionally
website. After an introductory session, one participant dropped out received a €25 honorarium.
of the study, leaving 11 participants (N=11, M=7, W=4; Mean age=33
yrs; SD=10.3 yrs). Of these, 5 were daily users of smart speakers, 3.3 Procedure
1 used them multiple times a week, 2 used them once a month, 1
Th study received ethical clearance from Swansea University’s
had only used them once or twice, and the remaining 2 had no
ethics board. As mentioned, participants were recruited through
prior usage. 64% (N=7) of participants in the study owned a smart
a charity based in the United Kingdom as well as via social media
speaker, with most (N=5) owning an Amazon Echo device. When
channels (Facebook, Twitter), emails and the charity website. The
asked about previous experiences using smart speakers, almost
study was conducted between the months of July - September 2021.
all participants (N=9) mentioned difculties interacting with these
Upon signing up for the study, and before the diary study phases
devices. Stammering severity was not collected as this is typically
commenced, participants received a Google Nest Mini [21] to install
done using clinical instruments [47].
in their home. One participant expressed extreme difculties in
speaking, and so received a Google Nest Hub [20] that allows for
3.2 Diary & Optional Interview tactile and gestural input along with visual output via a display.
For our home-based deployment, we asked participants to under- The researchers made this decision to ensure a more inclusive and
take a diary study in two phases, lasting 11 days and 10 days re- diverse set of experiences could be captured.
spectively. The diary study method is an established method in HCI Upon commencing phase one of the diary study, participants
research, being a data gathering practice that prioritises daily logs were given a 20-minute introductory information session via Zoom,
from participants’ perspectives [46]. giving participants the opportunity to ask questions and allowing
In the frst phase of the diary study (D1; Mean entries: 8; SD: 2.23 the researchers to further explain any aspects that were unclear
entries) participants were asked the following questions to refect about the study structure and tasks. In this introductory session,
on their barriers to interaction: participants were provided any help and advice they required to set
up their smart speakers, and how to complete the diary studies. Be-
1) What experiences did you have interacting with your smart
yond confrming the smart speakers were set up, actual interactions
speaker (Google Nest) today?
with the device were left to participants after this session. As audio
2) What difculties have you had interacting with your smart recordings were not collected as part of this study, participants
speaker? were also sent material detailing how to delete audio recordings
3) How did you overcome these difculties? the smart speakers stored online, if they wished to do so. During
this session, participants also flled out a demographic question-
For the second phase (D2; Mean entries: 6.8; SD: 5 entries), ques- naire and answered questions relating to their past experiences
tions were tailored to understand how these devices were now with smart speakers. Participants had the option to answer these
integrated into participants’ daily lives: questions verbally or in written form. Only one participant declined
1) How were your interactions with your Google Nest today? to answer these questions but did participate in the diary studies
What did you fnd easy and what did you fnd difcult? (II; N=10). At the end of the study, participants were then thanked
for taking part and were debriefed as to the aims of the work. They
2) If you had difculties, what aspects of your speech might have
afected this? 1 https://forms.ofce.com/
ASSETS ’22, October 23–26, 2022, Athens, Greece Bleakley et al.

were then paid an honorarium for participation and were also given The routines participants developed were infuenced by the loca-
the Google Nest devices permanently as a thank you for taking part tion of the device in their home. Participants that placed the device
in the study. This brand was selected due to ongoing popularity in their home ofce [P4, P12, P11, P2] integrated the Nest into
and an expected largest smart speaker share by 2025 [27]. their work routines - asking the device to assist with work-related
queries, play music during breaks, and turn on the kettle to begin
3.4 Data Analysis their morning routine:
After data collection was completed and optional interviews were “I used it through the day so like to play music while I’m
transcribed by a member of the research team, the same researcher working or on my lunch break or something. I also have
conducted inductive thematic analysis on the data [8] using NVivo to do like some maths calculation type things.” [P12 -
12.6.0 [32]. Codes were then exported to the Miro2 platform, where RI]
an afnity diagram [24] was used to outline the initial themes along “And we have a plug in the kitchen and the kettle on
with their defnitions. These themes were then discussed among with working from home at the moment. So I generally
members of the project team in relation to overlap and divergences would do that three times in the morning to make a cup
between them and adjusted following refection. of cofee because that’s my general routine.” [P2 - RI]
4 FINDINGS Some participants used the device to set alarms or prepare for
their commute by checking the weather; these behaviors were again
We found three distinct themes in our collected study data which
work related.
highlight the experience of smart speaker usage among people
who stammer: 1) "Using Devices in Daily Routines & Speech Ther- “I use it to set the forecast in the morning and through
apy Practice"; 2) "Interplay Between Contextual and Technological the day, particularly if I was going into work that day
Barriers"; 3) "Adapting to Device Limitations". These themes are because I work in [a capital city] so it’s good to know,
discussed in the following subsections. you know if I need a coat.” [P12 - RI]
“Helping me fnd the right time to leave with trafc, or
4.1 Using Devices in Daily Routines & Speech whether I should take a coat or umbrella.” [P1 - RI]
Therapy Practice Unsurprisingly, opportunities to interact with the smart speaker
In this section, we discuss how our participants integrated the were determined by its proximity, which thus determined the rou-
Google Nest devices in their day-to-day lives, including how partic- tines it became part of. Interactions with devices placed in ofce
ipants envisioned the use of these devices to improve their stammer environments complemented work routines in the home, whereas
in professional speech therapy contexts. The diferent activities un- those in communal spaces such as kitchens assisted their users
dertaken by participants and their frequency can be seen in Table 1. during the "leaving for work" routine.
Such usage is closely in line with recent work on typical behaviour
of smart speaker users, both generally [5], with older users [26] 4.1.2 Leveraging Device Afordances to Support Speech Therapy.
and with DHH users [7].” During the introductory interview, two participants mentioned
that they intended to use the Google Nest in a "therapeutic way"
Table 1: Participant activities with their devices and the fre- [P10 - II] and "to essentially practice and essentially encouraging [...]
quency in which these activities were conducted. communicative experiences" [P8 - II].
“going to be about just putting myself out there in terms
Activity Frequency of difcult sounds.” [P8 - II]
Information Queries 13% Exploring this theme in the refective interviews, we found that
Music 15% the majority of participants considered smart speakers to be poten-
Weather 11% tially benefcial to their speech therapy. P2 suggested their place in
Time 8% building speech confdence:
News 8%
“if you were at a point where you were at your stammer-
Home Appliances 4%
ing journey where you wanted to get some confdence
Travel & Commuting Queries 4% [...] it could ask you to say certain words that you know
Miscellaneous 37% you fnd are difcult.” [P2 - RI]
P11 and P9 further suggested how smart speaker applications,
4.1.1 Device Activity & Routines. Participants’ Google Nest devices informed by speech therapists and people who stammer, could be
quickly integrated into their daily lives, assisting them in morning used to complement certain techniques taught in therapy clinics
routines by allowing them to ask for the weather, listen to the news, and reduce reliance on in-person sessions:
and interact with home appliances. “I would actually love that. Especially in the [local med-
“It got integrated into not so much my routine but my ical service] it’s really hard to have a speech language
life in the sense that it became one of the main devices therapist.” [P11 - RI]
I would use to play music.” [P10 - RI] “there are techniques they’ll teach us and it wouldn’t
2 https://miro.com/ be difcult to code that and a device to assist the person
Exploring Smart Speaker User Experience for People Who Stammer ASSETS ’22, October 23–26, 2022, Athens, Greece

and because these devices are cheap as well it’s good 4.2.2 Mood: Stress, Fatigue & Anxiety. Participants reported that
because anyone could get one.” [P9 - RI] moods such as stress [P12-D2, P8-RI, P5-D1, P4-D2], anxiety [P12-
Indeed, interacting with the device provided opportunities for RI, P1-D2, P6-RI], and fatigue [P12-RI, P10-D1] negatively impacted
participants to practise in the home, a "liminal space between talk- their fuency when interacting with the device.
ing to yourself and talking to a human being" [P10-RI]. Participants “when I was very tired, I defnitely did stammer more
described themselves as “more anxious” [P11 - RI] when talking to a with it.” [P10 - RI]
person, citing that the device “wouldn’t judge” - [P10 - RI], which in These factors often changed on a day-to-day basis as participants
turn made them less likely to stammer. A further advantage of inter- reported "good speech days" [P12-D1, P2-D1] where they are less
acting in a controlled environment was avoiding eye contact, [P9-RI, likely to stammer frequently and "bad or worse speech days" [P12-
P8-RI] - a pressing concern during in-person communication: RI] where their stammer (e.g. a block) would be more prominent.
“active efort to maintain eye contact when I’m having These were often clearly defned external factors, e.g., “Stress at
a conversation in person, over the telephone that doesn’t work making me block more.” [P4 - D2], but sometimes there was
matter and obviously speaking to the device.” [P8 - RI] no discernible reason:
Overall, participants found benefts in interacting with their “So it can be specifc things like if I’m stressed or if
devices, citing the opportunities for such interactions to improve I’m tired or if I’m quite anxious or nervous maybe, but
aspects of their speech and practise techniques learned in therapy. sometimes there isn’t a particular reason.” [P12 - RI]
The controlled space the device aforded in their homes allowed
them to interact in contexts where certain barriers to in-person 4.2.3 Interaction Length & Interactivity. Participants reported in-
interaction could be lowered. creased likelihood of stammering as utterance length increased;
"short and simple" [P3 - D1] interactions were more successful. We
4.2 Interplay Between Contextual and further explored this in our diary study probe, where participants
were assigned preset tasks such as a quiz to complete.
Technological Barriers
“That was easier because I could just answer A,B, and
Users’ varying speech patterns meant that difculties varied in
C. So I would just answer one word.” [P6 - RI]
segments of the interaction. They had to overcome barriers in their
speech as well as learn the unwritten rules of interaction to create Upon refection, some participants reported that they stammered
successful utterances for the device. more when completing tasks in the diary study probe [P6 - RI, P10
- RI], mentioning that they often had to resort to repetition due
4.2.1 Social Pressure, Privacy & Device Locations. Although the to the "longer phrases" [P6 - RI] that were required. Others stated
Nest allowed participants to easily engage with the device in a that shorter phrases and the specifcity of words added additional
controlled environment, the presence of other people determined "pressure" [P8-RI, P9-RI] that was not present in the other daily
whether users could interact without pressure. Difculties in in- interactions they had with the device.
teraction were attributed to other people in the room, as users
“It does add an element of pressure because you have
stammered more on commands that were said fuently when others
to say certain words as [opposed to] on a day-to-day
were not around. These events occurred in communal areas such
basis.” [P9 - RI]
as the kitchen or living rooms.
“like since before the study, I fnd it oddly easy to talk 4.2.4 Dificult Sounds & Words. Participants exhibited varied dif-
to assistant devices, especially when there aren’t other fculties with diferent sounds and words, particularly the wake
people around.” [P10 - D2] word that initiates interaction with the smart speakers. For exam-
“with the speaker because it was in my room, I stammer ple, one participant reported less difculty saying "Hey Google" in
less than when I’m by myself [...] we actually have a comparison to "Alexa" due to the starting sound of each wake word
Home Pod in our kitchen, so every time I try to use that, [P2]. Conversely, another participant had increased difculties due
I stammer a lot more even if I were to say the same to the starting sounds for "h" and "g" [P12]. This was particularly
command.” [P6 - RI] problematic, as all commands began using the wake word.
“I don’t stammer because of a device, it’s because there’s “Wake word - getting the initial sound out to get Google
added pressure sort of urgency from the other people.” [P9 to interact with me.” [P1 - D1]
- RI] These patterns also persisted in other words in a command.
This has resulted in some users avoiding using the device in Diferences between soft, hard sounds [P6 - RI], hard stop letters
communal spaces in the home, stating that it was “less of a problem” [P9 - D1], and harsh [P12 - RI] sounds were used to explain these
[P6 - RI] if the command was unsuccessful. Although privacy was difculties.
related to the presence of other people, one participant expressed “The letters O, K and G are hard starts and can be
concerns about how the device collected data, and the potential difcult to say.” [P9 - D1]
impact on their stammer. Difculties relating to sounds also revolved around delays and
“I am a little apprehensive that the anxiety that I have prolongations of certain letters. Although devices would sometimes
around privacy is going to afect whether I stammer recognise users’ requests in these contexts, this experience was
with the Google Nest.” [P10 - II] inconsistent.
ASSETS ’22, October 23–26, 2022, Athens, Greece Bleakley et al.

“It seems to recognise certain elongation of words e.g are saying or thinks you have fnished speaking.” [P12
if I elongated “s” at the start of a word it mostly seems - D2]
to understand the word. Whereas if I elongated “m” at “I get stuck saying OK Google as I end up saying "OOO
the start of word, it wouldn’t recognise the word.” [P12 OOOOOOOOOOOOOOOOOK Google" but it stops lis-
- D1] tening during my prolonged "O".” [P1 - RI]
“I think I asked it again and I could say the ‘b’, but it Users stated that one of the main disadvantages of communicat-
did actually wait a while for me to do that.” [P6 - RI] ing with the devices is the lack of awareness it had to their stammer.
When commands to the device were not registered, users became Unlike human-human interactions where interlocutors would ad-
frustrated, which created negative interaction experiences, in turn just their turns to accommodate their speaking patterns, devices
increasing the likelihood of the device not recognising them. Addi- would cut users of. Moreover, participants mentioned not being
tional pressure was added as users repeated commands, analogous able to disclose their stammer, which provides a level of ease and
to interaction experiences with other people. comfort in human-human interactions not available with smart
“Whatever when you’re out and about and you actu- speakers.
ally think you’ve said that really well and you’re really “People don’t stop listening when I can’t get a word out,
happy and you’re like “yeah”, but then the person will so just taking my time with people is fne.” [P1 - RI]
say like “sorry, I didn’t hear that” [...] that kind of pres-
“if there’s a way to detect that the user stammers [...]
sure when you have to say it again and it intensifes
Then maybe it can recognize that or maybe give the
making it worse” [P2 - RI]
user more time to actually fnish their command to say
Being forced to give up after such an increasingly negative series people that don’t stammer.” [P6 - RI]
of interactions was described as a “debilitating” [P8-RI] experience.
4.2.7 Limited Scafolding for Error Recovery. Recognition was one
4.2.5 Past Negative Experiences. Prior experiences with the Nest in- of the biggest barriers in interacting with the Google Nest. Often,
fuenced users’ propensity to stammer. As participants began their the device did not assist with users’ error recovery, leaving them
day with the device in the diary study, their frst communicative unsure how to correct utterances - "it just takes a long time to
experience often infuenced subsequent interactions throughout realize or know what the Nest wants" [P10 - RI] and that they often
the day. Moreover, past negative experiences with the Nest de- "wasn’t/not sure why" [P12 - D2, P9 - D1, P2 - D1] their initial
termined users’ speech patterns, with some mentioning a loop command was not recognized:
wherein thoughts about past stammering-related interactions with
“Mostly frustrating - Didn’t recognise a request I made -
the Google Nest would create difculties on words they did not
had to ask it 3 times . Think fuency was the same each
previously stammer on [P2-RI]. This happened as they prepared
time so not sure why !.” [P2 - D1]
themselves for the lack of recognition from the device, expecting
to be in a " difcult situation" [P10-RI]. “It was just all of a sudden hit a wall and it no longer
recognizing and I thought we were friends – I thought
“I think the main motivation was the awareness that
we were close.” [P8 - RI]
I have a fairly severe speech problem and that I must
do something to counteract it. On occasion this led to a One participant [P10] explicated error recovery in detail, men-
feedback loop of enhanced awareness causing a more tioning that as the device repeated the perceived utterance back
severe speech block.” [P3 - RI] to them, they would be able to understand strategies to repair seg-
“If you’re having those negative thoughts then it would ments of the phrase. This is in contrast to situations where the
lead to more block even on words that you wouldn’t device would execute a command based on what it perceived the
necessarily expect on an average day to block on.” [P2 - user said.
RI] “if you asked it to - where’s the best place to buy chips
and it said “here’s the best place to buy sheep”, you
4.2.6 Timing Out. The device timing out was a key issue, with would know that the word it didn’t understand correctly
many participants mentioning that longer waiting times would was chips, you would make [...] some kind of efort to
have improved their interactions with the device. However, a few make the Nest understand that word the next time” [P10
expressed that waiting times with the Nest were longer compared - RI]
to other devices [P12 - D1, P9 - RI], creating positive interaction ex-
periences. During unsuccessful interactions, users often struggled 4.3 Adapting to Device Limitations
to continue due to device interruption as they felt additional pres-
sure to produce utterances. Participants that reported blocking also As users tried to overcome these barriers, they deployed strategies
mentioned the device timing out before the following utterance or both old and new. However, issues such as error recovery limited
the device ignoring a repetitive command after a block. Similarities the strategies deployed in addition to defning the qualities of in-
were reported among those that elongated sounds (also mentioned teraction among users who stammer.
in 4.2.4). 4.3.1 Repetition. To overcome errors they encountered with the
“When blocking, either no sound or a repetitive sound Nest devices, users would often repeat their commands, due to
comes out so the nest doesn’t always identify what you the lack of alternative error recovery methods available. Although
Exploring Smart Speaker User Experience for People Who Stammer ASSETS ’22, October 23–26, 2022, Athens, Greece

at times repetition allowed them to eventually execute their com- taking "deep breaths" [P1-D2] avoided issues such as timing out or
mands successfully, when unsuccessful, users would not be able to device interruption.
delineate the cause of the error, due to lack of device feedback (see
“I just found that speaking at a much more measured
4.2.7).
pace never really gave it the opportunity to stop you
“Over time, I found ways to overcome my initial chal- [...]Once I kind of, not mastered that but once I imple-
lenge of not being able to say the wake word, mainly mented that it was much easier, much more success-
through repetition...” [P1 - RI] ful” [P8-RI]
“I suppose that’s what happened when I put something
At times, participants also mentioned increasing the volume in
in the calendar. I had to repeat it even though I originally
which they spoke to ensure the device could pick up their com-
thought it went well...” [P2 - RI]
mands, while also moving closer to the device to increase recogni-
In these situations, participants either persisted until they were tion [P10 - D1 & D2, P9 - D1 & D2]. This became difcult in less
successful, or simply gave up using these difcult commands. In- private and noisier environments in the home.
deed, most users mentioned that their patterns of use remained
unchanged, often sticking to the same commands for the duration “It helps if I’m on my own when speaking to the device
of the diary study to avoid negative stammering experiences. the quieter the room the better as noise can cause a
direction and make speaking diferent.....” [P9 - D1]
“That’s the command I know I won’t stammer and I
know I can say.” [P6 - RI] “repeating, or going closer to the speaker....” [P10 - D1]
“I am trying to be more aware of the sounds I struggle
with as I do avoid then habitually now.” [P8 - D1] 5 DISCUSSION
Repetition became characteristic of participants’ interactions Smart speakers and other intelligent personal assistants (IPAs) are
with the device. One participant explained how other household becoming more ubiquitous with a strong presence in the home
members had "fewer interactions" [P3 - RI] with the device, whereas and other environments. Yet, users that stammer have not been
they had difculties "being understood the frst or even second time explicitly accounted for in the design of these interfaces. In this
[P3 - RI]. Other participants described "persistence" [P1 - D1,P5 - D1] study, we highlight three distinct themes that refect the experiences
and having "persevered" [P2 - D1] in relation to how they overcame of users that stammer. We build on previous work [11] highlighting
these challenges. As users repeated commands to the device, they the lack of accessibility in IPA use through smart speakers for users
also employed methods to improve certain aspects of their speech. with diverse speech patterns.
These are discussed in more detail in the following subsections.
4.3.2 Planning Uterances. As users reformulated utterances, they 5.1 Navigating Difcult Interactions
became more aware of the limitations in recognition that the Nest While participants had some success in interacting with the smart
had. They often tried to opt for other words to overcome blocking speakers, and even embedded them into their daily routines, there
or stammering, paralleling strategies used in in-person interaction. were frequent indications that the devices’ design was insufcient
“I think once I did actually I did have a way around to accommodate their stammering. This included an inability to
that command. Saying “what events are there?” or some- initiate the wake word required to start interactions, difculties
thing. Because one of these words are much easier for with specifc sounds or phrases, and the device timing out for par-
me to say, because events begins with ‘e’.” [P6 - RI] ticipants, for example when blocking occurred.
Participants engaged in strategies in attempts to overcome dif-
However, at times, this paradoxically caused them to stammer, culties with the devices understanding and correctly interpreting
due to previously formed expectations. their speech. These included repetition, planning utterances, speak-
“I thought about day 1 more, and I defnitely had dif- ing in a more controlled manner, and adjusting pitch. However,
fculty with thinking that I was about to stutter and participants encountered mixed results. Repeating commands, for
planning for it in the way that meant I actually meant example, would sometimes create additional pressure on producing
I ended up messing up the sentence and confusing the speech, which mirrored some participants’ experiences of talking to
device....” [P10] other people. Unsuccessful use of strategies that already attempted
to navigate system recognition errors could leave participants want-
4.3.3 Controlled Speaking. The most commonly reported strategy
ing to avoid interacting with the devices again. Such challenges
[P1, P12, P3, P9] was speaking in a "conscious" [P3], or "contrived"
to regular adoption are evident in related work with DHH users
[P1] manner. This meant that users were sometimes less relaxed
[19, 35] who feel excluded from the speech-centred design process.
speaking to the device than when speaking to another person, yet
First impressions are vital in the subsequent adoption of a tech-
this strategy did not always result in successful interactions.
nology, and we recommend that some future ASR eforts focus on
“Again, I tried to talk in a controlled and not very re- maximising the success of those commands that are most frequently
laxed way. However it didn’t always work...” [P1 - D1] used in this and previous work.
Speaking slowly allowed participants to fuently produce words While using and possibly promoting these strategies may be
that previously resulted in errors, and fow into words they previ- useful in developing more successful interactions, relying on them
ously stammered with [P12]. Additionally, controlling breathing or rather than being able to use one’s own speech may be detrimental.
ASSETS ’22, October 23–26, 2022, Athens, Greece Bleakley et al.

Prior work has identifed that openly stammering and being com- inability to conduct the initial interactions for people who stam-
fortable in speaking is a critical value that people who stammer mer. Improvements could also draw on web accessibility guidance,
would like to see recognised in the tools they use [9], and it is likely including creating additional time to speak (as introduced with
that smart speakers are no diferent. Amazon Alexa [6]), capturing and re-using smaller parts of speech,
Additionally, the variable success of the above strategies can be and the use of additional modalities when appropriate [48].
compounded by how people’s stammers were manifesting at the
point of interaction. Daily life stress, anxiety, and fatigue had an
impact on people’s speech, which in turn could make successful in-
5.3 The potential for smart speakers to support
teractions with the smart speakers more problematic. Additionally, speech and language therapy
participants felt further pressure from the presence of others, re- Our study highlights the opportunities that smart speakers aford
sulting in less successful device interactions. Prior work has demon- people who stammer. Over the course of the study, some partic-
strated the reluctance of people to engage with speech interfaces in ipants utilised their devices to practice speaking techniques and
public settings [14, 33], due to a sense of social embarrassment or sounds they know they have difculty with, to build confdence
awkwardness [1, 14]. Although groups such as families may engage for future social interactions. The lack of perceived judgement and
collaboratively with smart speakers in home settings [41], the in- the nature of the controlled environment of interaction reduced
creased potential of stammering because of others being co-located speaking anxiety for some participants. Additionally, there was
with smart speakers may impact on this embedding of devices in potential seen in using smart speakers as a means of delivering
collaborative interactions and daily routines. speech and language therapy. This echoes the prior use of smart
speakers to deliver other forms of therapy or coaching in the past
[23] and applied to help build confdence for public speaking [55].
Additionally, this resonates with speech development services like
5.2 Fostering Error Recovery withVR [56] that provide customisable virtual reality speaking en-
A lack of error recovery support in the design of smart speakers vironments, where the speaker is in full control of the situation.
was a major issue for participants.One suggestion for improving Similar approaches may be adoptable in commercial smart speakers
error recovery was to provide feedback on what the device had to provide cheaper and more readily available speaking practice.
interpreted from people’s speech, which does not currently occur The potential for using voice assistant technologies is shared by
transparently. Similar fndings were observed in studies with L2 recent work on how speech and language therapists (SLTs) use or
English speakers and IPAs, in which using visual feedback (e.g. want to use these devices as part of their practice [30]. The authors
through smartphones) allowed people to understand the cause of indicated some SLTs were already using commercial devices to help
communication breakdowns [60]. While this may be one solution, their clients with daily tasks and improving and practising speech.
it is not always possible in hands-busy/eyes-busy scenarios where Smart speaker integration into daily routine and their use of voice
people do not have the ability to access a screen [11]. While device aligns them with the emerging discipline of healthcare coaching,
logs are available that indicate what a device has interpreted, these which often uses regular, short conversations to educate and mo-
are not readily accessible because they require people to actively tivate to deliver its interventions [38]. However, as discussed in
look up this information online via their personal accounts. [31], existing algorithms that “smooth over” stammering support
Similarly, error recovery may need to more clearly indicate why ASR, but not detection of stammering events. Explicating such
an error has occurred. It was not always clear to participants why events is necessary for both meaningful error recovery and poten-
difculties were emerging. Participants were sometimes left won- tial speech therapy uses (discussed below). We recommend a focus
dering if something about their speech pattern at the time of inter- on improving the accuracy of event detection and development
action was creating difculties, or whether it was the nature of the of smart speaker-specifc datasets, collected through user-centred
query itself. This echoes fndings from [7] with DHH users who studies such as this, or building on the work of [29] by synthesising
were unable to understand why an error had occurred. In addition common stammering events in existing smart speaker datasets.
to the visual ‘training wheels’ recommended in this previous work, Further work should also focus on exploring the experiences
we recommend a “what went wrong?” command that would allow and expectations of both SLTs and people who use their services
any user to know why their previous command caused an error. with IPAs in this context. Such work could then inform profes-
This would further help to bridge the “gulf of expectation” [11] that sional standards on their use, as well as ways to improve the design
is especially pertinent to users who stammer, who may consider of the technology to support speech and language therapy. By
their stammer a “speech error” [36] and incorrectly attribute the learning from the experiences of people who stammer, along with
error to themselves, rather than the limitations of smart speakers. healthcare practitioners, we can develop more inclusive design
While this lack of clarity is true irrespective of fuency [60], the guidelines for usage of smart speakers by people with specifc com-
participants in this study were more inclined to blame their own municative needs, and further explore their use in SLT practice.
speech rather than the device’s performance, resulting in nega- Indeed, work by authors of the StammerApp [36] shows support for
tive self-perception. There is advice on how to talk to people who real-time feedback on speech from users who stammer. While the
stammer (e.g. [54]), although it is unclear how this may feasibly use of augmented reality (AR) technologies such as Google Glass
be transferred to smart speaker interactions [11]. While improve- were suggested for this, we see clear potential for such feedback
ments may need to draw on existing research on how to make in smart speakers. We thus urge the third-party development of
smart speaker interactions progress [15], this may not address the ‘Skills’ (Alexa) or ‘Actions’ (Google Assistant) that provide such
Exploring Smart Speaker User Experience for People Who Stammer ASSETS ’22, October 23–26, 2022, Athens, Greece

feedback to assist in identifying areas for further practice that are [2] Ali Abdolrahmani, Kevin M Storer, Antony Rishin Mukkath Roy, Ravi Kuber,
not currently available. and Stacy M Branham. 2020. Blind leading the sighted: drawing design insights
from blind users towards more productivity-oriented voice interfaces. ACM
Transactions on Accessible Computing (TACCESS) 12, 4 (2020), 1–35.
[3] Dragan Ahmetovic, Cole Gleason, Chengxiong Ruan, Kris Kitani, Hironobu
5.4 Limitations and Future Work Takagi, and Chieko Asakawa. 2016. NavCog: a navigational cognitive assistant for
This paper identifes some of the opportunities and challenges that the blind. In Proceedings of the 18th International Conference on Human-Computer
Interaction with Mobile Devices and Services. 90–99.
people who stammer face when interacting with commercially avail- [4] Norman Alm, John Todman, Leona Elder, and Alan F Newell. 1993. Computer
able smart speakers in their homes over a three-week period. We aided conversation for severely physically impaired non-speaking people. In
emphasise that our fndings are primarily limited to experiences of Proceedings of the INTERACT’93 and CHI’93 Conference on Human Factors in
Computing Systems. 236–241.
speech only smart speakers. Future research should seek to examine [5] Tawfq Ammari, Jofsh Kaye, Janice Y Tsai, and Frank Bentley. 2019. Music,
how people’s experiences difer when using other speech-capable Search, and IoT: How People (Really) Use Voice Assistants. ACM Trans. Comput.
Hum. Interact. 26, 3 (2019), 17–1.
devices in and outside of the home environment, as well as how [6] Steven Aquino. 2021. Exclusive: Amazon Adds New Speech Set-
multi-modal forms of smart speakers impact the fndings presented. ting To Alexa App To Help Stutterers Finish Commands, Queries.
Additionally, further work would beneft from exploring how peo- https://www.forbes.com/sites/stevenaquino/2021/10/05/exclusive-amazon-
adds-new-speech-setting-to-alexa-app-to-help-stutterers-fnish-commands-
ple who stammer in other languages interact with smart speakers, queries/ Accessed 2nd April 2022.
and how stammering may also interact non-L1 English speakers’ [7] Johnna Blair and Saeed Abdullah. 2020. It Didn’t Sound Good with My Cochlear
experiences [57, 59]. While the participants in our study were all Implants: Understanding the Challenges of Using Smart Assistants for Deaf and
Hard of Hearing Users. Proceedings of the ACM on Interactive, Mobile, Wearable
adults, a higher number of children experience some form of stam- and Ubiquitous Technologies 4, 4 (2020), 1–27.
mer [53]. Further research in the developing area of younger users [8] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.
Qualitative research in psychology 3, 2 (2006), 77–101.
of smart speakers [18] may be required to create a more comprehen- [9] Robin N Brewer and Vaishnav Kameswaran. 2018. Understanding the power
sive insight into the experiences of people who stammer, alongside of control in autonomous vehicles for people with vision impairment. In Pro-
other diverse speech patterns. Further quantitative research would ceedings of the 20th International ACM SIGACCESS Conference on Computers and
Accessibility. 185–197.
also create additional insights by understanding how many people [10] Julie Cattiau. 2021. A communication tool for people with speech impairments.
who stammer own and use smart speakers, and how frequently https://blog.google/outreach-initiatives/accessibility/project-relate/ Accessed
recognition errors emerge when interacting with them. 2nd April 2022.
[11] Leigh Clark, Benjamin R Cowan, Abi Roper, Stephen Lindsay, and Owen Sheers.
2020. Speech diversity and speech interfaces: Considering an inclusive future
through stammering. In Proceedings of the 2nd Conference on Conversational User
6 CONCLUSION Interfaces. 1–3.
In this paper, we examined the use of smart speakers in the home [12] Leigh Clark, Philip Doyle, Diego Garaialde, Emer Gilmartin, Stephan Schlögl,
Jens Edlund, Matthew Aylett, João Cabral, Cosmin Munteanu, Justin Ed-
by people who stammer though three-week diary studies and semi- wards, and Benjamin R Cowan. 2019. The State of Speech in HCI: Trends,
structured interviews. We undertook our work to focus on how we Themes and Challenges. Interacting with Computers 31, 4 (09 2019), 349–371.
could explore the user experience of these technologies and make https://doi.org/10.1093/iwc/iwz016 arXiv:https://academic.oup.com/iwc/article-
pdf/31/4/349/33525046/iwz016.pdf
them more inclusive. Our participants used smart speakers for a [13] Eric Corbett and Astrid Weber. 2016. What can I say? addressing user experience
range of features, placing the devices in diferent places within the challenges of a mobile voice user interface for accessibility. In Proceedings of the
home and accordingly using them as part of diferent day-to-day 18th international conference on human-computer interaction with mobile devices
and services. 72–82.
routines. Although our participants found benefts of using their [14] Benjamin R. Cowan, Nadia Pantidi, David Coyle, Kellie Morrissey, Peter Clarke,
devices and highlighted the potential of using them for speech and Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. "What Can i Help
You with?": Infrequent Users’ Experiences of Intelligent Personal Assistants. In
language therapy, external social factors such as the co-presence of Proceedings of the 19th International Conference on Human-Computer Interaction
others negatively impacted their experiences. Crucial for consider- with Mobile Devices and Services (Vienna, Austria) (MobileHCI ’17). Association
ing smart speaker design, our fndings identifed how the length for Computing Machinery, New York, NY, USA, Article 43, 12 pages. https:
//doi.org/10.1145/3098279.3098539
and characteristics of interaction (including specifc sounds), time- [15] Joel E. Fischer, Stuart Reeves, Martin Porcheron, and Rein Ove Sikveland. 2019.
outs, and insufcient scafolding for error recovery, impeded device Progressivity for Voice Interface Design. In Proceedings of the 1st International
Conference on Conversational User Interfaces (Dublin, Ireland) (CUI ’19). Asso-
use which our participants overcame through greater planning,
ciation for Computing Machinery, New York, NY, USA, Article 26, 8 pages.
repetition, and consciously controlling their speaking. Stammering https://doi.org/10.1145/3342775.3342788
is a diverse phenomenon that impacts people diferently, thus while [16] International Organization for Standardization. 2010. Ergonomics of Human-
system Interaction: Part 210: Human-centred Design for Interactive Systems. ISO.
we propose improvements for smart speaker devices, these must https://www.iso.org/standard/77520.html
be applied carefully to avoid exclusion or negative side efects. [17] Diamantino Freitas and Georgios Kouroupetroglou. 2008. Speech technologies
for blind and low vision persons. Technology and Disability 20, 2 (2008), 135–156.
[18] Radhika Garg, Hua Cui, Bo Zhang, Spencer Selingson, Martin Porcheron, Leigh
ACKNOWLEDGMENTS Clark, Benjamin R Cowan, and Erin Beneteau. 2022. The Last Decade of HCI
Research on Children and Voice-based Conversational Agents. In CHI Conference
We would like to thank STAMMA for their help in developing this on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22).
study. This work was supported by the Engineering and Physical Association for Computing Machinery, New York, NY, USA, 19 pages. https:
//doi.org/10.1145/3491102.3502016
Sciences Research Council grant EP/M022722/1. [19] Abraham Glasser, Vaishnavi Mande, and Matt Huenerfauth. 2021. Understand-
ing deaf and hard-of-hearing users’ interest in sign-language interaction with
personal-assistant devices. In Proceedings of the 18th International Web for All
REFERENCES Conference. 1–11.
[1] Ali Abdolrahmani, Ravi Kuber, and Stacy M Branham. 2018. " Siri Talks at You" [20] Google. 2022. Google Nest Hub (2nd Gen). https://store.google.com/gb/product/
An Empirical Investigation of Voice-Activated Personal Assistant (VAPA) Usage nest_hub_2nd_gen?hl=en-GB Accessed 4th April 2022.
by Individuals Who Are Blind. In Proceedings of the 20th International ACM [21] Google. 2022. Google Nest Mini - A Smart Speaker for Any Room. https:
SIGACCESS Conference on Computers and Accessibility. 249–258. //store.google.com/gb/product/google_nest_mini?pli=1&hl=en-GB Accessed 4th
ASSETS ’22, October 23–26, 2022, Athens, Greece Bleakley et al.

April 2022. [44] Nan Bernstein Ratner and Brian MacWhinney. 2018. Fluency Bank: A new
[22] Susumu Harada, Jacob O Wobbrock, Jonathan Malkin, Jef A Bilmes, and James A resource for fuency research and practice. Journal of fuency disorders 56 (2018),
Landay. 2009. Longitudinal study of people learning to use continuous voice- 69–80.
based cursor control. In Proceedings of the SIGCHI conference on Human Factors [45] Juniper Research. 2018. Digital Voice Assistants in Use to Triple to 8 Billion by
in Computing Systems. 347–356. 2023, Driven by Smart Home Devices. shorturl.at/dgoGL. Accessed 22nd Feb
[23] Ahmed Hassoon, Jennifer Schrack, Daniel Naiman, Dina Lansey, Yasmin Baig, 2020.
Vered Stearns, David Celentano, Seth Martin, Lawrence Appel, et al. 2018. In- [46] John Rieman. 1993. The diary study: a workplace-oriented research tool to guide
creasing physical activity amongst overweight and obese cancer survivors using laboratory eforts. In Proceedings of the INTERACT’93 and CHI’93 conference on
an alexa-based intelligent agent for patient coaching: protocol for the physical Human factors in computing systems. 321–326.
activity by technology help (PATH) trial. JMIR research protocols 7, 2 (2018), [47] Glyndon Riley and Klaas Bakker. 2009. SSI-4: Stuttering severity instrument.
e9096. PRO-ED, an International Publisher.
[24] Karen Holtzblatt and Hugh Beyer. 1997. Contextual design: defning customer- [48] Abi Roper, Stephanie Wilson, Timothy Neate, and Jane Marshall. 2019. Speech
centered systems. Elsevier. and Language. In Web Accessibility. Springer, 121–131.
[25] Peter Howell, Stephen Davis, and Jon Bartrip. 2009. The University College [49] Frank Rudzicz, Aravind Kumar Namasivayam, and Talya Wolf. 2012. The TORGO
London Archive of Stuttered Speech (UCLASS). Journal of Speech, Language, and database of acoustic and articulatory speech from speakers with dysarthria.
Hearing Research 52, 2 (2009), 556. Language Resources and Evaluation 46, 4 (2012), 523–541.
[26] Sunyoung Kim and Abhishek Choudhury. 2021. Exploring older adults’ per- [50] Sergio Sayago, Barbara Barbosa Neves, and Benjamin R Cowan. 2019. Voice as-
ception and use of smart speaker-based voice assistants: A longitudinal study. sistants and older people: some open issues. In Proceedings of the 1st International
Computers in Human Behavior 124 (2021), 106914. Conference on Conversational User Interfaces. 1–3.
[27] Brett Kinsella. 2019. Loup Ventures Says 75% of U.S. Households Will [51] Freedom Scientifc. 2022. Jaws - Freedom Scientifc. https://www.
Have Smart Speakers by 2025, Google to Surpass Amazon in Market Share. freedomscientifc.com/products/software/jaws Accessed 23rd Feb 2022.
https://voicebot.ai/2019/06/18/loup-ventures-says-75-of-u-s-households-will- [52] Shakeel Ahmad Sheikh, Md Sahidullah, Fabrice Hirsch, and Slim Ouni. 2021.
have-smart-speakers-by-2025-google-to-surpass-amazon-in-market-share/ Machine Learning for Stuttering Identifcation: Review, Challenges & Future
[28] Brett Kinsella. 2022. The Rise and Stall of the U.S. Smart Speaker Market – New Directions. arXiv preprint arXiv:2107.04057 (2021).
Report. https://voicebot.ai/2022/03/02/the-rise-and-stall-of-the-u-s-smart- [53] Stamma. 2020. Stammer in the Population. https://stamma.org/news-features/
speaker-market-new-report/ stammering-population Accessed 23rd Feb 2020.
[29] Tedd Kourkounakis, Amirhossein Hajavi, and Ali Etemad. 2020. FluentNet: [54] Stamma. 2020. Talking With Someone Who Stammers. https://stamma.org/about-
end-to-end detection of speech disfuency with deep learning. arXiv preprint stammering/talking-someone-who-stammers Accessed 23rd Feb 2020.
arXiv:2009.11394 (2020). [55] Jinping Wang, Hyun Yang, Ruosi Shao, Saeed Abdullah, and S Shyam Sundar.
[30] Pranav Kulkarni, Orla Dufy, Jonathan Synnott, W George Kernohan, Roisin 2020. Alexa as coach: Leveraging smart speakers to build social agents that
McNaney, et al. 2022. Speech and Language Practitioners’ Experiences of Com- reduce public speaking anxiety. In Proceedings of the 2020 CHI conference on
mercially Available Voice-Assisted Technology: Web-Based Survey Study. JMIR human factors in computing systems. 1–13.
Rehabilitation and Assistive Technologies 9, 1 (2022), e29249. [56] withVR. 2022. withVR - A safe space to speak. https://withvr.app Accessed 2nd
[31] Colin Lea, Vikramjit Mitra, Aparna Joshi, Sachin Kajarekar, and Jefrey P Bigham. April 2022.
2021. Sep-28k: A dataset for stuttering event detection from podcasts with people [57] Yunhan Wu, Justin Edwards, Orla Cooney, Anna Bleakley, Philip R Doyle, Leigh
who stutter. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Clark, Daniel Rough, and Benjamin R Cowan. 2020. Mental workload and lan-
Speech and Signal Processing (ICASSP). IEEE, 6798–6802. guage production in non-native speaker IPA interaction. In Proceedings of the
[32] QSR International Pty Ltd. 2018. Nvivo (Version 12). https://www. 2nd Conference on Conversational User Interfaces. 1–8.
qsrinternational.com/nvivo-qualitative-data-analysis-software/try-nvivo [58] Yan Wu, Stephen Lindsay, Leighton Evans, Martin Porcheron, Leigh Clark, and
[33] Ewa Luger and Abigail Sellen. 2016. "Like Having a Really Bad PA": The Rhys Jones. 2021. Enabling Digital First: A Case Study of Sight-Impaired Users in
Gulf between User Expectation and Experience of Conversational Agents. In Wales. Technical Report. Swansea University, Swansea, UK. https://cronfa.swan.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Sys- ac.uk/Record/cronfa58304/
tems. Association for Computing Machinery, New York, NY, USA, 5286–5297. [59] Yunhan Wu, Martin Porcheron, Philip Doyle, Justin Edwards, Daniel Rough, Orla
https://doi.org/10.1145/2858036.2858288 Cooney, Anna Bleakley, Leigh Clark, and Benjamin R. Cowan. 2022. Compar-
[34] Gerald A Maguire, Christopher Y Yeh, and Brandon S Ito. 2012. Overview of ing Command Construction in Native and Non-Native Speaker IPA Interaction
the diagnosis and treatment of stuttering. Journal of Experimental & Clinical through Conversation Analysis. In Proceedings of the 4th International Conference
Medicine 4, 2 (2012), 92–97. on Conversational User Interfaces.
[35] Vaishnavi Mande, Abraham Glasser, Becca Dingman, and Matt Huenerfauth. 2021. [60] Yunhan Wu, Daniel Rough, Anna Bleakley, Justin Edwards, Orla Cooney, Philip R
Deaf Users’ Preferences Among Wake-Up Approaches during Sign-Language Doyle, Leigh Clark, and Benjamin R Cowan. 2020. See what I’m saying? Com-
Interaction with Personal Assistant Devices. In Extended Abstracts of the 2021 paring intelligent personal assistant use for native and non-native language
CHI Conference on Human Factors in Computing Systems. 1–6. speakers. In 22nd international conference on human-computer interaction with
[36] Roisin McNaney, Christopher Bull, Lynne Mackie, Floriane Dahman, Helen mobile devices and services. 1–9.
Stringer, Dan Richardson, and Daniel Welsh. 2018. StammerApp: Designing [61] Mingrui Ray Zhang, Ruolin Wang, Xuhai Xu, Qisheng Li, Ather Sharif, and Ja-
a Mobile Application to Support Self-Refection and Goal Setting for People cob O Wobbrock. 2021. Voicemoji: Emoji Entry Using Voice for Visually Impaired
Who Stammer. In Proceedings of the 2018 CHI Conference on Human Factors in People. In Proceedings of the 2021 CHI Conference on Human Factors in Computing
Computing Systems. 1–12. Systems. 1–18.
[37] NHS. 2019. Dysarthria (difculty speaking). https://www.nhs.uk/conditions/
dysarthria/ Accessed 2nd April 2022.
[38] Jeanette M Olsen and Bonnie J Nesbitt. 2010. Health coaching to improve healthy
lifestyle behaviors: an integrative review. American Journal of Health Promotion
25, 1 (2010), e1–e12.
[39] World Health Organization. 2010. ICD-10 Version:2010. http://apps.who.int/
classifcations/icd10/browse/2010/en#/F98.5 Accessed 23rd Feb 2020.
[40] Anne Marie Piper and James D Hollan. 2008. Supporting medical conversations
between deaf and hearing individuals with tabletop displays. In Proceedings of
the 2008 ACM conference on Computer supported cooperative work. 147–156.
[41] Martin Porcheron, Joel E Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice
Interfaces in Everyday Life. In Proceedings of the 2018 ACM Conference on Human
Factors in Computing Systems (CHI ’18). Association for Computing Machinery,
New York, NY, USA, Article 640, 12 pages. https://doi.org/10.1145/3173574.
3174214
[42] Alisha Pradhan, Leah Findlater, and Amanda Lazar. 2019. " Phantom Friend" or"
Just a Box with Information" Personifcation and Ontological Categorization of
Smart Speaker-based Voice Assistants by Older Adults. Proceedings of the ACM
on Human-Computer Interaction 3, CSCW (2019), 1–21.
[43] Alisha Pradhan, Amanda Lazar, and Leah Findlater. 2020. Use of intelligent
voice assistants by older adults with low technology use. ACM Transactions on
Computer-Human Interaction (TOCHI) 27, 4 (2020), 1–27.
Beyond Subtitles: Captioning and Visualizing Non-speech
Sounds to Improve Accessibility of User-Generated Videos
Oliver Alonzo∗ Hijung Valentina Shin Dingzeyu Li
oa7652@rit.edu vshin@adobe.com dinli@adobe.com
Rochester Institute of Technology Adobe Research Adobe Research
Rochester, NY, USA Cambridge, MA, USA Seattle, WA, USA

ABSTRACT
Captioning provides access to sounds in audio-visual content for
people who are Deaf or Hard-of-hearing (DHH). As user-generated
content in online videos grows in prevalence, researchers have
explored using automatic speech recognition (ASR) to automate
captioning. However, defnitions of captions (as compared to sub-
titles) include non-speech sounds, which ASR typically does not
capture as it focuses on speech. Thus, we explore DHH viewers’
and hearing video creators’ perspectives on captioning non-speech
sounds in user-generated online videos using text or graphics. For-
mative interviews with 11 DHH participants informed the design
and implementation of a prototype interface for authoring text-
based and graphic captions using automatic sound event detection,
which was then evaluated with 10 hearing video creators. Our fnd- Figure 1: The non-speech sound [TRAIN BEEPS] is cap-
ings include identifying DHH viewers’ interests in having important tioned to provide access to this sound for DHH viewers in
non-speech sounds included in captions, as well as various criteria a frame from the TV show Castle. (obtained from [35]).
for sound selection and the appropriateness of text-based versus
graphic captions of non-speech sounds. Our fndings also include
hearing creators’ requirements for automatic tools to assist them 1 INTRODUCTION
in captioning non-speech sounds.
Closed captioning provides access to the audio of audio-visual con-
tent for Deaf or Hard-of-hearing (DHH) people . Using automatic
CCS CONCEPTS speech recognition (ASR) to support automatic captioning of online
• Human-centered computing → Empirical studies in HCI; videos has been increasingly explored, with several online plat-
Empirical studies in accessibility. forms supporting its use (e.g. YouTube [34]). However, auditory
content includes a richer array of sounds beyond speech such as
KEYWORDS music, background noises, or other non-speech sounds like laughter.
accessibility, audio tagging, automatic captions, non-speech sounds Thus, automatic speech recognition alone is not enough to create
complete closed captions that include such non-speech sounds [35].
ACM Reference Format: To the best of our knowledge, the use of automatic sound event
Oliver Alonzo, Hijung Valentina Shin, and Dingzeyu Li. 2022. Beyond Sub- detection when captioning user-generated videos has not been ex-
titles: Captioning and Visualizing Non-speech Sounds to Improve Accessi- plored (except for a blog post from YouTube [12]). Guidelines for
bility of User-Generated Videos. In The 24th International ACM SIGACCESS manual or professional captioning include suggestions for including
Conference on Computers and Accessibility (ASSETS ’22), October 23–26, 2022,
non-speech sounds (e.g. guidelines provided by 3Play Media1 , Web
Athens, Greece. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/
3517428.3544808
Accessibility Initiative2 , Described and Captioned Media Program3
and BBC4 ). Thus, professionally-produced captions often include
non-speech sounds. However, published research on automatic
∗ This work was conducted during a summer internship at Adobe Research. captioning for user-generated content mostly focuses on spoken
content [35]. Thus, in this work, we explore the interests and per-
spectives of DHH adults on the inclusion of non-speech sounds in
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed the context of user-generated videos. We also investigate how to
for proft or commercial advantage and that copies bear this notice and the full citation support content creators in captioning non-speech information and
on the frst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
sounds by using automatic sound event detection tools.
to post on servers or to redistribute to lists, requires prior specifc permission and/or a
fee. Request permissions from permissions@acm.org.
1 https://www.3playmedia.com/blog/captioning-sound-efects-in-tv-and-movies/
ASSETS ’22, October 23–26, 2022, Athens, Greece
2 https://www.w3.org/WAI/media/av/captions/
© 2022 Association for Computing Machinery.
3 https://dcmp.org/learn/602-captioning-key---sound-efects-and-music
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00
https://doi.org/10.1145/3517428.3544808 4 https://bbc.github.io/subtitle-guidelines/#Sound-efects
ASSETS ’22, October 23–26, 2022, Athens, Greece Alonzo et al.

Guidelines for captioning non-speech sounds typically suggest online videos. Thus there is a vast diference in the availability and
the use of verbal descriptions enclosed in brackets (Figure 1). How- quality of captions across online platforms.
ever, considering that non-speech sounds can also be visualized Research has explored various aspects of the user experience
graphically, we also explore the use of graphic captions. (UX) and personalization of subtitles (e.g. [10, 11, 14, 15, 19]), as
To this end, we conducted formative interviews with 11 DHH well as the needs of diverse users (e.g. [1]. However, little work
participants about their experiences with online videos, captions, has focused specifcally on the perspectives of DHH viewers and
and with non-speech sounds in online videos. We asked questions hearing creators on the inclusion of non-speech sounds using text-
about their interests in having non-speech sounds captioned in based or graphic captions.
online videos, including which sounds would be of interest, and
how those sounds should be captioned (e.g. through text-based or 2.1 ASR and Automatic Captions
graphic captions). Our results suggest interest in having important Video authoring and communication tools, as well as online and
non-speech sounds captioned in online videos. How to caption social media platforms, are increasingly adopting ASR to support
those sounds may vary based on the type of video content, the automatic captions (e.g. Premiere Pro5 , Zoom6 , YouTube7 , Insta-
sound type, and the intended audience. We also identifed trade- gram8 , TikTok9 ). However, research on the use of ASR for automatic
ofs between text-based and graphic captions. captions is still on-going.
Our formative study informed the design and implementation of Prior work has examined the preferred appearances of captions
a high-fdelity prototype to caption or visualize non-speech sounds among DHH adults, fnding great diversity in preferences towards
using automatic sound event detection tools. We then conducted various visual characteristics (e.g. font, background color) [4], and
a study with 10 hearing video creators, asking them about their preferences towards the use of punctuation in automatically gener-
experiences creating online videos, adding or not adding captions, ated captions [20]. Prior work has also examined how to evaluate
and about their thoughts on captioning or visualizing non-speech ASR systems among DHH adults, fnding that the literacy levels
sounds, or how automatic tools can support them in this process. of participants (which are diverse among DHH adults [29, 30, 32])
Then, they interacted with our prototype to caption and visualize afect the efectiveness of metrics typically employed for caption-
non-speech sounds in three sample videos. Hearing video creators ing evaluation [5]. More recent work explored which genres DHH
wanted automatic systems to be selective about the sounds iden- adults prioritize for accurate captioning, fnding that news and
tifed and suggested that the appropriateness of graphic captions politics, education, tech and science, and flm and animation were
may vary based on the video content. Accurate time stamps and among the top priorities [6]. Finally, semi-automatic captioning
general descriptions were highlighted as important for automatic approaches using crowd-sourcing have also been explored [31].
systems to provide. However, prior work on the use of ASR for automatic captions
The contributions of our work include: has mostly focused on speech. To the best of our knowledge, the
(1) Empirical evidence of DHH viewer’s preferences for what only exception is a blog post from YouTube [12] which describes
non-speech sounds to caption and how to caption them, the inclusion of three non-speech sounds (music, applause, and
and hearing video creators’ perspectives about what kind of laughter), but does not provide details about the user study support-
support they want for captioning non-speech sounds. ing that decision. Thus, in this work, we explore the perspectives
(2) Guidance for designers of captioning technologies and re- of DHH adults on the inclusion of non-speech sounds when using
searchers in audio-visual analysis felds investigating tech- automatic tools for captioning user-generated videos.
nologies that may support the captioning of non-speech
sounds in user-generated videos. 2.2 Non-Speech Sounds in Real Life and VR
(3) A high-fdelity prototype for captioning or visualizing non-
Research has explored using automatic sound event detection to
speech sounds using automatic sound event detection tools.
create sound-awareness applications for DHH users in physical
environments. Prior work includes investigations of which sounds
are of importance, what aspects of those sounds are of importance,
2 BACKGROUND AND RELATED WORK where sound awareness is more important, approaches to visualize
sounds, and the appropriateness of various devices for visualizing
Captioning provides access to audio content as text and is often
sounds in physical environments (e.g. [7, 18, 22]). Findings from
used to provide access to auditory content for DHH people [35].
prior work reveal urgent and safety-related sounds as sounds of
While the terms captions and subtitles are often used interchange-
general interest for sound awareness [18]. However, participants’
ably (e.g. [10]), their purposes may be diferent: subtitles display the
hearing ability may afect those interests [18]. Some characteristics
language of the audio-visual content for people who do not know
of sounds, such as a sound’s source, identity and location, may be
that language (e.g., a non-English speaker watching an English
more important than other characteristics such as its volume [18].
movie with subtitles in their language), while captions display the
Recent work has also explored non-speech sounds in virtual
audio for people who do not have access to it (e.g., some people
reality (VR). Jain et al. developed a taxonomy of sounds as a starting
who are DHH). Thus, defnitions of and guidelines for captioning
5 https://helpx.adobe.com/premiere-pro/using/speech-to-text.html
include non-speech information and sounds such as speaker infor-
6 https://blog.zoom.us/update-on-live-transcription-for-free-accounts/
mation, environmental noises and sounds, sound efects, music, etc. 7 https://googleblog.blogspot.com/2009/11/automatic-captions-in-youtube.html
Captioning is legally required for content streamed on live TV in 8 https://about.fb.com/news/2020/09/new-automated-captions-powered-by-ai/

countries such as the U.S. [16], but no such requirements exist for 9 https://newsroom.tiktok.com/en-us/introducing-auto-captions
Beyond Subtitles ASSETS ’22, October 23–26, 2022, Athens, Greece

a b c

Figure 2: Examples from prior work on diferent visualization techniques, including: a) dynamic text that varies in size to
indicate the volume of non-speech sounds [33], b) a physics-based approaches that indicate how sound would move through
physical materials [27], and c) using colors and icons for visualizing emotion in spoken content [25].

point to explore sound awareness in VR based on two dimensions the use of such graphic captions as a potential alternative to text-
of sounds: their source and intent [21]. As the need for separate based captions in the context of user-generated videos.
exploration of non-speech sound awareness in VR highlights, the
fndings from one domain (e.g., physical spaces) may not necessarily 3 RESEARCH QUESTIONS
translate to others. However, to the best of our knowledge, the Based on the gaps identifed above, in this, study we investigate
inclusion of non-speech sounds in the context of authoring captions the following research questions:
for user-generated online videos has not been explored. First, focusing on DHH viewers, we investigate:
• RQ1. What are the experiences of DHH viewers with online
videos (e.g. what type of content do they watch, what do
2.3 Visualizing Sounds they like and dislike about it), and with closed captions?
• RQ2. What are DHH viewers’ perspectives on the inclusion
Sounds can be visualized in several ways that vary how they relate of text-based and graphic captions for non-speech sounds in
to properties of the sounds or their level of semantic meaning. As online videos?
shown in Figure 2a, text with dynamic size has been explored to in- Then, focusing on hearing video creators, we investigate:
dicate volume when captioning non-speech sounds [33]. Prior work • RQ3. What are the current practices of hearing creators with
has also explored visualizing other non-speech information, such captioning online videos?
as emotion, using icons and colors (Figure 2c) [17, 25]. Researchers • RQ4. How do hearing creators perceive the use of a pro-
in [17] explored the use of icons and colors to augment existing totype tool based on automatic sound event detection for
speech-oriented text-based captions, which was described as po- captioning non-speech sounds using text-based and graphic
tentially childish by participants. Furthermore, researchers have captions in online videos?
explored visualizations as a way to supplement tactile information
when exploring the use of tactile feedback to provide non-speech
4 INTERVIEWS WITH DHH PARTICIPANTS
information to DHH viewers, fnding benefts when using both
combined such as improved recall of non-speech information [24]. To answer RQs 1 and 2, and inform the design of a prototype used
However, in that work, researchers did not explore participants’ for RQs 3 and 4 (described in section 6.1), we conducted a formative
preferences for visualizations alone, as they were only explored study with DHH participants. This section describes the study’s
alongisde tactile feedback. method, participants, and results.
Research has also explored physics-based approaches, such as cy-
matics (Figure 2b), which imitate the movement of physical matter 4.1 Method
among the waves produced by sounds [27]. These approaches can We conducted semi-structured interviews with DHH participants,
be useful in settings such as audio editing or visual explorations of and our electronic appendix includes our full questionnaire. We
characteristics of sounds. However, they are not semantically mean- began by asking about their experiences watching online videos,
ingful. Thus, prior research on applications of sound awareness [26] their experiences with captions in online videos, and their thoughts
or visualization of non-speech sounds (e.g., in video games [13]) about having non-speech sounds captioned or visualized.
typically includes more semantically meaningful approaches such Then, we prompted participants with three videos illustrating
as icons to represent a sound’s source. text-based and graphic captions for non-speech sounds. We picked
GIFs and animated stickers, which can be overlaid on videos, three videos that varied in: their format (i.e. landscape or portrait);
have been growing in popularity on social media and may provide their genre, which we selected from prior work identifying genre
new ways for visualizing sounds in semantically meaningful ways priority for accurate captioning in online videos (more specifcally,
that are separate from text-based captions (see Figure 3), but to the we selected one genre from each priority level identifed: sports,
best of our knowledge, no work has explored their use as graphic news and entertainment) [6]; and their source, which included
captions for non-speech sounds. Thus, in this work, we also explore YouTube (Figure 3a), BBC (Figure 3b), and TikTok (Figure 3c). Our
ASSETS ’22, October 23–26, 2022, Athens, Greece Alonzo et al.

Sports News Entertainment


Source: YouTube Source: BBC Source: TikTok
Duration: 35 seconds Duration: 22 seconds Duration: 11 seconds

Baseline
Without non-
speech sounds

It’s good to see the smile, isn’t it?

Text-based
Non-speech
sounds

It’s good to see the smile, isn’t it? [laughter] [trumpet playing] [beep]

Graphic
Non-speech
sounds

It’s good to see the smile, isn’t it?

Figure 3: Frames from the video stimuli used in our studies. Each column illustrates the three videos, while each row illustrates
the conditions the videos were shown in. The text-based captions have been enlarged for readability.

video selection was not meant to illustrate all possible combinations targeted for DHH people in the general population, and one group
of those three aspects. Instead, we wanted some diverse combina- from a large university for DHH people in the U.S. Participants’
tions to prompt participants to think about these diferent aspects. mean age was 30 (range = 18 to 47, SD = 9.68), self-identifying
Thus, our three videos consisted of a YouTube video of a sports as male (N = 4), female (N = 6) and non-binary (N = 1). Five self-
scene in landscape format, a BBC news video in landscape format, identifed as Deaf (Deaf, with a capital D, is usually employed to
and a TikTok entertainment video in portrait format. More details refer to members of Deaf culture [28]) and six as Hard-of-hearing.
about these videos are provided as part of our electronic appendix.
We showed each video to participants in three conditions (illus-
trated in Figure 3): a) without captioned non-speech sounds as a
baseline, b) using text-based captions for non-speech sounds, and c) 4.3 Procedure and Data Analysis
using graphic captions for non-speech sounds. The demonstrations Participants received a consent form via e-mail before the study and
for these conditions were created manually using Premiere Pro, met with a researcher via Zoom. The appointments lasted 55 min-
using GIPHY10 stickers for the graphic captions in condition c. utes on average. In the end, participants flled out a demographics
We always started with our baseline (condition a), and then we form and were compensated with a USD$30 Amazon gift card.
rotated the order of conditions b and c across participants. We also We conducted the interviews in English, accommodating partic-
rotated the order of the videos using a Latin Square schedule. After ipants’ self-selected communication preferences, which included
watching all three conditions for each video, we asked participants using American Sign Language (ASL) interpreters (N = 3), profes-
what they liked or disliked about each version, about their perspec- sional captioners (N = 1), automatic captions (N = 3), text-based
tives on text-based and graphic captions for non-speech sounds, chat (N = 3) or spoken English alone (N = 1).
and whom they think these technologies may beneft most. The transcripts, obtained using Zoom’s automatic transcription,
contained 4770 words on average. In the interviews supported by
4.2 Participants ASL interpreters, their voicing of participants’ signing was tran-
We recruited 11 participants through online advertisements posted scribed. The frst author then conducted a thematic analysis using
on social media groups on Reddit and Facebook, including groups an inductive approach to identify codes which were then grouped
into themes, as described by Braun and Clarke in [8]. We also follow
10 https://giphy.com the best practices for reporting our fndings as discussed in [9].
Beyond Subtitles ASSETS ’22, October 23–26, 2022, Athens, Greece

5 FORMATIVE INTERVIEW RESULTS about a sound such as its source, location, tone of voice, volume or
emotion. Participants found graphic captions easier to understand
5.1 Online Videos and Captions
and see, which may make them more universal and beneft viewers
The most commonly mentioned video platform was YouTube. Oth- who cannot read text-based captions (e.g., children), with some
ers also mentioned watching videos on social media platforms, fnding the visual nature of graphic captions more “ASL friendly.”
including Instagram, Facebook, and TikTok, as well as streaming However, participants worried about their potential for distracting,
platforms, such as Netfix, Amazon Prime, Hulu, HBO Max, and Dis- blocking the content of the video, or taking away from the expe-
ney+. Participants cited three main purposes for watching videos rience of watching online videos by impacting their emotional or
online: entertainment, educational, and informational purposes. visual feel. Some also worried that they may feel childish and an-
5.1.1 Participants like the control they have in online videos, but noy adults as it may feel like “talking down” to them. Participants
there is a lack of high-quality captions. Control was an aspect par- sometimes found the graphic captions hard to distinguish from
ticipants liked about online videos, including control over what to other graphics or visual efects in videos. Finally, most participants
watch, the playback, volume, and the ability to turn captions on or suggested that graphic captions should be provided on demand,
of. For instance, P11 said “When I’m watching videos online, I can giving viewers control over whether they see them.
pause it and if I miss something, I can rewind it. I also have control of 5.2.2 Advantages and disadvantages of text-based captions. Partici-
the volume.” Some also highlighted having visual information as a pants liked the familiarity of text-based captions, including their
beneft, and how captioning may be benefcial in learning vocab- location in standard places in the video, and the symbols used to
ulary. Participants also highlighted how captioning technologies indicate non-speech sounds (i.e., brackets). Participants liked that
for online videos, including automatic captioning, have improved they “do the job” without disrupting the video. Participants saw
overall. the potential for text-based captions to include aspects of sounds
However, captioning issues were prominent among what partic- that they liked that graphic captions included (e.g., changes, source,
ipants did not like about online videos, including inaccuracies in source location, and timing). For instance, P5 mentioned that text-
automatic captions and the difculty of fnding well-captioned con- based captions could use verbs to indicate changes (e.g. [applause
tent. P3 mentioned that well-captioned content may be expensive, fading]), P10 suggested that arrows could show where the sounds
citing subscription-based streaming platforms (which often caption come from, and P3 suggested using additional symbols to layer
non-speech sounds, according to P3) as examples. Finally, partic- these details.
ipants also mentioned the loss of captions when copying videos However, participants mentioned that while text-based captions
(e.g., videos reposted to social media), and the lack of support for make videos accessible, they are often not interesting and lack
captions in live videos (e.g., Twitch) or social media platforms. context. Furthermore, an emergent theme was that it may be hard
5.1.2 Participants sought workarounds to understand uncaptioned to verbally describe sounds for people who do not know those
content, but were mindful of others’ experiences. Participants re- sounds. For example, P11, who identifes as hard-of-hearing, shared
ported several workarounds for uncaptioned content such as ask- an anecdote of having to explain what "[roaring]" meant while
ing someone, trying to understand by re-watching the videos, or watching a movie with Deaf friends. While P11 indicated people
fnding the same content in writing. Some hard-of-hearing par- are now more familiar with brackets signifying non-speech sounds,
ticipants also indicated buying headphones or speakers to play they still worried about the difculties of describing sounds with
videos louder, including P11 who said “I put up a lot of volume in the words, and suggested the creation of a glossary to support this.
sound bar, like I spent extra money to get extra sound.”. Some Deaf 5.2.3 How to choose between text-based vs. graphic captions. Vari-
participants also mentioned reading people’s lips. However, many ous factors related to the videos, the sounds and the viewers were
participants indicated often leaving videos if one of the methods discussed as criteria to select between text-based and graphic cap-
above fails, especially if a person is clearly talking in the video. For tions. First, three aspects of a video were discussed, including its
instance, P6 said “If I encounter a video that I need to rewatch or type, length, and how visually busy it is. Specifcally, participants
go back several times, I’ll probably just move on to another thing.” found graphic captions more appropriate for entertainment videos
Notably, P4 and P11 mentioned not wanting to afect others’ ex- (e.g., in social media), whereas text-based captions appeared more
periences by asking them to interpret the content or by making ftting for more serious content. For instance, P6 said “If I’m on
sounds louder. For uncaptioned non-speech sounds, in addition to social media [a] balloon is totally appropriate, right? But if I saw
the workarounds mentioned above, participants indicated recogniz- that same balloon representation in like BBC, not so much.” Graphic
ing missing sounds through the behavior of people in the videos, or captions also seemed more appropriate for shorter videos, whereas
by noticing a gap in the story. While some mentioned instinctively text-based captions may be better for visually busy scenes as well
recognizing that something is missing, others also acknowledged as videos that have spoken content. Second, participants discussed
that they may still inadvertently miss non-speech sounds, including specifc aspects of the sounds, including the visibility of the sound’s
P1 who said “I may miss out what happens beyond what people say source (i.e,. whether it is on screen) and dynamicity (i.e., whether a
without my knowledge.”. sound changes). Participants had mixed opinions about which was
more appropriate for visible sounds, but graphic captions seemed
5.2 Text-based vs. Graphic Captions. better for dynamic sounds (e.g., an applause visualization decreas-
5.2.1 Advantages and disadvantages of graphic captions. Many par- ing in size as its volume decreases). Finally, participants brought up
ticipants liked the ability of graphic captions to provide more details two aspects of viewers as criteria: their age and hearing abilities.
ASSETS ’22, October 23–26, 2022, Athens, Greece Alonzo et al.

P11 suggested that graphic captions seemed more appropriate for nature of captions. Finally, P2 highlighted the importance of details
younger audiences, while P9, for instance, suggested that text-based especially for members of Deaf culture, for whom details and de-
captions may be better for people with slight hearing loss who, as scriptors are not only important, but can also be useful for learning
P11 put it, may already have an understanding of the sounds. vocabulary and learning about sounds:
A few participants also suggested that graphic and text-based “Because in Deaf culture, we usually like descriptors. We
captions could be used simultaneously, with one complementing like to understand what vocabulary is applied to certain
the other, which could have the beneft of standardizing the location sounds so that if it’s explained we can understand what’s
of the graphic captions if they were placed around the text-based happening in English. So, that helps us as individuals
captions to provide complementary information, or as P6 said, it to understand what hearing people are hearing. The
would all be “located in the area that I’m expecting to see that infor- same can be said, same could be true about Deaf people
mation”. P5 also suggested using emojis for “textifying”the captions explaining our experience to hearing people, they may
(i.e., making them closer to the informal language commonly used not understand so we have to be more detailed.”
in text messaging aided by emojis to convey emotion), which could
emphasize emotion. Other participants, in turn, expressed indifer- 5.4 Benefts of Captioning Non-Speech Sounds.
ence for the format, as long as the information is provided, such as Participants commented that the inclusion of non-speech sounds
P4 who believed that “access is the key.”. would beneft DHH people, as well as people with auditory process-
ing disorders. Some also highlighted the importance of considering
5.3 Sounds and Information to Include Deaf-blind people when incorporating these into videos, especially
5.3.1 Caption sounds that are important to the storyline of the video, with graphic captions. Participants also refected on how captioning
and consider viewers’ preferences. Participants expressed strong non-speech sounds can also improve the co-viewing experience of
interest in having non-speech sounds included when captioning DHH and hearing people, since DHH people would not need to ask
online videos. Most participants conditioned their preferences for hearing people for clarifcation to understand content.
what non-speech sounds to include on whether the sound is impor- Ultimately, the benefts of captioning non-speech sounds in-
tant to the video, as opposed to specifc types of sounds. As P4 put cluded understanding or knowing “what’s going on” in videos,
it: “specifcally if it’s relevant to the story. So for example, if a guy is knowing the whole story, or “being at the same level” as everyone
going into a house and he’s not talking but he hears something in the else. Participants indicated how sounds can convey information,
house, then that’s important. But if for example, a woman is drinking set up scenes or the tone for a scene, give more depth, and provide
water and you can hear her swallow, that’s probably not relevant.” spatial awareness, with P6 describing sounds as "condiment" for
This selectivity was in part motivated by the fact that there can be videos. Some participants commented how these may be realized
too much going on, especially when a video also contains spoken subconsciously and thus seem unimportant to hearing people, but
content. Other criteria related to the content of the video such as the these functions are important for fully understanding “what’s going
number of people talking and what they’re doing, and the length on.” Finally, P2 highlighted that text-based captions of non-speech
of the video, are also important to consider when selecting which sounds may help distinguish captioned videos with no dialogue
sounds to visualize. Specifcally, participants mentioned that as the from uncaptioned videos. In other words, text-based captions of
number of people talking in a video increases, or if the level of non-speech sounds in a video without spoken content can indicate
activity of activity of people in videos increases, the need for being that the video is captioned (whereas an uncaptioned video may or
selective becomes more important. P9 also suggested being able to may not contain spoken content). However, graphic captions alone
select only the sounds that are outside of a viewer’s hearing range may not have this same function as it may be hard to distinguish
such as low frequencies (e.g., bass sounds) in the case of P9. Some graphic captions for sounds from general visual efects in videos.
participants also mentioned how some frequencies may interfere
5.4.1 Tools that support captioning non-speech sounds may benefit
with their hearing aids and thus it may be useful to have them cap-
hearing and DHH creators, as well as DHH viewers. While many
tioned (and perhaps silenced). Notably, many participants talked
participants indicated that tools to support including non-speech
about the diversity within the DHH community, often referring to
sounds as text-based or graphic captions would support hearing
it as a “spectrum,” and how this diversity may be refected in their
creators in making their videos more accessible, some also saw
preferences for what sounds should be included. One participant,
potential utility for DHH creators who would like to accessibly
for instance, mentioned not liking music because they felt music is
include non-speech sounds in their videos, but cannot hear those
not compatible with Deaf culture. Thus, what counts as “important”
sounds themselves. Some participants also envisioned how the
sounds may have a high level of user-dependency.
inclusion of non-speech sounds in fully automatic captioning would
5.3.2 Including details about the sounds is important. When talking be useful for viewers, as viewers often do not have the choice to
about what aspects of sounds to include, participants highlighted make videos accessible themselves when creators choose not do so.
details such as the source, its location, and temporal changes within
the sounds. P4 also mentioned it is important to know how sounds 6 PROTOTYPE STUDY
interact with people in videos by detailing who can hear the sounds, Based on the results from our formative interview study, we de-
for example. However, P11 also mentioned it is important to balance signed a second study to investigate how creators can be supported
the level of detail and the length of the descriptions, especially for by automatic tools to caption or visualize non-speech sounds. To
text-based captions, to consider slow readers and the fast-moving this end, we developed a prototype for authoring text-based and
Beyond Subtitles ASSETS ’22, October 23–26, 2022, Athens, Greece

graphic captions of non-speech sounds for videos, and a video components. As noted above, the visualizations provided were ob-
demonstration of this prototype is included as part of our electronic tained using the GIPHY API for developers, which provides options
appendix. We conducted semi-structured interviews where par- for requesting GIFs and Stickers. The Emoji search in our prototype
ticipants interacted with our prototype and answered questions was thus a search for stickers with the word “emoji” appended.
about their experiences before and after their interactions. This
section describes our prototype, followed by our interview method, 6.2 Method
participants and data analysis.
The study began by asking participants about their experiences
creating online videos, including how often they create videos for
6.1 Prototype posting online and what types of videos they create. We then asked
While we drew inspiration from existing captioning interfaces for participants about their experiences captioning videos, which in-
our general interface (e.g. YouTube and Adobe Premiere Pro), our cluded questions about how often they caption their videos, how
design was mostly function-driven, guided by the need to author they make the decision to caption or not caption their videos, as
(section 6.1.1) and preview (section 6.1.2) the captions. Moreover, well as what tools they use and what content they consider when
our formative interview results also informed specifc design deci- they do caption their videos. We then asked them questions related
sions as highlighted below. to captioning non-speech sounds, including their current consider-
To use the prototype, users start by loading a video into the ations of including non-speech sounds in captions.
authoring interface. Our system displays a list of non-speech sounds We then introduced the prototype to participants through a 3-
in the video as would be detected by an automatic algorithm (Figure minute demo that introduced the video player, the sound events
4a), as well as a preview of the video (Figure 4g). Each occurrence of tab, and all the functions of the prototype. Then, participants inter-
a non-speech sound is represented by a card user-interface element, acted with the prototype to caption or visualize the three videos we
which includes a text description of the sound, a time-stamp of had used in the formative interview study, as described in section
when the sound occurs, and optionally a graphic that can be used 4.1 under three conditions: 1) only adding sound events manually
as a graphic caption for the sound (Figure 4a). Motivated by the (i.e. without using the Wizard-of-Oz automatic sound detection
fndings in our formative study, if the sound event card is generated system); 2) a Wizard-of-Oz condition using results that were cre-
by an automatic algorithm, it also includes a label suggesting the ated by a member of our team to produce error-free output; and
user to consider adding more details to the suggested description. 3) using the direct output from an actual automatic sound event
6.1.1 Authoring non-speech sound captions. Each sound event card detection system [23], which contained errors. This structure al-
is added as a caption to the video. The text description is used for the lowed us to prompt participants for comparisons between manual
text-based caption, the graphic for the graphic caption, and the time additions and the Wizard-of-Oz automatic system, but also between
stamp indicates when the caption will appear in the video. Users human-quality output and automatic results containing errors. The
can add, delete or edit the sound card to customize the captions. automatic results included errors such as missing sounds, misla-
They can change the text description or the timestamp by simply beling sounds and labeling a sound multiple times. However, the
typing over the corresponding felds. To change the visuals, a user time stamps for the sounds that were detected were often close to
can click on "Add visuals," which opens a modal search box. The the actual timing of the sounds. All the labels included in these
search box, powered by the GIPHY API for developers11 , supports conditions and their time stamps are provided in our electronic
searching for three types of visualizations: GIFs (animated images), appendix.
Stickers (animated vector drawings or image cutouts) and Emojis All participants interacted with the prototype in all three condi-
(more specifcally, stickers containing emojis). Users can specify tions. We asked participants to do the manual-only condition frst
the location, size and rotation of the visuals through direct manip- to ask them to imagine what they would want from an automatic
ulation in the video player while previewing the results. Finally, system to support them in this task. We rotated the other two condi-
as participants in our formative study indicated that they were tions using the Wizard-of-Oz automatic systems across participants.
interested in important sounds, our prototype supports marking a We also rotated the videos using a Latin Square schedule.
subset of sounds as important by toggling the star icon on the top We asked participants to think out loud as they captioned or
right corner of a sound event card. visualized non-speech sounds in the videos. Once participants were
satisfed with the results, they were exported as JSON fles contain-
6.1.2 Previewing the captions. Users can preview the captions in ing metadata for the text-based and graphic captions participants
the video player by selecting the "Descriptions" option under the added so that we can replicate their work. After fnishing each
video player. Similarly, to preview graphic captions, the user selects video, we asked about their experiences captioning or visualizing
the "Visuals" option (Figure 4b). By default, all sound event cards the non-speech sounds for each video. After fnishing all three
are included as captions in the preview. Users can select Starred, to videos, we asked participants to refect on their overall experiences,
only preview captions for the starred sound events. and to compare the use of text-based versus graphic captions as
well as manual captioning versus automatic systems.
6.1.3 Implementation. We built the front-end of the prototype us-
ing the React JavaScript library for creating the HTML components
and functionality, and the Bootstrap library for the styling of the 6.3 Participants
Participants were recruited from two sources: internal communica-
11 https://developers.giphy.com tion channels at an industry research lab, as well as special-interest
ASSETS ’22, October 23–26, 2022, Athens, Greece Alonzo et al.

a Individual Sound e Delete


Events

Graphic
Caption i
b Timestamps f Star Preview
c Text-based Caption
g
d Graphic Caption

Preview Settings h

Figure 4: Our prototype provides an interface to manage individual sound events (a) with their respective timestamps (b),
enabling authoring text-based captions (c) and graphic captions (d). The user can highlight (f) or remove (e) these sound events.
A a video player allows users (g) to preview the captions by selecting the appropriate setting (h) for text-based ("Descriptions")
or graphic ("Visuals") captions. The fgure shows "Visuals" selected and thus graphic captions are previewed (i).

social media groups, focused on video creation and university captioning content, which included personal experiences bene-
groups local to one of our co-authors. Our recruitment criteria fting from captions for comprehension or for understanding peo-
included having experience creating videos for posting online, but ple with diferent accents. While some indicated only captioning
we did not specify specifc levels of experience so that we could get videos when it is required for work or academic assignments, others
diverse levels and thus a diverse set of perspectives. were motivated to do so for ensuring their videos are accessible,
We recruited ten participants, which included six who self-identifed especially when making videos for classes with classmates with
as male, three as female and one as non-binary. Participants’ aver- cognitive disabilities or who are DHH. P5 also mentioned adding
age age was 26 (range = 18 to 29, SD = 6.6). Participants self-reported captions because of stylistic purposes to add personality (e.g. to
levels of experience included developing (N = 3), competent (N = emphasize something with specifc fonts).
5), advanced (N = 1) and expert (N = 1). Participants reported cap-
tioning the videos they create rarely (N = 3), occasionally (N = 2), 7.1.1 Participants tend to produce higher quality captions when
often (N = 4) and always (N = 1). the stakes are higher. Some participants indicated producing high-
quality captions only for academic- or work-related videos. Par-
6.4 Procedure ticipants indicated manually captioning videos, starting from an
Participants were contacted via e-mail to receive a consent form automatic system and editing the output, or paying an expert to
ahead of the study. Participants then met with a researcher via do it. Most participants indicated relying on automatic systems to
Zoom for a 60-minute appointment. At the end of the appointment, do some of the work, or an expert to do it for them, because of
participants flled out a demographics form and were compensated how tedious the manual captioning process is. For personal videos,
with a $20 Amazon gift card for their participation. however, many participants indicated using automatic captioning
tools alone, even when they may be inaccurate. For instance, P1
6.5 Data Analysis said “At work, the stakes are higher [. . . ] whereas for personal videos
it’s a little more of a laid back approach [. . . ] I’m doing things on
The video recordings of the interviews were an average of 55 min-
YouTube and if I’m only going to get 10 to 100 views, then the auto
utes long. Zoom’s automatic transcription system was used through-
captions are more than enough.”
out all the interviews to obtain transcripts of these videos, obtaining
an average length of 7680 words per transcript. As we only collected
7.1.2 Most participants only captions videos with spoken content.
open-ended comments from participants, our analysis was similar
Many participants indicated only captioning videos that have spo-
to our formative user study. Namely, the data from this study was
ken content. For example, P4 said “I’d say if there’s spoken content,
analyzed by the frst author using a thematic analysis approach,
I’ll do closed captions.” Some participants mentioned not captioning
frst using an inductive approach to identify codes in the data which
videos when a video only contains music, text or pictures, or if there
were then grouped into themes following the approach described
is no spoken content. Most participants indicated never having in-
in [8, 9].
cluded non-speech sounds when captioning videos (which means
they were creating, by defnition, subtitles). However, 3 participants
7 PROTOTYPE STUDY RESULTS
indicated having occasionally included them, including P5 who said:
In this section, we summarize the results from our prototype study, “I wouldn’t say consistently, every single time I encounter non-speech
which address RQs 3 and 4. sounds, but I have captioned them before.” P9 also mentioned adding
a song’s title if a video only contains music, or information about
7.1 Previous Captioning Experiences non-speech sounds if they are the main point of a video. P9 has
We had a mix of participants who either caption their video con- also uploaded videos to social media platforms that do not sup-
tent regularly or rarely. Participants discussed motivations for port manually adding captions, but found workarounds to caption
Beyond Subtitles ASSETS ’22, October 23–26, 2022, Athens, Greece

non-speech sounds such as commenting, replying or using a text could provide guidance such as when to add graphic captions for
overlay or a sticker. a sound, why someone might want to add a graphic caption, and
which visualizations to include.
7.2 Perspectives on Text-Based and Graphic Participants talked about how it would be useful to break down
Captions the timestamps by scene for sounds that linger over scene changes
when employing graphic captions, as their appropriateness may
The themes we identifed related to hearing creators’ perspectives
vary by scene. P5 suggested that it would be useful if the system
on text-based and graphic captions included challenges for creating
could automatically detect scene changes and suggest time stamps
descriptions of non-speech sounds for text-based captions, as well as
for those scenes within a specifc sound event. P4 also suggested
the appropriateness and benefts of graphic captions. The following
that the system could automatically identify visual objects in the
subsections summarize these themes.
video to pin graphic captions to so that if the object moves, the
7.2.1 Challenges for creating text-based captions for non-speech graphic caption moves along with it.
sounds included trade-ofs between completeness and concision. Many
participants expressed doubts about the best way to describe sounds 7.2.3 Graphic captions may be beneficial, but also distracting. Many
for text-based captions, the best wording, and what level of detail to participants envisioned benefts from graphic captions, such as
include. Participants discussed trade-ofs between length, complex- being able to add humor to entertainment or social media videos.
ity, accuracy, and how interesting they could make the descriptions. For example, P7 said: “The social media crowd, you know people
For example, P6 said “I’m trying to think of how to put that across in who are using Facebook, TikTok and Instagram, will defnitely get
a short, interesting way.” Considering these trade-ofs, participants a kick out of this.” However, some participants were also wary of
commented that on-screen sounds may need fewer details as the the potential for visuals to “take away” or distract from the main
sound source is already visible. For instance, P5 said: “I think [’trum- content. For example, P10 noted: “If I were to add any visuals, it
pet’] gets the point, especially if there’s a guy playing the trumpet would take away from the actual subject matter of the video”.
so I might leave that just for conciseness.” Describing ambiguous
sounds (i.e. sounds with unclear sources) was also challenging for 7.3 Automatic Systems Should Identify
some participants. For example, when describing a beeping noise
Important Sounds
from a human in one of the sample videos, P8 expressed confusion
by saying: “I didn’t understand how to describe it. So, it was a bit When asked about what participants wanted from an automatic
confusing but I thought car beeping noise would be the most accurate system that could identify non-speech sounds for them, most indi-
description because I think everybody knows about a beeping noise.” cated they would want a system to identify important sounds. Some
Considering these challenges, some participants indicated that even highlighted the prototype as doing “a good job” of fltering
guidance for creating text-based captions would be useful. P4 indi- important sounds when asked if they noticed any missing sounds
cated that for the text-based captions, guidance could be structural in the results. P3, for instance, said: “The car engine humming in the
(e.g. using verbs vs. nouns, or what number of words to use). P1 background, that was missing but I think that’s good.” Participants’
and P5, in turn, suggested that because coming up with good de- inclinations to only include important sounds were also evident
scriptions may require a wider English vocabulary, having some in their lack of interactions with the star function (a function de-
support for description alternatives may also be useful. scribed in Section 6.1). Participants found the purpose of the star
unclear as they had only included important sounds already, and
7.2.2 The appropriateness of graphic captions varies with the type thus marking them with a star seemed redundant. P4, for example,
of video and certain characteristics of scenes. Most participants sug- said “when you were explaining the starring feature in the tutorial
gested that the appropriateness of graphic captions varies depend- earlier, I thought ‘I don’t know if that’s something I would use’ because
ing on the type of content. For example, most suggested that the I would only include sounds that are important.” When discussing
visualizations available in our study were not appropriate for the what constitutes an important sound, participants indicated sounds
BBC video because those tend to be more “serious.” However, the that are “noticeable,” that “stood out,” or that provided context for
visualizations seemed appropriate for the TikTok video. Both P2 spoken content.
and P3 specifcally described the appropriateness depending on the Many participants were not sure about whether “background”
“place and purpose” of the video. There was disagreement, however, or “insignifcant” sounds should also be included. However, P2 ac-
about whether the formality of the visualizations themselves would knowledged the importance of including all sounds for someone
afect their appropriateness. P2, for example, suggested that if the who may not have the “privilege” to select which sounds to pay
BBC created their own “more formal looking” set of visualizations, attention to. Thus, after engaging with the prototype, P2 decided
graphic captions could potentially be more appropriate. However, to include all of the sounds in a video and star the important ones.
P5 indicated that “even if we were to have more formal visuals for Notably, when talking about what to do with “sounds that do not
something like the BBC, I think it’s just not in their guidelines to use convey information” (P9), participants drew comparisons to alterna-
[visual] efects.” Finally, participants discussed the utility of graphic tive text (i.e. text added to images for screen readers to read). Both
captions for sounds that are of screen, with many suggesting they P4 and P9 talked about how guidelines for alternative text suggest
would not add graphic captions for sounds that are already on marking images that do not convey information as “decorative.”
screen as those are already visually available. Considering these Thus, both participants considered sounds that are not important
variations of appropriateness, participants suggested that a system to be akin to decorative images.
ASSETS ’22, October 23–26, 2022, Athens, Greece Alonzo et al.

Finally, many participants also talked about what information correctly identifying that the sound labeled as a “horse” was a child
about a sound they would like an automatic system to identify. using the spoken content from the video, which referred to a child.
Participants mentioned wanting only general descriptions of what “I guess it is confrmation that it is a baby,” P2 said.
the sound may be as too much specifcity would be likely to in-
troduce errors. On the other hand, some participants highlighted 7.4.3 The ability to adjust the system’s sensitivity may help in deal-
the importance of obtaining accurate time stamps of the sounds ing with errors. When discussing errors from the automatic system,
from the system, which we explore in more detail in the next sub- a few participants believed those errors were an issue of sensitivity.
section. Most participants, however, indicated wanting both the P6, for example, suggested that the errors were caused because
descriptions and the timestamps for the sounds. the system may have been “too sensitive.” Thus, these participants
suggested adding a “reanalyze button” (P1) or a “sensitivity slider”
7.4 Dealing with Errors from Automatic (P7) that could help to reduce the errors from the system. P7 defned
sensitivity in terms of the number of sounds identifed, but also in
Systems terms of how “obvious” the sounds that the system identifed are.
In the conditions using the automatic system, some scenarios in-
cluded both descriptions and timing errors. In some cases, partici- 8 DISCUSSION AND TAKEAWAYS
pants suggested that the automatic system ended up being unhelp-
ful when containing errors, with some deleting all the suggestions This section summarizes the takeaways from our studies for those
provided and starting from scratch instead. For instance, P10 com- interested in captioning or visualizing non-speech sounds in user-
mented: “I could argue that [the automatic system] actually made generated content, including researchers investigating automatic
things a little bit slower than just putting that things in manually.” captions. Considering that our fndings include the perspectives of
However, participants’ discussions of errors suggest that their ex- DHH viewers on new approaches to visualize non-speech sounds
pectations difered when talking about the labeling of the sound (i.e. graphic captions), some of our takeaways may also be insight-
versus the time stamps for the sounds. ful for industry professionals interested in visualizing non-speech
sounds in professionally-produced content.
7.4.1 The accuracy of timestamps may be the most important. Many Be selective about non-speech sounds. The results from both of
participants highlighted accurate timestamps as an important fea- our studies suggest that both DHH and hearing users valued being
ture of an automatic system. For example, P5 said “Having the selective about which non-speech sounds to include. DHH partici-
timings already sorted for you beforehand is very convenient. It takes pants expressed interest in only having sounds that are important,
out one of the biggest time-consuming parts of a process.” Thus, par- as including every single detail may provide too much information.
ticipants seemed more sensitive to timing errors than errors in Hearing creators, in turn, described important sounds as what they
describing the sounds. For instance, P9 attributed the unhelpful- would want an automatic system to identify because captioning all
ness of errors specifcally to timestamps: “Erroneous timestamps sounds in a video may be too time-consuming. These fndings align
are not as useful.” P1 also mentioned that timing errors would be with guidelines for creating closed captions suggests including non-
time-consuming in longer videos: “What if I had a 40-minute video? speech sounds, which suggest only doing so when necessary (e.g.
Am I gonna have to go through and look at every second?” However, the BBC’s captioning guidelines12 ). Our work further encourages
many participants agreed that having a “template” or “framework” researchers working on automatic sound event detection to con-
with accurate timestamps is useful even when the descriptions of sider importance estimations when detecting non-speech sounds in
the sounds are incorrect. user-generated videos, and we provide insights about what consti-
tutes an important sound. For instance, hearing creators discussed
7.4.2 Ambiguous sounds may be dificult to identify and describe, the criteria of whether a sound afects the spoken content of a video.
but strategies can help. Ambiguous sounds were at the core of par- Finally, estimations of sound importance or other sound qualities
ticipants’ discussions of errors in the automatic labeling of the (e.g. volume) could be used to adjust the “sensitivity” of automatic
sounds. Many acknowledged that dealing with ambiguous sounds systems and narrow down their results.
is difcult even for humans. For example, P2 said: “I can see how the
computer that’s identifying the noise really has to be smart because [it Include details, but balance with potential for distraction.
is hard] even for me.” For instance, one of the errors introduced by Our results suggest that DHH participants are interested in hav-
the automatic system was labeling an of-screen whining child as a ing detailed information about non-speech sounds (e.g. the source,
“horse.” Some participants actually trusted that this was a horse. P4 source location and changes in sounds) in text-based or graphic
and P7 mentioned that even though they understood that a horse captions, which aligns with prior work on sound visualization [18]
did not make sense in that context, the suggestion primed them and the use visual-tactile feedback for non-speech information in
and they could not think of the sound as something else. How- captions [24]. However, DHH participants in our study worried that
ever, participants also discussed strategies to disambiguate certain too many details in text-based captions, and graphic captions in
sounds. For example, P8 found an error that labeled a trumpet as general (which naturally included more details), could be distract-
a mosquito understandable because visual information from the ing. Thus, those interested in captioning or visualizing non-speech
video was needed to disambiguate that sound: “To tell the diference sounds in their videos should consider including details, but be
that this is a trumpet, you need a little bit of visual along with the
sound to know.” Others also talked about using spoken content in
the video to disambiguate sounds. For example, P2 talked about 12 https://bbc.github.io/subtitle-guidelines/#Intonation-and-emotion
Beyond Subtitles ASSETS ’22, October 23–26, 2022, Athens, Greece

mindful of the potential for distracting viewers from the main con- do not provide any information13 , suggest providing indications
tent. A possible design intervention to support creators would be when videos only contain decorative sounds may help reduce the
providing guidance within the user interface on how to structure ambiguity of uncaptioned videos.
the text-based captions (e.g. how many words to use, and which
types of words) and how much detail to include. Providing accurate timestamps to creators is important, but
general descriptions are helpful, too. Our results suggest that
Consider properties about the video, sound, and audience accurate time stamps are important when using an automatic sys-
for choosing between text-based and graphic captions, and tem, as our hearing creators suggested that identifying these is
their respective benefts. Our fndings reveal diferent factors to the most time-consuming aspect of captioning in general. While
consider when choosing text-based or graphic captions. First, the descriptions were also important to many of our participants, gen-
type of video appeared to determine the appropriateness of graphic eral descriptions seemed more useful than specifc ones given that
captions. The results from both studies suggested that graphic cap- specifcity may introduce errors, especially when dealing with am-
tions may be more appropriate for entertainment videos, while biguous sounds. Our results also suggest ways in which ambiguity
text-based captions may be more appropriate for “serious” videos. can be reduced in automatic systems, which was a source of dif-
Future work can further explore whether varying the design of fculty highlighted by YouTube for including non-speech sounds
graphic captions to better match the content would afect viewers’ in their automatic captions [12]. More specifcally, participants
and creators’ preferences for the use of graphic captions. Other highlighted the use of semantic information from both visual and
factors, such as the visibility of a sound on screen and demographic spoken content to reduce ambiguity, which aligns with current
factors of the viewers (e.g. their age and hearing ability) are impor- trends in multi-modal analysis and understanding.
tant too. While future work could explore each of these factors in
more detail, our fndings provide guidance for designers of these 9 CONCLUSION, LIMITATIONS, AND
technologies to consider these factors about the videos, the sounds
FUTURE WORK
and their audiences. Our fndings also shed light on what text-based
and graphic captions may signify. The latter may serve to illustrate In this paper, we presented two studies with DHH and hearing
recognizable sounds and indicate the location of the sources, the participants to explore their perspectives on captioning non-speech
precise timing of the sounds, as well as changes in the sounds. sounds, using text-based or graphic captions, in user-generated
Text-based captions, on the other hand, may better describe harder- videos. Our fndings include DHH participants’ interests in hav-
to-visualize abstract sounds where a greater level of verbal detail ing important non-speech sounds in these videos, while hearing
may be required. creators also indicated an inclination toward only including im-
portant sounds when captioning non-speech sounds. Our fndings
Graphic captions should be optional and may need standard- also include trade-ofs between text-based and graphic captions
ization. DHH participants suggested that graphic captions should for captioning non-speech sounds, and potential factors for de-
be optional for viewers instead of being embedded in videos, an ap- termining their appropriateness. Finally, we explored the use of
proach analogous to close captioning (as opposed to open captions, automatic tools to support hearing creators when captioning non-
which are embedded in videos). Furthermore, DHH participants speech sounds and identifed guidance for future work in this area.
also suggested that graphic captions may be difcult to distinguish There were several limitations in our work, and avenues for
from other graphics in online videos. The ability to overlay them on future work. First, our work was qualitative, with a small sample
demand may also support viewers in distinguishing which graphics size and a small selection of videos. We identifed potential factors
are part of the video as opposed to graphic captions. However, cur- at play when determining the appropriateness of graphic captions
rent captioning technologies only support text (or animated text). (including the content and sound type, the sound location, viewers’
Thus, new captioning or media formats, such as BBC’s proposal for demographic factors, and the graphic captions’ style). However,
Object-Based Media [2, 3, 19], may need to be developed to support future work should explore these factors using a larger selection of
the addition and standardization of optional graphic captions. carefully controlled videos among a larger sample size of viewers.
Similarly, in our prototype study, participants were using sample
Text-based captions of non-speech sounds may help distin- videos provided by us. A study with content creators editing their
guish uncaptioned videos. DHH participants also suggested that own videos could reveal further insights relevant to real use cases.
using text-based captions for non-speech sounds can support view- Our participants also had diverse levels of skills and experience. Fu-
ers to distinguish videos without spoken content from uncaptioned ture work investigating the preferences of participants with specifc
videos, as there are times when DHH viewers cannot tell if a video levels of skills or experience may yield further insights.
is not captioned or if it simply does not contain spoken content. Our formative study suggested that automatic sound event de-
If a video only contains non-speech sounds, and those sounds are tection may be helpful for DHH creators to caption non-speech
captioned, viewers may conclude that the video does not contain sounds in their videos, which may introduce diferent challenges.
spoken content (although there is still a possibility that a creator Thus, future work should also explore its use among DHH creators
only captions non-speech sounds in a video with spoken content). and its implications.
It is also possible that a video may contain unimportant (or dec-
orative) non-speech sounds. Hearing creators’ comparisons with
alt-text, and the respective guidelines for decorative images that 13 https://webaim.org/techniques/alttext/#decorative
ASSETS ’22, October 23–26, 2022, Athens, Greece Alonzo et al.

ACKNOWLEDGMENTS closed-captioning-rules/ Accessed: 2021-06-01.


[17] Deborah I Fels, Daniel G Lee, Carmen Branje, and Matthew Hornburg. 2005.
We thank Justin Salamon for helping with sound event detection Emotive Captioning and Access to Television. https://doi.org/10.1145/3173574.
and the Center for Accessibility and Inclusion Research (CAIR) at 3173665
[18] Leah Findlater, Bonnie Chinh, Dhruv Jain, Jon Froehlich, Raja Kushalnagar, and
RIT for feedback on a early draft of the paper. This material is based Angela Carey Lin. 2019. Deaf and Hard-of-Hearing Individuals’ Preferences for
upon work supported by the National Science Foundation under Wearable and Mobile Sound Awareness Technologies. Association for Computing
awards No. 1822747 and 2125362. Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300276
[19] Benjamin M. Gorman, Michael Crabb, and Michael Armstrong. 2021. Adaptive
Subtitles: Preferences and Trade-Ofs in Real-Time Media Adaption. In Proceedings
REFERENCES of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama,
[1] Mike Armstrong, Andy Brown, Michael Crabb, Chris J Hughes, Rhianne Jones, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA,
and James Sandford. 2016. Understanding the diverse needs of subtitle users in a Article 733, 11 pages. https://doi.org/10.1145/3411764.3445509
rapidly evolving media landscape. SMPTE Motion Imaging Journal 125, 9 (2016), [20] Michael Gower, Brent Shiver, Charu Pandhi, and Shari Trewin. 2018. Leveraging
33–41. Pauses to Improve Video Captions. In Proceedings of the 20th International ACM
[2] Mike Armstrong and Michael Crabb. 2017. Exploring ways of meeting a wider SIGACCESS Conference on Computers and Accessibility (Galway, Ireland) (ASSETS
range of access needs through object-based media-workshop. In Conference on ’18). Association for Computing Machinery, New York, NY, USA, 414–416. https:
Accessibility in Film, Television and Interactive Media, York, UK. //doi.org/10.1145/3234695.3241023
[3] BBC Research & Development. [n.d.]. Object-Based Media. [21] Dhruv Jain, Sasa Junuzovic, Eyal Ofek, Mike Sinclair, John Porter, Chris Yoon,
https://www.bbc.co.uk/rd/object-based-media. Accessed: 2022-06-01. Swetha Machanavajhala, and Meredith Ringel Morris. 2021. A Taxonomy of
[4] Larwan Berke, Khaled Albusays, Matthew Seita, and Matt Huenerfauth. 2019. Sounds in Virtual Reality. In Designing Interactive Systems Conference 2021 (Virtual
Preferred Appearance of Captions Generated by Automatic Speech Recognition Event, USA) (DIS ’21). Association for Computing Machinery, New York, NY,
for Deaf and Hard-of-Hearing Viewers. In Extended Abstracts of the 2019 CHI USA, 160–170. https://doi.org/10.1145/3461778.3462106
Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) [22] Dhruv Jain, Angela Lin, Rose Guttman, Marcus Amalachandran, Aileen Zeng,
(CHI EA ’19). Association for Computing Machinery, New York, NY, USA, 1–6. Leah Findlater, and Jon Froehlich. 2019. Exploring Sound Awareness in the Home for
https://doi.org/10.1145/3290607.3312921 People Who Are Deaf or Hard of Hearing. Association for Computing Machinery,
[5] Larwan Berke, Sushant Kafe, and Matt Huenerfauth. 2018. Methods for Evaluation New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300324
of Imperfect Captioning Tools by Deaf or Hard-of-Hearing Users at Diferent Reading [23] Qiuqiang Kong, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D.
Literacy Levels. Association for Computing Machinery, New York, NY, USA, 1–12. Plumbley. 2020. PANNs: Large-Scale Pretrained Audio Neural Networks for Audio
https://doi.org/10.1145/3173574.3173665 Pattern Recognition. IEEE/ACM Transactions on Audio, Speech, and Language
[6] Larwan Berke, Matthew Seita, and Matt Huenerfauth. 2020. Deaf and Hard- Processing 28 (2020), 2880–2894. https://doi.org/10.1109/TASLP.2020.3030497
of-Hearing Users’ Prioritization of Genres of Online Video Content Requiring [24] Raja S. Kushalnagar, Gary W. Behm, Joseph S. Stanislow, and Vasu Gupta. 2014.
Accurate Captions. In Proceedings of the 17th International Web for All Conference Enhancing Caption Accessibility through Simultaneous Multimodal Information:
(Taipei, Taiwan) (W4A ’20). Association for Computing Machinery, New York, Visual-Tactile Captions. In Proceedings of the 16th International ACM SIGACCESS
NY, USA, Article 3, 12 pages. https://doi.org/10.1145/3371300.3383337 Conference on Computers & Accessibility (Rochester, New York, USA) (ASSETS
[7] Danielle Bragg, Nicholas Huynh, and Richard E. Ladner. 2016. A Personalizable ’14). Association for Computing Machinery, New York, NY, USA, 185–192. https:
Mobile Sound Detector App Design for Deaf and Hard-of-Hearing Users. In //doi.org/10.1145/2661334.2661381
Proceedings of the 18th International ACM SIGACCESS Conference on Computers [25] Daniel G. Lee, Deborah I. Fels, and John Patrick Udo. 2007. Emotive Captioning.
and Accessibility (Reno, Nevada, USA) (ASSETS ’16). Association for Computing Comput. Entertain. 5, 2, Article 11 (April 2007), 15 pages. https://doi.org/10.1145/
Machinery, New York, NY, USA, 3–13. https://doi.org/10.1145/2982142.2982171 1279540.1279551
[8] Virginia Braun and Victoria Clarke. 2006. Using thematic [26] Tara Matthews, Janette Fong, F. Wai-Ling Ho-Ching, and Jennifer Mankof. 2006.
analysis in psychology. Qualitative Research in Psychology 3, Evaluating non-speech sound visualizations for the deaf. Behaviour & Information
2 (2006), 77–101. https://doi.org/10.1191/1478088706qp063oa Technology 25, 4 (2006), 333–351. https://doi.org/10.1080/01449290600636488
arXiv:https://www.tandfonline.com/doi/pdf/10.1191/1478088706qp063oa arXiv:https://doi.org/10.1080/01449290600636488
[9] Virginia Braun and Victoria Clarke. 2022. Thematic Analysis. https://www. [27] John McGowan, Grégory Leplâtre, and Iain McGregor. 2017. CymaSense: A Real-
thematicanalysis.net. Accessed: 2022-03-01. Time 3D Cymatics-Based Sound Visualisation Tool. In Proceedings of the 2017 ACM
[10] Andy Brown, Rhia Jones, Mike Crabb, James Sandford, Matthew Brooks, Mike Conference Companion Publication on Designing Interactive Systems (Edinburgh,
Armstrong, and Caroline Jay. 2015. Dynamic Subtitles: The User Experience. In United Kingdom) (DIS ’17 Companion). Association for Computing Machinery,
Proceedings of the ACM International Conference on Interactive Experiences for TV New York, NY, USA, 270–274. https://doi.org/10.1145/3064857.3079159
and Online Video (Brussels, Belgium) (TVX ’15). Association for Computing Ma- [28] Carol Padden and Tom Humphries. 2005. Inside Deaf Culture. Harvard University
chinery, New York, NY, USA, 103–112. https://doi.org/10.1145/2745197.2745204 Press. http://www.jstor.org/stable/j.ctvjz83v3
[11] Andy Brown, Jayson Turner, Jake Patterson, Anastasia Schmitz, Mike Armstrong, [29] S. J. Parault and H. M. Williams. 2010. Reading Motivation, Reading Amount,
and Maxine Glancy. 2017. Subtitles in 360-Degree Video. In Adjunct Publication and Text Comprehension in Deaf and Hearing Adults. Journal of Deaf Studies
of the 2017 ACM International Conference on Interactive Experiences for TV and and Deaf Education 15, 2 (2010), 120–135. https://doi.org/10.1093/deafed/enp031
Online Video (Hilversum, The Netherlands) (TVX ’17 Adjunct). Association for [30] C. B. Traxler. 2000. The Stanford Achievement Test, 9th Edition: National Norming
Computing Machinery, New York, NY, USA, 3–8. https://doi.org/10.1145/3084289. and Performance Standards for Deaf and Hard-of-Hearing Students. Journal of
3089915 Deaf Studies and Deaf Education 5, 4 (Jan 2000), 337–348. https://doi.org/10.1093/
[12] Sourish Chaudhuri. 2017. Adding sound Efect information to Youtube captions. deafed/5.4.337
https://ai.googleblog.com/2017/03/adding-sound-efect-information-to.html Ac- [31] M. Wald. 2011. Crowdsourcing Correction of Speech Recognition Captioning Errors.
cessed: 2021-06-01. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.
[13] Karen Collins and Peter J. Taillon. 2012. Visualized sound efect icons for improved 1145/1969289.1969318
multimedia accessibility: A pilot study. Entertainment Computing 3, 1 (2012), [32] Dawn Walton, Georgianna Borgna, Marc Marschark, Kathryn Crowe, and Jes-
11–17. https://doi.org/10.1016/j.entcom.2011.09.002 sica Trussell. 2019. I am not unskilled and unaware: deaf and hearing learners’
[14] Michael Crabb, Rhianne Jones, and Mike Armstrong. 2015. The Development of self-assessments of linguistic and nonlinguistic skills. European Journal of Spe-
a Framework for Understanding the UX of Subtitles. In Proceedings of the 17th cial Needs Education 34, 1 (2019), 20–34. https://doi.org/10.1080/08856257.2018.
International ACM SIGACCESS Conference on Computers & Accessibility (Lisbon, 1435010 arXiv:https://doi.org/10.1080/08856257.2018.1435010
Portugal) (ASSETS ’15). Association for Computing Machinery, New York, NY, [33] Fangzhou Wang, Hidehisa Nagano, Kunio Kashino, and Takeo Igarashi. 2015.
USA, 347–348. https://doi.org/10.1145/2700648.2811372 Visualizing video sounds with sound word animation. In 2015 IEEE International
[15] Michael Crabb, Rhianne Jones, Mike Armstrong, and Chris J. Hughes. 2015. Conference on Multimedia and Expo (ICME). 1–6. https://doi.org/10.1109/ICME.
Online News Videos: The UX of Subtitle Position. In Proceedings of the 17th 2015.7177422
International ACM SIGACCESS Conference on Computers & Accessibility (Lisbon, [34] Noah Wang. 2017. Visualizing sound efects. https://youtube-eng.googleblog.
Portugal) (ASSETS ’15). Association for Computing Machinery, New York, NY, com/2017/03/visualizing-sound-efects.html Accessed: 2021-06-01.
USA, 215–222. https://doi.org/10.1145/2700648.2809866 [35] Sean Zdenek. 2015. Reading sounds: Closed-captioned media and popular culture.
[16] Sofa Enamorado. 2018. CVAA & FCC closed Captioning requirements for online University of Chicago Press.
video. https://www.3playmedia.com/blog/fnal-cvaa-and-fcc-online-video-
Nothing Micro About It: Examining Ableist Microaggressions on
Social Media
Sharon Heung Mahika Phutane Shiri Azenkot
Cornell Tech Cornell Tech Cornell Tech
New York, New York, USA New York, New York, USA New York, New York, USA

Megh Marathe∗ Aditya Vashistha∗


Michigan State University Cornell University
East Lansing, Michigan, USA Ithaca, New York, USA

ABSTRACT microaggressions which are defned as subtle remarks or insults


Ableist microaggressions are subtle forms of discrimination that that are fueled by negative stereotypes of disability [50]. These
disabled people experience daily, perpetuating inequalities and “micro” forms of discrimination perpetuate inequalities, ableism,
maintaining their ongoing marginalization. Despite the importance and stereotypes against disabled people while maintaining their
of understanding such harms, little work has been done to examine ongoing marginalization [50].
how disabled people are discriminated against online. We address Several scholars have examined disabled people’s experiences
this gap by investigating how disabled people experience ableist with ableist microaggressions in everyday settings and documented
microaggressions on social media and how they respond to and the harms emerging from such experiences. For example, Keller
cope with these experiences. By conducting interviews with 20 and Galgay examined diferent types of microaggressions disabled
participants with various disabilities, we uncover 12 archetypes people encountered in-person and presented a framework to inter-
of ableist microaggressions on social media, reveal participants’ pret such microaggressive experiences [29]. Xiong developed scales
coping mechanisms, and describe the long-term impact on their to measure diferent types of microaggressive behaviors [56]. Such
wellbeing and social media use. Lastly, we present design recom- microaggressive experiences cause negative health outcomes (e.g.,
mendations, re-evaluating how social media platforms can mitigate a greater likelihood of depression and anxiety) and afect academic
and prevent these harmful experiences. performance (e.g., reduced academic self concept and lower grade
satisfaction) [28, 32, 35]. While these research advances provide the
KEYWORDS necessary foundation to study ableist microaggressions towards
disabled people, all the work focuses on microaggressive behaviors
microaggressions, ableism, social media, disability
in ofine settings.
ACM Reference Format: In recent years, social media has witnessed an unprecedented
Sharon Heung, Mahika Phutane, Shiri Azenkot, Megh Marathe, and Aditya growth in users worldwide. An increasing number of social interac-
Vashistha. 2022. Nothing Micro About It: Examining Ableist Microaggres- tions occur through social media for both disabled and non-disabled
sions on Social Media. In The 24th International ACM SIGACCESS Conference
people alike. With more disabled people turning to social media
on Computers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens,
Greece. ACM, New York, NY, USA, 14 pages. https://doi.org/10.1145/3517428.
as a means to socialize with others remotely and form online com-
3544801 munities, it’s important to understand feelings of exclusion and
discrimination that occur on these platforms [3, 11–13, 25, 36, 48].
1 INTRODUCTION To date, little is known about the types of microaggressions disabled
people experience on social media, and how these experiences difer
Disabled people1 routinely experience diferent forms of discrimi- across online and ofine settings. To fll this critical gap, we sought
nation, despite increased awareness about disability and legislation to answer two research questions:
to protect disabled people [1]. One form of discrimination is ableist
∗ Marathe and Vashistha are joint senior authors. • RQ1: How do disabled people experience microaggressions
1We use the term disabled people, since identity-frst language is preferred by dis- on social media?
ability rights groups, the disability justice movement, and mimics the language our • RQ2: How do disabled people respond to and cope with
participants used. We recognize that being a disabled person is a part of one’s identity
and experiences, and that a person is disabled by society, technology, and the built microaggressions?
environment.
To answer these questions, we conducted semi-structured inter-
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed views with 20 participants who had a variety of disabilities. During
for proft or commercial advantage and that copies bear this notice and the full citation the interviews, we asked them about a time when they experienced
on the frst page. Copyrights for components of this work owned by others than ACM subtle forms of discrimination and felt excluded on social media,
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a probing around their perception of what had happened and how
fee. Request permissions from permissions@acm.org. they responded. We gained insight on how these platforms mediate
ASSETS ’22, October 23–26, 2022, Athens, Greece such interactions between the disabled person and the perpetra-
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 tors as well as how these microaggressive experiences afected the
https://doi.org/10.1145/3517428.3544801 disabled person’s social media use.
ASSETS ’22, October 23–26, 2022, Athens, Greece Heung et al.

Our fndings revealed 12 microaggression archetypes that dis- their disability itself or experiences related to their disability. Mi-
abled people experience on social media. Participants reported croaggressions also consisted of “denial of privacy” experiences
routinely receiving ableist posts, messages, and comments that when people solicited private information relating to a person’s
made them feel patronized and infantilized. Participants had to disability or “desexualization” when disabled people were assumed
endure invasive, personal questions regarding their disability, rela- to be uninterested in or incapable of sex. Disabled people also en-
tionships, and intimacy. They also felt excluded when people on countered “helplessness” when non-disabled people insisted on
social media ignored their posts or accessibility needs and when helping them even when no help was necessary or tried to rescue
users and platforms censored content posted by them. While some them from their disability. Disabled people were often a target of
microaggressive experiences bore similarity to those happening “patronization” when people lauded their unremarkable everyday
in ofine settings (e.g., denial of disability identity), we uncovered actions as achievements, “infantilization” when people treated them
new forms of microaggressive experiences that are unique to so- as immature and dependent, and “second-class citizen” when they
cial media (e.g., being ignored or ghosted). Participants described were treated as inferior or more burdensome than non-disabled
microaggressions as harmful experiences that hampered their self- people. Keller and Galgay also found a“spread efect” when people
esteem and confdence, left a lasting impact, and forced them to made assumptions due to their disability (e.g., assuming that a blind
change the way they use social media. They responded to and coped person’s sense of taste or smell are more powerful than those of
with microaggressive behaviors in diferent ways. While some par- sighted people).
ticipants responded to microaggressions to educate the perpetrator The foundational work by Keller and Galgay has been expanded
or retaliate, many used platform features to report the ofending upon by numerous scholars who have examined ableist microag-
comment and block the perpetrator. gressions in ofine contexts. For example, Bell [5] conducted an
Drawing on these fndings, we discuss how microaggressions interview study with people who identifed as having visible disabil-
manifest diferently on social media and present design recommen- ities, confrming Keller and Galgay’s patterns of ableist microaggres-
dations on how social media platforms could mediate and prevent sions and developing new insights on the impact of microaggres-
these harmful experiences. We make several contributions to the sions for visibly disabled people. Bell found that microaggressions
accessibility community and the conversation around online inclu- place disabled people in a “disability double bind,” which causes an
sion: internal confict between a disabled person asserting their identity
as a capable person worthy of equal treatment, while also having to
• We provide a comprehensive understanding of various types
request accommodations for access purposes. Bell further described
of microaggressive behaviors disabled people experience on
fve strategies that visibly disabled people use when responding
social media, their coping mechanisms and response strate-
to microaggressions and noted that the ten patterns proposed in
gies, and the resulting impact on their wellbeing, self-worth,
Keller and Galgay’s framework are not mutually exclusive; a single
and social media use.
microaggression can embody more than one pattern.
• We explicate the diferences in ableist microaggressions in
Olkin et al. [40] conducted a mixed-methods study with disabled
online and ofine settings.
women and found two new types of ableist microaggressions in
• We discuss design recommendations to reimagine social me-
addition to confrming the patterns identifed by Keller and Galgay.
dia that is more inclusive and welcoming.
They found that disabled women encountered microaggressions
when their health-related symptoms were not believed by medical
2 RELATED WORK practitioners or when they were told they looked too healthy or
The term microaggressions was originally coined by Chester Pierce, attractive to be disabled. Xiong [56] developed a prototype scale
a psychiatrist who studied these “subtle, stunning, often automatic, to measure ableist microaggressions. The 93 items in this scale
and non-verbal exchanges” towards African Americans [42, 50]. combine Keller and Galgay’s patterns with microaggressions not
Since then, microaggressions have been re-defned as “brief slights specifc to a disability context. The scale has been reviewed by dis-
and insults targeting persons of oppressed identities” [4, 50]. The abled experts, although it remains to be validated [56]. While these
use of the term has expanded to other marginalized populations, in- research advances provide the necessary foundation to study ableist
cluding the disability community. Similar to other structural forms microaggressions towards disabled people, all the work discussed
of oppression (e.g., racism and sexism), microaggressions experi- thus far focuses on microaggressive behaviors in ofine settings.
enced by disabled people stem from structural ableism; hence the A growing body of HCI research has examined online harass-
term ableist microaggressions. ment and discrimination targeted towards various marginalized
To date, scholarship on ableist microaggressions has primarily communities [7, 27, 31, 37]. For example, scholars have studied
focused on their occurrences in ofine and in-person settings. Keller harassment experienced by Black people [22, 38, 55], Asian peo-
and Galgay [29] were the frst to systematize the types of microag- ple [23], women [39, 51, 54], LGBTQIA+ people [8, 21, 22, 51],
gressions uniquely experienced by disabled people. Based on focus and Muslims [15, 30]. Although studies of online hate speech and
groups with twelve disabled people, they developed a framework disability disclosure have uncovered some instances of ableism
to categorize ableist microaggressions into ten distinct patterns. [6, 7, 16, 37, 38], scholars have yet to examine the specifc kinds of
They found microaggressions about “denial of personal identity” discrimination disabled people experience online. We extend the
when people ignored all aspects of a disabled person’s identity scholarship on (1) online harassment and (2) ableist microaggres-
other than disability. Disabled people also encountered “denial of sions by examining new forms of microaggressions that disabled
disability experience” when microaggressions minimized or denied people encounter on social media.
Nothing Micro About It: Examining Ableist Microaggressions on Social Media ASSETS ’22, October 23–26, 2022, Athens, Greece

3 METHODS coding to discover new categories of microaggressions and deduc-


To examine ableist microaggressions on social media, we conducted tive coding to draw from existing in-person, ableist microaggres-
interviews with 20 disabled people. sions [29, 52]. The three coders frst coded two interviews together,
discussing discrepancies in codes and creating a preliminary code-
3.1 Participant Recruitment book. The coders then coded the rest of the interviews separately,
communicating when new codes arise by annotating the code-
We recruited participants through various sampling methods, in-
book. Throughout the analysis, we held multiple discussions to
cluding convenience sampling (recruiting those we knew), snow-
iteratively refne the codes and reconcile disagreements through
ball sampling (referrals from recruited participants), and strat-
peer-debriefng [17] to ensure that our themes comprehensively
ifed sampling (recruiting for a diverse balance of gender and
represent the data. In the end, we had 182 codes (e.g., response =
disability)[2, 45]. Participants were screened through a Qualtrics
no response) and 16 themes (e.g., “responding is pointless”).
survey to ensure they ft our criteria to include those who self-
identify as having a permanent or long-term disability, have a
visible disability profle (disability disclosed online), and use social 3.4 Participant Demographics
media on a regular basis (multiple times per week or everyday). We Table 1 presents details about participant demographics and social
required participants to disclose their disability because we sought media use. This information uses participants’ own words. We re-
to understand disability-specifc microaggressions. This disclosure cruited 20 participants (11 male, 7 female, 1 gender non-conforming,
could occur via participants’ posts, bios, and/or photos. and 1 preferred not to disclose) with varying long-term disabilities
(e.g., blind or visually impaired, deaf and hard of hearing, and neu-
3.2 Data Collection rodivergent). Participant ages ranged from 19 to 35 years (mean =
We conducted semi-structured interviews with participants via 26.7). The majority (n=17) of our participants were from the United
Zoom. Due to access needs, three participants responded in text States, while three participants resided in India, Ireland, and the
through the chat feature and one participant used a combination UK.
of voice and text. All interviews were conducted in English, lasted
approximately 60 minutes, and were audio-recorded with the par-
3.5 Positionality
ticipants’ consent. Participants received Amazon gift cards in recog-
nition of their time and expertise. Given the sensitive nature of discussing online discrimination and
Each interview began with introductions, a review of the consent microaggressions, we believe it is crucial to refect on our stance.
form, and reiteration that the participant could take a break or skip Our team consists of authors with a rich experience of working
questions at any time. Questions were open-ended to empower with disabled people and authors who have experienced ableist
participants to share their experiences. The interview included microaggressions. Although we present individual accounts of mi-
questions about social media use, microaggression experiences, croaggressions, we take a disability studies perspective in striving
and avenues for improvement of social media platforms. We asked for structural changes that mitigate and ideally dismantle ableism
participants to share what platforms they use, what content they [34]. Our goal is to bring forward the voices of disabled people in
post that reveal aspects of their identities to help contextualize how they are excluded online; and intervene in conversations on
their experiences with microaggressions (e.g., how much does your online governance and content moderation.
profle reveal about yourself and your disability identity?). We then
asked participants to tell us about a time when they were treated 4 FINDINGS
diferently on social media, when they felt unwelcome on social
We present the diferent forms of microaggressions that the disabled
media, and when they experienced subtle discrimination or mi-
participants experienced on social media (Section 4.1). We then
croaggressions on social media. We intentionally did not use the
share the aftermath of these experiences, discussing ways in which
term microaggression earlier in the interview in case participants
disabled people responded to (Section 4.2.1) and coped with these
were unfamiliar with the term. From there we asked follow-up
microaggressions (Section 4.2.2). Finally, we describe participants’
questions to better understand what happened (e.g., when did this
views on the long-term impact of microaggressions and ways to
happen?) and how it afected their social media use (e.g., did this
help prevent and mitigate the harm (Section 4.2.3).
change the types of content you shared?). Finally, we asked par-
ticipants to refect on how microaggressions might be unique to
people with disabilities (e.g., do you think people with disabilities 4.1 Types of Ableist Microaggressions
are more or less likely to experience microaggressions on social Although two participants could only recall experiences of
media?) and how the platform and other users could help during overt discrimination, the majority of our participants described
microaggressive experiences. We conducted interviews until theo- microaggression-related experiences in detail. Participants often
retical saturation was reached. referred to these experiences as “backhanded comments” (P11)
or somebody "hid[ing] the fact that they’re trying to discriminate
3.3 Data Analysis [against] me” (P16). Some added that microaggressions were “usu-
Our data consisted of approximately 17 hours of audio recordings ally unintentional” and, due to their subtlety, these experiences
and detailed notes collected during the interviews. After transcrib- could lead to “overanalyzing” the event (P3). We present our fnd-
ing the interviews, we used thematic analysis [9] to conduct open ings in the form of 12 archetypes that closely mirror the actual
ASSETS ’22, October 23–26, 2022, Athens, Greece Heung et al.

Table 1: Participant Demographics and Social Media (SM) Use.

P Gen. Age Disability* Ethnicity* SM Use SM Platforms Disability Disclosure*

P1 F 19 Several Physical Disabilities Caucasian Daily Facebook, TikTok, Snapchat, Insta- via public profle (Infuencer)
gram, dating apps
P2 M 29 100% Vision Impairment Indian LinkedIn, WhatsApp via posts
P3 M 22 Duchenne Muscular Dystrophy Indian-American Bi-weekly Facebook, Instagram via personal chats on Messenger
P4 X 21 Neurodivergent, "Cocktail of disabil- Caucasian Daily Twitter, Instagram, Tumblr, Snapchat, via smaller groups ie. discord servers, via
ities" Discord, Leage of Legends chat description
P5 F 22 Blind, Fiber Myalgia, PTSD, Lupus, Caucasian Daily Facebook, Messenger, Reddit via posts, groups, and service animal
Autism
P6 M 30 Mild Vision Issues, Speech-related African American Facebook via posts of hobbies that indirectly re-
vealed disability
P7 M 26 Visually Impaired Black American Daily Facebook, Twitter, Reddit, Instagram, via photos
LinkedIn, WhatsApp, Snapchat
P8 F 30 Depression (since childhood) Caucasian Daily Facebook via public posts, writing articles
P9 M 25 Speech Disability Black Daily Facebook, Twitter, Reddit via personal profle
P10 M 32 Speech-related Disability, Attention African American Daily Facebook, Twitter, Whatsapp, Tele- via public profle, bio ("disability is not in-
Defcit gram, Instagram ability")
P11 M 21 Cerebral Palsy, Wheelchair User Caucasian
P12 F 26 Autism Black American Daily Facebook, Twitter, Instagram, What- via posts, stories
sApp
P13 M 35 Epilepsy African American Daily Facebook, Twitter, Instagram, Dating via personal chats
apps, WhatsApp
P14 M 35 Leg Problem, User of Manual White Daily Facebook, Twitter, Instagram, What- via public pictures, posts
Wheelchair, Standing Stick sApp, TikTok
P15 M 35 Hard of Hearing and Other Physical Black Weekly Facebook, Twitter, Instagram, TikTok via private status, photos for friends
Impairments
P16 M 25 Visually Impaired (low vision) Black American Daily Facebook, Twitter, Instagram, via posts, pictures for friends
LinkedIn, WhatsApp
P17 F 32 Epilepsy, Autism (since childhood) Black American Daily Facebook, Instagram, WhatsApp, via bio description, writing articles
Linkedln
P18 F 24 Autism Black Daily Twitter, Instagram, LinkedIn, What- via pictures, posts
sApp
P19 X 21 Ehlers-Danlos Syndrome, En- White Daily Facebook, Twitter, TikTok, Youtube via public description, bio, content
dometriosis, Neurological Disorder,
Chronic Pain
P20 F 23 Wheelchair user Pakistani & Irish Daily Twitter, Instagram via bio ("wheelchair user")

*are self-identifed and self-dictated by participants

manifestation of a microaggression. These archetypes were in- “I hate that so much. I’m not inspirational for going to
tentionally chosen to preserve the visceral impact, nuance, and class and expressing to my friends how much I don’t
emotional intensity that accompany instances of microaggressions. want to be in class. That’s not inspirational, that is what
every other college student does.”

4.1.1 Patronization and Infantilization. Participants shared sev- Other instances of patronization were more subtle. P20 received
eral instances of microaggressions where they felt patronized and patronizing comments on two occasions. In one instance, when she
infantilized. posted about a night out, she received comments like, “you seem
You’re so inspirational. Patronizing comments were the most so happy being out” and “that’s great for you,” while non-disabled
common microaggression that our participants experienced. More people received “usual comments like emojis” (P20). On another
specifcally, this manifested in comments where other users ex- occasion, P20 posted about going on a hike, people commented,
aggerated the participant’s routine activities as inspirational or “you look so happy.” “People don’t expect me to be able to hike because
even sometimes “glorif[ied]” a disabled person for living a normal I use a wheelchair,” P20 explained.
life. According to P2, "[people] keep posting ‘oh, this is so cool’ and P20 connected her online experience with face-to-face patron-
‘you’re so inspiring’ just because you might have done something ization. Her friends patted her head and said “good girl, good boy.”
very normal." P11 expressed the same sentiment saying the most While she did not think they had ill-intent, P20 thought this be-
frequent microaggression is “oh my God you’re doing this thing havior was inappropriate, stating that she’s not a “dog” and no one
independently like that’s amazing [and] I can’t believe you’re func- likes to be “touched without permission.”
tioning as a human person. . . phrased in 1000 diferent ways.” P1 also Some participants expressed discomfort when others glorifed
received comments like “you’re so inspirational” and explained why disabled people. For example, P20 found videos that glorifed asking
this particular microaggression was upsetting: a disabled person to prom, characterizing it as a “heroic thing.”
Nothing Micro About It: Examining Ableist Microaggressions on Social Media ASSETS ’22, October 23–26, 2022, Athens, Greece

Similarly, P2 saw this on a professional networking site, where Beyond comments about ability, a few participants shared instances
someone commended himself for changing the life of a disabled when people commented on how disabled people should look and
person: dress. For example, P18 described how her disabled friend, who
has a physical disability with “one side of her hand is shorter than
"I think about six months back there was some random
the other,” received a comment on Instagram about her wardrobe.
post on linkedin... and this person was a stockbroker
The perpetrator asked her to “wear something [with] more covering
and he said I gave an internship to someone with a
because of her body and the hand” (P18).
disability and he [the intern] made a proft... and now
In a more severe case, P13 shared thoughts about the COVID-19
he [the intern] also has a girlfriend or something like
pandemic on Facebook, and a “friend” on the platform expressed
that. I got really pissed... where did the disability thing
surprise that P13 had not died due to his disability.“Even you sur-
and the girlfriend come into play.” (P2)
vived this?” the perpetrator asked.
Where’s your mom? Several participants expressed another I would kill myself if I was disabled. Some participants ex-
type of microaggression where people infantilized a disabled person, perienced extreme microaggressions with eugenic undertones. P1
treating them like a child, perceiving them as “naive,” and often received a comment saying “wow you’re so brave, I would kill my-
discounting their opinions and life experiences. For example, when self if I was disabled.” Such microaggressions were also instances
P1 posted videos about her daily life on TikTok, she experienced "a of patronization. Others recalled microaggressions that indicated
lot of infantilization" like “oh where’s your mom?” and “oh you live severe stigmatization of a disabled person. P17 received comments
by yourself?” She felt that people believed that she was incapable from family members on Facebook referring to her as a “mistake,”
of being independent and needed assistance in doing daily chores. and P14 received a comment from a stranger on Facebook saying,
While she understood that this might be because of ableism or “you are disturbing me with your disability.” Recalling an in-person
people’s limited understanding of disabilities, she felt frustrated encounter, P1 shared that “old white women” tried “to pray the
when people commented that “grocery shopping and doing laundry disability out of [her].”
is cute”(P1).
4.1.3 Denial of Disability Identity and Experience. Participants ex-
4.1.2 Disability as Inability. Several participants experienced mi- perienced a range of microaggressions where others accused them
croaggressions when people made assumptions about what a dis- of faking their disability or questioned their disability experience.
abled person can and cannot do. In extreme cases, some microag- You’re lying about your disability. While some microaggres-
gressions questioned disabled people’s ability to contribute to soci- sions stemmed from assumptions about disability, one type of mi-
ety and their very existence. croaggression involved questioning whether participants were dis-
Can someone like you do that/wear that? While some mi- abled in the frst place and doubting the degree of their disability. P1
croaggressions glorifed disability, others directly assumed a lack of shared instances when “being disabled comes into question” with ac-
ability. P16 recalled “rude comments” from people making assump- cusations of faking the need for a wheelchair. Similarly, P5 recalled
tions about his disability, such as “oh sorry, you can’t do this because being accused of faking her disability due to lack of awareness of
you’re visually impaired.” Others shared ableist microaggressions assistive technology for people with visual impairments:
that shamed them in participating in social media challenges. P12
explained, “Somebody will [comment] ‘how are you on Facebook,
or how are you on reddit if you have vision issues’ sug-
“I made a dance video on a popular song and posted it gesting that you’re lying about having vision issues,
on my WhatsApp [status]. . . someone on my friend list because you wouldn’t be on social media if you [have]
said I don’t have to follow all the challenges and make vision issues. Like literally it makes absolutely no sense.
dance videos when I can’t move my body according to This one actually happened today.”
the beats.”
Such microaggressions were alike in both ofine and online set-
These ableist assumptions carried over to what a disabled person
tings. For example, P5 described how she struggled with medical
can do with their professional and career goals. For instance, P13
professionals who did not believe her when she described aspects
partook in a Whatsapp group to talk about business ideas. After
of her disability. P5 experienced a severe migraine episode for sev-
P13 shared a proposal someone asked“can someone like you do
eral days where she became completely dehydrated. As she was
that?” On the other hand, when a disabled person was perceived
wheeled onto an ambulance, the EMTs were “actively denying” the
as successful, perpetrators invalidated their success by attributing
possibility of her having light sensitive migraines because she is
their achievement to charity. As a TikTok infuencer and performing
legally blind. Similarly, P19 recalled an instance that they described
artist, P1 explained how people second-guess her success in acting:
as “medical gaslighting.” Health professionals assumed the pain was
"People think that when I get hired for a job in the “in [their] head,” leading to misdiagnosis of endometriosis which
entertainment industry it’s simply out of pity...I get a “nearly killed” them.
lot of people who genuinely think that everything is That’s not a disability. Although not as extreme as accusing a
handed to me because I’m disabled. . . people are like person of faking a disability, participants recalled microaggressions
‘wow I bet the government paid [for you getting the when aspects of their disability experiences were invalidated. Per-
work]’, which is funny because the government doesn’t petrators had their own assumptions of what “counts” as a disability
pay for anything.” experience. For instance, P4 had an invisible physical disability and
ASSETS ’22, October 23–26, 2022, Athens, Greece Heung et al.

recognized that the hidden nature of their disability caused “in- What happened to you? Many participants reported being
validation from both non-disabled and disabled people.” On Tumblr, asked personal questions which they “didn’t feel comfortable an-
they experienced microaggressions from the disability community swering” (P3). For example, someone on Reddit asked P5, “if you’re
along the lines of “oh well that’s not what we meant by disability.” blind, how do you know when you’re on your menstrual cycle”. Some
P4 fretted that disclosing their disability “invited ableism” and not of these invasive questions relate to participants’ disabilities, for
disclosing it resulted in invalidation. example, about “how [they are] functioning in a wheelchair” (P18)
Some participants felt invalidated when they were compared and “what happened to your eyes and why are you [wearing] blocky
to people who have other forms of disability. For example, P19 glasses” (P16). P3 emphasized that these questions stem from a
shared that people often commented that she is not disabled because “place of curiosity, but at the same time it’s not always good to be
she is “not in a wheelchair." P8 described a similar situation when curious about things like that... There are boundaries people have."
others invalidated her mental-health disability. P8 was diagnosed P11 called this "the line between being curious [and] being ofensive.”
with depression at a young age and used Facebook to write about Can you have sex? Another type of microaggression partici-
her experiences. She recalled an incident where a Facebook friend pants shared was being asked inappropriate and crude questions
accused her of lying: about their sexual health and activity. P11 described instances of
being asked personal questions about sex both in-person and on
"There was a time that I wrote about myself being preg- social media:
nant and being depressed... There’s some people [who] "Random strangers inquire about my sexual function..
actually think that depression is. . . someone’s person- they see a person in a wheelchair, and they think.. if
ality, they don’t believe that depression is actually a their reproductive system works properly and it’s weird
disability, so I was trying to tell everybody how I was but it’s happened several times. Sometimes in the rare
doing and how I was taking care of myself.. A [Face- instances where I have had conversations with strangers
book] friend was like I should stop lying. . . stop saying on social media in DMs which isn’t something I do a
things that are not real. . . I should stop doing what I’m lot but I’ve done it a few times. . . People have questions
doing because people would not appreciate what I’m about things that it just makes me stop and think for a
doing.” (P8) second. . . you can ask me any question in the world and
why is that the frst thing you want to know about me."
While some participants received invalidating comments from peo-
ple, P5 felt invalidated when her posts were removed by the platform P1 also recalled experiences of being asked about her sexual
as a result of other users fagging her posts. She shared that at least activities. Even in public places like a grocery store, she was asked:
fve of her posts that disclosed her disability were reported and “can you have sex?” Participants believed that the perpetrators ob-
taken down on Reddit. For example, one post described an outing jectifed disabled people and took liberties, often crossing lines. P1
with her service dog and another was about her preference in canes recalled how she was asked such questions before she turned 18
that were being given by her local government agency. Given the and her immediate thoughts were “why are you asking a 16 year old
lack of visibility on why the post is reported, she felt frustrated girl these questions?”
that not only was her disability experience invalidated, but also en- Are you sure your husband loves you? Participants described
tirely removed from the platform. Like P5, P1’s posts were reported instances when the perpetrators assumed that disabled people are
on TikTok by "trolls" who wanted to cause the platform to lock incapable of forming intimate relationships. P8 explained that when
her account without any reason. P1 felt that these experiences are she got married, she received comments on Facebook that ranged
frequent for disabled activists who “have had their whole accounts from disbelief to accusations of her lying about being married.
removed and completely erased because [they] talked about issues She described how insensitively some people reacted, including
that people are uncomfortable with." comments like “You got married really? Are you sure your husband
You’re abusing your service dog. All participants who used loves you? Are you sure you are not lying?” Similarly, P1 recounted
a service dog were accused of animal mistreatment. For instance, how her disabled friend, who is a social media infuencer, received
after posting a video of her and her service dog, P1 had a person harsh comments from people when she announced her relationship
accuse them of “abusing her dog and say[ing] [her dog] doesn’t with a non-disabled person. P1 detailed that the trolls did not view
like to work” (P1). As a service dog handler, P11 echoed a similar it as a “normal human relationship,” and commented that her friend
misconception: disabled people “force” dogs to work. In addition, “must be paying” the boyfriend to be in the relationship.
P5 has been accused of faking the need for a service dog. She was 4.1.5 Being Ignored and Excluded Online. Participants felt a target
denied access to a service dog-related Facebook group because the of microaggressions when people on social media purposely ignored
administrators thought she was too young to have one. or “ghosted” them, making them feel unwelcome. Participants also
shared experiences of feeling excluded due to the inaccessibility of
4.1.4 Invasion of Privacy and Denial of Meaningful Relationships. social media, social media challenges, and online content in general.
Participants recalled experiencing microaggressions in the form Being ignored. Although the majority of microaggressions were
of invasive, personal questions regarding their disability and their specifc actions, this type of microaggression was the lack of action
sexual activity. In addition participants received questions about or inactivity on social media. P3 labeled being ghosted or “left on
their romantic relationship, indicating an assumption that a disabled read” (when the perpetrator sees the message and does not respond)
person cannot form meaningful, intimate relationships. as a microaggression. During this time, his friends kept making
Nothing Micro About It: Examining Ableist Microaggressions on Social Media ASSETS ’22, October 23–26, 2022, Athens, Greece

excuses not to talk or hang out with him. P3 eventually concluded would "calm himself down, leave social media for a week, and move
that his friends didn’t want to be friends anymore because “they on with life.” P9 referred to microaggressions as “norms,” while
think I’m diferent [from] other people...It’s not direct [aggression] but other participants expressed they are “used to it already” (P12) and
more of a subtle thing.” Instead of being ghosted, P10 experienced a “unfortunately no longer surprised” (P1).
similar microaggression where he felt ignored when people did not Whether to Respond Or Ignore. Regardless of their emotional
comment on his Facebook post: reaction, participants decided whether to respond to or ignore
“I posted a recording of myself talking about ways that the microaggression based on the identity of the perpetrator. P2
the disabled can be considered in society but. . . I didn’t explained how if a patronizing microaggression came from a friend,
get anyone who could support me in the comment sec- he “would sit them down and speak to them” (P2). However if the
tion so I felt so bad that people with disabilities can be perpetrator was “really old or from a previous generation,” he would
easily ignored on Facebook. I expected some comments not put efort into changing their mindset because “he might not be
about that so that I know people care about the disabled. able to change them.” Similarly, P11 called out that he might be more
I don’t [usually] delete my posts but I had to since. . . no likely to respond to children or “younger people" who are just being
one cared.” “curious.” Nonetheless, responding to microaggressions exerted an
Exclusion via inaccessibility & moderation A few participants emotional toll and one “might not be in the mood” to handle it (P2).
experienced microaggressions when the perpetrators purposely P20 elaborated on the trade-ofs:
shared inaccessible content with them. P5 described how her room- “Some people will report those comments. I always try
mate changed the text color to orange in a group chat on Facebook to have an open conversation with them, because the
messenger, fully knowing that the color would make the chat in- reason I am on social media is to try and educate people.
accessible to P5. Another participant, P14, found the behavior to So it would be remiss if I just deleted them, but then I
be microaggressive when people tagged him in social media chal- am getting to the point where I am like how much of
lenges (e.g., dancing challenges) that he cannot do because of his my physical energy is this costing?”
disability. Similar to P20, P1 appreciated her social media followers who
Many participants mentioned that inaccessibility of social me- replied to ableist comments for her. Another participant recalled
dia content was a source of feeling unwelcome and excluded. For being the one to defend and respond on behalf of a disabled friend.
instance, P2 felt unwelcome on social media when people in his P18 stated that she defended her disabled friends by engaging with
WhatsApp groups posted pictures without captions and alt text and people who were the perpetrators of microaggressions. She replied
then engaged in conversations about them. Similarly, P1 found it to a comment about clothes her friend should not wear and de-
frustrating when content creators on TikTok and YouTube uploaded fended her friend by commenting:“she’s disabled and that doesn’t
videos without closed captions. P1 also expressed resentment on change who she is so I will be glad if you people don’t throw shit
exclusionary, content moderation systems that are ill-equipped on disabled people [because] we can actually be who we want to be.”
to meet the needs of disabled people. She described how TikTok (P18)
fagged her videos and banned her from live streaming believing Some participants saw a microaggression incident as an opportu-
her to be “underaged because of [her] dwarfsm.” Participants also nity to educate the perpetrator. P8 described how she “was so pissed
reported experiencing microaggressions when there was a delay that [she] had to just respond.” Instead of expressing her anger, she
addressing issues they experienced online. For example, while P1 described her mental disability to the perpetrator and outlined the
recognized that she was blocked as a measure for minor safety, she lived experiences of people with such disabilities, in the hopes of
found it frustrating that her appeal remained unaddressed for over “proving” to him that “this exists.” Similarly, P1 described a mantra
a year. she told to those who asked her invasive questions: “if you aren’t
going to say it to a non-disabled person, don’t say it to a disabled
4.2 The Aftermath of Microaggressions person; it’s pretty simple." P19 described how they felt responsible
Having described the diferent types of microaggressions disabled not only to educate the perpetrator but also their followers:
people experience online, we now dive deeper into the aftermath of "If I think that it’s a learning opportunity [and] a com-
experiencing a microaggression, sharing how participants reacted, mon misconception I wil screenshot it and block out the
responded, and coped on the social media platform. We then share name and post it on my story. I like to explain why this
participants’ perceptions of the long-term impact and ideas of what isn’t okay and why it’s damaging. But I try not to do
social media can do to mitigate these experiences. that because. . . even if you block out the name, people
4.2.1 Responding to Microaggressions. There was a wide spectrum can stalk your videos. . . so I’m not sure, because I don’t
of reactions to microaggressions. Some participants claimed “social want to send hate back. But I don’t want to let that
media is not meant for the disabled” (P14) and some “want[ed] to opinion exist.”
delete [their] whole online existence” (P19). Others felt lonely at Although participants recognized the beneft of responding as a
frst (P3) and some were “speechless” (P5) and “so upset to the point means of education and advocacy, some wondered if it was worth
they didn’t even know what to do” (P8). However, with time, the the efort. P1, a disability activist on social media, questioned the
way participants reacted to microaggressions also changed. For efectiveness of educating about disability on social media and felt
example, P14 described how initially microaggressions on social that “people are so caught up in their [own] mindsets. . . there is no
media would make him depressed and heartbroken, but later he point in talking to them." Other participants decided that responding
ASSETS ’22, October 23–26, 2022, Athens, Greece Heung et al.

is pointless because they cannot change the perpetrator’s views or social media for a week.” Some participants found blocking peo-
their own disability (P18). Therefore, some participants felt there ple and deleting comments to be a better recourse than reporting.
was nothing more to do “but to accept it and move on.” (P14) For example, P4 expressed frustration with the “current reporting
Some participants decided not to respond to microaggressions systems and algorithms" and believed that microaggressions go
because they felt that doing so would add more fuel to the fre. P8 unchecked because “there are no repercussions for the [perpetra-
viewed the microaggressions as a “test” of their resilience; “social tors’] actions.” Not only are the current processes and underlying
media bullies who are looking for her to get mad.” P18 would “feel algorithms inadequate to counteract microaggressions, they often
more weak” responding to the perpetrator, failing the test. Similarly, amplify undesirable behaviors instead of curbing them. P4 reported
P1 viewed the perpetrators as people who are bored and found it and blocked perpetrators on Twitter rather than responding to
most efective to not “give them attention.” prevent other users from seeing the original post:
A few participants responded with humor; P11 described how "If [they] take the time to quote tweets and [the tweets]
he “dish[es] it right back to them”and P5 "often snapped back with get more popular then the algorithm will support [their]
a sarcastic comment.” For instance, on a local Facebook group, P5 posts and then people can see the original post. . . . [They
warned others of a road where she “almost sprained her ankle” are] putting that message to a bunch of people who
because of her “vision issues.” She described her response to the haven’t necessarily consented to reading that [and]...
patronizing comment: what if [they] boost this person’s post to someone who
“Somebody comments well you should have just driven actually agrees with it.”
to your appointment then. . . I replied: if I can’t manage Changing Social Media Use. Immediately after experiencing the
to step of a curb correctly because of my depth percep- microaggression, many participants took a break from social media
tion I don’t think you want to trust me on a road.” and refected on what they could do to avoid experiencing such
negative slights and ofensive behavior. For example, immediately
4.2.2 Coping with Microaggressions. Participants described several after the microaggression, P12 did not post for several days, P15
coping strategies to reduce the risks and harms of microaggressions. reduced “visits to Facebook,” and P16 “backed of from social media"
While some turned to distracting or distancing themselves from to spend more time with family.
the microaggression by engaging in a hobby or fnding refuge in Although this time-of from social media was a short-term efect,
entertainment, several participants shared other ways they coped the microaggression experiences forced participants to set bound-
with microaggressions, which involved the use of certain social aries on how they use social media, when they use social media, and
media features and changes in social media behaviors. what they post. For example, P1 and P19 described that to avoid
Deleting, Blocking, and Reporting. Rather than replying to being afected by microaggressions, they stay away from social
the perpetrator, some participants responded by taking action on media right after they wake up or right before bedtime. P2 even
the platform either by deleting and reporting ofensive comments, deleted his social media apps every few days as a habit to protect
and blocking the ofenders. Five of our participants had deleted his mental health. P4 and P6 became “passive users” and decided to
comments or posts. Not only did they delete the comment to “let it share “less information” about themselves, including their disability.
go” but also as a means to forget and recover. Similarly, P16 described how he does not post pictures of himself
Several participants opted to report the posts and comments to avoid being a target of microaggressions:
containing microaggressions. P9 described an incident where he
“I don’t really post my personal pictures due to [my] dis-
had posted a quote on Facebook and the perpetrator responded that
ability. I sometimes post memes or funny pictures. I[do]
he is “not supposed to post that kind of stuf” because of his disability.
like other people’s posts, engage [with] comments, scroll
P9 decided to not respond because he was not in the “best mood”
through my feed, [and] watch other people’s stories on
and did not want to infringe on the perpetrator’s “freedom of speech.”
instagram... If I post private pictures...on my Facebook
Because the perpetrator was a Facebook friend and someone he
I make it private to reduce discrimination and prevent
knew from college, P9 was also reluctant to block him. Instead P9
unwanted questions on vision impairment."
took a “screenshot of the message and sent it directly to Facebook.”
P9 elaborated: Like P16, several participants believed that they were a target
of microaggressions because of their disability and opted to hide
"It was a friend of mine that I do not want to lose. I don’t their disability identity. For example, P14 avoided posting on Tik-
want it to look like I blocked [him]... but there’s a need Tok because “there is no way [he] can do a video without showing
to actually cope with that kind of stuf. . . to stop it from [his] legs." By becoming passive users and limiting the amount of
recurring..I’m not that kind of aggressive human being disclosure of their disability, P14 explained that this will “avoid all
[that usually] blocks or deletes. . . because you can’t just sorts of embarrassment or harassment on social media." P1 also took
block everybody.” specifc measures in the types of content she created to avoid going
Like P9, other participants also felt that blocking and unfriending viral on TikTok. Previously she had a TikTok video with 3 million
people they cared about could burn bridges and “make enemies” views and “90% of the comments [were] negative” with hate, which
(P7). made her consciously post content that would not go viral. She
A few participants decided to block the perpetrator to “prevent” highlighted how a viral video has impacted what she posts:
the microaggression from happening again. P14 explained that, "I have realized that the videos that go viral for me are
“when it happened to me I just [had] to get the guy blocked and leave ones where I present an issue surrounding disability and
Nothing Micro About It: Examining Ableist Microaggressions on Social Media ASSETS ’22, October 23–26, 2022, Athens, Greece

say this is why this is ofensive. . . I call non-disabled such a taboo topic [and] we’re so dehumanized that people don’t even
people out for the things that they do to disabled people. realize that it matters whether we’re here or not. . . They don’t think we
Those are the kinds of videos that go viral and. . . bring can feel pain or we’re intellectually disabled.” P1 revealed what she
me a lot of hate from non-disabled people who feel called learned about society’s view on disability through microaggression
out. . . if I post about my daily life, those things don’t experiences:
go viral. People think that having a viral video is the "Being disabled is publicized unfortunately if you have
perfect idea. . . when in reality for marginalized groups, a platform or not everything is publicized everybody
it brings a lot more hate than it does good so a lot of thinks that it’s public information. So your sex life, your
activists out there don’t want their videos to go viral relationships with other people, how your family [and]
and. . . have left platforms.” (P1) friends treat you people just think it’s charity. . . espe-
Although some participants explicitly said they post less fre- cially with having my platform grow on social media, [I
quently, a few stopped posting altogether. P8 explained how these see] the ableism that is just so prevalent in our society."
harmful experiences have led her not to post anymore on social Re-Imagining Social Media. Participants shared ways in which
media even though “she wants to post but then think[s] about what social media platforms could prevent the microaggression from hap-
people will say.” As someone who blogs about depression, she still pening in the frst place. Some participants suggested social media
writes but does not share her writing on social media because she platforms to have stricter community guidelines that explicitly pro-
does not “want to see bad comments.” tect disabled people and a process of removing users if they do not
4.2.3 Mitigating the Harm of Microaggressions. Participants ac- abide. P15 shared that social media platforms could ask all users
knowledged that coping does not repair the harm caused by mi- to watch a short video about discrimination. Other participants
croaggressions. P18 emphasized this by saying: wanted social media platforms to take responsibility in disability
awareness by educating users about disability, ableism, and disabil-
"If there is a person that wants to block the person for
ity pride month. Taking this a step further, P11 recommended that
your own good, then go ahead but what has been said
platforms educate users when the microaggression occurs.
has already been said. . . a person can’t take back the
words and its efect on you.” “I think exposure is the only way to combat microag-
gressions. . . that’s starting to happen on social media
Long-Term Impact. Participants refected on the long-term
more now but it wasn’t until the past couple of years. . .
efect of microaggressions on their self-perception. P2 described
you have more disabled creators on social media trying
how patronizing microaggressions have changed the way he inter-
to combat those narratives but I defnitely think the best
prets compliments on social media. He explained “being patronizing
way to do it is just to have proof of those things on social
because of disability in certain cases is super evident, but there are
media being wrong. So you go to the places where the
cases when it might not be.” As a result, he second-guesses the com-
microaggressions are and prove them wrong." (P11)
ments that say “he’s handsome.” Others more broadly shared how
the microaggressions have afected their self-esteem, adding to the Other participants thought of ways to mitigate the harm once
difculties of having a disability. P3 explained: the microaggression has been said. For example, P16 suggested
"It [microaggressions on social media] defnitely afects having a bot to “automatically remove” the microaggression on his
your confdence and self-esteem. It [self-esteem] drops... behalf, delete “rude words,” or automatically “reply to [inconsiderate]
You already have it pretty low because of your physical private messages.” P5 furthers this notion by proposing a bot to
difculties and things you have to deal with normal automatically send an “informative video” when people ask “how
people don’t have to deal with and on top of that, you can you be blind."
have to deal with these issues like people not wanting Some participants wanted improvements to reporting and mod-
to be friends with you [and] ignoring you." eration on social media. P4 explained that there is a lack of trans-
parency and accountability when reporting a post. With their per-
Some participants felt that they were unable to stop these mi-
sonal experience reporting on Tumblr and Twitter, they were un-
croaggressions. P19, who is a disability activist, described how these
aware if “someone who is reported has actually been punished.” How-
microaggressions left a lasting impact on her:
ever, their experience was diferent in League of Legends (an online
“Even though I do have the language to combat [mi- game) where the administrators promptly addressed the matter. P4
croaggressions] now, I’m in such a state of shock when- felt “good to some extent” that the perpetrators were “minutely in-
ever it happens because it brings back memories and convenienced.” Participants recognized that the perpetrators found
because it’s just a nasty thing to experience that I don’t workarounds that prevented their comment from being removed.
feel like I’m doing as good a job, as I could, of being an For example, P4 and P20 reported that the perpetrators often re-
advocate, which is another like layer of guilt.” placed letters with numbers or special characters in hate speech
In addition to shaping self-perceptions, microaggressions to avoid being fagged by the underlying content moderation al-
showed participants how mainstream society perceives disabil- gorithms. In addition, some participants felt frustrated when the
ity. For example, P11 highlighted how social media in particular algorithms removed hateful posts without context. For instance,
exposes these views since it “gives an opportunity to see the kinds on a Facebook group to talk about abuse and trauma, P5 had a
of people out there and people will say a lot through a screen [that] post removed because it included hateful words that was a part
they won’t say to your face.” P19 further explained that “disability is of someone’s experience rather than being directed at anyone. P5
ASSETS ’22, October 23–26, 2022, Athens, Greece Heung et al.

explained how exasperating it was when Facebook kept removing disabled” is related to the helplessness pattern, in that it character-
the post and every time they appealed the group got more “strikes.” izes a frantic urge on the part of a nondisabled person to cure or
P1 also noted issues with the slow appeal process and wishing that rid another person of their disability [29]. The eugenic undertone
there was a better way to verify age. P19 explained why ableist of this archetype also questions the worth and very existence of
microaggressions are hard to automatically identify: disabled people, exemplifying second-class citizenship [29, 49]. The
“With disability, the language is so vague. You can’t archetypes “what happened to you,” “can you have sex,” and “are
really pick out a word that they use. . . that’s a microag- your sure your husband loves you?” represent instances where
gression and that’s a hate crime. Because it’s the tone. disabled people are asked invasive and ofensive questions about
There’s no button for saying they’ve implied that it’s a person’s life that are typically considered private. These three
okay that I die. But clearly that’s not an okay thing to archetypes map onto the denial of privacy pattern [29]. The ap-
say, but there’s no way of reporting that kind of thing." parent comfort with which nondisabled strangers inquire into a
disabled person’s history or sexual activity betrays the understand-
ing that disabled people can be treated as objects of frank curiosity.
5 DISCUSSION
These two archetypes also embody an implicit sense of second-class
This paper has described 12 archetypes of ableist microaggressions citizenship for disabled people [29]. Questioning the legitimacy of
experienced on social media, unpacked how disabled people re- intimate relationships denies the possibility that a disabled person
spond to and cope with these incidents, and explained how they could be a sexual being and intimate partner, and thus corresponds
reimagine a more welcoming social media. We now discuss how to the desexualization and spread efect patterns [29]. We support
microaggressive experiences are unique to social media and present Bell’s assertion that the types of microaggressions are not mutually
design recommendations on how social media platforms could me- exclusive, as evidenced by the instances in which our archetypes
diate and perhaps even prevent these harmful experiences. embodied more than one type [5].

5.1 Ableist Microaggressions on Social Media


5.1.1 Categorizing Microaggressions. “Being ignored” and “exclu- 5.1.2 Responding to Microaggressions. Keller and Galgay wrote
sion via inaccessibility and moderation” are the two novel types that “we understand least” the “strategies [disabled people] use to
of microaggressions our study uncovered. “Being ignored” occurs deal with” microaggressions [29]. Our work addresses this critical
when a disabled person is neglected or ghosted by others. This is a gap by examining: how disabled people cope with microaggressions
particularly interesting microaggression, because it is characterized and the strategies they employ to counter, take action against, and
by the lack of action, whereas prior work has generally conceptu- respond to microaggressions in the moment and over the longer
alized microaggressions as a problematic action. Thus, a pointed term. We distinguish between and describe three temporally distinct
silence when encountering information about a person’s disability responses to microaggressions – the reaction that immediately
can be just as hurtful as a negative response. A more appropriate follows a microaggression, coping mechanisms that people use
response would involve sincere engagement with and a willingness to move past a microaggression, and longer-term strategies that
to learn about disability and ableism. “Exclusion via inaccessibility people use to respond to microaggressions and prevent them from
and moderation” refers to two instances: 1) when people share inac- occurring in the future.
cessible content with a disabled person despite knowing about their Our fndings confrm known response strategies including ac-
access needs; and 2) when moderation algorithms ban disabled users tive self-advocacy around disability; parsing the intent behind a
or fag their content due to normative expectations around what a microaggression to determine whether it was meant to be hurtful;
legitimate user looks like. Participants reported feeling frustrated educating others to counter stereotypes and myths about disabil-
by exclusionary experiences that arise from content moderation ity; responding via humor, sarcasm, or direct communication; and
algorithms and policies, because the mechanisms intended to pro- ignoring microaggressions and avoiding perpetrators [5]. Addition-
tect users result in making disabled people feel unwelcome and ally, we uncover response strategies unique to social media that
excluded on social media. disabled people routinely use. These include modulating the fre-
We confrm that the ableist microaggressions identifed in prior quency and timing of social media use; being cautious about the
work also occur on social media. Five of our archetypes map di- topic and format of the content they share; blocking and report-
rectly onto previously-identifed ofine microaggressions. “You’re ing perpetrators or problematic content; and deleting social media
so inspirational” corresponds to patronization, while “where’s your content – both their own and/or the ofender’s. These strategies
mom?” embodies infantilization [29]. Similarly, “you’re lying about align with recent work on understanding harassment experienced
your disability,” “that’s not a disability,” and “you’re abusing your by content creators and Black women [38, 51]. We also unpack the
service dog” represent specifc manifestations of the denial of dis- decision-making that goes into choosing which strategy to employ,
ability experience [29]. fnding that context and the perceived likelihood of being able to
Five other archetypes could be categorized into multiple types change the perpetrator’s mind plays a key role in this decision.
of existing microaggressions. The archetype “can someone like you Finally, our participants point to the labor that goes into experienc-
do that/wear that?” simultaneously makes assumptions about a ing microaggressions. Not only does the recipient have to spend
disabled person and ignores other aspects of a disabled person’s time and energy fguring out whether an action truly was a mi-
identity. Hence, it embodies the spread efect and denial of personal croaggression, but they must also cope with their reaction, decide
identity patterns [29]. The archetype “I would kill myself if I was how to respond, and perhaps advocate for themselves with the
Nothing Micro About It: Examining Ableist Microaggressions on Social Media ASSETS ’22, October 23–26, 2022, Athens, Greece

perpetrator or platform, thereby opening themselves up to more referring to systemic marginalization removed despite adhering to
microaggressions. platform content policies [22]. We propose that scholars continue
to develop this comparison between online and ofine microaggres-
5.1.3 Comparing Experiences of Online and Ofline Microaggres- sions for various minoritized groups, due to its potential to yield
sions. Extending the study of ableist microaggressions to social broader insights about marginality, intersectionality, and power
media provides us with an opportunity to begin comparing how dynamics in the digital world.
microaggressions are experienced across online and ofine contexts.
In both online and ofine contexts, participants found it challenging
to defnitively establish an incident as an ableist microaggression. 5.2 Design Recommendations
Microaggressions are subtle and rarely contain slurs or curse words, Currently, most major social media platforms incorporate moder-
relying instead on the tone and form. For instance, a seemingly ation tools (e.g., reporting, community policies) [41] and educate
harmless question or admiration make it difcult to identify the their users on hate speech and harassment [14]. Therefore, it is
perpetrator’s intent and make disabled people question their felt important to consider how social media platforms might account
experience: intentional hurt versus innocent curiosity and back- for ableist microaggressions. We discuss difculties in considering
handed remark versus compliment. Participants also reported that microaggressions within online moderation and recommend that
ableist microaggressions are usually unintentional. Prior work has social media designers and researchers educate users on disability
interpreted the lack of intent to mean that the perpetrator does and ableism.
not mean to cause harm [29]. However, our participants said that
perpetrators do intend what they say or do, but they typically have
not thought through the implications of their actions. 5.2.1 The Challenge of Microaggressions Within Online Modera-
Participants described that ableist microaggressions are more tion. Prior work on social media moderation documented faws
frequent online, perhaps due to the lack of consequences for actions within current online moderation systems, such as unfair and false
on social media and the anonymity aforded by some platforms. Fur- reporting [22, 27, 46], mislabelling harassment [7], and lack of trans-
ther, participants worried that the sheer scale and visibility of social parency in the harassment reporting process [7]. Our participants
media means that online microaggressions are much more likely to echoed current faws in moderation, leading us to wonder how mi-
afect the views of observers. That is, online microaggressions are croaggressions should ft within these systems. Several participants
more worrisome due to their ability to shape the views of many, in our study felt uncomfortable blocking or reporting perpetrators,
leading to larger-scale ramifcations in terms of how these users due to the unintentional nature of microaggressions and to avoid
will think about and treat disabled people in the future. Participants severing personal relationships (see Section 4.2.1). Others felt that
also noted that perpetrators may continue harassing them on other everyone was entitled to their opinion, and blocking was unneces-
platforms. This cross-platform harassment makes it pointless to sary. This aligns with prior work viewing online harassment with a
delete individual posts and comments. restorative justice lens: removing perpetrators of microaggressions,
However, participants reported that online microaggressions or banning their profle may not serve justice to disabled people
can be easier to ignore and move past than ofine ones. The rela- [47], or align with their preferences and beliefs. Overall current
tively public nature of online microaggressions means that there is moderation tools (i.e., blocking and reporting) are designed to ac-
a higher chance that disabled people can rely on friends, disability count for overt cases of harassment. Future work should continue
advocates, and allies to intervene on their behalf. Further, some to investigate preferences for handling microaggressions and more
participants reported that perpetrators of online microaggressions subtle forms of discrimination.
rarely have the power to actually deny disabled people a material Due to their inherent subtlety and nuance, microaggressions
resource. This is unlike ofine microaggressions where the disabled may be difcult to detect automatically. Microaggressions are often
person may rely on the perpetrator for something such as food, personal, and may be ofensive given prior history and type of
transportation, or medical treatment. In these cases, the perpetra- relationship with the perpetrator. Moderators or models may not
tor may also occupy a position of authority, granting a veneer of be privy to this information, and hence may be unable to classify
legitimacy to the microaggression and its underlying message. some microaggressions.
Finally, the same type of microaggression can manifest difer- Online moderators may lack interpersonal and social context of
ently in online and ofine contexts. The archetype “that’s not a microaggressions. For instance, microaggressions like “what hap-
disability,” which corresponds to a denial of disability experience pened to you?” or “where’s your mom?” are difcult to identify as
[29], typically manifests ofine in the form of words or actions that microaggressions without additional context. On the other hand a
invalidate a person’s disability experience. This archetype takes on a microaggression such as "I would kill myself if I was disabled" can be
diferent form on social media, that of posts and users being fagged clearly labelled as an ableist microaggression. Since the moderator
as inappropriate by other users. Participants noted that posts dis- lacks interpersonal context and relationship, we recognize that they
cussing their own disability experience or sharing disability-related may be unable to reduce the emotional toll that microaggressive
content were often reported and removed on social media platforms experiences can have. Our participants also shared that they relied
without any explanation. Participants also reported cases of dis- on friends and other users to support them by responding to the
abled activists having their accounts banned for sharing disability perpetrator. Mahar et al. have also used an approach that recruits
advocacy content. This resonates with Haimson et al. who found friends to help flter messages during harassment attacks [33]. Since
that transgender and Black social media users often had content microaggressions are highly contextual to interpersonal history and
ASSETS ’22, October 23–26, 2022, Athens, Greece Heung et al.

styles of interaction, crowdsourced and even friendsourced mod- users becoming hateful [23]. This further motivates a design oppor-
erators may be unable to identify some microaggressions without tunity for social media platforms to contradict narratives behind
further context. ableist microaggressions. Similar to traditional media, inaccurate
Similarly, microaggressions are more complex to pinpoint by portrayals of disability on social media perpetuate ableism [43].
models compared to hate speech. Existing models are mostly trained In this study, we were limited to self-reported experiences of
to detect hate speech [19, 20, 24, 44]; however, some researchers ableist microaggressions. Future work should quantitatively inves-
are developing machine learning techniques to identify microag- tigate the breadth, frequency, and impact of these microaggressions,
gressions (based on gender, race, and sexuality) [10]. We encourage and examine disabled people’s preferences in moderating microag-
researchers to continue investigating ways of identifying ableist gressions online. Visibly disabled people and disabled women are
microaggressions online. known to experience more ableist microaggressions [4, 28], but
other aspects of a person’s identity might infuence the kinds and
5.2.2 Designing for Disability Education and Awareness. Partici- frequency of microaggressions that they are targeted for. We en-
pants suggested that social media should educate users about dis- courage researchers to consider intersecting identities of those who
ability and ableism, as a preventive measure to reduce microag- may be more susceptible to these harmful experiences.
gressions. For example, some wanted improvements in current
community guidelines, wishing for explicit policies around ableism.
6 CONCLUSION
Such community norms can educate and disincentivize perpetrators
from being ableist [41]. One participant proposed that the platform, Our work is the frst attempt to examine how disabled people ex-
instead of other users, can combat microaggressions when they perience and cope with ableist microaggressions on social media.
occur. Similarly, Bennett calls on online dating platforms to dispel From our interviews with 20 disabled people, we uncovered 12 mi-
the notion that disabled people are asexual [6]. Although education croaggression archetypes, validating in-person microaggressions
around disability and ableism should extend beyond online set- and introducing manifestations that are unique to social media.
tings, social media platforms have a unique opportunity to combat Some participants responded to perpetrators using platform fea-
harassment and discrimination by educating their users. tures, while others either ignored them or engaged with them in
In the event that ableist microaggressions could be detected, the hopes of changing the perpetrators’ mindset about disability.
social media platforms can combat ableist microaggressions proac- Overall participants felt that experiencing ableist microaggressions
tively by nudging the perpetrator, asking them to refect if their afected their wellbeing and heavily transformed the way they used
comment or post has ableist microaggressions. Perhaps this nudge social media. We see our work as a starting point in examining
can activate only when there are indicators of ableism. This nudge exclusionary experiences of disabled people on social media. As we
can educate perpetrators on the potential harms before they engage work towards inclusion, we call upon researchers and designers to
with disabled users on the platform. Other scholars have explored consider disability as a facet of diversity along with race, gender,
educating users through nudges to alter future behaviors [53] and and sexuality.
such lightweight interventions while sharing have been found to
be efective in reducing the propagation of fake news [26]. For ACKNOWLEDGMENTS
example, Jahanbakhsh et al. explored the use of nudges in fact- We are thankful to our participants who openly shared their expe-
checking information on social media and experimented with a riences. We thank Gillian Hayes and our anonymous reviewers for
variety of behavioral nudges; such as using checkboxes to assess helping improve the paper. This work was supported in part by the
the information or implementing text with rationales on why the in- National Science Foundation Graduate Research Fellowship and
formation is inaccurate. We recommend that designers explore how the University of California President’s Postdoctoral Fellowship.
such lightweight interventions can be adapted to prevent ableist
microaggressions from being sent in the frst place. REFERENCES
Social media platforms also need reactive approaches to mitigate [1] 1990. Americans With Disabilities Act. https://www.ada.gov/pubs/adastatute08.
the harms of ableist microaggressions. Consider the microaggres- htm
[2] 2011. The Sage Handbook of Qualitative Research.
sion: “can disabled people have sex?” The platform can present [3] Raya Al-Jadir. 2019. The damaging efect of social media positivity on disabled
information that: 1) answers the microaggressive question and people. https://disabilityhorizons.com/2019/10/the-damaging-afect-of-social-
replies for the disabled person reducing the emotional labor, and 2) media-positivity-on-disabled-people/
[4] Deniz Aydemir-Döke and James T. Herbert. 2021. Development and Validation
debunks the assumption that disabled people are uninterested in or of the Ableist Microaggression Impact Questionnaire. Rehabilitation Counseling
incapable of sex. If the ableist microaggression is a post, there could Bulletin (2021). https://doi.org/10.1177/00343552211014259
[5] Ayoka Bell. 2013. Nothing about us without us: A qualitative investigation of
be a public correction, much like credibility indicators to combat the experiences of being a target of ableist microaggressions. (2013). https:
misinformation and fake news [26, 46]. A public correction would //www.proquest.com/docview/1536397558
not censor or augment the original post, but include more informa- [6] Cynthia L. Bennett. 2017. Disability-Disclosure Preferences and Practices in
Online Dating Communities. XRDS 24, 2 (dec 2017), 30–33. https://doi.org/10.
tion about the content, educating all users on ableist stereotypes 1145/3155120
and misconceptions. The proposal to contradict an existing dis- [7] Lindsay Blackwell, Jill Dimond, Sarita Schoenebeck, and Clif Lampe. 2017. Clas-
criminatory comment on social media is similar to counter-speech, sifcation and Its Consequences for Online Harassment: Design Insights from
HeartMob. Proc. ACM Hum.-Comput. Interact. 1, CSCW, Article 24 (dec 2017),
which is used to combat hate-speech online [18, 23]. A recent study 19 pages. https://doi.org/10.1145/3134659
has found that counter-speech reduces the probability of other [8] Lindsay Blackwell, Jean Hardy, Tawfq Ammari, Tifany Veinot, Clif Lampe, and
Sarita Schoenebeck. 2016. LGBT Parents and Social Media: Advocacy, Privacy,
and Disclosure during Shifting Social Movements. In Proceedings of the 2016 CHI
Nothing Micro About It: Examining Ableist Microaggressions on Social Media ASSETS ’22, October 23–26, 2022, Athens, Greece

Conference on Human Factors in Computing Systems (San Jose, California, USA) [26] Farnaz Jahanbakhsh, Amy X. Zhang, Adam J. Berinsky, Gordon Pennycook,
(CHI ’16). Association for Computing Machinery, New York, NY, USA, 610–622. David G. Rand, and David R. Karger. 2021. Exploring Lightweight Interventions
https://doi.org/10.1145/2858036.2858342 at Posting Time to Reduce the Sharing of Misinformation on Social Media. Proc.
[9] Virginia Braun and Victoria Clarke. 2006. Using thematic ACM Hum.-Comput. Interact. 5, CSCW1, Article 18 (apr 2021), 42 pages. https:
analysis in psychology. Qualitative Research in Psychology 3, //doi.org/10.1145/3449092
2 (2006), 77–101. https://doi.org/10.1191/1478088706qp063oa [27] Shagun Jhaver, Sucheta Ghoshal, Amy Bruckman, and Eric Gilbert. 2018. Online
arXiv:https://www.tandfonline.com/doi/pdf/10.1191/1478088706qp063oa Harassment and Content Moderation: The Case of Blocklists. 25, 2, Article 12
[10] Luke Breitfeller, Emily Ahn, David Jurgens, and Yulia Tsvetkov. 2019. Finding (mar 2018), 33 pages. https://doi.org/10.1145/3185593
Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social [28] Shanna Katttari. 2020. Ableist Microaggressions and the Mental Health of
Media Posts. In Proceedings of the 2019 Conference on Empirical Methods in Natural Disabled Adults. Community Mental Health Journal 56 (2020), 1170–1179.
Language Processing and the 9th International Joint Conference on Natural Lan- https://doi.org/10.1007/s10597-020-00615-6
guage Processing (EMNLP-IJCNLP). Association for Computational Linguistics, [29] Richard Keller and Corinne Galgay. 2010. Microaggressive Experiences of People
Hong Kong, China, 1664–1674. https://doi.org/10.18653/v1/D19-1176 with Disabilities. Microaggressions and Marginality: Manifestation, Dynamics, and
[11] Carol J. Gill Carmit-Noa Shpigelman. 2014. "Facebook Use by Persons with Impact (2010), 241–267.
Disabilities". Journal of Computer-Mediated Communication 19 (2014), 610–624. [30] Ugur Kursuncu, Manas Gaur, Carlos Castillo, Amanuel Alambo, Krishnaprasad
Issue 3. https://doi-org.proxy.library.cornell.edu/10.1111/jcc4.12059 Thirunarayan, Valerie Shalin, Dilshod Achilov, I. Budak Arpinar, and Amit Sheth.
[12] Jessica Caron and Janice Light. 2016. “Social Media has Opened a World of ‘Open 2019. Modeling Islamist Extremist Communications on Social Media Using
communication:’” experiences of Adults with Cerebral Palsy who use Augmenta- Contextual Dimensions: Religion, Ideology, and Hate. Proc. ACM Hum.-Comput.
tive and Alternative Communication and Social Media. Augmentative and Alter- Interact. 3, CSCW, Article 151 (nov 2019), 22 pages. https://doi.org/10.1145/
native Communication 32, 1 (2016), 25–40. https://doi.org/10.3109/07434618.2015. 3359253
1052887 arXiv:https://doi.org/10.3109/07434618.2015.1052887 PMID: 26056722. [31] Amanda Lenhart, Michele Ybarra, Kathryn Zickur, and Myeshia Price-Feeney.
[13] Sue Caton and Melanie Chapman. 2016. The use of social media and people with 2016. Online Harassment, Digital Abuse, and Cyberstalking in America. Data &
intellectual disability: A systematic review and thematic analysis. Journal of Society (2016). https://www.datasociety.net/pubs/oh/Online_Harassment_2016.
Intellectual & Developmental Disability 41, 2 (2016), 125–139. https://doi.org/10. pdf
3109/13668250.2016.1153052 arXiv:https://doi.org/10.3109/13668250.2016.1153052 [32] Kayla Lett, Andreea Tamaian, and Bridget Klest. 2020. Impact of ableist microag-
[14] Meta Transparency Center. 2022. "Hate Speech". (2022). https://transparency.fb. gressions on university students with self-identifed disabilities. Disability &
com/policies/community-standards/hate-speech/ Society 35, 9 (2020), 1441–1456. https://doi.org/10.1080/09687599.2019.1680344
[15] Mohit Chandra, Manvith Reddy, Shradha Sehgal, Saurabh Gupta, Arun Balaji arXiv:https://doi.org/10.1080/09687599.2019.1680344
Buduru, and Ponnurangam Kumaraguru. 2021. "A Virus Has No Religion": Ana- [33] Kaitlin Mahar, Amy X. Zhang, and David Karger. 2018. Squadbox: A Tool to
lyzing Islamophobia on Twitter During the COVID-19 Outbreak. In Proceedings Combat Email Harassment Using Friendsourced Moderation. In Proceedings of
of the 32nd ACM Conference on Hypertext and Social Media (Virtual Event, USA) the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC,
(HT ’21). Association for Computing Machinery, New York, NY, USA, 67–77. Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA,
https://doi.org/10.1145/3465336.3475111 1–13. https://doi.org/10.1145/3173574.3174160
[16] Dasom Choi, Uichin Lee, and Hwajung Hong. 2022. “It’s Not Wrong, but I’m [34] Jennifer Mankof, Gillian R. Hayes, and Devva Kasnitz. 2010. Disability Studies as
Quite Disappointed”: Toward an Inclusive Algorithmic Experience for Content a Source of Critical Inquiry for the Field of Assistive Technology. In Proceedings of
Creators with Disabilities. In Proceedings of the 2022 CHI Conference on Human the 12th International ACM SIGACCESS Conference on Computers and Accessibility
Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association (Orlando, Florida, USA) (ASSETS ’10). Association for Computing Machinery,
for Computing Machinery, New York, NY, USA, Article 593, 19 pages. https: New York, NY, USA, 3–10. https://doi.org/10.1145/1878803.1878807
//doi.org/10.1145/3491102.3517574 [35] Ming Cui Mathew D. Gayman, Robyn Lewis Brown. 2011. Depressive symptoms
[17] John W. Creswell and Dana L. Miller. 2000. Determining Validity in Qualitative and bodily pain: the role of physical disability and social stress. Stress & Health
Inquiry. Theory Into Practice 39, 3 (2000), 124–130. https://doi.org/10.1207/ 27 (2011), 52–63. https://doi.org/10.1002/smi.1319
s15430421tip3903_2 arXiv:https://doi.org/10.1207/s15430421tip3903_2 [36] Hannah Miller, Heather Buhr, Chris Johnson, and Jerry Hoepner. 2013. Aphasi-
[18] Mithun Das, Binny Mathew, Punyajoy Saha, Pawan Goyal, and Animesh Mukher- aWeb: A Social Network for Individuals with Aphasia. In Proceedings of the 15th
jee. 2020. Hate Speech in Online Social Media. SIGWEB Newsl. Autumn, Article International ACM SIGACCESS Conference on Computers and Accessibility (Belle-
4 (nov 2020), 8 pages. https://doi.org/10.1145/3427478.3427482 vue, Washington) (ASSETS ’13). Association for Computing Machinery, New York,
[19] Paula Fortuna and Sérgio Nunes. 2018. A Survey on Automatic Detection of NY, USA, Article 4, 8 pages. https://doi.org/10.1145/2513383.2513439
Hate Speech in Text. ACM Comput. Surv. 51, 4, Article 85 (jul 2018), 30 pages. [37] Mainack Mondal, Leandro Araújo Silva, and Fabrício Benevenuto. 2017. A
https://doi.org/10.1145/3232676 Measurement Study of Hate Speech in Social Media. In Proceedings of the
[20] Jennifer Golbeck, Zahra Ashktorab, Rashad O. Banjo, Alexandra Berlinger, Sid- 28th ACM Conference on Hypertext and Social Media (Prague, Czech Republic)
dharth Bhagwan, Cody Buntain, Paul Cheakalos, Alicia A. Geller, Quint Gergory, (HT ’17). Association for Computing Machinery, New York, NY, USA, 85–94.
Rajesh Kumar Gnanasekaran, Raja Rajan Gunasekaran, Kelly M. Hofman, Jenny https://doi.org/10.1145/3078714.3078723
Hottle, Vichita Jienjitlert, Shivika Khare, Ryan Lau, Marianna J. Martindale, Shal- [38] Tyler Musgrave, Alia Cummings, and Sarita Schoenebeck. 2022. Experiences of
mali Naik, Heather L. Nixon, Piyush Ramachandran, Kristine M. Rogers, Lisa Harm, Healing, and Joy among Black Women and Femmes on Social Media. In
Rogers, Meghna Sardana Sarin, Gaurav Shahane, Jayanee Thanki, Priyanka Ven- Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems
gataraman, Zijian Wan, and Derek Michael Wu. 2017. A Large Labeled Corpus (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New
for Online Harassment Research. In Proceedings of the 2017 ACM on Web Science York, NY, USA, Article 240, 17 pages. https://doi.org/10.1145/3491102.3517608
Conference (Troy, New York, USA) (WebSci ’17). Association for Computing Ma- [39] Fayika Farhat Nova, MD. Rashidujjaman Rifat, Pratyasha Saha, Syed Ishtiaque
chinery, New York, NY, USA, 229–233. https://doi.org/10.1145/3091478.3091509 Ahmed, and Shion Guha. 2019. Online Sexual Harassment over Anonymous
[21] Oliver L. Haimson. 2017. Digital and Physical Barriers to Changing Identities. Social Media in Bangladesh. In Proceedings of the Tenth International Conference
XRDS 24, 2 (dec 2017), 26–29. https://doi.org/10.1145/3155118 on Information and Communication Technologies and Development (Ahmedabad,
[22] Oliver L. Haimson, Daniel Delmonaco, Peipei Nie, and Andrea Wegner. 2021. India) (ICTD ’19). Association for Computing Machinery, New York, NY, USA,
Disproportionate Removals and Difering Content Moderation Experiences for Article 1, 12 pages. https://doi.org/10.1145/3287098.3287107
Conservative, Transgender, and Black Social Media Users: Marginalization and [40] Rhoda Olkin, H’Sien Hayward, Melody Schaf Abene, and Goldie VanHeel. 2019.
Moderation Gray Areas. Proc. ACM Hum.-Comput. Interact. 5, CSCW2, Article The Experiences of Microaggressions against Women with Visible and Invisible
466 (oct 2021), 35 pages. https://doi.org/10.1145/3479610 Disabilities. Journal of Social Issues 75 (2019), 757–787. https://doi.org/10.1111/
[23] Bing He, Caleb Ziems, Sandeep Soni, Naren Ramakrishnan, Diyi Yang, and Srijan josi.12342
Kumar. 2021. Racism is a Virus: Anti-Asian Hate and Counterspeech in Social [41] Jessica A. Pater, Moon K. Kim, Elizabeth D. Mynatt, and Casey Fiesler. 2016. Char-
Media during the COVID-19 Crisis. In Proceedings of the 2021 IEEE/ACM Interna- acterizations of Online Harassment: Comparing Policies Across Social Media
tional Conference on Advances in Social Networks Analysis and Mining (Virtual Platforms. In Proceedings of the 19th International Conference on Supporting Group
Event, Netherlands) (ASONAM ’21). Association for Computing Machinery, New Work (Sanibel Island, Florida, USA) (GROUP ’16). Association for Computing Ma-
York, NY, USA, 90–94. https://doi.org/10.1145/3487351.3488324 chinery, New York, NY, USA, 369–374. https://doi.org/10.1145/2957276.2957297
[24] Jackson Houston, Peter Peterson, and Catherine Reich. 2020. Hate Speech De- [42] Chester M. Pierce, Jean V. Carew, Diane Pierce-Gonzalez, and Deborah Wills.
tection in Twitter: A Selectively Trained Ensemble Method. Master’s thesis. USA. 1977. An Experiment in Racism: TV Commercials. Education and Ur-
Advisor(s) Richard, Maclin,. AAI27997273. ban Society 10, 1 (1977), 61–87. https://doi.org/10.1177/001312457701000105
[25] Amanda Hynan, Juliet Goldbart, and Janice Murray. 2015. A grounded theory arXiv:https://doi.org/10.1177/001312457701000105
of Internet and social media use by young people who use augmentative and [43] Rebecca Renwick. 2016. Rarely Seen, Seldom Heard: People with Intellectual
alternative communication (AAC). Disability and Rehabilitation 37, 17 (2015), Disabilities in the Mass Media. Palgrave Macmillan UK, London, 61–75. https:
1559–1575. https://doi.org/10.3109/09638288.2015.1056387 //doi.org/10.1057/978-1-137-52499-7_5
ASSETS ’22, October 23–26, 2022, Athens, Greece Heung et al.

[44] N.D.T. Ruwandika and A.R. Weerasinghe. 2018. Identifcation of Hate Speech [51] Kurt Thomas, Patrick Gage Kelley, Sunny Consolvo, Patrawat Samermit, and
in Social Media. In 2018 18th International Conference on Advances in ICT for Elie Bursztein. 2022. “It’s Common and a Part of Being a Content Creator”:
Emerging Regions (ICTer). 273–278. https://doi.org/10.1109/ICTER.2018.8615517 Understanding How Creators Experience and Cope with Hate and Harassment
[45] Matthew J. Salganik. 2017. Bit by Bit: Social Research in the Digital Age (open Online. In Proceedings of the 2022 CHI Conference on Human Factors in Computing
review edition ed.). Princeton University Press, Princeton, NJ. Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery,
[46] Emily Saltz, Claire R Leibowicz, and Claire Wardle. 2021. Encounters with Visual New York, NY, USA, Article 121, 15 pages. https://doi.org/10.1145/3491102.
Misinformation and Labels Across Platforms: An Interview and Diary Study to 3501879
Inform Ecosystem Approaches to Misinformation Interventions. Association for [52] Stefan Timmermans and Iddo Tavory. 2012. Theory Construction in Qual-
Computing Machinery, New York, NY, USA. https://doi-org.proxy.library.cornell. itative Research: From Grounded Theory to Abductive Analysis. Sociologi-
edu/10.1145/3411763.3451807 cal Theory 30, 3 (2012), 167–186. https://doi.org/10.1177/0735275112457914
[47] Sarita Schoenebeck, Oliver L Haimson, and Lisa Nakamura. 2021. Drawing arXiv:https://doi.org/10.1177/0735275112457914
from justice theories to support targets of online harassment. New Media [53] Nicole A. Vincent and Emma A. Jane. 2017. Beyond law: protecting victims through
& Society 23, 5 (2021), 1278–1300. https://doi.org/10.1177/1461444820913122 engineering and design. Number 30 in Routledge Studies in Crime and Society.
arXiv:https://doi.org/10.1177/1461444820913122 Routledge, Taylor and Francis Group, United Kingdom, 209–223.
[48] Woosuk Seo and Hyunggu Jung. 2017. Exploring the Community of Blind or [54] Jessica Vitak, Kalyani Chadha, Linda Steiner, and Zahra Ashktorab. 2017. Identify-
Visually Impaired People on YouTube. In Proceedings of the 19th International ing Women’s Experiences With and Strategies for Mitigating Negative Efects of
ACM SIGACCESS Conference on Computers and Accessibility (Baltimore, Maryland, Online Harassment. In Proceedings of the 2017 ACM Conference on Computer Sup-
USA) (ASSETS ’17). Association for Computing Machinery, New York, NY, USA, ported Cooperative Work and Social Computing (Portland, Oregon, USA) (CSCW
371–372. https://doi.org/10.1145/3132525.3134801 ’17). Association for Computing Machinery, New York, NY, USA, 1231–1245.
[49] Sharon L Snyder and David T Mitchell. 2006. Cultural Locations of Disability. https://doi.org/10.1145/2998181.2998337
University of Chicago Press. [55] Qunfang Wu, Louisa Kayah Williams, Ellen Simpson, and Bryan Semaan. 2022.
[50] Derald Wing Sue, Christina M Capodilupo, Gina C Torino, Jennifer M Bucceri, Conversations About Crime: Re-Enforcing and Fighting Against Platformed
Aisha Holder, Kevin L Nadal, and Marta Esquilin. 2007. Racial microaggressions Racism on Reddit. Proc. ACM Hum.-Comput. Interact. 6, CSCW1, Article 54 (apr
in everyday life: implications for clinical practice. American Psychologist 62, 4 2022), 38 pages. https://doi.org/10.1145/3512901
(2007), 271. [56] Susan Xiong. 2016. The Development of the Disability Microaggressions Scale.
(2016). https://hdl.handle.net/10027/20980
Authoring accessible media content on social networks
Letícia Seixas Pereira José Coelho André Rodrigues
lspereira@fc.ul.pt jrocoelho@fc.ul.pt afrodrigues@fc.ul.pt
LASIGE, Faculdade de Ciências, LASIGE, Faculdade de Ciências, LASIGE, Faculdade de Ciências,
Universidade de Lisboa Universidade de Lisboa Universidade de Lisboa
Portugal Portugal Portugal

João Guerreiro Tiago Guerreiro Carlos Duarte


jpguerreiro@fc.ul.pt tjvg@di.fc.ul.pt caduarte@fc.ul.pt
LASIGE, Faculdade de Ciências, LASIGE, Faculdade de Ciências, LASIGE, Faculdade de Ciências,
Universidade de Lisboa Universidade de Lisboa Universidade de Lisboa
Portugal Portugal Portugal

ABSTRACT 1 INTRODUCTION
User-generated content plays a key role in social networking, al- Social networks have permeated every facet of modern society’s
lowing a more active participation, socialisation, and collaboration daily life. Following the recent events of the COVID-19 pandemic,
among users. In particular, media content has been gaining a lot their usage is at record levels [33]. Facebook has reported an in-
of ground, allowing users to express themselves through diferent crease of approximately 6% users over the previous year reaching
types of formats such as images, GIFs and videos. The majority of 1.93 billion daily active users in the third quarter of 2021 [10]. The
this growing type of online visual content remains inaccessible to possibility to engage with one another, while physically distant,
a part of the population, in particular for those who have a visual is at the moment more relevant than it ever was. For people with
disability, despite available tools to mitigate this source of exclusion. disabilities, these platforms also play an important role in disability
We sought to understand how people are perceiving this type of advocacy, as it provides a vehicle for meeting new contacts with
online content in their networks and how support tools are being disabilities, learning about issues and news related to it, and dis-
used. To do so, we conducted a user study, with 258 social network cussing accessibility challenges and solutions for improving social
users through an online questionnaire, followed by interviews with media inclusion [12, 36, 38].
20 of them – 7 blind users and 13 sighted users. Results show how Despite the contributions and improvements promoted by social
the diferent approaches being employed by major platforms may networks in recent years [37], main platforms still present substan-
not be sufcient to address this issue properly. Our fndings reveal tial accessibility barriers for users with disabilities. The complexity
that users are not always aware of the possibility and the bene- of their interfaces compared to many typical websites comes from
fts of adopting accessible practices. From the general perspectives the fact that they are primarily composed of user-generated content.
of end-users experiencing accessible practices, concerning barri- As such, for these platforms to be truly accessible they have to go
ers encountered, and motivational factors, we also discuss further beyond ensuring the content they control and produce is accessible:
approaches to create more user engagement and awareness. they need to ensure the content their users produce is also accessi-
ble. This is especially relevant for people with visual impairments
CCS CONCEPTS given the prevalence of user-generated content that is mostly visual
• Human-centered computing → Accessibility; Social media. (e.g., images, GIFs, videos). As observed by Voykinska et al. [36],
in order to fully engage with visual content, blind users need to
KEYWORDS overcome several challenges, in particular, the frequent lack of al-
ternative descriptions in photos, essential to provide them proper
accessibility, social media, visual content, user-generated content contextual information. Most of them rely on workarounds such
ACM Reference Format: as searching for meta-data (author, geo-localization and even com-
Letícia Seixas Pereira, José Coelho, André Rodrigues, João Guerreiro, Tiago ments posted by other users), or reaching out to a nearby friend or
Guerreiro, and Carlos Duarte. 2022. Authoring accessible media content on family member to assist them. Conversely, in a survey conducted by
social networks. In The 24th International ACM SIGACCESS Conference on Mathur et al. [24], friends and family members of visually impaired
Computers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. users conveyed that writing alternative text is time consuming and
ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/3517428.3544882 requires more thought than inaccessible uploading practices.
Eforts have been undertaken by the social network’s services
Permission to make digital or hard copies of part or all of this work for personal or themselves to improve the accessibility of visual content, such as
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation automatic captions [9], or providing text inputs for alternative de-
on the frst page. Copyrights for third-party components of this work must be honored. scriptions [35]. However, these approaches are not yet sufcient
For all other uses, contact the owner/author(s). to provide sufcient contextual information, or to ensure that end-
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Copyright held by the owner/author(s).
users are aware of these features and therefore they are not widely
ACM ISBN 978-1-4503-9258-7/22/10. used. In addition, little is known about the barriers and motivations
https://doi.org/10.1145/3517428.3544882
ASSETS ’22, October 23–26, 2022, Athens, Greece Pereira, et al.

impacting the creation of accessible content by most end-users, Despite these eforts, as described in the most recent studies
which is at the root of the problem. In this research, we sought to on this context, blind and visually impaired users still encounter
explore the current context of the accessibility of visual content signifcant barriers in interpreting visual content [4, 11, 23, 28, 38].
in social networks. We provide further analysis on the factors hin-
dering the creation of accessible content by end-users, considering
people with and without a visual impairment, on major social media 2.2 Content sharing
platforms. Furthermore, we also aimed to uncover what does or can Approaches to improving the accessibility of media content have
motivate people to create accessible content. We set the following been previously discussed in the literature. For instance, Morash et
research questions: al. [26] investigated approaches for web workers to author descrip-
tions of STEM (science, technology, engineering, and mathematics)
• RQ1: What are the motivations for social network users to
images. Still on the context of guiding users on how to provide
create accessible media content?
better descriptions, Mack et al. [22] proposed two types of proto-
• RQ2: Which barriers social network users encounter to share
type interfaces to facilitate authoring of alternative descriptions for
and author accessible media content?
images in Power Point presentations, as well as providing feedback
• RQ3: What are the requirements for social network users to
for automatic alternative descriptions. In a similar direction, Gurari
create accessible media content?
et al. [17] analyze a dataset of images and alternative descriptions
To answer these questions, we conducted a two-step user study focused on improving current machine generation approaches to-
including an online survey and user interviews. Our fndings sug- wards real end users needs. Another approach focused on online
gest that users are interested in providing accessible content on their content is presented by Guinness et al. [15] which, using a fully
social networks, but most of them are not aware of the steps needed automated approach, searches for possible descriptions already
in order to improve their practices and the impact it could have. available on the web for a given image. This strategy addresses
Our work leads to a better understanding of the current state of a current gap, since much content present on social networks is
media accessibility, in particular, of accessibility awareness among shared and widely disseminated on other pages - such as memes
social network users. The insights collected also highlight gaps and and GIFs. However, another type of content particular to social
opportunities to enable a better interaction for blind people and networks are personal images and photos - which are unlikely to be
more engagement in accessible practices by other end-users. available on other pages. Considering this last aspect, one great chal-
lenge still remains, the engagement of users in providing accessible
2 RELATED WORK content. Through an analysis of a million images posted on Twitter,
Gleason et al. [11] observed that only 0.1% contained an alternative
Our research is related to prior work on (1) existing accessibility
text. Twitter itself may be accountable for this low number by not
approaches employed by major platforms and its impact on visually
enabling by default the feature that allows the inclusion of the
impaired users’ media interpretation, (2) practices on accessible
alternative description for many years. However, Gleason et al. [11]
content sharing by end-users, and (3) current advances and gaps
also observed that even users who enabled this feature to provide
on image captioning.
alternative text descriptions did not always write them. Visually
impaired users engage in major photo-related activities as other
2.1 Accessibility on social networks users, considered by them as part of the social network experience.
Social networking services, such as Facebook, have a high adoption However, practices such as taking and editing photos, and provid-
rate among blind users [39]. This also happens with Twitter, which ing an alternative text, often involve undertaking workarounds or
evolved from a very simple text-based interface to one that is now getting help from trusted sighted people [1, 24, 36, 39].
flled with multimedia content [3]. The widespread usage of camera Mathur et al. [24] observed that friends and family members of
equipped mobile phones contributes to the growth in publications visually impaired users engage frequently in accessible practices.
containing visual content. However, this user-generated content Also, according to Wu et al. [39], users with visual impairments are
pushes social networks to become increasingly inaccessible [3]. much more likely to have friends who are also visually impaired.
Numerous eforts have been made to improve accessibility of These two factors seem to play a role in the increased accessibility of
visual content on social networks with some of these initiatives images that are accessed by visually impaired people (in comparison
coming from the service providers themselves. For instance, in 2016, to those accessed by sighted people) [24]. However, users also report
Twitter included a feature allowing users to compose their own that writing alternative text is time consuming and requires more
alternative descriptions for their images. However, users claim that thought than inaccessible uploading practices, which explains – at
this initial feature had its drawbacks as it had to be enabled by the least, in part – the low percentage of images with an alternative
users themselves, was hard to fnd, and to understand [34]. Even description [11].
though this changed in 2020 [35] – at the moment this resource Concerning users’ engagement, previous works suggest that
is active and available by default – the impact of this measure users who currently provide alternative descriptions on their images
has not yet been assessed nor discussed. Meanwhile, Facebook are mainly motivated by personal connections to someone with a
made a choice to use automatic descriptions by tagging each image disability or by a general matter of inclusion [11, 30]. These works
uploaded using image detection and recognition algorithms and also reinforce the need of educating users in providing alternative
enabling the user to edit the automatically provided alternative descriptions as well as investigate further approaches on using
description [9]. automated description techniques to improve descriptions provided
Authoring accessible media content on social networks ASSETS ’22, October 23–26, 2022, Athens, Greece

by authors. While these fndings provide important insights into 3.1 Online survey
authoring and sharing practices for accessible content, they leave An online survey was conducted to gather information about usage
some unanswered questions around the perspective of end users of social networks, motivations and barriers for authoring acces-
who are not yet aware of current accessibility approaches and the sible media content, and awareness of accessible practices. The
specifc needs of users with disabilities. In this paper, we focus on questionnaire used took about 15 minutes to complete, and it was
further exploring the challenges these users encounter when trying built using Microsoft Forms in four diferent languages (English,
to engage in such activities for the frst time and how to motivate French, Portuguese, and Spanish) in order to reach a diverse sample
them to create accessible media content. of participants. This questionnaire was divided in three sections:
demographic questions; social networks usage - considering fre-
quency, devices, and type of content of access, posting, and sharing
2.3 Image captioning activities; and social networks accessibility practices - consider-
The low user compliance in providing alternative text descriptions ing current practices but also accessibility awareness and further
is a common web accessibility problem and therefore some alter- motivations to engage in such practices.
native methods are usually employed to fll this gap. Stangl et
Pilot study interviews to inform survey design: As a frst step, we
al. [32] classifed some of the existing approaches to generate im-
conducted pilot interviews to identify potential problems with, and
age descriptions as human-powered approaches, automated image
improvements to, the questionnaire. Participants were recruited
descriptions approaches and hybrid image description technologies.
through the research team’s network. The only inclusion criterion
Human-powered are recognised by users for their accuracy and
was to be a user of at least one social network. The pilot study in-
quality of responses. Techniques such as Crowdsourcing may have
cluded 7 social network users, including three people who are blind.
slow response times for real-time needs and a high fnancial cost.
Participants had a variety of occupations as well as diferent levels
Friendsourcing may improve the quality and trustworthiness of the
of accessibility awareness. Some of them were accessibility practi-
answers received, as friends would better understand the question
tioners, but also high school teachers with no previous knowledge
asked, while also removing fnancial costs of the service [3]. How-
of digital accessibility. Sessions were remotely conducted, taking
ever, the social costs of exposing one’s problems and vulnerability
between 30 and 40 minutes. During the sessions, a preliminary
are a serious concern for these users as they may appear or feel less
version of the online questionnaire was used. Participants were
independent [3]. As for automated approaches, unlike the previous
encouraged to propose suggestions and improvements to enhance
technique, they are fast and cheap, allowing platforms to deploy
the overall understanding of the survey. At this stage, participants
them at scale [8, 21]. While there are signifcant eforts undertaken
responses were not collected rather than their thoughts on the in-
over the past years on image understanding and automated cap-
formation contained on the questionnaire; i.e., their answers were
tioning, the accuracy of these captions is not yet sufcient. Caption
not considered in the fnal analysis. Participants provided impor-
and phrasing models have an important impact in scepticism as
tant feedback on the questionnaire, such as questions that were
blind and visually impaired people may rely more on automatically
difcult to understand or suggestions for new response options to
generated captions than on their intuition, making decisions based
closed-ended questions. Based on their contributions, a new version
on misinformation [23, 31]. Besides that, the adequacy of these
of the questionnaire was built, including two new questions and
systems used in the open world is still limited, especially when
some adjustments in the wording of other questions.
captioning the wide variety of images posted to social media [31].
In order to fll the gap in image captions, hybrid image descrip- Questionnaires: In order to focus on social network users, we
tion technologies propose a combination of automatic techniques disseminated the questionnaire through diferent social media chan-
and human intervention to investigate a trade-of between both nels. The call for participation was shared by the research team,
techniques, as explored in [16, 21, 27, 31]. their university and research unit, and fellow organisations (such
Despite prior eforts in evolving alternative interpretations for as disability-related ones). The questionnaire was online during 3
visual content, user-generated content still has a great impact on the months and we gathered a total of 258 answers from participants
accessibility of media content in social networks. For this research, aged from 17 to 73 years old (Mean=37.35, Median=31, IQR=23)
we collected feedback from users about their difculties and possible with 64 (25%) of them self-reporting having some kind of disability,
motivations in accessible authoring practices. From that, we provide such as visual, hearing, motor and/or cognitive impairments. In
insights into how this interaction fow can be improved to support particular, 34 participants were blind, 12 had low vision, and 1 was
their needs, as well as to better engage them in the authoring of colourblind.
accessible media content. In addition to the specifc questions concerning social media
usage and accessible practices, participants were also invited to
share additional thoughts. Through this questionnaire, participants
3 METHOD also stated their availability to be contacted for the next phase of
In order to address our research questions, i.e., to better understand this research.
the barriers faced by end-users to author accessible media content
and the motivational factors to engage in such practices, the study 3.2 User interviews
was structured into two diferent phases. Ethical approval to run In a frst moment, participants stating their availability in the previ-
the study was granted by our university’s Ethics committee. ous phase were contacted, and we followed up with semi-structured
ASSETS ’22, October 23–26, 2022, Athens, Greece Pereira, et al.

interviews with 20 of them – all reported being frequent social 4 FINDINGS


network users in the questionnaire. From these participants, 7 self- In this section, we frst present key fndings identifed through the
reported being blind and 13 of them were sighted users without a quantitative analysis of the data gathered through the question-
disability. Furthermore, half of them stated not frequently under- naire. Next, we present the fndings obtained from the qualitative
taking accessible practices, as presented in Table 1. analysis of the information gathered during the user interviews.
Prior to scheduling the interview, we asked participants to post This information was divided by the following topics: (1) accessibil-
media content on their usual social networks in an accessible way. ity unawareness, (2) lack of know-how, (3) the cost of the additional
Participants were instructed to post, at least, three diferent media efort, (4) complying with and without guidelines or features, (5)
contents in order to familiarize themselves with the authoring inaccessibility, (6) and accessibility motivations and concerns.
processes (in case they were not). In addition, they were also asked
to take notes reporting in detail the activities conducted and their 4.1 Online survey
opinions and difculties encountered in this process. Participants
This section provides detailed information obtained through the
were invited to conduct these activities over a two-week period.
online survey, containing questions regarding authoring and shar-
This time frame was intended to include accessible practices in
ing practices on social networks. The following data concerns the
their sharing and posting routine for two weeks, rather than asking
258 total answers gathered.
for a specifc task to create accessible content.
After two weeks of study (i.e., after the initial contact), the semi- 4.1.1 Device access. We asked participants how often they interact
structured interviews were conducted. We asked questions about with social networks, considering access and posting activities,
their experience in accessible practices in social networks, further according to device types.
motivations for accessible content authoring, and potential sugges- In both groups – sighted and visually impaired participants –
tions or additional thoughts on how to improve this process and mobile devices were more popular for accessing and posting content
to more fully commit end-users to accessibility practices. These on social networks. However, in both scenarios, visually impaired
interviews were also an opportunity to further discuss the answers participants tend to carry these activities on desktop or laptop
provided by participants in the questionnaire. All interviews were devices more than sighted participants, even though, the diference
conducted remotely over the phone, Skype, or Zoom, lasted 20 to is more signifcant in posting activities, as presented in Figure 1.
30 minutes and were recorded with participants’ prior consent.

3.3 Data analysis


The frst step of the data analysis consisted of a quantitative analysis
of the answers to the closed-questions in the online questionnaire.
A preliminary analysis was made to ensure that all answers pro-
vided were valid. For that, we searched for answers out of scope
and found that all the answers provided by the participants were
consistent with the corresponding question. In addition, all ques-
tions included in the questionnaire were optional, resulting in all
entries being considered valid. In summary, all responses provided
by all participants were considered in the data analysis. It is also
important to highlight that several questions were multiple choice,
allowing participants to choose one or multiple options from a list
of possible answers. Figure 1: Devices used by participants to access and post con-
Next, all conducted interviews were transcribed and, along with tent on social networks
answers provided for the open-ended questions, analysed through
an inductive coding approach [25]. First, two researchers reviewed
independently a subset of this data, conceptualizing a set of codes 4.1.2 Social networks used. Concerning the social networking plat-
based on this subset. The coders compared their initial set of codes forms accessed by our questionnaire participants, Facebook was
and their categorisations in order to develop a unifed list of codes. the frst choice for the majority, followed by Twitter. Only 2% of
Then, the two coders reviewed a new subset of transcriptions taking visually impaired participants declared accessing Instagram, while
into account the consolidated and revised codes in order to reach 25% of sighted participants reported having Instagram as their main
an agreement, identifying a total of 150 distinct codes, organized in social network.
two levels, 21 codes and 129 sub-codes. Following that, the coding Sighted participants reported to post most of their content on
of all data collected was performed. The codes and their sub-codes Instagram followed by Facebook and Twitter. Regarding visually
are available online1 . impaired participants, they declared to prefer to post their content
on Facebook, followed by Twitter and a very low percentage of
them declared using Instagram as their main social network to post
1 The codebook is available at: https://osf.io/anmd7/?view_only= content. This information is presented in Figure 2. It is important to
547d72f78553489498af9c1df7af1a58 highlight that other social networks were also mentioned by both
Authoring accessible media content on social networks ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Demographics of interviewees including age, most accessed social networks, social networks users most post their
own content, and social networks users most share their content (i.e., post content created by others, for example, sharing on
Facebook and retweeting on Twitter).

Most accessed Social networks users Social networks users


ID Age Accessible practices
social network most post their content most share their content
Blind participants
BP1 20 Yes Facebook Facebook Facebook
BP2 63 Yes Facebook Facebook Facebook
BP3 53 Yes Twitter Facebook Facebook
BP4 52 Yes Twitter Twitter Twitter
BP5 50 Yes Twitter Twitter Twitter
BP6 21 No WhatsApp WhatsApp WhatsApp
BP7 17 Yes Twitter Messenger Twitter
Sighted participants
SP1 73 Yes Twitter Twitter Twitter
SP2 32 No WhatsApp WhatsApp Instagram
SP3 30 No Facebook Facebook Facebook
SP4 57 Yes Twitter Facebook Twitter
SP5 30 No WhatsApp Instagram Instagram
SP6 25 No Instagram Instagram Instagram
SP7 30 No Twitter Messenger Messenger
SP8 33 No Facebook Facebook Facebook
SP9 41 No Facebook Facebook -
SP10 27 Yes Instagram Instagram Instagram
SP11 29 No WhatsApp WhatsApp Instagram
SP12 34 Yes Instagram Instagram Instagram
SP13 30 No WhatsApp WhatsApp WhatsApp

groups, such as LinkedIn, WhatsApp, and Tiktok, however, with they still encounter several challenges using these apps [1], which
less expressive numbers. may contribute to their low participation levels in image-centric
social networks, such as Instagram. In particular, among other types
of visual content, GIFs and Memes were highly unpopular among
visually impaired participants, with all of them reporting never
sharing this type of content. As explored in previous works, these
kind of media represents a challenge for accessibility. Besides not
being properly supported by major platforms [30], they carry some
cultural context or hidden meaning related to an emotional tone or
humorous aspect [13, 14], being heavily dependent on alternative
descriptions provided by the authors themselves.

4.1.4 Accessible practices. The vast majority of visually impaired


participants (70%) reported providing an alternative description for
their last shared media content, while most of sighted participants
(71%) declared not having provided it.
Figure 2: Social networks used by participants We also asked participants that did provide an alternative de-
scription for their last three posted media content how they did it.
Most of them (25%) used the functionality provided by the social
4.1.3 Type of content. We also asked participants about the con- network, while 23% chose to integrate the description in the text of
tent they most post on social networks. Although the numbers the post.
obtained for text, audio, and video content do not show a signif- Among those not providing alternative descriptions, when asked
cant diference, the same was not observed for visual-only content. about the reasons for not engaging in such activity, most of them
Sighted participants reported posting 5% of image content, while declared not knowing that it was possible, followed by those who
this number is 2% for visually impaired participants. Even though declared not knowing where to write an alternative description
users with a visual impairment edit and post photos on social media, (Figure 3).
ASSETS ’22, October 23–26, 2022, Athens, Greece Pereira, et al.

using technology, in particular, about blind people accessing vi-


sual content. Further investigation conducted through interviews
allowed us to observe that most of them were not aware of the use
of technology by blind people and how this content is being con-
sumed by them, therefore becoming more difcult to understand
what accessible content really means:
“I am describing this image, theoretically for a blind
person, but then how will the platform use it? Does
the person have the possibility to listen? Is there going
to be a...? I don’t know how this is used in the end, it
seems silly, but how does it reach the person? How is
it accessible in fact, at the end of the experience?” –
SP13
Figure 3: Reasons to not provide alternative description A frst layer of this unawareness is refected by the lack of
knowledge about what they should do to improve the accessi-
We also observed through questionnaire responses a correlation bility of their content. Most sighted participants, although frequent
between adopting accessible practices and the familiarity with the users of social networks, only discovered this possibility during
needs of people with disabilities. The majority of sighted partici- this study.
pants (56%) had no idea about accessibility practices adopted by Some sighted participants also felt that it was not necessary to
people around them, while more than half of visually impaired provide an alternative access for their content as they do not know
participant (51%) stated having friends or family members posting anyone with a disability, underestimating the reach of authoring
or sharing accessible content. and sharing activities. However, while some sighted participants
Figure 4 summarises participants’ opinions on why social net- were used to privately sharing some family pictures and may follow
work users in general do not provide alternative text descriptions this line of argument, this misjudgement about the impact and
for media content they post or share. Most participants consider needs of accessibility may go further, reinforcing the stigma, by
that other people don’t know that this is possible, followed by those implying that visual content is not for blind users:
considering that other people don’t think it has any impact. “I also share a bit of this stigma, which is that, if
Instagram is highly visual, a blind person will use it?
[...] I don’t know if there is this motivation on the part
of the blind people to use a tool that is so visual.” –
SP5
Under the assumption that blind people are not interested or mo-
tivated in accessing visual content, the accessibility of this content
is not taken into account. This is particularly evident in image-
centric social networks such as Instagram. Consequently, blind
participants reported being excluded from accessing this content
and thus from using these platforms.
From the perspective of two blind interviewees, some people
just do not care enough about the subject. Partly because many of
them don’t know someone with a disability, so they do not refect
about it, but also because some may be focused on just reaching
Figure 4: Reasons for others not providing alternative de- out to as many people as possible, neglecting minority user groups:
scription “Other people don’t care, no matter how much we tell
them.” – BP5

4.2 Interviews with Social Network Users 4.2.2 Lack of know-how. During the interviews, participants who
embed alternative descriptions in the content of their post evoked
In this section, we present the information collected during the
having discovered this practice through other users’ publications,
interviews conducted with 20 social network users. Participants
and, therefore, used it as a model example to complete this task.
were asked about their experience with accessible practices in social
This behaviour was also observed by Gleason et al. [11] in a previ-
networks, further motivations for accessible content authoring, and
ous study. Although not ideal, it may increase the visibility of issues
potential suggestions or additional thoughts on how to improve
related to content accessibility. However, this choice was mainly
this process.
driven by sighted interviewees not perceiving nor discovering the
4.2.1 Accessibility unawareness. Considering the data obtained, we specifc accessibility feature provided by the platform. Interviewees
observed an overall lack of awareness among sighted participants, further described the difculty to discover where to write an al-
but also, some stigma associated with people with disabilities ternative text description as being related to the non user-friendly
Authoring accessible media content on social networks ASSETS ’22, October 23–26, 2022, Athens, Greece

and hard-to-fnd accessibility approaches currently employed by many people and I have to be describing each person
social network service providers: . . . ” – SP5
“If they told me ‘Oh yes, you can make these things Blind participants shared the additional burden of often having
accessible, you just have to take a lot of very compli- to be the activist in their social circles, bringing awareness and
cated steps to get it done’. Obviously, this will make it recalling others to ensure they start producing accessible content.
very difcult for people who are interested in doing This leads some to the feeling of being obnoxious for having to
it. Accessibility features are difcult to fnd and not constantly remind their acquaintances they are not able to fully
very discoverable, they’re hidden.”- SP8 understand the content shared:
Furthermore, sighted interviewees also discussed their difculty
“When we have a disability and we’re faced with it,
in not knowing how to write a suitable alternative description.
we’re tired of saying ‘hey, don’t forget, publish some-
While they convey it was a challenge to represent an image through
thing accessible’.” – BP5
text, they also stated fnding no guidance or information on what
is considered a good alternative description: As reported by our blind interviewees and more broadly dis-
“My biggest difculty was this: not knowing if there cussed in previous works [23, 27, 32, 36, 38, 40], alternative de-
was any standard or not, what would be necessary to scriptions currently provided by major platforms are not providing
include or not, if I should put more information, less enough contextual information for visually impaired users to prop-
information, if I should be more specifc, give more erly interpret media content. Therefore, they reinforced the impor-
details. . . ” – SP13 tance of promoting user engagement to convey this additional
context and the users’ own intentions and purposes in publishing
It was a consensus among all our interviewees that, as a frst
this content through human description:
step, it is essential to create approaches to raise awareness about
accessible practices and their benefts among social network users. “When it comes to being accessible, you can’t take it
Most of them suggest that one way to achieve equity in social media out of somebody’s hands [. . . ] Let them decide what
is by emphasising accessibility features as, currently, they do not they want to say, [...] how they want to describe it,
draw enough attention of these users to this matter: whatever they were thinking of, why they were post-
“They have to be more visible in the authoring process ing, that image should be what they’re posting.” –
[. . . ], even if they weren’t hidden, they could also be BP4
promoted, which they aren’t either.” – SP7
4.2.4 Lack of standardisation. Major platforms are employing dif-
Besides, accessibility must be a part of the authoring process. ferent approaches to support accessible media authoring. This lack
Interviewees consider that an edit feld for the alternative descrip- of standardisation requires users to identify and learn how to use
tion should be provided within the authoring fow with a content each of the accessible features provided by every platform used. In
warning not only to make it mandatory but, especially, to get Facebook, only two participants publishing accessible content for
people used to providing this information. the frst time were able to discover this feature, and, as stated by
Some participants have also reported that having a support fea- one of them, after a thorough research.
ture to suggest alternative descriptions would help them, frst by
providing an example of what is considered an appropriate alterna- “On Facebook, I really, really had to go search, I tried
tive description, and second by providing the opportunity for them a lot, I tried and eventually I had to search what Face-
to improve machine generated descriptions: book had for accessibility because I could not fnd it.”
– SP7
“This could become standard [...] show the descrip-
tion and say: "Do you think this is a good description They also reported not fnding this feature available in the
for this image?" because in this case, it [Facebook] native app, only being able to edit the automatically generated
simply decides the description itself and also, if it is description on their desktop through the web interface. One par-
not correct, I have no way of indicating it.” – SP8 ticipant also shared that, even though being blind and a frequent
Facebook user, he was not aware of the possibility of editing these
4.2.3 The cost of the additional efort. The third most popular rea- descriptions.
son mentioned by questionnaire participants for not providing an As for Twitter, the feature enabling the user to provide an alter-
alternative description for their media content was the time it takes native description for their own images was considered the best
to do it. While further exploring this subject during the interviews, approach by one blind user. However, as described by other par-
some sighted interviewees also declared that accessible practices ticipants, its efective use is jeopardised by not being part of the
are currently not integrated into their current publishing inter- standard Twitter authoring process:
action fow. Providing an alternative description for their images
adds a new task that would involve considerable time for refection “Twitter’s way of making a feld for people to put in
and make their activities much more time-consuming, especially the alt text is the best way to do it, but I can’t tell you
given the spontaneous nature of social media content sharing for how many people can fnd it.” – BP4
personal usage: Only one sighted participant tried to explore the accessibility
“Depending on the elements that are in the picture feature provided by Instagram during this study, and also mentioned
[it] can be a bit exhausting, especially if you have the difculty in fnding it:
ASSETS ’22, October 23–26, 2022, Athens, Greece Pereira, et al.

“In Instagram, the frst thing I think I had heard about “In this case [for this study], I had to ask my brother
is the alternative text, but I tried to fgure it out and I how the photos were like for me to describe them.” –
found it out, but with some difculty, because it was BP6
in some advanced settings in the post, it’s not very
This lack of support for blind content authors reinforces the
visible.” – SP7
social cost, also observed by Brady et al. [3], and their sense of
exclusion as they are unable to fully experience this aspect of social
During the interviews, Facebook, Twitter, and Instagram were
networks:
the most cited social networks among participants. Each of these
platforms adopted a diferent approach for providing alternative “I don’t think it’s because we’re blind or have any
descriptions for user-generated content. Twitter provides an input other type of disability that we lose the right to ex-
feld for users, while Facebook and Instagram rely on machine- press ourselves, or to send any kind of joke, let’s say,
generated descriptions. As observed by Sacramento et al. [30], ma- those little stickers... when talking, we who are blind,
jor social networks are not consistent also when it comes to are excluded from doing it.” – BP1
providing accessibility features and compliance among diferent As mentioned by a blind participant, the constant arising of new
platforms, such as mobile/desktop interfaces or iOS/Android mobile media content such as GIFs, Stickers, Memes, and images with
applications. However, the non-user-friendly interface is a common embedded text or screenshots enabled by major platforms pushes
ground between them. these social networks to be less and less accessible for them. This
Therefore, concerning the roles of diferent stakeholders in- diversity also comprises stories, a new format that has become
volved in the accessibility context, most interviewees consider that quite popular on major platforms. This format supports a mix of
these platforms have the largest infuence and, currently, they are images, videos, text, and even stickers. This type of content raises
not fully engaged in promoting accessibility: another challenge for users as, once more, they don’t know how to
provide accessibility for this type of content. Another issue raised
“The platform is responsible for ensuring a good user
by participants was concerning links that, when posted on social
experience and they have a good percentage of users
networks, often generate a preview of the website and often do
who have these specifc needs, and they are the ones
not contain any information about it. This diverse range of content
who have to ensure this experience also applies to
often fails to match the accessibility features available, which allows
them.” – SP7
us to better understand why none of the questionnaire participants
reported posting this type of content.
Some interviewees also considered this responsibility should
go further and shared with the users themselves. As they play an 4.2.6 Accessibility motivations & concerns. Participants currently
essential role in this process, they have to be more committed to publishing accessible content are mostly driven by having a dis-
accessible practices. ability themselves or having acquaintances with disabilities. Two
Governments’ actions and the legal context of digital media reg- sighted participants, not used to share accessible content, declared
ulation was also mentioned during the interviews as a possible way that they would be more engaged in these practices if they had a
forward. Two participants also believe it involves providing better personal connection to someone with a disability.
public policies for social inclusion in general. For instance, one par- Sighted participants declaring publishing accessible content
ticipant made a comparison with current accessibility legislation agreed the main reason for them to engage in these practices is
for physical structures, such as ramps and handrails. because it is the right thing to do. Sighted interviewees not used
to share accessible content shared their interest in contributing to
4.2.5 Inaccessibility. The inaccessibility of accessibility features is the inclusion of people with disabilities and to make information
a paradox currently encountered in these services. Just like sighted reach as many people as possible.
participants, blind interviewees also reported having difculties Most interviewees also convey that providing platform sup-
identifying the proper feature provided by major platforms even port is an important factor to create more user engagement. One
though they were familiar with its availability. This difculty is fur- blind participant, who frst discovered accessibility features during
ther reinforced by the constant updates of these systems, requiring the study, stated that now that he is aware of it, he has already
the acquisition of new knowledge about the structure of these new included alternative descriptions for all his previously posted pic-
interfaces, as also pointed out by Voykinska et al. [36]. tures and intends to adopt this practice in the future. Nevertheless,
Furthermore, while most blind questionnaire participants de- the difculty encountered may discourage some users, causing the
clared creating accessible content on their social networks, in the opposite efect on people who, at frst, are more likely to give it a
interviews it was possible to observe that this practice is not neces- try.
sarily being enabled by accessible features provided by platforms. It was also possible to observe that some sighted interviewees
During the interviews, they mentioned asking others for help in may consider accessibility approaches as creating setbacks to
order to confrm the elements contained in the media to be pub- their experience, reinforcing the unawareness about accessible prac-
lished, making them dependent of sighted friends or family tices – in particular the behaviour of screen readers and alternative
members. Moreover, one of them stated only sharing content al- descriptions.
ready accessible. Therefore, these users declared missing a feature The previously mentioned strategy of embedding an alternative
to assist them in creating their own descriptions for their images: description text in the post’s content, while it may be perceived
Authoring accessible media content on social networks ASSETS ’22, October 23–26, 2022, Athens, Greece

by some as an example of a good practice, was mentioned by one be responsible for seeking and providing such education, social
participant as a downside of accessibility, as it makes a post very networking sites may leverage their platform to educate users and
long and somehow redundant for those not using a screen reader. enable more inclusive sharing practices. Instead, accessible prac-
For that, she considers that this information should be embedded tices and features remain unknown to most users. Most sighted
in a way that is only perceived by screen reader users. Loading participants were not aware of the steps they can take to make
and scroll speed was also considered as a setback by other sighted their content more accessible and they found no guidance on
interviewee, as accessible practices concerns including additional major platforms to assist them in this process. In addition, even
information to be loaded by apps and websites: when they actively search for guidance, they report difculty in
“It occurred to me, people could describe just like learning about or in fnding accessibility features. Given this con-
I did, or record a short audio to be heard by these text, accessibility practices are being perceived by many sighted
people, but this will take a lot out of the dynamism users as an activity that requires a signifcant additional efort
that people have gotten used to in Instagram. [. . . ] it on their part.
would be great for them, but I think it represents a We also identifed that some sighted users still have a certain
setback for non-visually impaired people.” – SP5 stigma associated with accessibility, arguing that accessibility
should be employed only when necessary or, even worse, that
5 DISCUSSION accessibility compliance may compromise their current experience
In what follows, we frst discuss how these fndings can be used to on social media. Platforms not making this a requirement, or not
answer our research questions followed by further contributions providing a proper prompt warning, may also be contributing to
provided by this work. this frst line of thought. On the other hand, there is still a lack
of support for blind users to create accessible media content as
5.1 Research questions they fnd no features to assist them in this activity. Even though
they recognise the latest advances promoted by major platforms,
RQ1: What are the motivations for social network users to create
accessibility issues have been present in these services for a long
accessible media content? Our participants were mainly motivated
time [5, 6] and blind users are still highly dependent on other
by doing the right thing and promoting inclusion for people
people to participate in social networks, reinforcing the social cost
with disabilities. Although some of them are used to share media
constantly experienced by them.
content only for a private audience, such as family and close friends,
many of them are interested in enabling access to information for RQ3: What are the requirements for social network users to create ac-
other people as well. cessible media content? As a frst step, major platforms must provide
Considering the willingness to include people with diferent abil- a more user-friendly and accessible interface in order to make
ities and cultural contexts shown by interviewees, some strategies accessibility features more noticeable and easier to use. An-
may be applied in order to motivate more users to create accessible other critical issue identifed concerns the diferent approaches
media content. Participants tended to be more aware when in con- adopted by these platforms to provide accessibility features, mak-
tact with a person with a disability, but as well when confronted ing it difcult for users to identify and learn how to use each one of
with current accessibility approaches. They showed curiosity and in- these resources. For that, we suggest that accessible approaches
terest in knowing more about the subject and they are motivated to should be standardised among platforms in order to create
understand how their content is being consumed by blind users. One an easily recognised pattern towards an accessible posting and
major challenge identifed among interviewees was the unaware- sharing inherent routine. Moreover, machine-generated descrip-
ness of how and why blind people use social networks. Making tions introduced by some platforms often do not provide blind
end-users part of this process, integrating accessibility features users with enough information or context, so they properly un-
more prominently on platforms authoring fow, educating them derstand the corresponding media content, as also observed in
about accessible practices and alternative access, and provid- previous works [23, 27, 32, 36, 38, 40]. One promising avenue is
ing tutorials or scenarios of people with disabilities using the Web, investing in hybrid solutions: exploring the balance of the bene-
for instance, may increase awareness and thus encourage them to fts of technological advances in automatic image recognition and
become more frequently engaged in such practices. From that, we machine-generated descriptions, and involving users to fll the gap
reinforce the conclusions reached by previous studies [11, 30] on concerning context details and, in particular, their own intention
the need of additional tooling and training for social network users and purpose in posting a certain media content. This approach also
on all major platforms. harnesses automatic descriptions to guide and support users in
creating their own description, reducing the work and time cost
RQ2: Which barriers social network users encounter to share and au-
perceived by some of them. Furthermore, blind users authoring
thor accessible media content? First, it is important to highlight that
accessible content will also beneft from it, reducing the social cost
the barriers encountered and reported by participants are strongly
previously discussed.
related to their familiarity with people with disabilities, assistive
technologies, and accessibility features in general. For this reason,
it is extremely important to educate people about accessibility, in-
5.2 Perspectives on accessible social media
cluding how diferent disabilities afect the way people interact content authoring
with technology, their challenges, and how users can publish ac- While this study was not intended to be a comprehensive study on
cessible, inclusive digital content. While the society at large should the accessibility features or barriers of social networks as a whole,
ASSETS ’22, October 23–26, 2022, Athens, Greece Pereira, et al.

the diferent user perspectives on accessible practices in social questions also apply to diferent kinds of impairments, in particular
networks gathered in our study allowed us to identify potential concerning accessibility awareness and motivational strategies.
avenues for future research in accessible media content authoring.
Although machine-generated descriptions are not currently pro- 6 CONCLUSIONS
viding sufcient contextual information of visual content, their
In this study, we presented an overview of the usage and the ac-
main advantage is that they can potentially be deployed at a large
cessible practices employed by 258 inquired social network users.
scale. For that, research on text alternatives best practices and users
These fndings suggest that people with disabilities are interacting
preferences [29, 31, 32] may be used to further improve these de-
in social networks as much as users without disabilities. However,
scriptions so that their quality becomes acceptable. From a hybrid
our blind participants showed being more frequently engaged with
perspective, providing suggestions of appropriate alternative de-
non-visual content, such as text and audio, than with visual-only
scriptions may be useful to educate users and, therefore, to create
content, such as images, in particular GIFs and memes. This may be
more engagement on accessible practices.
partly explained by another fnding: most sighted participants are
Another opportunity suggested by some of our sighted intervie-
not sharing accessible media content because, as reported by most
wees is providing users with diferent ways of including alternative
of them, they were not aware of this possibility until participating
descriptions, such as audio descriptions. Previous studies [7, 20]
in this study.
identifed that human narration – besides being faster – allows
Following that, we conducted interviews with 20 of these partic-
better image comprehension by blind users and it helps to establish
ipants in order to better explore their experiences, challenges and
a connection between user and content author. Marques et al. [7]
motivations concerning accessibility and visual content in social
also suggest a scenario where blind users could send a request to
networks. While sighted interviewees were interested in being more
the author of an image so that he would record an audio descrip-
engaged in accessible practices, most of them did not fnd proper
tion. While this solution possibly reinforces the existing burden
assistance or support on major platforms to guide them to enhance
imposed on blind people, collaborative approaches on alternative
the accessibility of their content. At the same time, blind users are
descriptions are also an interesting aspect to be further explored.
not being provided with proper alternative descriptions whether
As suggested by Sacramento et al. [30], providing users the possi-
by authors or by the machine-generated descriptions provided by
bility of including alternative descriptions for visual content they
some major platforms. Moreover, the burden of educating others
encounter may be employed by sighted users who are already moti-
and promoting the authoring and sharing of accessible content falls,
vated and currently engaged in accessible practices. This approach
unfairly, upon their ability to compel others to act.
could be also a complementary action to create more awareness
This research does not aim to provide a thorough analysis of
among social media users.
current accessibility features rather than provide insights into the
Moreover, this research is not intended to scrutinize any par-
current status of accessibility awareness among social networks
ticular social network. The fndings and discussions presented
users, and potential future directions. Our work complements pre-
relied solely on the data collected, in which Facebook, Twitter,
vious research on visual content in social networks by providing
and Instagram were the social networks most used by study
insights on how sighted users are experiencing accessible practices,
participants, corroborating earlier studies on similar areas of re-
but also brings up the importance of employing hybrid solutions
search [1, 3, 13, 32, 36, 39]. Further research with participants using
to fll the current gap. Platforms must go beyond just deploying
other social networks to create accessible content, in particular
accessibility features and must be more invested in approaches that
those involving diferent forms of interaction, such as instant mes-
create user awareness, engaging more people in adopting accessible
sengers, would yield important insights into this context. Another
practices in their daily posting routine.
interesting perspective to be further investigated and discussed is
the current status of compliance of these platforms with the current
guidelines, in particular with the WCAG (Web Content Accessibility ACKNOWLEDGMENTS
Guidelines) and ATAG (Authoring Tools Accessibility Guidelines) - This work was supported by the European Commission through
which are often overlooked but might also be very ftting to this project SONAAR (grant agreement LC-01409741), and by FCT
context. Considering, for instance, the case of Facebook and Insta- through the LASIGE Research Unit, ref. UIDB/00408/2020 and ref.
gram that, even though they provide an alternative description for UIDP/00408/2020.
each image, as discussed earlier, it does not necessarily entail ac-
cessible content. This context is especially challenging since these REFERENCES
companies are not currently required by law to comply with any [1] Cynthia L. Bennett, Jane E, Martez E. Mott, Edward Cutrell, and Meredith Ringel
of these guidelines, leaving each of these platforms to decide the Morris. 2018. How Teens with Visual Impairments Take, Edit, and Share Photos
level of commitment they are willing to assume. on Social Media. In Proceedings of the 2018 CHI Conference on Human Factors in
Computing Systems - CHI ’18, Vol. 2018-April. ACM Press, New York, New York,
Finally, even though this work focused on visual disabilities, ac- USA, 1–12. https://doi.org/10.1145/3173574.3173650
cessibility in media content also has an impact on the interaction [2] Larwan Berke, Matthew Seita, and Matt Huenerfauth. 2020. Deaf and hard-of-
hearing users’ prioritization of genres of online video content requiring accurate
of people with other disabilities, for instance, video captions for captions. In Proceedings of the 17th International Web for All Conference, W4A
users with hearing impairments [2] or alternative descriptions for 2020. https://doi.org/10.1145/3371300.3383337
other screen reader users, such as people with cognitive impair- [3] Erin Brady, Yu Zhong, Meredith Ringel Morris, and Jefrey P. Bigham. 2013. Inves-
tigating the appropriateness of social network question asking as a resource for
ments [18, 19]. Therefore, most topics discussed in our research blind users. Proceedings of the ACM Conference on Computer Supported Cooperative
Work, CSCW (2013), 1225–1236. https://doi.org/10.1145/2441776.2441915
Authoring accessible media content on social networks ASSETS ’22, October 23–26, 2022, Athens, Greece

[4] Julian Brinkley and Nasseh Tabrizi. 2017. A Desktop Usability Evaluation of [22] Kelly Mack, Edward Cutrell, Bongshin Lee, and Meredith Ringel Morris. 2021.
the Facebook Mobile Interface using the JAWS Screen Reader with Blind Users. Designing Tools for High-Quality Alt Text Authoring. The 23rd International
Proceedings of the Human Factors and Ergonomics Society Annual Meeting 61, 1 ACM SIGACCESS Conference on Computers and Accessibility, 1–14. https://doi.
(sep 2017), 828–832. https://doi.org/10.1177/1541931213601699 org/10.1145/3441852.3471207
[5] Maria Claudia Buzzi, Marina Buzzi, and Barbara Leporini. 2011. Web 2.0: Twitter [23] Haley MacLeod, Cynthia L. Bennett, Meredith Ringel Morris, and Edward Cutrell.
and the blind. In Proceedings of the 9th ACM SIGCHI Italian Chapter International 2017. Understanding blind people’s experiences with computer-generated cap-
Conference on Computer-Human Interaction: Facing Complexity. 151–156. tions of social media images. Conference on Human Factors in Computing Sys-
[6] Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, and Fahim Akhter. 2010. Is tems - Proceedings 2017-May (2017), 5988–5999. https://doi.org/10.1145/3025453.
Facebook really" open" to all?. In 2010 IEEE International Symposium on Technology 3025814
and Society. IEEE, 327–336. [24] Reeti Mathur and Erin Brady. 2018. Mixed-Ability Collaboration for Accessible
[7] João Marcelo dos Santos Marques, Luiz Fernando Gopi Valente, Simone Bacellar Photo Sharing. In Proceedings of the 20th International ACM SIGACCESS Confer-
Leal Ferreira, Claudia Cappelli, and Luciana Salgado. 2017. Audio Description ence on Computers and Accessibility - ASSETS ’18. ACM Press, New York, New
on Instagram: Evaluating and Comparing Two Ways of Describing Images for York, USA, 370–372. https://doi.org/10.1145/3234695.3240994
Visually Impaired. In Proceedings of the 19th International Conference on Enterprise [25] Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and
Information Systems. SCITEPRESS - Science and Technology Publications, 29–40. inter-rater reliability in qualitative research: Norms and guidelines for CSCW
https://doi.org/10.5220/0006282500290040 and HCI practice. Proceedings of the ACM on Human-Computer Interaction 3
[8] Carlos Duarte, Carlos M. Duarte, and Luís Carriço. 2019. Combining Semantic (2019). Issue CSCW. https://doi.org/10.1145/3359174
Tools for Automatic Evaluation of Alternative Texts. In Proceedings of the 16th [26] Valerie S. Morash, Yue-Ting Siu, Joshua A. Miele, Lucia Hasty, and Steven Lan-
Web For All 2019 Personalization - Personalizing the Web. ACM, New York, NY, dau. 2015. Guiding Novice Web Workers in Making Image Descriptions Using
USA, 1–4. https://doi.org/10.1145/3315002.3317558 Templates. ACM Trans. Access. Comput. 7, 4, Article 12 (nov 2015), 21 pages.
[9] Facebook. 2020. How do I edit the alternative text for a photo on Facebook? https://doi.org/10.1145/2764916
https://www.facebook.com/help/214124458607871 [27] Meredith Ringel Morris, Jazette Johnson, Cynthia L. Bennett, and Edward Cutrell.
[10] Facebook. 2021. Facebook Reports Third Quarter 2021 Results. 2018. Rich representations of visual content for Screen reader users. Conference
https://investor.fb.com/investor-news/press-release-details/2021/Facebook- on Human Factors in Computing Systems - Proceedings 2018-April (2018), 1–11.
Reports-Third-Quarter-2021-Results/default.aspx https://doi.org/10.1145/3173574.3173633
[11] Cole Gleason, Patrick Carrington, Cameron Cassidy, Meredith Ringel Morris, [28] Meredith Ringel Morris, Annuska Zolyomi, Catherine Yao, Sina Bahram, Jefrey P.
Kris M Kitani, and Jefrey P Bigham. 2019. “It’s almost like they’re trying to Bigham, and Shaun K. Kane. 2016. "With most of it being pictures now, I rarely use
hide it”: How User-Provided Image Descriptions Have Failed to Make Twitter it": Understanding Twitter’s Evolving Accessibility to Blind Users. In Proceedings
Accessible. In The World Wide Web Conference on - WWW ’19. ACM Press, New of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, New
York, New York, USA, 549–559. https://doi.org/10.1145/3308558.3313605 York, NY, USA, 5506–5516. https://doi.org/10.1145/2858036.2858116
[12] Cole Gleason, Patrick Carrington, Lydia B. Chilton, Benjamin Gorman, Hernisa [29] Helen Petrie, Chandra Harrison, and Sundeep Dev. 2005. Describing images on
Kacorri, Andrés Monroy-Hernández, Meredith Ringel Morris, Garreth Tigwell, the web: a survey of current practice and prospects for the future. In Proceedings
and Shaomei Wu. 2020. Future research directions for accessible social media. of Human Computer Interaction International (HCII), Vol. 71.
ACM SIGACCESS Accessibility and Computing 127 (jul 2020), 1–12. https://doi. [30] Carolina Sacramento, Leonardo Nardi, Simone Bacellar Leal Ferreira, and João
org/10.1145/3412836.3412839 Marcelo dos Santos Marques. 2020. #PraCegoVer: Investigating the description
[13] Cole Gleason, Amy Pavel, Himalini Gururaj, Kris Kitani, and Jefrey Bigham. 2020. of visual content in Brazilian online social media Carolina. In Proceedings of the
Making GIFs Accessible. In The 22nd International ACM SIGACCESS Conference 19th Brazilian Symposium on Human Factors in Computing Systems. ACM, New
on Computers and Accessibility. ACM, New York, NY, USA, 1–10. https://doi.org/ York, NY, USA, 1–10. https://doi.org/10.1145/3424953.3426489
10.1145/3373625.3417027 [31] Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2017. Toward Scalable
[14] Cole Gleason, Amy Pavel, Xingyu Liu, Patrick Carrington, Lydia B. Chilton, Social Alt Text: Conversational Crowdsourcing as a Tool for Refning Vision-to-
and Jefrey P. Bigham. 2019. Making memes accessible. ASSETS 2019 - 21st Language Technology for the Blind. Aaai Hcomp 17 Hcomp (2017), 147–156.
International ACM SIGACCESS Conference on Computers and Accessibility (2019), [32] Abigale Stangl, Meredith Ringel Morris, and Danna Gurari. 2020. "Person, Shoes,
367–376. https://doi.org/10.1145/3308561.3353792 Tree. Is the Person Naked?" What People with Vision Impairments Want in
[15] Darren Guinness, Edward Cutrell, and Meredith Ringel Morris. 2018. Caption Image Descriptions. In Proceedings of the 2020 CHI Conference on Human Factors
Crawler: Enabling Reusable Alternative Text Descriptions Using Reverse Image in Computing Systems. ACM, New York, NY, USA, 1–13. https://doi.org/10.1145/
Search. In Proceedings of the 2018 CHI Conference on Human Factors in Computing 3313831.3376404
Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, [33] The New York Times. 2020. The Coronavirus Revives Facebook as News Power-
New York, NY, USA, 1–11. https://doi.org/10.1145/3173574.3174092 house. https://www.nytimes.com/2020/03/23/technology/coronavirus-facebook-
[16] Darren Guinness, Edward Cutrell, and Meredith Ringel Morris. 2018. Caption news.html Accessed: 2020-08-12.
Crawler: Enabling Reusable Alternative Text Descriptions using Reverse Image [34] Twitter. 2016. Accessible images for everyone. https://blog.twitter.com/en_us/a/
Search. In Proceedings of the 2018 CHI Conference on Human Factors in Computing 2016/accessible-images-for-everyone.html
Systems - CHI ’18, Vol. 2018-April. ACM Press, New York, New York, USA, 1–11. [35] Twitter. 2020. Twitter Accessibility. https://twitter.com/TwitterA11y/status/
https://doi.org/10.1145/3173574.3174092 1265689579371323392
[17] Danna Gurari, Yinan Zhao, Meng Zhang, and Nilavra Bhattacharya. 2020. Cap- [36] Violeta Voykinska, Shiri Azenkot, Shaomei Wu, and Gilly Leshed. 2016. How
tioning Images Taken by People Who Are Blind. In Computer Vision – ECCV Blind People Interact with Visual Content on Social Networking Services. In
2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work
Springer International Publishing, Cham, 417–434. https://doi.org/10.1007/978- & Social Computing - CSCW ’16, Vol. 1. ACM Press, New York, New York, USA,
3-030-58520-4_25 1582–1593. https://doi.org/10.1145/2818048.2820013
[18] Bronwyn Hemsley, Stephen Dann, Stuart Palmer, Meredith Allan, and Susan [37] WebAIM. 2017. Screen Reader User Survey. https://webaim.org/projects/
Balandin. 2015. "We defnitely need an audience": Experiences of Twitter, Twitter screenreadersurvey7/
networks and tweet content in adults with severe communication disabilities [38] Gill Whitney and Irena Kolar. 2020. Am I missing something? Universal Access in
who use augmentative and alternative communication (AAC). Disability and the Information Society 19, 2 (jun 2020), 461–469. https://doi.org/10.1007/s10209-
Rehabilitation 37, 17 (2015), 1531–1542. https://doi.org/10.3109/09638288.2015. 019-00648-z
1045990 [39] Shaomei Wu and Lada A. Adamic. 2014. Visually impaired users on an online
[19] Amanda Hynan, Juliet Goldbart, and Janice Murray. 2015. A grounded theory social network. In Proceedings of the 32nd annual ACM conference on Human
of Internet and social media use by young people who use augmentative and factors in computing systems - CHI ’14. ACM Press, New York, New York, USA,
alternative communication (AAC). Disability and Rehabilitation 37, 17 (2015), 3133–3142. https://doi.org/10.1145/2556288.2557415
1559–1575. https://doi.org/10.3109/09638288.2015.1056387 [40] Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2017. The efect
[20] Lawrence H Kim, Abena Boadi-Agyemang, Alexa Fay Siu, and John Tang. 2020. of computer-generated descriptions on photo-sharing experiences of people with
When to Add Human Narration to Photo-Sharing Social Media. In The 22nd visual impairments. Proceedings of the ACM on Human-Computer Interaction 1,
International ACM SIGACCESS Conference on Computers and Accessibility. ACM, CSCW (2017), 1–22. https://doi.org/10.1145/3134756
New York, NY, USA, 1–3. https://doi.org/10.1145/3373625.3418013
[21] Christina Low, Emma McCamey, Cole Gleason, Patrick Carrington, Jefrey P.
Bigham, and Amy Pavel. 2019. Twitter A11y: A Browser Extension to Describe
Images. In The 21st International ACM SIGACCESS Conference on Computers
and Accessibility - ASSETS ’19. ACM Press, New York, New York, USA, 551–553.
https://doi.org/10.1145/3308561.3354629
Support in the Moment: Benefits and use of video-span selection
and search for sign-language video comprehension among ASL
learners
Saad Hassan∗ Diego Navarro Alexis Gordon
Akhter Al Amin∗ National Institute for the Deaf Sooyeon Lee
Caluã de Lacerda Pataca∗ Rochester Institute of Technology Matt Huenerfauth
Computing and Information Science Rochester, NY, USA School of Information
Rochester Institute of Technology don6763@rit.edu Rochester Institute of Technology
Rochester, NY, USA Rochester, NY, USA
{sh2513,aa7510,cd4610}@rit.edu {aag7593,slics,matt.huenerfauth}@rit.edu

ABSTRACT KEYWORDS
As they develop comprehension skills, American Sign Language American Sign language, Sign Languages, Continuous Signing, Sign
(ASL) learners often view challenging ASL videos, which may con- Language Videos, Sign Look-up, Video Selection, Search Interface,
tain unfamiliar signs. Current dictionary tools require students to Integrated Search, Sign Language Learning, ASL Learning
isolate a single sign they do not understand and input a search query,
ACM Reference Format:
by selecting linguistic properties or by performing the sign into Saad Hassan, Akhter Al Amin, Caluã de Lacerda Pataca, Diego Navarro,
a webcam. Students may struggle with extracting and re-creating Alexis Gordon, Sooyeon Lee, and Matt Huenerfauth. 2022. Support in the Mo-
an unfamiliar sign, and they must leave the video-watching task ment: Benefts and use of video-span selection and search for sign-language
to use an external dictionary tool. We investigate a technology video comprehension among ASL learners. In The 24th International ACM
that enables users, in the moment, i.e., while they are viewing a SIGACCESS Conference on Computers and Accessibility (ASSETS ’22), Oc-
video, to select a span of one or more signs that they do not under- tober 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 14 pages.
stand, to view dictionary results. We interviewed 14 American Sign https://doi.org/10.1145/3517428.3544883
Language (ASL) learners about their challenges in understanding
ASL video and workarounds for unfamiliar vocabulary. We then 1 INTRODUCTION
conducted a comparative study and an in-depth analysis with 15 Over 70 million Deaf and Hard of Hearing (DHH) people worldwide
ASL learners to investigate the benefts of using video sub-spans use one of the over 300 sign languages recognized by the World
for searching, and their interactions with a Wizard-of-Oz prototype Federation of the Deaf [12, 48]. In the U.S., increasing numbers
during a video-comprehension task. Our fndings revealed benefts of DHH and hearing individuals are motivated to learn American
of our tool in terms of quality of video translation produced and Sign Language (ASL), which is used by about 500,000 people as
perceived workload to produce translations. Our in-depth analy- a primary form of communication [46]. There are nearly 200,000
sis also revealed benefts of an integrated search tool and use of students in ASL classes [15] at schools or universities, and ASL has
span-selection to constrain video play. These fndings inform future one of the fastest growing enrollments among language classes [17].
designers of such systems, computer vision researchers working Learning ASL can promote interactions between DHH and hearing
on the underlying sign matching technologies, and sign language individuals, to support greater inclusion, mutual understanding,
educators. and participation across society. Further, if DHH children cannot
access spoken language nor learn sign language during critical
CCS CONCEPTS developmental years, they may experience language deprivation
• Human-centered computing → Accessibility systems and tools; [20]. As most DHH children are born to hearing parents, their
Graphical user interfaces; Empirical studies in interaction design; User family or teachers are motivated to learn sign languages [53, 61].
interface programming. Students trying to understand a challenging video is part of
sign language education, to develop comprehension skills [18, 29,
∗ These authors contributed equally to this research. 38]. While there have been advances in machine translation to
automatically convert an ASL video into an English text, such
technology is still under development [51], and its use would bypass
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed the educational activity of students working to understand a video
for proft or commercial advantage and that copies bear this notice and the full citation themselves. Technology is needed to support learners during a
on the frst page. Copyrights for components of this work owned by others than ACM video-comprehension task without fully automating the process.
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a In foreign-language-learning contexts, dictionaries are a valuable
fee. Request permissions from permissions@acm.org. tool for students when faced with an unknown word in a text or
ASSETS ’22, October 23–26, 2022, Athens, Greece audio recording. However, students learning sign languages face
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 challenges when encountering a sign whose meaning they do not
https://doi.org/10.1145/3517428.3544883 know, given the lack of a standard writing system or an easy way
ASSETS ’22, October 23–26, 2022, Athens, Greece Hassan, et al.

for students to search for the meaning of a sign based on its visual 2 BACKGROUND AND RELATED WORK
appearance. Existing search tools are not sufcient [3, 24, 25], and Background on sign-language linguistics is provided below to ex-
it is difcult for students to use websites that ask them to enter plain key terminology, and prior work on sign-language pedagogy
linguistic features of the sign and browse a list of results to fnd the is discussed, to contextualize our work within that domain. Next,
sign that matches what they see [1, 6, 7, 40, 45, 56, 59]. Based on section 2.2 discusses the current state of sign-language look-up
automatic ASL-recognition technology for video matching, recent technologies, to illuminate key limitations of existing resources.
tools have been researched to enable students to submit a video
of a single sign to conduct a search within an ASL dictionary [3, 2.1 Sign Language Comprehension
5, 11, 13, 26, 37, 59, 63]. However, students trying to understand
an ASL video may not be able to accurately extract just one sign ASL linguistic phenomena contribute to challenges students may
nor replicate the sign themselves into a webcam to initiate a search face in comprehending an ASL video. Although students can browse
[3, 26]. For this reason, we investigate technologies for enabling dictionaries that show videos of an ASL sign’s citation form, i.e.,
users to quickly select a span of a video of ASL signing that contains the standard way in which a sign may appear when produced in iso-
one or more signs that they do not understand, and then trigger lation, when signs are produced during sentences in a continuous
a video-analysis search that will return a set of dictionary results manner, the appearance may difer. For instance, there is natural
with likely matches to the signs contained within the span. diversity in the production of signs across individual signers, which
Rather than considering separately the tasks of watching a video may be based on demographic or geographic regional variation.
and looking up the meaning of an unknown sign, we instead focus Further, two or more ASL signs may linguistically combine into
on the overall task. Based on feedback from ASL students dur- a compound sign [41]; novice ASL learners may have difculty
ing an interview study, we investigate an integrated video-player segmenting them appropriately to look up a meaning in an ASL
and sign-search tool. In our second study, some students watched dictionary [8]. Coarticulation, broadly, refers to how the produc-
videos containing some signs that they were not familiar with, tion of one sign may afect the way in which other nearby signs are
while using a Wizard-of-Oz prototype for playing videos, selecting produced in continuous signing [19, 54], e.g., the ending location or
spans of video, and performing searches for signs. Other students handshape of one sign may afect the location or handshape of the
used a baseline prototype without the searching functionality. This next. Coarticulation efects may lead to the production of a sign in
study shed light on how ASL learners interact with such a system. context to difer from its citation form. When ASL signers produce
This study also investigated the beneft of providing integrated rapid sequences of handshapes during fngerspelling, i.e., when
dictionary search to ASL learners (hearing university students) for specifc words are spelled alphabetically, coarticulation efects are
video-comprehension, in comparison to their use of an existing also possible [35], leading to the fngerspelled word not being a sim-
dictionary website. Our contributions include: ple concatenation of the individual alphabet handshapes. Finally,
ASL signing may include depiction, in which particular linguis-
tic constructions, often referred to as “classifers," convey spatial
information about the position, movement, or shape of entities [58].
Most prior work on sign-language video comprehension has
• We present an interview-based study with ASL learners focused on Deaf users. Little prior work–and no prior observa-
about their experiences watching challenging ASL videos. tional studies–have investigated the behavior of hearing people
Our novel fndings reveal their desire to view videos of when watching a challenging ASL video nor their workarounds for
various genres from multiple platforms, factors that lead unknown signs, e.g., using sign look-up tools. A recent review of
to challenges in video comprehension, and their current prior eye-tracking studies with DHH participants [2] discussed a
workarounds when facing unfamiliar signs. study that examined diferences in gaze patterns between Deaf and
• In the task context of ASL learners translating difcult ASL hearing individuals when looking at a live signer [55]. Although
videos, we present the frst comparative study between our understanding sign language in person is diferent then watching a
video-player prototype with integrated dictionary-search, video (and no sign look-up technologies had been used), that study
in comparison to an existing search-by-feature dictionary characterized gaze patterns of hearing people when trying to com-
website. Our fndings reveal benefts in terms of quality of prehend sign language. Observational studies, with eye-tracking or
translations produced and reduced workload. other means, may lead to insights about behaviors of ASL learners
• We present the frst observational study of ASL learners en- during video comprehension, especially given limited prior work.
gaged in the task of translating difcult ASL videos while In contrast, substantial prior literature exists on non-native learners
using search technology, specifcally a Wizard-of-Oz pro- of various spoken languages engaging in video comprehension, and
totype of our proposed system. Our analysis revealed how there has been work on spoken/written language translation tasks
users selected sub-spans and conducted searches, and we while learners use various electronic resources [4, 16, 27, 44, 57, 64].
characterize how users benefted from an integrated tool
that presented search results alongside the video (enabling 2.2 State of Sign Language Lookup Resources
checking of results in context). We found that usage varied Dictionaries are an important tool used by second language learners
depending on the genre of signing video, and we observed when looking up an unfamiliar word. However, when someone
unexpected use of the sub-span selection tool for the purpose encounters a sign whose meaning they do not know when viewing
of constraining the video play-head. sign language, it is more difcult to look up the word, since sign
Support in the Moment ASSETS ’22, October 23–26, 2022, Athens, Greece

languages lack a common writing system and users cannot use a a sign-language video that is difcult for them, especially how
text-search or alphabetical listing to search for a sign [5, 28]. technologies for ASL dictionary look-up may beneft these users.
Some sign-language dictionary systems expect users to recall In addition, no prior work has examined how users might beneft
linguistic properties of the sign that they are looking for–e.g., hand from an integrated tool for viewing ASL videos, with users able
confguration, orientation, location, movement–and enter these to select spans of the video as the basis for dictionary search. To
properties into a search-query interface to obtain a list of matching address these gaps, we investigate the following research questions:
signs [1, 6, 7, 40, 45, 56, 59]. Prior work on such search-by-feature RQ1 What are the challenges that ASL learners currently expe-
systems has revealed that they are challenging for ASL students rience when trying to understand a difcult sign-language
[6]. Other proposed sign-language dictionary systems expect users video, and what workarounds do they employ?
to submit a video of a single sign that they have extracted from a RQ2 Comparing the experience of users who used our tool and
longer video – or to recall and perform a sign into a webcam [5, 7, 11, those who used an existing feature-based ASL-English re-
13, 37, 59, 63]; sign-recognition technology performs a video search verse dictionary, is there a diference between translation
against a dictionary to provide potential matches. Even if a student quality or perceived workload to produce translation?
were able to remember and produce a sign into a webcam, there are RQ3 How do users interact with a Wizard-of-Oz prototype for
technical challenges in recognizing signs from video due to various viewing an ASL video and conducting dictionary-search
factors [52, 62]. Despite recent advancements [47, 51], state-of- on selected spans of video, during the video-watching and
the-art continuous sign-recognition software is still imperfect. To comprehension task?
mitigate inaccuracies in the video-to-sign matching, some proposed
dictionaries provide users with post-query fltering options, to 3 STUDY 1: INTERVIEW STUDY
narrow the set of results that are returned [26]. Overall, existing
This paper presents two studies: The goal of study 1 was to un-
dictionary systems face several limitations: They expect that the
derstand ASL learners’ challenges with video comprehension and
ASL learner can recall linguistic properties of the desired ASL sign
current workarounds they use. The fndings from study 1 informed
or accurately perform the sign from memory. In addition, systems
the study design, videos, and prototypes included in study 2.
assume that a user is able to precisely identify the starting and
ending of a sign they encountered in a video or in a conversation.
3.1 Study Design
Fast signing speed or various linguistic factors (section 2.1) make it
difcult for ASL learners to precisely select signs in videos. Finally, This IRB-approved study was conducted in person or remotely
the user must launch an additional task of querying a dictionary, based on the preferences of participants during the COVID-19 pan-
while engaging in a video-watching and comprehension task. Using demic. After informed consent was obtained, the semi-structured
separate tool to perform the search may cause users to lose context interview began with questions about participants’ prior experi-
from the video they were watching. ences watching ASL videos. They were asked about the type of
In contrast to prior work, we investigate a dictionary-search videos they watch, their experiences when they have difculty un-
system that enables the user to select a span (of potentially multiple derstanding, and any workarounds they use. To provide context
signs) from a video of continuous sign language, as a basis for for later questions about how difcult it may be for participants
a query to search for potential matching signs from a dictionary to select an individual sign or span of multiple signs they do not
system, with the results presented in an integrated video-player understand, we displayed several example videos to participants as
and search-results interface. This approach may mitigate the need a basis for discussion. These videos were taken from advanced ASL
for users to recall specifc linguistic properties of the unknown or ASL-English interpreting classes, conversational videos between
sign, mitigate the need to identify the specifc start/end of signs expert signers on YouTube, signing performances at theatres, and
in a continuous video, and enable the user to remain in context in interpreted poetry and music. (Video details appear in electronic
their video-watching-and-comprehension task. supplementary fles.) Videos contained a variety of linguistic phe-
Some recent research has investigated ASL learners interacting nomena discussed in section 2.1. Participants were asked how hard
with Wizard-of-Oz prototype systems for ASL dictionary search, it would be to select a sub-span containing one or multiple signs
to identify factors that afect users’ satisfaction [3, 23–26]. Method- and how they would select a time-range of a video. The average
ologically, our studies also employ a Wizard-of-Oz prototype of length of each interview was 36.5 minutes (σ =5.46 minutes).
an ASL dictionary-search system to understand users’ interaction
and potential benefts. However, in those prior studies, users had 3.2 Participants and Recruitment
been shown a stimulus video of native signer performing a single Participants were recruited by posting an advertisement on an
isolated sign (in citation form), and the user was asked to use a ASL Reddit channel and by contacting professors of introductory
dictionary system to identify the sign’s meaning. In contrast, our ASL courses, who shared an advertisement by email with their
studies examine how ASL learners engage in a search task while in students, containing two screening questions: “Are you currently
the midst of a video-watching-and-comprehension task. learning American Sign Language?” or “Have you completed an
introductory or intermediate ASL course in the past fve years?”
Participants were recruited if they responded with yes to at least one
2.3 Research Questions question. We recruited a total of 14 participants for our frst study,
There has been limited prior research on the experience of ASL which included, 2 men, 11 women, and 1 non-binary individual.
learners who are engaged in the educational activity of watching The median age was 21 (σ = 3.67). Participants had studied ASL for
ASSETS ’22, October 23–26, 2022, Athens, Greece Hassan, et al.

a mean of 3.4 years, and all participants confrmed that they had how the signed content on social media is shorter and more un-
taken fewer than 3 years of formal ASL classes. predictable in nature, with the topic of the video not always well
defned, which poses challenges for comprehension. Participants
3.3 Analysis and Findings also discussed how factual signing, e.g., in a documentary, was dif-
We employed a mixture of deductive and inductive approaches in fcult, due to complex vocabulary or increased use of fngerspelling.
our qualitative data analysis. To become familiar with the interview Other participants mentioned watching videos of ASL poetry and
transcripts, two authors read all 14, then during a subsequent read- ASL translations of popular songs, contexts in which they described
ing, they individually took notes to produce initial codes, which they signers as using more depiction and having "their own fow, and they
collated and collapsed into two individual code-books. Each of the have their own rhythm" (P11). Participants described how videos
authors then investigated underlying patterns among their codes with multiple signers, e.g., Deaf theatre, pose challenges, as P8 de-
and formed initial categories, and they consulted the interviewers scribed, "my brain is used to practicing with one signer." Similarly,
to get feedback on their initial categories and further improved participants mentioned how natural conversations were difcult to
them. The authors then met to review all of their initial categories, understand, e.g., P6 discussed how signing in such videos tends to
to identify similarities and diferences. During two three-hour meet- be "quicker, and they’re a little bit more relaxed."
ings, the authors performed an initial thematic grouping, which 3.3.2 Workarounds. Participants mentioned several workarounds
lead to fnal high-level categories. These high-level categories were that were useful in understanding challenging signed video content.
then presented to the rest of the team to arrive at a fnal set of For instance, several participants discussed using the context of a
themes and sub-themes presented in this section. video to understand unknown signs. Participants would consider
the description or title of the video, such as on YouTube, or they
3.3.1 Prior experiences and challenges associated with watching
would consider what was said before or after any unknown signs.
signed video content. Participants discussed various motivations
For instance, P3 described a situation in which they fgured out the
for viewing signing content. Twelve participants mentioned en-
sign for a citrus fruit by considering the context of the surrounding
gaging with ASL videos during classroom-related or homework
signing, which had mentioned lemons. P3 discussed how under-
activities. Ten participants also mentioned watching signed con-
standing later signing may clarify a portion of signing that had
tent outside of the classroom for their own enrichment or personal
not been previously understood, explaining how if they become
exposure to other types of signing, e.g., Deaf theatre or ASL songs.
confused then they "really focus on the next thing they’re saying, so I
P11 said, "It’s both in-class, we have diferent assignments the teacher
can piece together what they might have said, so I can understand it."
will give us, and then I also do it on my own time, if I’m looking for a
Participants discussed various strategies that involved controlling
deeper understanding about things, or if I’m looking for specifc signs.
the fow of the video player:
And I also follow some deaf content creators as well."
Participants also discussed how their lack of familiarity with • Periodically pausing was a strategy among several partic-
regional or dialectical variation in signing, such as Black ASL ipants. For instance, participants discussed how they paused
[43] used among some African-American signers in the U.S., led to videos in-between conversational turns in videos with mul-
challenges in understanding videos. P5 described their experience tiple signers; P8 described how they "pause in between each
in understanding signing among various communities: "I know some speaker... just enough time to grasp" what had been said.
white people in the community, [but the] black Deaf community and • Backtracking and replaying was another common ap-
the interpreter community, I still fnd hard." proach, as P12 explained, "pausing it and replaying it." P3
Participants discussed how various linguistic types of signs also discussed how they will "backtrack the video" if needed.
posed comprehension challeges. For instance, P1 described needing • Slowing down the video was also popular, if possible within
to consciously "switch my brain from a sign to actually each letter" the video player. For instance, P7 explained how they will
when encountering fngerspelling. P11 discussed challenges with "slow down the fngerspelling if...it’s on YouTube. If I could
"fngerspelling, classifers, compound words, any of that kind of stuf... alter the speed, I might try to slow it down."
fngerspelling is defnitely a little tougher for me." Participants dis- Regarding current strategies for seeking the meaning of an un-
cussed how fngerspelled names were challenging to understand known sign, six participants mentioned using English-to-ASL
in a video, especially when there were multiple individuals with dictionaries, i.e., guessing English meanings of the sign they did
similar names. Participants also described challenges with under- not understand to look up that English word in the dictionary to see
standing numbers, e.g., P5 said, "numbers are hard for me, for some if the sign displayed visually matched the sign that had not been
reason, I don’t know why." Participants also discussed challenges understood. Participants also mentioned using ASL-to-English
with compound signs, e.g., P13 said, "I wasn’t sure if that was one or “reverse” dictionaries, i.e., websites that allow someone to enter
two separate signs. So there were defnitely points in the video where linguistic properties to search for the English translation of a sign.
they were blending together a little bit, and I wasn’t sure." Participants discussed challenges, e.g., P7 said, "if I think I have an
Overall, participants discussed how diferent content sources idea of what the sign is I might use Handspeak, or there’s another
or genres pose challenges for comprehension. Participants men- one I use... [It’s] hard to specify handshape in current dictionaries."
tioned viewing signed content on various streaming services, e.g., P6 discussed struggling to enter linguistic properties when con-
YouTube and Netfix, as well as on social media, e.g., Instagram and structing a query: "I defnitely tried using the reverse dictionary stuf
TikTok. P14 said, "I watch ASL videos when I am going through In- online. Usually it doesn’t end up being successful, and I have to just
stagram because I follow some Deaf creators." Participants discussed end up moving on. Because, the way it’s structured, you have the
Support in the Moment ASSETS ’22, October 23–26, 2022, Athens, Greece

handshape, and the movement, and the location. Sometimes it’s a of watching a video to use electronic dictionaries, and this informed
little ambiguous, especially if you don’t actually know what that sign our decision to display dictionary results on the same page. In addi-
is; so, it’s hard to end up looking [it] up." Participants also expressed tion, participants’ explanation of workaround strategies they used
their frustration with having to launch a web-dictionary in another when encountering difcult video, e.g., re-playing and backtrack-
window while trying to understand a video. P11 said, "It’s pretty ing a video, led us to provide the "Play selection" button, which
frustrating sometimes when I’m trying to fnd a specifc sign, I have played the video while the playhead was constrained to the selected
to like go to Google and...then go through all the diferent pages... If I span–to support users in replaying a short segment of the video.
could just scroll and have the source material right there, I think it Findings from study 1 also informed the selection of videos
would be much more efcient." shown to participants during the study. Based on participants’ com-
Rather than use a specifc dictionary website, other participants ments about how the genre or linguistic phenomena in a video
mentioned typing descriptions of what a sign looked like into a relates to how challenging it is, we selected three genres of videos
Google search, e.g., P11 said, "I’ve defnitely tried to Google it before, to display in this study, including educational videos, conversa-
but it’s so hard to sometimes describe what it is that you’re looking tional videos with at least two signers, with turn-taking between
for. I end up being very vague... It’s very rare that I go to Google and them, and Deaf theatre and poetry videos. Since participants had
fnd what I’m looking for as far as trying to describe a sign." Finally, discussed how they face particular challenges when encountering
several participants mentioned that, if other people are available, fngerspelling or compound signs, we ensured that the videos in-
they may ask a teacher or a peer. As P09 said, "If I’m in class I cluded instances of these types of signing. In addition, we selected
would ask the teacher. If it’s for a class I would either look it up online videos that included multiple signers engaged in conversational
or if it’s in a vocabulary learning unit." signing, as well as a video with diferent regional dialects of signing,
since both had been discussed by participants in study 1. The videos
4 STUDY 2: PROTOTYPE STUDY used in this study were obtained from online sources and material
from fourth-year ASL interpreting courses. Videos were an average
While study 1 motivated the need for a tool that provided an inte-
of 23.7 seconds in length, and details about each video are provided
grated video-playing and sign-search experience, study 2 investi-
in electronic supplementary fles.
gated how users would interact with such a tool and whether there
are benefts as compared to existing systems. In this prototype 4.1.2 Selection of signs appearing in the results list. Given the rapid
study, users interacted with a Wizard-of-Oz prototype, at a desktop pace of advancement in the feld of sign-recognition technology, and
computer in a lab setting. Findings from study 1 informed both the since the purpose of this observational study was to understand
design of the prototypes used and the selection of videos included participants’ behavior and interaction, we selected a Wizard-of-
in both the prototypes, as described below. Oz approach to simulate an automatic search-recognition system.
Therefore, our system returned a pre-determined list of results,
4.1 Prototype Design in which the actual sign appeared somewhere in the results list.
4.1.1 Integrated Search. Since the focus of this study was on users’ The selection of signs that appeared on the results was based on
interaction and behavior, an interactive Wizard-of-Oz prototype pre-processing of videos. The protocol is described below:
was designed (Figure 1), in which the underlying sign-recognition (1) A Deaf member of our team with native ASL fuency watched
technology was simulated, without any automatic video analysis. all 9 of the ASL videos in advance to identify the sequence
On this web-based prototype, participants entered a participant of signs appearing in each video, along with the starting and
ID on the frst screen. Next, they were provided a calibration screen ending time-stamps of each sign.
to ensure that the size and aspect ratio of their browser window (2) For each sign, a set of dictionary-search results were manu-
was consistent. As shown in Figure 1, the interface displayed an ally prepared, to simulate the type of results someone would
ASL video with a play/pause button at the bottom-left corner of the see if using a real automatic dictionary-search system in the
screen. On a video-timeline at the bottom of the screen, a vertical future. Specifcally, for each sign, a native signer carefully
white line indicated the video playhead (the current position of selected the closest match and 11 other signs that were simi-
the video). The users can select a video span by dragging yellow lar in appearance to the sign, from a collection of signs from
edges of a selection bar on this timeline, and they can press a "Play the ASLLVD [50]. The researcher prioritized selecting signs
selection" button to play only the portion of the video in that span. for this “match list" with as many properties in common as
Once satisfed with their span selection, users can click on the possible to the given sign, i.e., the same handshape, number
yellow "Search selection" button on the top right corner of the of hands, movement, and location.
screen to search for the signs. The results were displayed in a scroll- (3) When a participant selects a span of video, it is possible that
able window on the right side of the screen. Each result consisted the start or end of the span is within the duration of a sign in
of a video of a sign from the American Sign Language Lexicon the video, rather than precisely at a boundary between signs.
Video Dataset (ASLLVD) [50] and a label below showing the closest We established a rule that the prototype would consider a
English gloss for that sign. When clicking on a "more information" sign to be within a span selected by the participant if at least
icon for each search result, linguistic properties for the item were half of the sign appears within the selected span.
displayed, as illustrated on the right side of Figure 1. (4) Since a selected span may contain multiple signs, the list of
Study 1 fndings informed the prototype design: For instance, dictionary-search results displayed combined results from
participants expressed frustration with needing to leave the context the match lists for the all signs within that span, as follows:
ASSETS ’22, October 23–26, 2022, Athens, Greece Hassan, et al.

Figure 1: Screen image of the prototype, displaying labels for the various interface regions, including the video player at the
top left, a text box where translations can be typed below, a video timeline with a span-selection interface along the bottom,
and a dictionary-search panel on the right.

From among the match lists for all signs in the span, one
match list was selected randomly, and the top item from
that list was taken (without replacement) for inclusion on
the combined results list. This process was repeated until all
match lists were empty, to produce a combined list of results
that contained the union of the original match lists. The frst
50 items from this combined list were displayed to users.
4.1.3 Baseline Prototype. We also designed a “baseline" prototype
identical to the one described above but without the integrated
dictionary-search option, as shown in Figure 2(a). When viewing (a)
a video with this baseline, participants were instructed to open a
second web-browser window to use the handSpeak reverse dictio-
nary 1 , as shown in Figure 2(b). That site enables users to look for
an isolated ASL sign by selecting text labels that represent various
linguistic properties of a sign, e.g., handshape, hand location, move-
ment, and orientation. Based on the query options selected, results
appear as a list of English gloss labels for matching signs. Partici-
pants were not allowed to visit other websites or other resources.

4.2 Study Design and Analysis Plan (b)


After providing informed consent in this IRB-approved study, par-
ticipants were asked to view a video and produce an English transla- Figure 2: (a) The prototype for the baseline condition in
tion text for it using the integrated-search prototype or the baseline Study 3, identical to the one from Study 2 but without any
prototype. Using a sample video, the researcher frst demonstrated dictionary-search ability. (b) The handSpeak ASL-English
the prototype (details in section 4.2). After indicating that they reverse dictionary website, which participants used in the
understood the prototype, each participant viewed 9 videos. The Study-3 baseline condition. As users click on linguistic prop-
order of videos was randomized. Their interaction was recorded: erties, the list of English gloss labels at the bottom of the
(1) The software prototype was designed to automatically record window updates to list matching signs.
the starting and ending points on the video timeline of every

1 https://www.handspeak.com/word/asl-eng/
Support in the Moment ASSETS ’22, October 23–26, 2022, Athens, Greece

span the participant selected, the number of signs within which had been produced using the baseline prototype. The average
each span, whenever the participant triggered a search, and translation-accuracy score assigned by the researcher was 8.03 for
the text of the English translation typed by the participant. translations produced using the dictionary-search prototype and
(2) Each participant’s face was approximately 65cm from a 19- 6.67 for the baseline prototype. The distributions in scores between
inch monitor, to which a Tobii Nano [49] 60Hz screen-based the two conditions difered signifcantly (Mann–Whitney U = 10,
eye-tracking device was attached. iMotions (v9.1) [30] soft- n1=8, n2=7, P = 0.0424 < 0.05 two-tailed, r = 0.24).
ware recorded each participant’s gaze. Figure 3 shows scaled mean raw NASA-TLX sub-scores (physical
(3) A researcher, who was a fourth-year English-ASL interpret- demand, temporal demand, performance, efort, and frustration)
ing student at the university, sat 2m away and took observa- and results of two-tailed Mann-Whitney U tests comparing sub-
tional notes during the experiment. The iMotions software scores across conditions. Participants who used the prototype with
enabled the researcher to monitor the participant’s gaze on dictionary-search gave signifcantly lower values for mental de-
the user-interface in real-time on a secondary display. mand (how much mental and perceptual activity was required),
At the end of the entire session, a debriefng interview was con- temporal demand (how much time pressure was felt), and frustra-
ducted to gather participants’ impressions of the system, perception tion (how insecure, discouraged, irritated, or stressed they felt). The
of how they interacted with the device, and other recommenda- NASA TLX instrument and details of these scales appear in [21, 22].
tions. The interview data was transcribed and coded using the same
methodology as in study 1. 4.5 Findings: In-depth Analysis
Qualitative analysis of the data listed above was performed by We present an in-depth analysis of how participants interacted with
two members of our research team who reviewed and coded this the integrated search prototype. Since prior work had investigated
data, from the perspective of identifying typical sequences of inter- the experience of students with existing dictionary-search websites,
action behavior during each video session. They reviewed record- e.g., [6, 26], we do not present a detailed observational analysis of
ings of the screen and eye-gaze, plotted eye-gaze patterns, analyzed users interacting with the baseline, given the limited novelty of
data captured by the software prototype, and reviewed the ob- such an analysis, amid prior literature.
server’s notes. The researchers discussed their notes and agreed
4.5.1 Using the span selection to constrain the playhead. Six par-
upon a categorization of the behaviors observed, as presented in
ticipants used the span selection tool to constrain the portion of
Findings section 4.5. After viewing and translating each video for
the video that would play at one time, to enable them to progress
both conditions, participants’ English translation texts were saved,
incrementally through the videos. Participants selected spans of
and participants completed a NASA TLX [21, 22].
average duration 11.43s for viewing videos in this manner, and they
progressively selected spans of video as they typed the English
4.3 Participants and Recruitment translation text. Figure 4(a) illustrates this trend, by plotting the
We recruited a total of 15 ASL students for study 2, using recruit- positions of spans a participant selected over time. Span 1, shown
ment criteria and approaches identical to study 1. 8 participants at the bottom of the image, indicates the frst span selected.
were assigned to the integrated search condition whereas 7 were Comments from debriefng interviews also support this observa-
assigned to the baseline condition. tion. P7 described how they selected a span of a particular width,
The median age of participants in the integrated search condition and then dragged it along the video, to watch portions of video
was 20, and this included 7 women and 1 non-binary individual. progressively: "I like how you can just maintain the length and you
Participants had studied ASL for a mean of 3.5 years, and all partic- just drag it over so you’re getting the same length of a chunk of the
ipants confrmed that they had taken fewer than 3 years of formal video; that was easy to use."
ASL classes.
The median age of participants who used the baseline prototype 4.5.2 Approaching task linearly, sometimes afer initial overview.
was 21, and they included 4 women and 3 men. A single recruitment Participants viewed the videos in a linear manner and produced
process was conducted and participants were randomly assigned to transcripts as they watched short segments of video. In some cases,
either the prototype-with-dictionary-search or baseline-prototype participants frst viewed the entire video, and then they returned to
condition. Participants using the baseline prototype had studied the beginning of the video to progressively view short segments of
ASL for a mean of 3.7 years, and all participants confrmed that video in a linear manner, as illustrated in Figure 4(b), which shows
they had taken fewer than 3 years of formal ASL classes. a full-video span prior to progressive short spans.
4.5.3 Using dictionary search to inform translation. In 62 out of
4.4 Findings: Comparative Study 72 video sessions, participants made use of the dictionary-search
To assess translation quality, we adapted a prior approach [9], in feature to look up the meaning of unknown signs in the video. As
which a human judge looked for translation errors in a text (e.g., illustrated in Figure 5, the results of the search tool informed par-
wrong or omitted words) and then assigned an overall translation- ticipants’ translation decisions as they linearly progressed through
accuracy score (out of 10). In our study, a fourth-year ASL inter- the video. During the debriefng interview, P6 described how the
preting student, who had completed a university course on ASL tool helped: “I knew what he was saying in general, I just couldn’t
linguistics, analyzed the transcripts to identify errors and assign think of the exact English words and that one came up right away."
translation-accuracy scores—without knowing which translations P7 discussed the benefts of the tool during fast signing, “It was
had been produced using the dictionary-search prototype and defnitely useful, especially when the signers were going really fast
ASSETS ’22, October 23–26, 2022, Athens, Greece Hassan, et al.

Figure 3: NASA TLX sub-scale scores from participants in the dictionary-search (n=8) and baseline prototype (n=7) conditions,
with scores scaled to a 0-to-100 range. For all sub-scales, lower scores are better, i.e., indicating less perceived demand, less efort
needed, less frustration, or better sense of performance success. Signifcance testing results from two-tailed Mann-Whitney U
tests are also presented on top of the bars.

because then I could double check to make sure that what I thought I earlier portions, using the search tool, to confrm that they had
saw was actually what I saw." correctly understood specifc signs.
During debriefng interviews, participants talked about various
4.5.4 Gradually making the span shorter prior to search. When ways in the search tool was useful in producing a more accurate
encountering a difcult portion of video, participants often reduced translation. For instance, P7 discussed how the search results moti-
the size of their span before initiating a dictionary search. When vated them to adjust their wording in the translation, e.g., saying,
selecting a span to view a portion of video (and not to conduct a “I would go back and use the tool to make my translation more precise
search), the average span width was 8.17 seconds (10.83 signs), while I guess. So, I could fx the sentences and the wording." Other times,
the length of spans immediately prior to a search request was 2.33 the search results simply boosted their confdence in how they had
seconds (3.25 signs). If a search result did not enable a participant to understood the video, e.g., with P7 saying “sometimes that helped
identify the meaning of signs in a particular span, some participants to confrm what I thought I saw."
progressively narrowed the span width, to more precisely select
the specifc video portion where they were confused, and then 4.5.6 Participants struggled to find a sign if a diferent version was
requested additional dictionary searches, as shown in Figure 6(a). being signed in the results. In several instances, participants were
While adjusting spans, participants’ gaze alternated between the confused when the citation-form of the sign displayed in the search
main video region and the span-selection control, as shown in 6(b). results did not match the variation of the sign in the main video,
In debriefng interviews, participants described watching longer often in the case of compound signs and depiction. In Figure 8, P5
segments to understand the context and then narrowing in on a performed a search, but the specifc appearance of the sign in the
shorter span that was difcult to understand. P5 characterized their video difered from the citation form shown in the dictionary-search
approach as “narrowing it down and then pressing search." Other results, which led the participant to glance back and forth between
participants discussed the benefts of beginning with a search of a them to compare. Ultimately, the participant did not produce the
wider span initially, e.g., as P7 explained, “if I was just trying to get correct translation for that portion of the video, suggesting that
the general idea of a section, it was helpful that sometimes there were they did not realize this was a match to what they had seen.
more signs in the up results besides like the specifc signs that I had In debriefng interviews, participants how matching dictionary-
selected because it gave more context and it was easier to understand." search results to signs in the video was more difcult for some video
genres. For instance, P4 discussed how when a sign was produced
4.5.5 Using dictionary-search to confirm results afer initial transla- with great emotion, e.g., in a theater video, then it was difcult
tion. Among the 62 video sessions in which participants made use to match it to a dictionary-search result with more neutral afect.
of the dictionary-search tool, in 40 cases, we observed participants P4 described their difculty with a video in which the signer “was
using the search tool after they had already completed a full trans- showing emotion and then you would go in the searches and they
lation of the entire video. As illustrated in Figure 7, after writing a wouldn’t. So it’s like, I guess you can get mixed up about the emotion."
translation for the whole video, the participant reviewed specifc Participants suggested that the dictionary-search system could be
Support in the Moment ASSETS ’22, October 23–26, 2022, Athens, Greece

(a)

(a)

(b)

Figure 5: P5 using the tool to produce a translation: (a) brows-


ing search results for a span and (b) after identifying the
meaning of the sign FIGURE-OUT, typing into the transla-
tion text box to continue the sentence: "They give books to
(b) learn from and fgure out..."

Figure 4: In (a), P5 viewed a video in a linear manner, us-


ing the span selection to progressively view short video seg- In debriefng interviews, participants discussed why they se-
ments. Horizontal bars indicate spans selected with respect lected spans of diferent widths for diferent genres. P6 discussed
to the total duration of the video. The frst span selected is how theatre/poetry videos were challenging to understand gener-
at the bottom of the y-axis, with subsequent spans appear- ally, “This one had more of a poetic meaning and display; so, it did
ing higher on the y-axis. Spans in blue dotted lines are those take more focus to understand it for the translation.". P5 described
for which the participant pressed the “Search” button to con- selecting spans while watching conversational videos in contrast
duct a dictionary search. The black lines at the bottom show to the theatre/poetry performances: “the conversational ones, when
the actual signs on the video timeline. In (b), P8 frst selected I chose these subspans were shorter, because they go back and forth a
the entire video, played it, and then reduced the span to a lot [between signers]. But the poetry ones I feel are more conceptual...
smaller duration and moved the span forward progressively. so you can watch longer pieces. You don’t need to cut it down."
P7 discussed how the width of selected span depended on the
overall signing pace, but longer spans were generally needed for
theatre/poetry videos due to the style of signing: “Some of the ones
improved by providing dialectical variations of each sign result and that were very visual, like the mushroom one and the moon one and all
an example of each sign’s use in a sentence, e.g., P3 said, “It would those ones that were ASL storytelling type of very fgurative language...
be nice to see the sign in a context with more facial expressions or in It’s typically slower paced, and sometimes there’s a lot of repeated
a sentence like you have in other dictionaries." signs. Or there’s a lot of just a depiction that’s very visual and doesn’t
have a lot of strictly vocabulary to go with it, but it’s more classifers.
4.5.7 Diferences in span selections across genres of videos. As de- I found that I would sometimes need a longer chunk in order to use
scribed above, the 9 videos in the study were from three genres: the tool and actually get relevant results of what was being signed."
natural conversations, educational videos, and theatre/poetry per- Participants were free to select spans that did not align precisely
formances. An analysis of the span-selection data captured by the with the boundaries of when one sign ends and the next begins;
prototype revealed that participants selected wider spans for the- however, the results of dictionary-search could be controlled more
ater videos, as illustrated in Figure 9. When users were selecting precisely if spans were selected more accurately. An analysis of the
a span simply to constrain the playhead to view a portion of the mean error (in seconds), between each span selection boundary
video, a Kruskal-Wallis test [H(2)=24.28, p<0.00001, η 2 = 0.043 (Small and the nearest actual sign boundary, reveled that users were less
Efect)] with Mann-Whitney post-hoc testing with Bonferroni cor- accurate when selecting spans during theatre/poetry videos. The
rections, revealed that users selected wider spans for theatre videos. mean error for natural conversation videos was 0.19 seconds, for
Similarly, when users were selecting a span as input to a search, educational videos was 0.23 seconds, and for theatre/poetry videos
the testing [H(2)=24.3031, p<0.00001, η 2 = 0.12 (Medium Efect)] was 0.55 seconds. A Kruskal-Wallis test [H(2)=5.2174, p=0.02236,
revealed that users also selected wider spans for theatre videos. η 2 = 0.041 (Small Efect)] and Mann-Whitney post-hoc testing with
ASSETS ’22, October 23–26, 2022, Athens, Greece Hassan, et al.

Figure 8: P5 glanced between a dictionary result for RE-


(a) FLECT and corresponding portion of the video, where the
sign had been produced in a diferent manner, leading to P5
to believe they were diferent signs and ultimately produc-
ing an incorrect translation. The yellow lines going from the
span to the video are showing the gaze pattern.

such videos was a factor, “because of depiction... there weren’t clear


boundaries as much in the signs because it was using classifers."

5 DISCUSSION
(b) While prior second-language pedagogical research had investigated
the challenges faced by students trying to understand texts with
Figure 6: P1 adjusting span width over time, while perform- difcult vocabulary [16], no prior study had investigated ASL stu-
ing repeated search requests. (a) Blue dotted lines indicate dents’ experiences and challenges through frsthand interview and
spans for which dictionary-search was requested, and red observational methodologies. Our participants described how di-
solid lines indicate spans for which no search was requested. alectical variation, linguistic types of signs, and various genres of
For spans 7 to 10, the width reduces over time. (b) When fne- ASL content led to comprehension challenges. We investigated how
tuning span width, participant 8’s gaze moves between the students currently approach understanding ASL videos contain-
video region and the span control. The yellow lines going ing unknown signs, revealing how students turn to online video
from the span to the video are showing the gaze pattern. streaming, video sharing, and social media sites in order to gain
experience at understanding more diverse and natural examples of
signing. While the proliferation of video social media and stream-
ing services has diversifed and increased available ASL content
[60], most prior ASL pedagogical research on video comprehension
predates the ubiquity of such sources, e.g., [14]. Our fndings moti-
vate the need for a tool to (a) support students viewing videos they
seek from diverse sources and (b) with challenging content to
support developing comprehensions skills.
Study 1 also investigated students’ current workarounds for
videos with difcult signing, revealing their dissatisfaction with ex-
isting ASL dictionary resources. These fndings aligned with prior
work, e.g., [6], on the need for better tools to enable students in
identify the meaning of an unknown sign. A unique focus of our
study was that participants considered dictionary-searching chal-
Figure 7: P7 had already completed an English translation lenges within the context of trying to understand a difcult video:
for the entire video, and then they returned to a few ear- We found students were frustrated at needing to leave their video
lier regions of the video and requested dictionary searches to look up a sign in a separate website and their use of workarounds
to confrm their translation for specifc segments of video. like repeated pausing and rewinding of videos. These fndings
specifcally motivate HCI research on tools that support: (a) view-
ing and repeating short spans of video and (b) integrating the
Bonferroni correction, revealed that the error in the case of the- dictionary-search tool into the video-playing experience.
atre/poetry videos was higher than for the other two genres. In Study 2 investigated the potential beneft of these tools. It
debriefng interviews, P2, P6, and P7 discussed how aligning span frst compared the full prototype and a baseline prototype in which
selection with actual sign boundaries was more challenging for span selection was still available for constraining the playhead, yet
theatre/poetry videos. P7 explained that the type of signs within the students had to use an external, existing ASL dictionary website
Support in the Moment ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 9: Box and whisker plots illustrating span widths for each video genre, in two cases: spans selected immediately prior
to a search (top of graph) and spans selected not immediately prior to a search (bottom of graph). In each case, participants
selected wider spans for videos in the theater genre. Median are indicated by vertical bars within each box, means indicated
by x, and outliers by dots. Signifcant pairwise diferences are marked with asterisks: ∗∗ if p < 0.01, and ∗ ∗ ∗ if p < 0.001.

as a reference tool. Our fndings revealed that the integrated search learners viewing video. Our fndings therefore motivate further
tool led to translations of greater accuracy and participants rated HCI research on span-selection interfaces within this new context.
the workload lower. Prior work had examined at how using in- Our analysis also revealed that video players with span-selection
situ dictionaries improve comprehension for non-native speakers interfaces may be a useful probe during studies in which partici-
[36], but this was the frst study to explore this in the sign-language pants use such a tool to view videos may reveal their comprehension
context. Study 2 also investigated how users would actually interact strategies. For instance, our analysis revealed diferences in span
with a such a system, using a Wizard-of-Oz prototype methodology. widths that users selected for diferent genres, and participants
While prior HCI research had investigated students interactions discussed how their selection of span width related to the linguistic
with ASL dictionary-search interfaces, e.g., [3, 6, 25, 26], that work properties, with wider spans for challenging theater/poetry videos.
had focused on students trying to look up a specifc sign from Similarly, when span-selection boundaries were compared to actual
their memory or search using a video of an isolated sign. Our sign boundaries in videos, we observed more error during the-
novel observational fndings include how students engage with a ater/poetry videos. Thus, post-study analysis of the spans selected
dictionary search tool within the context of the overall video by individuals who view ASL videos may reveal insights for ASL
comprehension task, e.g., replaying and comparing the portions linguistics or education researchers, and real-time analysis of spans
of the original video to potential sign matches. These observational could enable adaptive educational software capable of identifying
fndings in Study 2 aligned with the interview-based fndings from when students are currently struggling while viewing a video.
Study 1, in which students had expressed frustration at losing their Overall, the fndings of this paper are relevant to several au-
video-watching context when using a separate dictionary tool. diences: The current experience and workarounds of students
Our observational fndings also revealed how students engaged that motivate research on integrated tools for students viewing
in a dual use of the video player’s span-selection interface: (a) to challenging ASL videos will inform the work of accessibility and
constrain the playhead to progressive portions of video and (b) HCI researchers. Students’ perspective on factors that lead to chal-
to specify input for dictionary search. Notably, usage (a) served lenges in ASL video comprehension, as well as the potential of
as a probe within our analysis to identify the “window size" of span-selection video players as a research tool, will be relevant to
video that students viewed as they worked through the challenging ASL linguistics and education researchers. In addition, our fndings
video. While some prior research on video analysis or annotation on the benefts of ASL dictionary search using extracted spans of
tools for linguists analyzing ASL videos had incorporated a span- continuous-signing video informs the work of computer-vision
selection interface for labeling regions of videos, e.g., [39, 50], no researchers. Specifcally, our fndings reveal a whole new sign-
prior research had investigated span selection in the context of ASL recognition task of using a sub-segment of a continuous video as
ASSETS ’22, October 23–26, 2022, Athens, Greece Hassan, et al.

input for ASL recognition. Traditionally, when ASL recognition Finally, our studies have focused on ASL learners, but the video-
researchers consider processing continuous ASL videos, it is in playing and span-searching tool could be useful for other user
the context of machine-translation to English text. Rather than groups, e.g., linguists annotating videos of ASL, or experienced ASL
providing fully automatic translation of a video, our fndings have interpreters translating complex technical videos. Future research
revealed interest and benefts for students in attempting to under- could investigate the design of tools for these related tasks.
stand a challenging ASL video on their own with some integrated
sign-searching supports. In addition, compared to recognition of 7 CONCLUSION
videos of entire utterances, there are unique challenges when the Students trying to understand a challenging video is a key part of
input is a fragment extracted from a longer video. The selected comprehension-skill development among ASL learners. However,
span may not perfectly align with sign boundaries, the signs at there are limitations with existing tools for looking-up signs. Our
the boundary may have been subject to co-articulation efects of interview study with ASL learners revealed users’ frsthand per-
signing beyond the span selection, and there will be less contextual spective on challenges they face during comprehension of such
information for the recognition system to consider. videos and their existing workarounds. These fndings motivated
More broadly, our fndings speak to the literature on users inter- and suggested the design of a Wizard-of-Oz video-player prototype
acting with videos, especially when carefully scrutinizing video, with an integrated dictionary-search, based on span selection. Our
e.g., when interacting with specialized video-editing software. While study revealed diferences between this prototype and a baseline
span-selection is less prevalent in video-player systems, several (use of an existing ASL dictionary website) revealed benefts for stu-
commercial video-editing systems, e.g., [31–33], allow users to se- dent’s accuracy of translation of ASL videos and subjective rating
lect a segment of video. Some prior observational research had of workload. An analysis of how users interacted with the inte-
investigated users selecting spans while engaged in a video-editing grated system also revealed diferences in use based on the genre
task [34], and our study has extended span-selection interaction and linguistic complexity of the video, benefts of the integrated
to the ASL education and video-search context. Other prior work design, and a dual use of span selection both as input to dictionary
had investigated educational software tools for students to select a search and for constraining the video playhead.
segment of a spoken-language lecture videos while taking notes Our work motivates research and tools to address the needs
[10, 42] or integrated approaches to editing, sharing, and control- of ASL learners engaged in comprehension of challenging videos,
ling spoken-language educational lecture videos [10, 57]. While and our fndings inform future designers of ASL systems, com-
there are diferences between the task of understanding an ASL puter vision researchers working on sign-matching technologies,
video and understanding an educational lecture video in spoken and sign-language educators or linguists. These fndings may also
language, our fndings on the benefts of span-selection interfaces inform the design of video comprehension tools for other contexts.
for constraining the playhead and serving as a basis for integrated
search tools may be relevant to that domain. ACKNOWLEDGMENTS
This material is based upon work supported by the National Science
Foundation under Grant No. 1763569, 2212303, and 2125362.
6 LIMITATIONS AND FUTURE WORK
Video stimuli in study 2 varied in frame-rate, bit-depth, compres- REFERENCES
[1] Alikhan Abutalipov, Aigerim Janaliyeva, Medet Mukushev, Antonio Cerone, and
sion, and frame scan (i.e., interlaced vs. progressive); these factors Anara Sandygulova. 2021. Handshape Classifcation in a Reverse Dictionary of
can afect video comprehension [29]. Since the videos were consis- Sign Languages for the Deaf. In From Data to Models and Back, Juliana Bowles,
tent in both prototype conditions in study 2, these factors did not Giovanna Broccia, and Mirco Nanni (Eds.). Springer International Publishing,
Cham, 217–226.
afect our results, but future research could investigate the experi- [2] Chanchal Agrawal and Roshan L Peiris. 2021. I see what you’re saying: A lit-
ence of ASL students viewing videos that vary in these dimensions. erature review of eye tracking research in communication of Deaf or Hard
While our research has focused on ASL learners and videos, future of Hearing Users. In The 23rd International ACM SIGACCESS Conference on
Computers and Accessibility (Virtual Event, USA) (ASSETS ’21). Association
work could be extended to learners of other sign languages. for Computing Machinery, New York, NY, USA, Article 41, 13 pages. https:
While the Wizard-of-Oz dictionary-search output in our proto- //doi.org/10.1145/3441852.3471209
[3] Oliver Alonzo, Abraham Glasser, and Matt Huenerfauth. 2019. Efect of automatic
type simulated a single level of sign-recognition output accuracy, sign recognition performance on the usability of video-based search interfaces for
future research could examine how variations in the output qual- sign language dictionaries. In The 21st International ACM SIGACCESS Conference
ity would afect users’ experience to inform computer vision re- on Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19). Association
for Computing Machinery, New York, NY, USA, 56–67. https://doi.org/10.1145/
searchers on the level of accuracy required. Analogous research has 3308561.3353791
been conducted on isolated-sign search-by-video systems [3, 24, 25]. [4] Stavroula Sokoli Athens and Stavroula Sokoli. 2007. Stavroula Sokoli ( Athens )
While our work revealed benefts of integrated, span-based Learning via Subtitling ( LvS ) : A tool for the creation of foreign language learn-
ing activities based on flm subtitling. In MuTra 2006 – Audiovisual Translation
dictionary-search during ASL video comprehension, future work Scenarios: Conference Proceedings. MuTra, Copenhagen, Denmark, 8 pages.
could further explore the design space of span selection or presen- [5] Vassilis Athitsos, Carol Neidle, Stan Sclarof, Joan Nash, Alexandra Stefan, Ashwin
Thangali, Haijing Wang, and Quan Yuan. 2010. Large lexicon project: American
tation of search results. Further, while study 2 revealed benefts in Sign Language video corpus and sign language indexing/retrieval algorithms.
translation quality and lower workload scores, it did not investigate In Workshop on the Representation and Processing of Sign Languages: Corpora
whether engaging in the task of watching a challenging video led and Sign Language Technologies (CSLT), Vol. 2. European Language Resources
Association (ELRA), Valletta, Malta, 11–14.
to measurable learning efects among students; future work could [6] Danielle Bragg, Kyle Rector, and Richard E. Ladner. 2015. A user-powered
examine potential short- and long-term benefts of such activity. American Sign Language dictionary. In Proceedings of the 18th ACM Conference
Support in the Moment ASSETS ’22, October 23–26, 2022, Athens, Greece

on Computer Supported Cooperative Work and Social Computing (Vancouver, BC, [27] Saad Hassan, Aiza Hasib, Suleman Shahid, Sana Asif, and Arsalan Khan. 2019.
Canada) (CSCW ’15). Association for Computing Machinery, New York, NY, USA, Kahaniyan - Designing for Acquisition of Urdu as a Second Language. In Human-
1837–1848. https://doi.org/10.1145/2675133.2675226 Computer Interaction – INTERACT 2019, David Lamas, Fernando Loizides, Lennart
[7] Fabio Buttussi, Luca Chittaro, and Marco Coppo. 2007. Using web3D technologies Nacke, Helen Petrie, Marco Winckler, and Panayiotis Zaphiris (Eds.). Springer
for visualization and search of signs in an international sign language dictionary. International Publishing, Cham, 207–216.
In Proceedings of the Twelfth International Conference on 3D Web Technology [28] Robert J Hofmeister. 2000. A piece of the puzzle: ASL and reading comprehension
(Perugia, Italy) (Web3D ’07). Association for Computing Machinery, New York, in deaf children. Mahwah, N.J. : Lawrence Erlbaum Associates, New Jersey, USA.
NY, USA, 61–70. https://doi.org/10.1145/1229390.1229401 143–163 pages.
[8] Naomi K. Caselli, Zed Sevcikova Sehyr, Ariel M. Cohen-Goldberg, and Karen Em- [29] Simon Hooper, Charles Miller, Susan Rose, and George Veletsianos. 2007. The
morey. 2017. ASL-LEX: A lexical database of American Sign Language. Behavior efects of digital video quality on learner comprehension in an American Sign
Research Methods 49, 2 (01 Apr 2017), 784–801. https://doi.org/10.3758/s13428- Language assessment environment. Sign Language Studies 8, 1 (2007), 42–58.
016-0742-0 [30] iMotions A/S. 2019. iMotions Biometric Research Platform. imotions. https:
[9] Sheila Castilho, Stephen Doherty, Federico Gaspari, and Joss Moorkens. 2018. //imotions.com/academy/
Approaches to human and machine translation quality assessment. Springer Inter- [31] Adobe Inc. 2008. Adobe Premiere Pro. https://www.adobe.com/products/
national Publishing, Cham, 9–38. https://doi.org/10.1007/978-3-319-91241-7_2 premiere.html. [Online; accessed 03-March-2022].
[10] Konstantinos Chorianopoulos and Michail N. Giannakos. 2013. Usability design [32] Apple Inc. 2008. Apple Finalcut. http://aiweb.techfak.uni-bielefeld.de/content/
for video lectures. In Proceedings of the 11th European Conference on Interactive bworld-robot-control-software/. [Online; accessed 03-March-2022].
TV and Video (Como, Italy) (EuroITV ’13). Association for Computing Machinery, [33] Apple Inc. 2008. Apple iMovie. https://www.apple.com/imovie/. [Online;
New York, NY, USA, 163–164. https://doi.org/10.1145/2465958.2465982 accessed 03-March-2022].
[11] Christopher Conly, Zhong Zhang, and Vassilis Athitsos. 2015. An integrated [34] Tero Jokela, Minna Karukka, and Kaj Mäkelä. 2007. Mobile Video Editor: Design
RGB-D system for looking up the meaning of signs. In Proceedings of the 8th ACM and Evaluation. In Proceedings of the 12th International Conference on Human-
International Conference on PErvasive Technologies Related to Assistive Environ- Computer Interaction: Interaction Platforms and Techniques (Beijing, China)
ments (Corfu, Greece) (PETRA ’15). Association for Computing Machinery, New (HCI’07). Springer-Verlag, Berlin, Heidelberg, 344–353.
York, NY, USA, Article 24, 8 pages. https://doi.org/10.1145/2769493.2769534 [35] Jonathan Keane, Diane Brentari, and Jason Riggle. 2012. Coarticulation in ASL
[12] Eberhard, David M., Gary F. Simons, and Charles D. Fennig (eds.). 2021. Sign fngerspelling.
language. https://www.ethnologue.com/subgroups/sign-language [36] Annette Klosa-Kückelhaus and Frank Michaelis. 2022. The Design of Internet
[13] Ralph Elliott, Helen Cooper, John Glauert, Richard Bowden, and François Dictionaries. The Bloomsbury Handbook of Lexicography 1 (2022), 405.
Lefebvre-Albaret. 2011. Search-by-example in multilingual sign language [37] Pradeep Kumar, Rajkumar Saini, Partha Pratim Roy, and Debi Prosad Dogra. 2018.
databases. In Proceedings of the Second International Workshop on Sign Language A position and rotation invariant framework for sign language recognition (SLR)
Translation and Avatar Technology (SLTAT). SLTAT, Dundee, Scotland, 8 pages. using Kinect. Multimedia Tools and Applications 77, 7 (2018), 8823–8846.
[14] Karen Emmorey, Robin Thompson, and Rachael Colvin. 2009. Eye gaze during [38] Marlon Kuntze, Debbie Golos, and Charlotte Enns. 2014. Rethinking literacy:
comprehension of American Sign Language by native and beginning signers. Broadening opportunities for visual learners. Sign Language Studies 14, 2 (2014),
Journal of deaf studies and deaf education 14, 2 (2009), 237–243. 203–224.
[15] National Center for Education Statistics (NCES). 2018. Digest of education [39] The language archive. 2018. ELAN - The Max Planck Institute for Psycholinguis-
statistics number and percentage distribution of course enrollments in languages tics. https://archive.mpi.nl/tla/elan
other than English at degree-granting postsecondary institutions, by language [40] J. Lapiak. 2021. Handspeak. https://www.handspeak.com/
and enrollment level: Selected years, 2002 through 2016. https://nces.ed.gov/ [41] Scott K Liddell and Robert E Johnson. 1986. American Sign Language compound
programs/digest/d18/tables/dt18_311.80.asp formation processes, lexicalization, and phonological remnants. Natural Language
[16] Susan M Gass, Jennifer Behney, and Luke Plonsky. 2020. Second language acqui- & Linguistic Theory 4, 4 (1986), 445–513.
sition: An introductory course (5 ed.). Routledge, New York. 774 pages. [42] Ching (Jean) Liu, Chi-Lan Yang, Joseph Jay Williams, and Hao-Chuan Wang.
[17] David Goldberg, Dennis Looney, and Natalia Lusin. 2015. Enrollments in lan- 2019. NoteStruct: Scafolding Note-Taking While Learning from Online Videos.
guages other than English in United States Institutions of Higher Education, Fall In Extended Abstracts of the 2019 CHI Conference on Human Factors in Comput-
2013. ing Systems (Glasgow, Scotland Uk) (CHI EA ’19). Association for Computing
[18] Debbie B Golos and Annie M Moses. 2011. How teacher mediation during video Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/3290607.3312878
viewing facilitates literacy behaviors. Sign Language Studies 12, 1 (2011), 98–118. [43] Carolyn McCaskill, Ceil Lucas, Robert Bayley, and Joseph Christopher Hill. 2011.
[19] Michael Andrew Grosvald. 2009. Long-distance coarticulation: A production and The hidden treasure of Black ASL: Its history and structure. Gallaudet University
perception study of English and American Sign Language. University of California, Press Washington, DC, Gallaudet University Press, 800 Florida Avenue, NE,
Davis, 1 Shields Ave, Davis, CA 95616. Washington, DC 20002-3695.
[20] Wyatte C Hall, Leonard L Levin, and Melissa L Anderson. 2017. Language de- [44] John Milton and Vivying S. Y. Cheng. 2010. A toolkit to assist L2 learners
privation syndrome: A possible neurodevelopmental disorder with sociocultural become independent writers. In Proceedings of the NAACL HLT 2010 Workshop on
origins. Social psychiatry and psychiatric epidemiology 52, 6 (2017), 761–776. Computational Linguistics and Writing: Writing Processes and Authoring Aids (Los
[21] Sandra G Hart. 2006. NASA-task Load Index (NASA-TLX); 20 years later. In Angeles, California) (CL&amp;W ’10). Association for Computational Linguistics,
Proceedings of the human factors and ergonomics society annual meeting, Vol. 50. USA, 33–41.
Sage Publications Sage CA, Sage publications, Los Angeles, CA, 904–908. [45] Daniel Mitchell. 2021. British Sign Language BSL dictionary. https://www.
[22] Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task signbsl.com/
Load Index): Results of empirical and theoretical research. In Advances in psy- [46] Ross Mitchell, Travas Young, Bellamie Bachleda, and Michael Karchmer. 2006.
chology. Vol. 52. Elsevier, Amsterdam, Netherlands, 139–183. How many people use ASL in the United States? Why estimates need updating.
[23] Saad Hassan. 2022. Designing and experimentally evaluating a video-based Sign Language Studies 6 (03 2006). https://doi.org/10.1353/sls.2006.0019
American Sign Language look-up system. In ACM SIGIR Conference on Hu- [47] Anshul Mittal, Pradeep Kumar, Partha Pratim Roy, Raman Balasubramanian, and
man Information Interaction and Retrieval (Regensburg, Germany) (CHIIR ’22). Bidyut B Chaudhuri. 2019. A modifed LSTM model for continuous sign language
Association for Computing Machinery, New York, NY, USA, 383–386. https: recognition using leap motion. IEEE Sensors Journal 19, 16 (2019), 7056–7063.
//doi.org/10.1145/3498366.3505804 [48] J Murray. 2020. World Federation of the deaf. http://wfdeaf.org/our-work/
[24] Saad Hassan, Oliver Alonzo, Abraham Glasser, and Matt Huenerfauth. 2020. [49] Tobii Pro Nano. 2014. Tobii Pro Lab. Tobii Technology. https://www.tobiipro.com/
Efect of ranking and precision of results on users’ satisfaction with search- [50] Carol Neidle and Christian Vogler. 2012. A new web interface to facilitate access
by-video sign-language dictionaries. In Sign Language Recognition, Translation to corpora: Development of the ASLLRP data access interface (DAI). In Proc. 5th
and Production (SLRTP) Workshop-Extended Abstracts, Vol. 4. Computer Vision – Workshop on the Representation and Processing of Sign Languages: Interactions
ECCV 2020 Workshops, Virtual, 6 pages. between Corpus and Lexicon, LREC. Citeseer, OpenBU, Istanbul, Turkey, 8 pages.
[25] Saad Hassan, Oliver Alonzo, Abraham Glasser, and Matt Huenerfauth. 2021. Efect https://open.bu.edu/handle/2144/31886
of Sign-Recognition Performance on the Usability of Sign-Language Dictionary [51] Razieh Rastgoo, Kourosh Kiani, and Sergio Escalera. 2021. Sign language recog-
Search. ACM Trans. Access. Comput. 14, 4, Article 18 (oct 2021), 33 pages. https: nition: A deep survey. Expert Systems with Applications 164 (2021), 113794.
//doi.org/10.1145/3470650 https://doi.org/10.1016/j.eswa.2020.113794
[26] Saad Hassan, Akhter Al Amin, Alexis Gordon, Sooyeon Lee, and Matt Huener- [52] Kishore K Reddy and Mubarak Shah. 2013. Recognizing 50 human action cate-
fauth. 2022. Design and Evaluation of Hybrid Search for American Sign Language gories of web videos. Machine vision and applications 24, 5 (2013), 971–981.
to English Dictionaries: Making the Most of Imperfect Sign Recognition. In CHI [53] Jerry Schnepp, Rosalee Wolfe, Gilbert Brionez, Souad Baowidan, Ronan Johnson,
Conference on Human Factors in Computing Systems (New Orleans, LA, USA) and John McDonald. 2020. Human-Centered design for a sign language learning
(CHI ’22). Association for Computing Machinery, New York, NY, USA, Article application. In Proceedings of the 13th ACM International Conference on PErva-
195, 13 pages. https://doi.org/10.1145/3491102.3501986 sive Technologies Related to Assistive Environments (Corfu, Greece) (PETRA ’20).
Association for Computing Machinery, New York, NY, USA, Article 60, 5 pages.
ASSETS ’22, October 23–26, 2022, Athens, Greece Hassan, et al.

https://doi.org/10.1145/3389189.3398007 [60] Carolina Tannenbaum-Baruchi and Paula Feder-Bubis. 2018. New sign language
[54] Jérémie Segouat. 2009. A study of sign language coarticulation. SIGACCESS new (S): the globalization of sign language in the smartphone era. Disability &
Accessible Computing 1, 93 (Jan 2009), 31–38. https://doi.org/10.1145/1531930. society 33, 2 (2018), 309–312.
1531935 [61] Kimberly A. Weaver and Thad Starner. 2011. We need to communicate! Helping
[55] Zatorre RJ Shiell MM, Champoux F. 2014. Enhancement of visual motion detection hearing parents of Deaf children learn American Sign Language. In The Pro-
thresholds in early Deaf people. PloS one 9, 2 (2014), e90498. https://doi.org/10. ceedings of the 13th International ACM SIGACCESS Conference on Computers and
1371/journal.pone.0090498 Accessibility (Dundee, Scotland, UK) (ASSETS ’11). Association for Computing
[56] ShuR. 2021. SLintoDictionary. http://slinto.com/us Machinery, New York, NY, USA, 91–98. https://doi.org/10.1145/2049536.2049554
[57] Namrata Srivastava, Sadia Nawaz, Joshua Newn, Jason Lodge, Eduardo Velloso, [62] Polina Yanovich, Carol Neidle, and Dimitris Metaxas. 2016. Detection of major
Sarah M. Erfani, Dragan Gasevic, and James Bailey. 2021. Are You with Me? ASL sign types in continuous signing for ASL Recognition. In Proceedings of the
Measurement of Learners’ Video-Watching Attention with Eye Tracking. In Tenth International Conference on Language Resources and Evaluation (LREC’16).
LAK21: 11th International Learning Analytics and Knowledge Conference (Irvine, European Language Resources Association (ELRA), Portorož, Slovenia, 3067–
CA, USA) (LAK21). Association for Computing Machinery, New York, NY, USA, 3073. https://www.aclweb.org/anthology/L16-1490
88–98. https://doi.org/10.1145/3448139.3448148 [63] Zahoor Zafrulla, Helene Brashear, Thad Starner, Harley Hamilton, and Peter
[58] Ted Supalla. 1982. Structure and acquisition of verbs of motion and location in Presti. 2011. American Sign Language recognition with the Kinect. In Proceedings
American Sign Language. Ph.D. Dissertation. University of California, San Diego. of the 13th International Conference on Multimodal Interfaces (Alicante, Spain)
[59] Nazif Can Tamer and Murat Saraçlar. 2020. Improving keyword search perfor- (ICMI ’11). Association for Computing Machinery, New York, NY, USA, 279–286.
mance in sign language with hand shape features. In Computer Vision – ECCV https://doi.org/10.1145/2070481.2070532
2020 Workshops, Adrien Bartoli and Andrea Fusiello (Eds.). Springer International [64] Mikhail A. Zagot and Vladimir V. Vozdvizhensky. 2014. Translating Video:
Publishing, Cham, 322–333. Obstacles and challenges. Procedia - Social and Behavioral Sciences 154 (2014),
268–271. https://doi.org/10.1016/j.sbspro.2014.10.149
A Dataset of Alt Texts from HCI Publications
Analyses and Uses Towards Producing More Descriptive Alt Texts of Data Visualizations in Scientific
Papers
Sanjana Chintalapati∗ Jonathan Bragg Lucy Lu Wang
sanjanac@cs.washington.edu jbragg@allenai.org lucylw@uw.edu
University of Washington Allen Institute for AI Allen Institute for AI; University of
Seattle, WA, USA Seattle, WA, USA Washington
Seattle, WA, USA
ABSTRACT 1 INTRODUCTION
Figures in scientifc publications contain important information Alternative text (or “alt text”) describes the content of a visual
and results, and alt text is needed for blind and low vision readers graphic or image to those who cannot see them. As such, alt text
to engage with their content. We conduct a study to characterize is an important component of accessible design. Most scientifc
the semantic content of alt text in HCI publications based on a documents use graphics to communicate information alongside
framework introduced by Lundgard and Satyanarayan [30]. Our text; scientifc documents can be especially difcult to make acces-
study focuses on alt text for graphs, charts, and plots extracted from sible to BLV readers [5], with a large majority of these paper PDFs
HCI and accessibility publications; we focus on these communities lacking usable alt text [47]. Mack et al. [31] conducted a study of
due to the lack of alt text in papers published outside of these BLV users and what they need from alt text, and found that graphs
disciplines. We fnd that the capacity of author-written alt text to and charts are of special importance to these users, as they can be
fulfll blind and low vision user needs is mixed; for example, only especially important for conveying results. This, coupled with the
50% of alt texts in our sample contain information about extrema fnding that the vast majority of scientifc fgures lack alt text alto-
or outliers, and only 31% contain information about major trends gether, suggests that even if the rest of the text in a scientifc paper
or comparisons conveyed by the graph. We release our collected were accessible to a BLV reader, that a signifcant portion of the
dataset of author-written alt text, and outline possible ways that informational content of these works (fgures) remain inaccessible,
it can be used to develop tools and models to assist future authors which can negatively impact reader experience.
in writing better alt text. Based on our fndings, we also discuss Though there have been progress and attempts in automatically
recommendations that can be acted upon by publishers and authors generating alt text descriptions of images on web and social me-
to encourage inclusion of more types of semantic content in alt dia platforms [17, 35, 48], these methods do not apply as well to
text. scientifc images. From a machine learning perspective, much of
the advancement in image recognition and scene understanding
CCS CONCEPTS in recent years have derived from training neural models on large-
• Human-centered computing → Empirical studies in acces- scale labeled image datasets (such as ImageNet [42] and Google
sibility. Open Images [25]), datasets that are primarily composed of natu-
ral images, which represent only a small proportion of the types
KEYWORDS of images found in scientifc publications. Figures from scientifc
papers run the gamut of image types, including but not limited to
accessibility, scientifc documents, alt text, dataset
natural images, medical images, diagrams, schematics, a wide array
ACM Reference Format: of graphs and charts, as well as combinations of these types, e.g., a
Sanjana Chintalapati, Jonathan Bragg, and Lucy Lu Wang. 2022. A Dataset medical image annotated with a histogram. Correspondingly, many
of Alt Texts from HCI Publications: Analyses and Uses Towards Producing established image understanding models cannot be directly or easily
More Descriptive Alt Texts of Data Visualizations in Scientifc Papers. In adapted for the scientifc domain. Hybrid crowdsourcing solutions
The 24th International ACM SIGACCESS Conference on Computers and Acces- that integrate human experience with machine functionality may
sibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York,
ofer useful alternative solutions [17, 19, 39, 43]. For example, Qian
NY, USA, 12 pages. https://doi.org/10.1145/3517428.3544796
et al. [39] advocate for a hybrid approach to image captioning where
∗ Work machines generate caption units and humans perform stitching.
conducted during internship at Allen Institute for AI
To support these solutions for scientifc alt text, we need to better
understand the current status of alt text content, and develop tools
that can support authors in writing more useful descriptions of
This work is licensed under a Creative Commons Attribution International scientifc fgures.
4.0 License. Several prior studies have attempted to quantify the availability
of alt text in scientifc documents [7, 26, 47], though none have in-
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Copyright held by the owner/author(s).
vestigated the content of author-written alt texts and whether they
ACM ISBN 978-1-4503-9258-7/22/10. convey adequate information about fgures to blind and low vision
https://doi.org/10.1145/3517428.3544796
ASSETS ’22, October 23–26, 2022, Athens, Greece S. Chintalapati, J. Bragg, and L.L. Wang

(BLV) readers. In this work, we extract and analyze the content of 2.1 Guidelines for writing scientifc alt text
author-written alt text from papers published by the accessibility The Web Content Accessibility Guidelines (WCAG) [9, 13] contain
and HCI communities, and provide recommendations toward en- guidance on when alt text should be provided and the suggested
couraging the inclusion of more types of descriptive information in content for the alt text. The National Center for Accessible Media
alt text that may be useful to BLV readers. By extracting realistic (NCAM) has published guidelines including high-level recommen-
author-written alt text, we also provide a useful data resource that dations for writing alt text for graphs, suggesting that a complete
can be used to support authors in writing better alt text and study description should include text describing (i) the layout of the graph,
how image understanding models and crowd-authoring techniques (ii) the location of variables on the graph, and (iii) for static graphs,
can be adapted to more efectively produce fgure alt text in the the overall trends presented, and for dynamic graphs, summary
scientifc domain. information such as the range of the axes.2 The Benetech Diagram
We process and extract author-written alt text from over 25K pub- Center also provides image description guidelines with the goal of
lications in the domains of accessibility and HCI, identifying nearly making it easier, cheaper, and faster to create and use accessible dig-
3.4K pieces of valid alt text from 899 papers. To assess the type ital images.3 The referenced documentation includes both general
of information contained in these alt texts, we use the framework best practices concerning aspects such as style and language that
introduced by Lundgard and Satyanarayan [30], which accounts apply to every type of image, along with specifc considerations
for four diferent levels of semantic content that may be conveyed for bar graphs, pie graphs, line graphs, and scatter plots such as
by graphical data visualizations. We assess the semantic content listing the numbers in a pie graph from smallest to largest, and
present in the alt text corresponding to fgures of graphs, charts, focusing on the change of concentration in scatter plots. Several
and plots (data visualizations), images that are prevalent in sci- academic publishers have also provided research-based guidelines
entifc papers, and for which the alt text content can be suitably for improving the accessibility of digital media. For example, the
represented using the levels introduced in the Lundgard and Satya- Association for Computing Machinary (ACM) strongly encourages
narayan [30] framework. We fnd that though most alt text contain authors to provide alt text for images and charts, and includes in-
basic information about the graph type, axes labels, and what is structions for authors such as not duplicating the caption text and
plotted, far fewer contain information beyond this. For example, providing keywords.4
only 50% of alt text in our sample discuss extrema or outliers in the Lundgard and Satyanarayan [30] introduced a four-level con-
data, and only 31% discuss trends or comparisons. The lack of this ceptual model describing the semantic content of information that
type of semantic content in alt text can make it difcult for a BLV should be present in alt text descriptions of scientifc data visualiza-
user to understand these kinds of images in the way they desire, as tions. Level 1 includes construction details such as the type of fgure
found by Lundgard and Satyanarayan [30]. (e.g., bar plot or line plot), and labels of the axes. Level 2 includes
Our contributions in this work can therefore be summarized as: statistics about the fgure data, such as extremes and correlations.
• An assessment of the semantic information conveyed by Level 3 includes larger takeaways, such as trends and patterns in
author-written alt text of graph and chart fgures extracted the data. Level 4 includes domain-specifc insights and societal con-
from papers published in venues representing work in ac- text for the fgure data. The authors conducted studies including
cessibility, HCI, and related areas. We found that levels of BLV users, and found that these users gained the most information
covered content are inadequate, even at accessibility and HCI from textual descriptions conveying information from semantic
conferences, which have alt text requirements and writing levels 1–3 [30]. This fnding corresponds to the recommendations
guidelines. of alt text content made by the NCAM. Though this framework is
• A dataset of 3386 author-written alt text from HCI publi- not a guideline document per se, we adopt it for this study in order
cations, of which 547 have been annotated with semantic to evaluate the quality of author-written alt text for graphs and
levels.1 The methods used to construct this dataset can be charts found in scientifc publications. It has been validated through
extended to study trends in scientifc fgure alt text more studies with BLV participants, and we are unaware of alternative
broadly, and our dataset can be used develop tools and mod- frameworks.
els to support alt text authoring. For example, we experiment Towards alt text preferences, Bennett et al. [4] conducted a study
with training a classifer that identifes semantic levels in on best practices for describing race, gender, and disability sta-
text, which could be used to provide feedback to authors as tus in alt text, and found that people in photographs preferred to
they are writing alt text. We discuss and explore additional be described with the language that they use to talk about them-
opportunities in Section 5. selves, and that descriptions of concrete visual details were more
appropriate than language around identities. Though this does not
2 RELATED WORK apply directly to scientifc fgures, alt text written by someone
other than the authors of a paper may want to consider how the
We briefy discuss related work on how to write useful scientifc
authors intended for the fgure to be understood as well as the
alt text (Section 2.1), resources and methods for scientifc fgure
language used by the authors in the rest of the publication. Af-
understanding (Section 2.3), resources and methods for automatic
ter reviewing BLV people’s experiences with digital image types
alt text generation (Section 2.4), and other methods for improving
fgure accessibility (Section 2.5). 2 https://www.wgbh.org/foundation/ncam/guidelines/accessible-digital-media-
guidelines
3 http://diagramcenter.org/making-images-accessible.html
1 The dataset and annotations are available at https://github.com/allenai/hci-alt-texts. 4 https://authors.acm.org/proceedings/production-information/describing-fgures
A Dataset of Alt Texts from HCI Publications ASSETS ’22, October 23–26, 2022, Athens, Greece

such as news articles and employment websites, Stangl et al. [46] between two data series?) Models trained on these datasets have
found that a one-size-fts-all approach for image descriptions is shown improving performance [24, 27], though because the plots
not ideal. Similarly, through interviews with screen reader users, in these datasets are generated synthetically, they may not transfer
Mack et al. [31] found that diferent BLV users had varying pref- well to graphs found in actual scientifc publications, which are
erences about the level of detail that they found to be most help- signifcantly more noisy, diverse, and variable than those found in
ful in alt text, although most users concluded that both brevity these datasets. Also, though fgure understanding through VQA is
and the availability of detailed information were desirable traits. related to the task of producing alt text, the task of VQA itself does
Such fndings should be kept in mind when authoring efective not produce a coherent textual description of the fgure, which is
alt text. the desired outcome for alt text. The FigCAP project [10, 11] at-
tempts to bridge this divide by deriving fgure descriptive text from
2.2 Interfaces for authoring alt text the questions and answers of the FigureQA [24] dataset; though
Morash et al. [33] developed interfaces to guide novice web workers FigCAP refers to itself as a fgure captioning dataset, the “captions”
in writing descriptions of scientifc images. The authors queried they provide are more similar to the notion of alt text, including
descriptions about the fgure structure, axes labels, data values,
workers for information about select image attributes based on the
etc.
NCAM guidelines, on attributes such as image type, title, and units
shown. They found that the templated query method was preferred Recent work has aimed to automatically generate captions for
by the workers and produced better image descriptions. both natural images [12, 44] and fgures [20, 38]; however, cap-
Mack et al. [31] built a prototype interface for authoring alt text, tion generation and alt text generation are not the same task, and
and measured the quality of alt text on a four-point scale based on could be said to have difering goals. Captions are intended to
be consumed by all readers, and contain information that com-
three interface variations: the current PowerPoint interface; a free-
plements the content of the image; while alt text is meant to ex-
form interface, where suggestions were presented as a bulleted list;
and a template interface, where each prompt was listed separately plain the informational content of the image for users who cannot
and included a text box to respond to that prompt. Participants who see it. SciCap [20] introduces a large dataset of graph fgures and
use screen readers were asked to rank the quality of alt text written captions derived from arXiv; captions are those originally writ-
ten by authors, and are post-processed to remove tokens corre-
under the PowerPoint, free-form, and template interfaces, they
sponding to numbers and equations. The ImageClefMed Caption-
found that, in general, the free-form interface encouraged authors
ing task released a relatively much smaller dataset focused on
to write alt text that is more closely aligned with the preferences
of screen reader users. captioning of medical images derived from scientifc publications
[38].
As far as we know, there are no datasets available for studying
2.3 Automated methods for scientifc fgure scientifc fgure alt text generation with realistic, author-written
understanding alt text. Though FigCAP [10, 11] and FigJAM [40] explore text
Scientifc fgure understanding tasks such as fgure classifcation, generation in the alt text setting, the images and target texts used
visual question-answering (VQA), or image captioning have re- are synthetic (derived from FigureQA [24]) and not representative
ceived signifcant attention from the AI community in recent years of the analogous task in a realistic setting.
[10, 11, 20, 21, 23, 24, 27, 32, 34, 38, 40, 45]. In Table 1, we describe a
number of datasets that have been introduced to train models and 2.4 Automated methods for alt text generation
evaluate their performance on these tasks. Ofce 365 generates alt text for any image or fgure pasted into
Datasets introduced for scientifc image classifcation include Microsoft PowerPoint.5 Though easy to use, the feature shows
FigureSeer [45], DocFigure [21], and SlideImages [34]. All three limited performance on scientifc fgures, usually only describing
datasets include realistic images extracted from scientifc papers, the type of the fgure. For example, for the fgures in Table 2, the
along with labels to classes such as graph, medical image, or natural corresponding alt texts generated by Ofce 365 are, respectively:
image. These datasets have been used to train models that can detect “Chart, bar chart,” “Chart, line chart,” and “Chart.” Though Ofce
fgure type, which is an essential piece of information that should 365 usually identifes the correct type of chart, no other information
be available in alt text. However, fgure type is only one of many about the fgure content is generated, and the resulting alt text is
pieces of information that BLV users may need to understand the of limited use to the reader.
content of an image, and therefore these datasets are of limited use Qian et al. [40] created synthetic datasets for fgure alt text gen-
in the alt text generation setting. eration by adapting the FigureQA [24] and DVQA [23] datasets
Towards more detailed fgure understanding, datasets such as for fgure VQA. Figures in both datasets are synthetic (not from
FigureQA [24], DVQA [23], and PlotQA [32] have been introduced. scientifc papers), and alt text units are derived from the data and
These datasets are made up of synthetically generated graphs and information used to construct each fgure. The authors then trained
charts, along with associated questions and answers about the graph the FigJAM model, which generates alt text descriptions based on
contents. For example, questions may be related to the graph title, the multimodal inputs of the raw fgure image and fgure metadata
axes, the � and �-axes values associated with specifc data series, [40]. Though performance on synthetic data was shown to be good,
or the names of each data series. In PlotQA [23], many questions
also go beyond the structure of the plot and may require data re-
trieval or additional reasoning (e.g. What is the average diference 5 https://www.microsoft.com/en-us/microsoft-365/powerpoint
ASSETS ’22, October 23–26, 2022, Athens, Greece S. Chintalapati, J. Bragg, and L.L. Wang

Dataset Domain Realistic/ Task Size of dataset


Synthetic
FigureSEER [45] Scientifc fgures Realistic Image classifcation 60K fgures
(6 classes)
DocFigure [21] Scientifc fgures Realistic Image classifcation 33K images
(28 classes)
SlideImages [34] Educational Realistic Image classifcation 3K images
illustrations (8 classes)
FigureQA [24] Scientifc graphs Synthetic VQA 100K images
and charts
DVQA [23] Bar charts Synthetic VQA 300K images
PlotQA [32] Plots Synthetic plots; VQA 224K images
Realistic data
SciCap [20] Scientifc fgures Realistic Image captioning 416K fgures
ImageClefMed Caption [38] Medical images Realistic Image captioning 5K images
FigCAP [10, 11] Bar charts, pie Synthetic alt text generation 110K fgures
charts, line plots
Table 1: Datasets for scientifc fgure understanding

the model may not generalize to realistic fgures found in scien- accessible. Salisbury et al. [43] explored a novel approach in which
tifc papers, which exhibit signifcantly more variability than those crowdworkers were paired together to create image descriptions.
evaluated by Qian et al. [40].6 Crowdworkers were asked questions to extract desired details about
Researchers have developed tools for generating alt text infor- an image such as the location of the image and what emotions the
mation for images on non-scientifc social platforms. Gleason et image evoked. Platforms such as VizWiz [6] or Be My Eyes8 con-
al. [17] addressed the accessibility barrier on Twitter by creating a nect BLV users to sighted crowdworkers and volunteers via an app
browser extension which adds alt text to Twitter using six methods, for assistance with questions and daily tasks. Building upon the suc-
such as reverse image searching and automatic image captioning. cesses of the VizWiz Grand Challenge [19], a similar solution could
However, this project focused on Twitter images, which difer from be created to connect BLV researchers with volunteers that can pro-
realistic scientifc fgures. Additionally, Wu et al. [48] deployed vide suitable descriptions or answers to questions about scientifc
an automatic alt text system to identify faces, objects, and themes fgures. Mack et al. [31] built user interfaces for both authoring alt
for photos on Facebook in order to make them more accessible to text and providing feedback on automatic alt text, which could be
screen reader users. The domain of natural images on social media theoretically connected to a crowdsourcing platform and adapted
again signifcantly difers from our domain of scientifc graphs and to collect alt text for scientifc fgures based on user demand.
charts. Lastly, researchers have developed methods for surfacing alt text
that do not involve manual efort from humans or automated alt
2.5 Other methods for improving fgure text generation. Guinness et al. [18] found that many images appear
accessibility in several places across the web, and used this insight to develop
Researchers have also developed alternatives to alt text for improv- Caption Crawler. This system uses reverse image search to fnd alt
ing fgure accessibility. ChartSense [22] and PlotDigitizer7 are chart text from similar images available on the web and surfaces these
data extraction methods which convert chart images into structured alternate descriptions to the user. This method works quite well in
data tables. However, charts of the same type are too diverse in the domain of natural images, where many similar images of places
style to apply a single extraction algorithm, algorithms have trouble or things can be found on the open internet.
interpreting overlapped visual entities, and there is no text-region
detection algorithm for chart images with sufcient accuracy [22]. 3 METHODS
Auditory graphs, tactile graphs, and various multimedia approaches We extract and analyze the presence and content of alt text for
have also been introduced to improve the accessibility of graphs graphs, charts, and plots in scientifc papers. These fgures are
and charts [15, 36]. typically used to visualize data and results, are of special importance
Crowdsourcing, in which a group of non-experts completes a to BLV readers [31], and are the types of images for which Lundgard
task that is currently infeasible to accomplish via automated meth- and Satyanarayan [30] have defned semantic levels. The Lundgard
ods, has also been used to good efect for making images more and Satyanarayan [30] framework describes four levels of semantic
6 The content, organized by increasing complexity:
authors do not release their dataset or pretrained models, so we are unable to
perform a comparative analysis on realistic fgures derived from scientifc papers.
7 http://plotdigitizer.sourceforge.net/ 8 https://www.bemyeyes.com/
A Dataset of Alt Texts from HCI Publications ASSETS ’22, October 23–26, 2022, Athens, Greece

Level 1: enumerating visualization construction details (e.g., Action Wizard. We used Adobe Acrobat Pro rather than an alter-
type, marks, and encodings) nate programmatic approach because we were unable to identify
Level 2: identifying statistical concepts and relations (e.g., a PDF processing library capable of extracting alt text.12 Second,
extremes and correlations) we extracted alt text from the converted HTML document. Third,
Level 3: characterizing perceptual and cognitive phenomena we fltered the extracted alt text according to a set of heuristics.
(e.g., trends and patterns) Our pilot attempts at extracting alt text revealed that most of the
Level 4: articulating domain-specifc insights or societal con- extracted alt text consisted of uninformative short descriptions like
text. “Image” or fle paths like “C:\\path_to\fgure1.jpeg,” so we defned a
Lundgard and Satyanarayan [30] found that levels 1–3 were re- set of fltering criteria to remove these. We also fltered out alt text
ported most useful by blind and low vision readers. Level 4, which shorter than 80 characters since many shorter alt text fall under the
incorporates signifcantly more subjective information, was found category of uninformative short descriptions. To determine this 80-
to be less essential to understanding; in fact, a majority of blind character threshold, we analyzed a sample of 100 extracted alt text
readers in their study (63%, n=19) believed that fgure alt text should to fnd a limit that maximizes recall without sacrifcing precision.
not contain level 4 content. We refer to the alt text that pass these fltering criteria as “valid alt
text.”
3.1 Sampling papers and extracting For all valid alt texts, we identify those that are likely to corre-
author-written alt text spond to graphs, charts, and plots. We iteratively defned another
set of heuristics: a list of words and phrases that correspond to
Our goal is to construct a dataset of author-written alt text by au-
graphs and charts (e.g., “graph”, “chart”, “error bar”; the full list of
tomatically sampling and extracting alt text from papers. We start
terms are provided in Appendix A). For each fgure, we search for
with the set of papers from two conferences: ACM CHI Confer-
token matches in the alt text and image caption against this list of
ence on Human Factors in Computing Systems (CHI) and ACM
terms, and retain only fgures and alt text matching at least one
SIGACCESS Conference on Computers and Accessibility (ASSETS)
term. The alt text of the matching fgures are then annotated with
published in the years 2010-2020. We identify these papers using
the Lundgard and Satyanarayan [30] semantic levels.
the ACM’s reported DOIs, and link these to PDFs in the Seman-
tic Scholar corpus [1]. This yields 5218 PDFs. We then iteratively
extend our paper sample to include all papers written by authors
3.2 Annotation of alt text semantic levels
who have published in CHI and/or ASSETS. We prioritize authors To study what types of content are present in author-written alt text,
based on the frequency of their publications in CHI and ASSETS, we ask annotators to assess the semantic content levels present
and retrieve these authors’ other publications using the Semantic in each sentence of each piece of alt text. We split alt text into
Scholar API.9 From these queries, we assemble a further sample of sentences using the scispaCy NLP library [37]. Annotators are
20000 paper PDFs to process and extract alt text. shown these sentences along with the corresponding fgure caption
CHI and ASSETS are premiere conferences on human-computer and a link to view the fgure.
interaction and accessible computing, and both conferences have a Six label options were provided for each sentence:
history of soliciting accessible paper submissions10 and requiring • Level 1: Figure logistics
authors to provide alt text for fgures. We therefore expect a much • Level 2: Statistical properties and comparisons
higher percentage of papers published at these venues to contain • Level 3: Complex trends and patterns in data
valid alt text. Similarly, our rationale for sampling additional papers • Level 4: Domain-specifc insights or societal concepts to help
published at other venues by authors who have published at CHI explain Level 3 trends
and ASSETS is based on our hypothesis that such authors are more • This alt text contains no levels of content
likely to write alt text in general, even if another publishing venue • This image is not a graph or chart
may not require alt text for submission. Initially, we intended to
If a fgure is not a graph or chart, the annotator is instructed to se-
extract alt text from a stratifed random sample of papers repre-
lect the last option. Otherwise, up to three semantic levels could be
senting all felds of study, but a pilot attempt showed that random
selected for each sentence. These label options were adapted from
sampling would yield virtually no alt text. We arrived at this con-
the level descriptions given by Lundgard and Satyanarayan [30]
clusion after processing a stratifed sample of 5000 PDFs from 2010
and shortened and simplifed to make them easier for annotators to
to 2020 and extracting only a single piece of descriptive alt text.
understand. Annotators were given an additional instruction docu-
Given the time and expense of processing PDFs to extract alt text
ment that provides more detail on each of the label options, along
at scale, we decided that the stratifed random sample strategy was
with examples of sentences corresponding to each label option.
untenable.
The annotators were instructed to exhaustively label each sen-
For each paper in our sample, we followed a three step process
tence with all of the levels it contains. We recruited two annotators
to extract alt text. First, we processed the PDF using Adobe Acrobat
Pro11 to convert the PDF to HTML using the Adobe Acrobat Pro
12We experiment with several other widely available PDF libraries and con-
9 https://api.semanticscholar.org/
version tools, including PDFTOHTML (http://pdftohtml.sourceforge.net/),
10 CHI directs authors to the SIGCHI guidelines for an accessible submission: PDFMiner (https://github.com/euske/pdfminer), pdf2xml
https://sigchi.org/conferences/author-resources/accessibility-guide/ and ASSETS pro- (https://sourceforge.net/projects/pdf2xml/) etc., and found that none of these
vides these instructions: https://assets21.sigaccess.org/creating_accessible_pdfs.html tools allowed access to the embedded alt text. Scaling in Adobe Acrobat Pro was the
11 https://www.adobe.com/acrobat/acrobat-pro.html only working solution we were able to identify.
ASSETS ’22, October 23–26, 2022, Athens, Greece S. Chintalapati, J. Bragg, and L.L. Wang

Figure 1: The proportion of PDFs in our analyzed sample which contain valid alt text over time, with 95% confdence intervals
computed through bootstrap resampling.

through the UpWork platform.13 The annotators had undergraduate- semantic content. We determine the relationship between
level education in math, statistics, and materials science, and had length and presence of semantic levels using our data.
previous experience reading graphs, plots, and other scientifc fg-
ures. The alt text retained after the fltering steps described in the
previous section were split into sets of 100 alt texts each for an- 4.1 Descriptive statistics
notation. Two individuals annotated an initial sample of 100 alt We process 25218 total paper PDFs to extract alt text. Of these, 19500
texts (298 sentences) to refne the task and ensure high annota- (77.3%) are successfully converted to HTML by Adobe Acrobat Pro.
tor agreement. The inter-annotator agreement computed over this Only 897 (4.6%) of these converted documents contain at least one
sample was 87.6%, with � = 0.80, indicating very good agreement. piece of valid alt text. Around 2048 pieces of alt text corresponded
We further clarifed the instructions following a discussion of dis- to fle paths and 2545 alt texts did not meet our length criteria, and
agreements. Given the high agreement level, a single annotator were fltered out; all other alt texts that were fltered out did not
annotated the remaining alt text. For the fnal analysis, the frst have any content besides “image.” After this fltering, the 897 papers
annotator’s annotations are used for the sample that was doubly contain 3386 valid author-written alt texts. Using our keyword
annotated. Examples of extracted alt text and the corresponding heuristics, we determine that 1085 of these alt texts are likely to
annotated semantic levels are provided in Table 2. correspond to graphs or charts. Based on a cursory examination,
the alt texts that were fltered out using these keyword heuristics
4 RESULTS consist primarily of natural images and diagrams.
We analyze the semantic content of the extracted alt text, and We ask annotators to assess the semantic levels present in each
attempt to answer the following research questions: sentence of the remaining 1085 alt text. Of these, 547 fgure alt texts
(consisting of 2127 sentences) are labeled as belonging to graphs,
• RQ1: What is the distribution of semantic content in
charts, and plots by our annotators, indicating that our keyword
author-written alt text?
heuristics have approximately 50% precision. Alt texts of fgures
We want to determine the proportion of alt text contain-
corresponding to these data visualizations are further annotated
ing level 1, 2, 3, and 4 content. Of these, what proportion
for semantic content. Several examples of author-written alt text
contains levels 1–3 content, which satisfes most BLV user
and semantic level annotations are given in Table 2. In Table 3, we
needs? Correspondingly, which semantic levels are most
provide the numbers of PDFs processed and alt text retained after
often missing?
each fltering step.
• RQ2: How does the distribution of semantic content in
Figure 1 shows the proportion of PDFs in our sample that contain
alt text change over time? We want to determine whether
any valid alt text, and how this proportion changes in our sample
the presence of levels 1-3 semantic content is increasing
over the last decade. There is a slight increase in alt text coverage
over time, and by how much. We expect that with improve-
in 2014; this is the same year that CHI specifed that alt text is
ments in alt text awareness and workfows over time, that
required in submissions. We also observe that the proportion of
the amount of content available in alt text should correspond-
papers with valid alt text has improved over time, especially in the
ingly increase.
past few years, although the overall proportion is still quite low
• RQ3: How does length of alt text correlate with seman-
(below 15%). We note that Figure 1 does not indicate the actual
tic levels?
proportions of papers from these years that have valid alt text (we
Mack et al. [31] fnd that there is tension between detail
do not process the version of record for all papers in our sample
and brevity in alt text. The ideal alt text may vary based on
due to the difculty in ascertaining these versions and copyright
user needs, but should balance length and completeness of
challenges in obtaining them; we also cannot guarantee that the
13 https://www.upwork.com/ pipeline we use succeeds in extracting alt text for all of the papers
A Dataset of Alt Texts from HCI Publications ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure & Source Author-written alt text with annotations


“Figure 4: Materials participants reported wanting to read in ASL
text (L1). This fgure presents a bar chart, with separate bars for
DHH (light blue) and hearing (dark blue) populations (L1). Y-axis is
% participants, ranging from 0-70 (L1). X-axis is Material Desired in
ASL Text (L1). sorted by DHH popularity (most popular frst):
Website content, Printed content, Email, Texts/SMS, Video
captions, Other, None (L1).”

Figure 4 reproduced from Bragg et al. [8]

“A line chart showing the average time it took participants on all


tasks (L1). The y-axis of the chart is time in seconds (ranges from 0
to 60), the x-axis of the chart is session number (ranges from 1 to 6)
(L1). There is a line for the three modes: Silent, Verbal and Finger
Pointing (L1). They all appear to be going down, but there is a big
spike in the Verbal mode line at session 4 (L3). In general, the
Finger Pointing mode is the highest (takes the most time), the
Silent mode is next and the Verbal takes the least amount of time,
although in the fourth and sixth sessions, the Verbal line is above
Figure 4 reproduced from Baker et al. [2] the Silent one (L3). There is a dot corresponding to the Braille
mode at Session 6, it is between the Verbal and Silent modes (L1).”

“Plot of mean proportion of image pixels diferentiable


(independent) for 0% - 100% of the population (dependent) for
websites and infographics (L1). Increasing from 0% of the
population, both plots start at 100% diferentiable and gradually fall
to 80% diferentiable at 75% of the population (L2, L3). Plots begin
to diverge here as they both fall of more quickly until a
discontinuity plateau is reached at 88% of the population (websites
= 60% diferentiable, infographics = 50% diferentiable) (L2, L3).
(Figure 7 reproduced from Reinecke et al. [41]) Plateau gradually declines to 99% of population (websites = 55%
diferentiable, infographics = 45% diferentiable), and then both
plots fall to 0% diferentiable for 100% of the population (L2, L3).”

Table 2: Example fgures and author-written alt text with annotated semantic levels added to the end of each sentence, with the
prefx L, in parentheses and colored (blue).

we process). Rather, the fgure describes our success rate in creating some fgures in our sample have alt text with level 3 content, over
this dataset, and provides some sense of the trend towards more two-thirds of the fgure alt texts that we examined do not contain
valid alt text in recent years. level 3 content (level 3 content describes trends and patterns). At
the same time, over one-third of the fgure alt texts lack both level
2 and level 3 content, meaning that there is no information on
4.2 Analysis of alt text semantic content
extrema and outliers in addition to trends and patterns. The lack of
Distribution of semantic content in alt text (RQ1). In Figure 2 (left), such content can make it more difcult for BLV users to acquire
we present the maximum level of content found in each fgure alt the information that they need from these graphs and charts.
text in our sample. We observe a fairly evenly distribution between In Figure 2 (right), we show the total number of levels of content
max levels 1, 2, and 3. Recall that in Lundgard and Satyanarayan present in all fgure alt text we analyzed. We fnd that the vast
[30], BLV users found levels 1–3 content to be most useful. Although
ASSETS ’22, October 23–26, 2022, Athens, Greece S. Chintalapati, J. Bragg, and L.L. Wang

Processing step Count


Total PDFs processed 25218
PDFs successfully converted to HTML 19500 (77.3%)
Papers with at least one valid alt text 897 (4.6%)
Pieces of valid alt text 3386
Number of fgure alt text annotated (heuristically fltered to likely be graphs or charts) 1085
Number of annotated fgure alt text that correspond to graphs or charts 547
Table 3: The numbers of papers and fgure alt text that remain after each fltering step.

Figure 2: Distribution of the maximum level (left) and total number of levels (right) of semantic content found in the sample of
annotated author-written alt text.

majority of alt text in our sample only contain one or two levels. This support authors in writing better alt text, or to develop authoring
means that though the maximum levels of content are somewhat tools or models for producing alt text where none are available. We
evenly distributed between levels 1, 2, and 3, only one or two of discuss some of these potential applications below.
these levels are present in most fgure alt texts.
Semantic content in alt text over time (RQ2). Figure 3 shows the 5.1 Improving author-written alt text &
proportion of alt text from each year that contain text of each of
supporting reading interfaces
the semantic levels. Though the vast majority of alt text contain
level 1 information, a much lower proportion contain level 2 and 3 To improve authoring of alt text, we can develop tools to help
information. Over time, there have not been signifcant changes to identify potentially missing content in alt text and prompt authors
the proportion of alt text that contains level 2 and 3 information. to add such information during the authoring and editing process.
For graphs and charts, the semantic levels serve as a proxy for
Relationship between alt text length and content (RQ3). In Figure 4, content. Based on the fndings by Lundgard and Satyanarayan [30],
we show the relationship between the length of alt text and the BLV users found levels 1–3 information most benefcial, and a tool
number of levels of content that it contains. Alt text containing aimed to improve alt text for graphs and charts could assess alt
more levels of information tend to be longer, though alt text of text quality based on the presence or absence of information at
comparable length can have diferent numbers of levels present these levels. In other words, we could train a classifer based on the
(as indicated by the overlapping boxes across levels). This suggests semantic level annotations in our dataset to detect which levels are
that alt text does not have to be longer in order to have more present and build a tool that prompts authors for the missing levels,
levels of content. In regards to the balance between brevity and similar to interfaces that have been built in the past for other tasks
detail, authors may want to optimize for the amount of information like providing peer feedback [16]. Such a classifer could also be
included in alt text without resorting to writing something overly used to automate trend monitoring for alt text content, enabling
long. expanded and continually updating versions of our analysis with
reduced human labor.
5 DATASET USES To test out the viability of this theory, we train several classif-
Alongside our analysis, we release our dataset of 3386 alt text col- cation models using our collected annotations. We approach this
lected from 897 HCI publications.14 Of these, the contents of 547 as a multi-class, multi-label classifcation problem. The classes are
alt texts (2127 sentences) are annotated with the semantic levels the four semantic levels, and up to four labels can be assigned to
introduced by [30]. This dataset can be used to develop tools to a single piece of text. The input to the model is a single sentence
of alt text, and the output is a distribution of labels over the four
14 Data and analyses are available at https://github.com/allenai/hci-alt-texts. classes. We experiment with a Random Forest classifer using tf-idf
A Dataset of Alt Texts from HCI Publications ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: Proportion of alt text containing text of each semantic level over time with 95% confdence intervals computed using
bootstrap resampling.

Model Accuracy Micro-F1


Random Forest (tf-idf ) 0.689 (0.021) 0.782 (0.011)
BERT-base 0.912 (0.016) 0.824 (0.032)
SciBERT-base 0.910 (0.010) 0.819 (0.021)
Table 4: Model performance averaged over fve folds, shown
with standard deviations in parentheses.

alt text sentence. Performance could be improved further through


additional model tuning or data annotation. The outputs of these
models can feasibly be used to provide feedback to authors who
are writing alt text, to indicate when content of certain semantic
levels may be lacking.
In addition to helping to improve author-written alt text, such a
classifer could be used to support improved reading experiences
of existing alt text. For example, semantic level classifcation could
enable users to make informed decisions about whether and how
to read alt text by fltering for semantic levels that may be more
relevant to their needs. Morash et al. [33] described similar types
of personalized reading experiences, which could be enabled for alt
text that has been written using standardized templates. We leave
the implementation of these authoring and reading interfaces, as
Figure 4: Relationship between length of alt text (character well as explorations on user interface design, to future work.
count) and the number of semantic levels of information
represented. Though length correlates with the number of
5.2 Training and evaluating NLP models for alt
levels, there are many longer alt texts that do not necessarily
contain more levels of semantic content. text generation
Recent developments in multimodal image-language pretraining
[28, 29, 49] hold promise towards the eventual automatability of
word representations, as well as classifers based on BERT [14] and scientifc fgure alt text generation. Currently, fgure alt text genera-
SciBERT [3]. We use 5-fold cross-validation to train and evaluate tion is hampered by the lack of realistic training and evaluation data.
all models. The training data is split into folds preserving each alt Though the size of our dataset is small and insufcient for training
text as a unit (547 instances), while text and labels are provided to neural models, it may still be useful to help scale the collection of
the model at the sentence level (2127 sentences). We report mean training data. Alt text from this dataset can be used to provide high-
accuracy and F1 over all fve folds for each model in Table 4. quality examples to annotators. Additionally, a classifer trained
Baseline performance of these models suggests that they are rea- to predict alt text semantic levels such as the one introduced in
sonably good at identifying the correct semantic levels present in an Section 5.1 could be used to ofer feedback to annotators during
ASSETS ’22, October 23–26, 2022, Athens, Greece S. Chintalapati, J. Bragg, and L.L. Wang

the annotation process, e.g., by indicating when information of a We emphasize that the lack of certain semantic levels in alt text is
certain content level is missing in the annotation. Instruction spe- not equivalent to an assessment about the alt text’s quality. Difer-
cifc to the missing level could be provided to the annotator to elicit ent users may want diferent information out of alt text, and there
further description information, as in the techniques employed by is no one-size-fts-all solution [31]. Rather, we use the framework
Morash et al. [33]. Several works propose a hybrid approach that as a proxy for content availability, which can be used to elicit dif-
combines machine learning model output with human writers to ferent kinds of descriptive information that may be missing in an
create better image descriptions [19, 31, 39]. Our dataset and classi- author’s original alt text. Finally, in Lundgard and Satyanarayan
fers could be used in collaboration with human writers to produce [30], the authors assessed whether the levels were useful for BLV
more descriptive alt text. users, but not whether all of levels 1–3 were necessary for an alt text
The alt texts in our dataset could also serve as part of a viable to be considered complete. Further study is needed to determine
evaluation corpus. Though not all alt text in the dataset contain the appropriate balance of depth of information, completeness, and
semantic level annotations, the texts themselves are written by the brevity in relation to the usefulness of alt text.
original paper authors, and are therefore more likely to be faithful We propose in Section 5 several uses of this dataset towards
to the original intent and content of the paper. We release all 3386 improving author and publisher workfows around writing fgure
valid alt text extracted from our sample of papers, which includes alt alt text. A classifer trained to detect semantic levels can be used to
text in addition to the 547 alt texts belonging to data visualizations provide feedback to authors who are writing alt text, and can be
which we annotate for semantic content. These alt texts can be used to elicit alt text containing more levels of information. The
used to assess pieces of information that authors found important dataset can also be used to help develop machine learning models
enough to include in the image description. The output of a general- that can generate better alt text based on the image itself. We believe
purpose scientifc alt text generation model can be evaluated against that coupling machine learning models with crowdsourced image
the information contained in the original author-written alt text descriptions may provide a reasonable solution to problems around
associated with these fgures. alt text availability, and we plan to explore such solutions in future
work.
Policy clearly matters. We faced signifcant challenges when
6 DISCUSSION & CONCLUSION attempting to extract alt text from a broad swathe of scientifc pub-
In regards to scientifc alt text, availability is still the primary issue. lications, and had to limit ourselves ultimately to HCI publishing
However, in the alt text we were able to extract from HCI publica- venues such as ASSETS and CHI. There is no doubt in our minds
tions, we observed that for papers where authors have taken the that the alt text we were able to extract are only there because of
time to write alt text, the content and level of detail available in the eforts of the accessibility and HCI research community and
these alt text is also worth considering. What does it mean to write the importance that members have placed on digital accessibil-
useful alt text? What does it mean to include enough detail such ity. Signifcant work remains to encourage researchers outside of
that the content of an image can be understood by a BLV user? For these communities to participate in making their work accessible.
graphs and charts, we propose that authors leverage the framework Within the community, there are also ways that we can improve
introduced in Lundgard and Satyanarayan [30] to ensure that some fgure accessibility, by providing information that are described
basic semantic information is provided, enough such that BLV users by BLV users as being more relevant or more important towards
can understand the structure of the graph, its extrema and outliers, interpreting these images.
as well as the obvious trends and comparisons that can be made. In
our current analysis, we fnd that many author-written alt text are
ACKNOWLEDGMENTS
not yet meeting these thresholds.
We recognize the limitations of our techniques. The alt text and We thank Donal Fitzpatrick, Doug Downey, the reviewers, and
fgures included in our dataset make up a biased sample, containing members of the Semantic Scholar research team for their valuable
only papers from CHI and ASSETS and from the authors publishing feedback. We thank Bailey Kuehl and Ihsan Allah Rakha for their
in these venues. They are not representative of fgures in all schol- assistance in the annotation process.
arly documents. Though we would have liked to construct a more
representative sample of scientifc fgures, the overwhelming lack REFERENCES
of fgure alt text in scholarly publications prevents us from doing [1] Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Craw-
so. As more authors from other disciplines begin including alt text ford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu A.
Ha, Rodney Michael Kinney, Sebastian Kohlmeier, Kyle Lo, Tyler C. Murray, Hsu-
and the barriers to adding alt text to scientifc fgures decreases, we Han Ooi, Matthew E. Peters, Joanna L. Power, Sam Skjonsberg, Lucy Lu Wang,
hope that it will become easier to create such a dataset. Christopher Wilhelm, Zheng Yuan, Madeleine van Zuylen, and Oren Etzioni.
Additionally, our analysis and annotations are limited to fgures 2018. Construction of the Literature Graph in Semantic Scholar. In NAACL.
[2] Catherine M. Baker, Lauren R. Milne, Jefrey Scofeld, Cynthia L. Bennett, and
containing data visualizations (graphs, charts, and plots), which Richard E. Ladner. 2014. Tactile graphics with a voice: using QR codes to access
are only one of many types of images present in scientifc publi- text in tactile graphics. Proceedings of the 16th international ACM SIGACCESS
conference on Computers & accessibility (2014).
cations. Our results on the suitability or missingness of semantic [3] Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A Pretrained Language
content in author-written alt text cannot generalize beyond these Model for Scientifc Text. In EMNLP.
image types. Going beyond graphs, frameworks other than that [4] Cynthia L. Bennett, Cole Gleason, Morgan Klaus Scheuerman, Jefrey P. Bigham,
Anhong Guo, and Alexandra To. 2021. “It’s Complicated”: Negotiating Accessibil-
introduced in Lundgard and Satyanarayan [30] may be needed to ity and (Mis)Representation in Image Descriptions of Race, Gender, and Disability.
capture the availability and distribution of informational content. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
A Dataset of Alt Texts from HCI Publications ASSETS ’22, October 23–26, 2022, Athens, Greece

(Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, [27] Matan Levy, Rami Ben-Ari, and Dani Lischinski. 2021. Classifcation-Regression
NY, USA, Article 375, 19 pages. https://doi.org/10.1145/3411764.3445498 for Chart Comprehension.
[5] Jefrey P. Bigham, E. Brady, Cole Gleason, Anhong Guo, and D. Shamma. 2016. [28] Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang.
An Uninteresting Tour Through Why Our Research Papers Aren’t Accessible. 2019. VisualBERT: A Simple and Performant Baseline for Vision and Language.
Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in ArXiv abs/1908.03557 (2019).
Computing Systems (2016). [29] Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. ViLBERT: Pretraining
[6] Jefrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Rob Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks.
Miller, Rob Miller, Aubrey Tatarowicz, Brandyn Allen White, Samuel White, and In NeurIPS.
Tom Yeh. 2010. VizWiz: nearly real-time answers to visual questions. Proceedings [30] Alan Lundgard and Arvind Satyanarayan. 2022. Accessible Visualization via
of the 23nd annual ACM symposium on User interface software and technology Natural Language Descriptions: A Four-Level Model of Semantic Content. IEEE
(2010). Transactions on Visualization and Computer Graphics 28 (2022), 1073–1083.
[7] E. Brady, Y. Zhong, and Jefrey P. Bigham. 2015. Creating accessible PDFs for [31] Kelly M. Mack, Edward Cutrell, Bongshin Lee, and Meredith Ringel Morris. 2021.
conference proceedings. Proceedings of the 12th Web for All Conference (2015). Designing Tools for High-Quality Alt Text Authoring. The 23rd International
[8] Danielle Bragg, Raja S. Kushalnagar, and Richard E. Ladner. 2018. Designing an ACM SIGACCESS Conference on Computers and Accessibility (2021).
Animated Character System for American Sign Language. Proceedings of the 20th [32] Nitesh Methani, Pritha Ganguly, Mitesh M. Khapra, and Pratyush Kumar. 2020.
International ACM SIGACCESS Conference on Computers and Accessibility (2018). PlotQA: Reasoning over Scientifc Plots. 2020 IEEE Winter Conference on Applica-
[9] B. Caldwell, M. Cooper, Loretta Guarino Reid, and G. Vanderheiden. 2008. Web tions of Computer Vision (WACV) (2020), 1516–1525.
Content Accessibility Guidelines (WCAG) 2.0. [33] Valerie S. Morash, Yue-Ting Siu, Joshua A. Miele, Lucia Hasty, and Steven Lan-
[10] Chen Chen, Ruiyi Zhang, Sungchul Kim, Scott D. Cohen, Tong Yu, Ryan A. Rossi, dau. 2015. Guiding Novice Web Workers in Making Image Descriptions Using
and Razvan C. Bunescu. 2019. Neural caption generation over fgures. Adjunct Templates. ACM Transactions on Accessible Computing (TACCESS) 7 (2015), 1 –
Proceedings of the 2019 ACM International Joint Conference on Pervasive and 21.
Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium [34] David Morris, Eric Müller-Budack, and Ralph Ewerth. 2020. SlideImages: A
on Wearable Computers (2019). Dataset for Educational Image Classifcation. Advances in Information Retrieval
[11] Charles C. Chen, Ruiyi Zhang, Eunyee Koh, Sungchul Kim, Scott D. Cohen, Tong 12036 (2020), 289 – 296.
Yu, Razvan C. Bunescu, and Razvan C. Bunescu. 2019. Figure Captioning with [35] Meredith Ringel Morris, Jazette Johnson, Cynthia L. Bennett, and Edward Cutrell.
Reasoning and Sequence-Level Training. ArXiv abs/1906.02850 (2019). 2018. Rich Representations of Visual Content for Screen Reader Users. Proceedings
[12] Xinlei Chen, Hao Fang, Tsung-Yi Lin, Ramakrishna Vedantam, Saurabh Gupta, of the 2018 CHI Conference on Human Factors in Computing Systems (2018).
Piotr Dollár, and C. Lawrence Zitnick. 2015. Microsoft COCO Captions: Data [36] Azadeh Nazemi and Iain Murray. 2013. A Method to Provide Accessibility for
Collection and Evaluation Server. ArXiv abs/1504.00325 (2015). Visual Components to Vision Impaired.
[13] W. Chisholm, G. Vanderheiden, and Ian Jacobs. 2001. Web content accessibility [37] Mark Neumann, Daniel King, Iz Beltagy, and Waleed Ammar. 2019. ScispaCy:
guidelines 1.0. Interactions 8 (2001), 35–54. Fast and Robust Models for Biomedical Natural Language Processing. ArXiv
[14] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: abs/1902.07669 (2019).
Pre-training of Deep Bidirectional Transformers for Language Understanding. In [38] Aaron Nicolson, Jason Dowling, and Bevan Koopman. 2021. AEHRC CSIRO at
NAACL. ImageCLEFmed Caption 2021. In CLEF.
[15] Christin Engel and Gerhard Weber. 2017. Improve the Accessibility of Tactile [39] Xin Qian, Eunyee Koh, Fan Du, Sungchul Kim, and Joel Chan. 2020. A Formative
Charts. In INTERACT. Study on Designing Accurate and Natural Figure Captioning Systems. In Extended
[16] C. Ailie Fraser, Tricia J. Ngoon, Ariel S. Weingarten, Mira Dontcheva, and Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems
Scott Klemmer. 2017. CritiqueKit: A Mixed-Initiative, Real-Time Interface For (Honolulu, HI, USA) (CHI EA ’20). Association for Computing Machinery, New
Improving Feedback. In Adjunct Publication of the 30th Annual ACM Sympo- York, NY, USA, 1–8. https://doi.org/10.1145/3334480.3382946
sium on User Interface Software and Technology (Québec City, QC, Canada) [40] Xin Qian, Eunyee Koh, Fan Du, Sungchul Kim, Joel Chan, Ryan A. Rossi, Sana
(UIST ’17). Association for Computing Machinery, New York, NY, USA, 7–9. Malik, and Tak Yeon Lee. 2021. Generating Accurate Caption Units for Figure
https://doi.org/10.1145/3131785.3131791 Captioning. Proceedings of the Web Conference 2021 (2021).
[17] Cole Gleason, Amy Pavel, Emma McCamey, Christina Low, Patrick Carrington, [41] Katharina Reinecke, David R. Flatla, and Christopher Brooks. 2016. Enabling
Kris M. Kitani, and Jefrey P. Bigham. 2020. Twitter A11y: A Browser Extension Designers to Foresee Which Colors Users Cannot See. Proceedings of the 2016
to Make Twitter Images Accessible. Proceedings of the 2020 CHI Conference on CHI Conference on Human Factors in Computing Systems (2016).
Human Factors in Computing Systems (2020). [42] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean
[18] Darren Guinness, Edward Cutrell, and Meredith Ringel Morris. 2018. Caption Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael S. Bernstein,
Crawler: Enabling Reusable Alternative Text Descriptions using Reverse Image Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition
Search. Proceedings of the 2018 CHI Conference on Human Factors in Computing Challenge. International Journal of Computer Vision 115 (2015), 211–252.
Systems (2018). [43] Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2017. Toward Scalable
[19] Danna Gurari, Qing Li, Abigale Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Social Alt Text: Conversational Crowdsourcing as a Tool for Refning Vision-to-
Jiebo Luo, and Jefrey P. Bigham. 2018. VizWiz Grand Challenge: Answering Language Technology for the Blind. In HCOMP.
Visual Questions from Blind People. 2018 IEEE/CVF Conference on Computer [44] Piyush Sharma, Nan Ding, Sebastian Goodman, and Radu Soricut. 2018. Concep-
Vision and Pattern Recognition (2018), 3608–3617. tual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic
[20] Ting-Yao Hsu, C. Lee Giles, and Ting-Hao Kenneth Huang. 2021. SciCap: Gener- Image Captioning. In ACL.
ating Captions for Scientifc Figures. In EMNLP. [45] Noah Siegel, Zachary Horvitz, Roie Levin, Santosh Kumar Divvala, and Ali
[21] K. V. Jobin, Ajoy Mondal, and C. V. Jawahar. 2019. DocFigure: A Dataset for Scien- Farhadi. 2016. FigureSeer: Parsing Result-Figures in Research Papers. In ECCV.
tifc Document Figure Classifcation. 2019 International Conference on Document [46] Abigale Stangl, Meredith Ringel Morris, and Danna Gurari. 2020. "Person, Shoes,
Analysis and Recognition Workshops (ICDARW) 1 (2019), 74–79. Tree. Is the Person Naked?" What People with Vision Impairments Want in Image
[22] Daekyoung Jung, Wonjae Kim, Hyunjoo Song, Jeongin Hwang, Bongshin Lee, Descriptions. Association for Computing Machinery, New York, NY, USA, 1–13.
Bo Hyoung Kim, and Jinwook Seo. 2017. ChartSense: Interactive Data Extraction https://doi.org/10.1145/3313831.3376404
from Chart Images. Proceedings of the 2017 CHI Conference on Human Factors in [47] Lucy Lu Wang, Isabel Cachola, Jonathan Bragg, Evie (Yu-Yen) Cheng,
Computing Systems (2017). Chelsea Hess Haupt, Matt Latzke, Bailey Kuehl, Madeleine van Zuylen, Linda M.
[23] Kushal Kafe, Scott D. Cohen, Brian L. Price, and Christopher Kanan. 2018. DVQA: Wagner, and Daniel S. Weld. 2021. Improving the Accessibility of Scientifc Doc-
Understanding Data Visualizations via Question Answering. 2018 IEEE/CVF uments: Current State, User Needs, and a System Solution to Enhance Scientifc
Conference on Computer Vision and Pattern Recognition (2018), 5648–5656. PDF Accessibility for Blind and Low Vision Users. ArXiv abs/2105.00076 (2021).
[24] Samira Ebrahimi Kahou, Adam Atkinson, Vincent Michalski, Ákos Kádár, Adam [48] Shaomei Wu, Jefrey Wieland, Omid Farivar, and Julie Schiller. 2017. Automatic
Trischler, and Yoshua Bengio. 2018. FigureQA: An Annotated Figure Dataset for Alt-text: Computer-generated Image Descriptions for Blind Users on a Social
Visual Reasoning. ArXiv abs/1710.07300 (2018). Network Service. Proceedings of the 2017 ACM Conference on Computer Supported
[25] Alina Kuznetsova, Hassan Rom, Neil Gordon Alldrin, Jasper R. R. Uijlings, Ivan Cooperative Work and Social Computing (2017).
Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander [49] Luowei Zhou, Hamid Palangi, Lei Zhang, Houdong Hu, Jason J. Corso, and
Kolesnikov, Tom Duerig, and Vittorio Ferrari. 2020. The Open Images Dataset Jianfeng Gao. 2020. Unifed Vision-Language Pre-Training for Image Captioning
V4. International Journal of Computer Vision 128 (2020), 1956–1981. and VQA. ArXiv abs/1909.11059 (2020).
[26] J. Lazar, E. Churchill, T. Grossman, G. V. D. Veer, Philippe A. Palanque, J. Morris,
and Jennifer Mankof. 2017. Making the feld of computing more inclusive.
Commun. ACM 60 (2017), 50 – 59.
ASSETS ’22, October 23–26, 2022, Athens, Greece S. Chintalapati, J. Bragg, and L.L. Wang

A HEURISTICS FOR IDENTIFYING GRAPHS • data


AND CHARTS • points
• error
The full list of words and phrases used to identify fgures corre-
• error bar
sponding to graphs and charts include:
• trial
• graph • trials
• chart • bar plot
• plot • bar
• scatter plot • venn
• scatter • mean
• distribution • average
Low-Cost Tactile Coloring Page Fabrication on a Cuting
Machine
Assembly and user experiences of cardstock-layered tangible pictures
Nicole E. Johnson* Tom Yeh Ann Cunningham
University of Colorado, Boulder, USA, University of Colorado, Boulder, USA, Sensational Books, USA,
Nicole.Johnson-1@colorado.edu Tom.Yeh@colorado.edu Ann@acunningham.com

ABSTRACT [1, 2]. Tactile images in particular hold benefts regarding spatial
Tactile images are an important mode of information for the blind cognition [3] but also challenges regarding availability and literacy
and low vision community but many of the common methods have [4–6]. This work addresses availability and literacy development
high acquisition and material costs. This design inquiry looks at by leveraging a low-cost technology to make tactile coloring pages
the use of a Cricut cutting machine to layer cardstock for creating for BLV early literacy content.
tangible pictures and coloring pages. This report discusses the devel- This report summarizes three years of work investigating a com-
opment of the technique followed by the corresponding production mercially available CNC cutting machine, the Cricut Maker, as a
considerations, design iterations, pilot feedback, and limitations. viable medium for creating tactile pictures. We will discuss the
We believe that in certain contexts this technique could be a viable development of our cardstock layering technique followed by the
method for conveying tactile information, especially in the context corresponding production considerations, design iterations, pilot
of early learning activities where practice is required for gaining feedback, and limitations. Cardstock layered pictures are a cost-
life-long tactile graphicacy skills that can translate to reading maps efective alternative to other traditional tactile image techniques
and information graphics later in life. Additionally, the low acquisi- and are ideal for coloring activities, a skill that helps build tactile
tion cost of this method has value to parents and educators of blind graphicacy in early learning settings.
and low vision children as a cheap DIY way to make tactile content.
2 RELATED WORK
CCS CONCEPTS 2.1 Production of Tactile Images
• Human-centered computing; • Accessibility technologies; In terms of image production, most techniques remain analog with a
Accessibility systems and tools;; • Social and professional high time and resource cost due to the specialty demand of the tools
topics; • People with disabilities.; and need for design mediation. The most common methods used
today for tactile images include microcapsule paper (swelled lines),
KEYWORDS thermoform (vacuum-formed plastic), craft supplies, 3D printing,
CNC machine, tactile graphic, paper, pictures, touch, blindness, paper embossing (raised dot matrix on paper), and refreshable pin-
illustrations, images, tactile, early learning, technique matrix devices (matrix of pins on a refreshable tablet) [7–9].
ACM Reference Format:
Each of these methods has pros and cons across a range of spec-
Nicole E. Johnson*, Tom Yeh, and Ann Cunningham. 2022. Low-Cost Tac- trum including acquisition cost, production time, production cost,
tile Coloring Page Fabrication on a Cutting Machine: Assembly and user image resolution, ease of use/production, and whether or not the
experiences of cardstock-layered tangible pictures. In The 24th International image is dynamic [9] as well as accessibility of production. All meth-
ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22), ods with the exception of rudimentary craft supplies involve high
October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 5 pages. acquisition costs that are between $1,400 for swell, thermoform, and
https://doi.org/10.1145/3517428.3551353 low-volume embossers and over $14,000 for refreshable pin matrix
displays and high-volume embossers. Cricut cutting machines have
1 INTRODUCTION a lower acquisition cost ($200-$400) and has the advantage of being
The proliferation of the internet has opened a landscape of informa- able to use regular cardstock and other craft materials as opposed
tion access to many populations; however, visual imagery remains to swell paper that costs $1.50-$2.50 per sheet.
a challenge for blind and low vision (BLV) communities. Common
alternative modes for conveying visual information include audio 2.2 Cognition of Tactile Images
descriptions and tactile representations, ideally paired together In order to consider the cognitive challenges in reading tactile im-
ages, one must consider the fundamental diferences between the
visual and tactile modality; vision is global and immediate whereas
touch is local and sequential [10, 11]. For this reason, tactile design
This work is licensed under a Creative Commons Attribution-NonCommercial
International 4.0 License. is not a one-to-one process. Visual embellishments must be stripped
away, where spacing and textural distinguishability become the
ASSETS ’22, October 23–26, 2022, Athens, Greece most important elements [12, 13]. Understanding communication
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. through tactile images requires tactile graphicacy, as vision re-
https://doi.org/10.1145/3517428.3551353 quires visual graphicacy. Tactile graphicacy is a learned skill that
ASSETS ’22, October 23–26, 2022, Athens, Greece Nicole Johnson et al.

Figure 1: (Left) Middle layer of lobster design having registration holes fed into dowels. (Right) Top of lobster design fed into
dowels

takes practice [14–19] and consequently, access to a wide range of For the designs themselves, the top layer is always the broadest out-
materials and pictures that are not as immediately available as in line of the object, then the middle paper layer(s) include important
vision [13, 19, 20]. The long-term benefts of tactile reading skill details of the object.
development have been shown widely across areas such as spatial One draw to this technique was the possibility for duplication
reasoning and working memory [13, 20–23], as well as being an and streamlined production. Our frst large-scale picture distribu-
indicator of future employment [24, 25]. tion of around 80 pictures was a tiger design by Cunningham. All
three authors participated in the frst assembly-line style produc-
3 CARDSTOCK LAYERING TECHNIQUE tion technique, with one gluing the layers and transferring them
DEVELOPMENT for assembly at the peg boards. Working together we were able
to complete the 80 pictures with 160 paper layers in one hour. We
The initial idea for this cardstock layering technique came from
theorized the assembly could go very fast with between 5-10 vol-
co-author Ann Cunningham and is centered around the premise
unteers, each with a specifc job such as gluing, transferring paper
of registration holes, which line up the image layers and allow for
layers, multiple peg board assembly stations, transferring assem-
streamlined duplication. Registration holes are a common method
bled pictures to the drying rack, etc. The task also works with just
in printmaking to line up multi-block prints. The general cardstock
two people, with one person gluing and the other assembling. It is
layering technique is outlined in Figure 1.
possible to do alone but does add time switching between gluing
The idea for layered image components stems from the process
and assembling.
of stacking clay layers to create bas-relief images. Bas-relief is a
The tiger design was distributed in the art room at the National
style of sculpture where the subject of the image is protruding out
Federation of the Blind Colorado 2019 Regional Conference for pilot
of a fat surface. Types of this image include low-relief, high-relief,
feedback. These sessions yielded informal feedback from both blind
and sunken relief [26] The peg board was created using a thick
and sighted children and adults about the tiger picture, coloring
plastic base with holes drilled in for two dowels. The spacing of the
activity, as well as brainstorming subject matter the viewers would
dowels corresponds to the outer holes of the standard three-hole
be interested in.
punch (7mm holes spaced 70mm from center of hole). Each design
At this stage in the inquiry, we learned that the thickness of
includes all three holes for easy storage in three ring binders.
paper was more important than a linen textured paper. The thicker
paper we used was 110lb/300gsm weight and created a more dis-
3.1 Early Design Phase tinct edge for the design. The linen paper texture was not very
The following research questions guided the early phases of this discernable from the regular paper texture and was too thin for
design inquiry: cutting, but moving forward we continued to use it with designs as
1. How can we leverage a Cricut cutting machine to create the uncut backing page. The Cricut Maker is able to cut a variety of
tactile pictures? materials such as fabric, foam, plastics, cardboard, vinyl, and others.
2. How can we create engaging tactile content for kids? This ofers tactile designers many options for adding areal textures
3. How can we make tactile pictures and activities collabora- to their images. The best adhesive we found is the solvent-based
tive? product Super 77 Spray by 3M. Assembly requires an extremely
well-ventilated area and aerosol masks.
The inquiry started with familiarizing to the machine and creat-
ing prototype designs over a period of months to experiment with
various paper thicknesses and textures. Designs were drawn in the
vector drawing program Adobe Illustrator and saved as .SVG fles.
Low-Cost Tactile Coloring Page Fabrication on a Cuting Machine ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 2: Ten designs clockwise from top left: stegosaurus, girafe, sea turtle, grasshopper, Saturn, rhino, seahorse, rocket,
honey bee, and elk.

3.2 Late Design Phase Table 1: Material costs for each picture
After establishing a papercut picture workfow and getting pre-
liminary feedback from the BLV community, frst author Nicole Material Cost
Johnson engaged in an extended period of design work over a pe- Super 77 Spray $0.04 per paper layer
riod of months, creating over 80 designs intended for early learning Heavy cardstock $0.15 per sheet
settings. Images include subject matter such as animals, coloring Regular cardstock $0.08 per sheet
pages, maze, matching activities, and cut-outs (Figure 2). Cut-outs Colored cardstock $0.08 per sheet
of shapes and objects have also been shown to help in tactile under- Newsprint backing $0.02 per sheet
standing [7], a bi-product of the top outline page of our designs as Linen textured backing $0.07 per sheet
well as possibilities for collage cut-out activities made on a Cricut. Total cost per picture $0.14-$0.64

a mid-range price per picture compared to swell paper at $1.50-


4 TECHNIQUE IN PRACTICE $2.50 per picture, thermoform at $0.13-$0.36 and braille paper for
4.1 Cost & Time Breakdown embosser graphics at $0.05 [27]. While thermoform and embosser
As mentioned above, this layered cardstock technique is cheaper graphics have a lower cost per picture, the acquisition cost is con-
than traditional tactile picture techniques in terms of acquisition siderably higher at around $1,000-$4,000 for thermoform machines
and materials. Material cost breakdown is listed in Table 1. and $1,300-$6,000 for graphic capable, low-volume embossers [28].
One of the limitations to this technique is time required cut-
ting the pages, which varies depending on material thickness and 4.2 Feedback
amount of detail. Table 2 includes a breakdown of production time Due to the COVID-19 pandemic there was a period of months where
for 6 designs with production data averaged based on the number it was difcult to collect feedback from users. The designs in this
of copies. Data includes total cutting time of all layers, gluing time phase were instead guided largely by production considerations
per picture, trimming time per picture, braille label time (Including from the sighted designer with an extensive background in tactile
punching with a slate and stylus and adhering to the picture), and design. Ideally there would be more intermittent user feedback
the total production time for each picture. directly from the BLV community to guide the designs. We are
Each cardstock layered picture costs between $0.14-$0.64 de- currently in the process of collecting a range of feedback, discussed
pending on the amount and material of the paper layers. This is in the following sections.
ASSETS ’22, October 23–26, 2022, Athens, Greece Nicole Johnson et al.

Table 2: Production time to produce one picture.

Design Total cut time Glue Trim Braille labels Total time per pic
Heart Sheet 2:35 min 1:15 min 0:45 sec 1:00 min 5:35 min
Hands with Hearts 1:55 min 1:15 min 0:45 sec 1:00 min 4:55 min
Flower Scene 4:00 min 0:56 sec 0:43 sec 0:53 sec 6:32 min
Sea Turtle 2:55 min 1:05 min 0:44 sec 0:53 sec 5:37 min
Butterfy 2:13 min 1:00 min 0:40 sec 0:53 sec 4:46 min
Unique Hands 2:20 min 1:00 min 0:40 sec 0:53 sec 4:53 min

4.2.1 Picture Exchange Program. Our team paired with two sepa- The second was a more in-depth look at the technique with 9
rate tactile content creators who provide families and teachers of participants in a Tactile Drawing Club. The technique was repli-
the visually impaired (TVIs) with monthly subscription boxes. They cated in all phases and worked well for the tactile literacy process
agreed to share two of our pictures per month in the boxes over required to replicate an image through tactile drawing. Four of
a period of 3 and 4 months. Interviews were also conducted with the drawing club participants were blind TVIs and ofered helpful
the creators about their background and experiences producing insights about the technique. One of the biggest issues with the
BLV learning materials. One topic brought up in both interviews technique is the lack of accessibility in the production stages. One
was the need for parents to be engaged and informed about how positive that was mentioned is the debossed nature of the image,
to support their children’s learning outside of the classroom and which stores better and does not wear down like a raised embossed
what DIY tools are available for creating content. image. Another comment that was brought up was the possibility to
The box for K-5 ages reached around 30-50 families over a period deconstruct the image by leaving the parts unglued in a binder and
of 4 months, distributing around 300 pictures. The box for the 3-6 fipping through them individually to better understand how they
age range reached around 25-35 families over a period of 3 months, ft together. This method of the deconstructed layers also allowed
distributing around 200 pictures. Each month included a short sur- participants to utilize as stencils when recreating the image, which
vey link for the pictures, ofering more pictures in exchange for some participants especially liked. The coloring activity angle was
feedback. In the process of developing prize incentives, two 10-page also received positively as a skill development tool to help with the
themed coloring books were developed titled “In the Garden” and metacognition of tactile image processing.
“In the Ocean” based on the design library and altered slightly for
added braille titles.
Preliminary picture feedback from parents was largely positive,
5 LIMITATIONS
all stating that the tactile sensation was good. One parent suggested This design inquiry had limitations in terms of feedback opportuni-
colored images for low-vision kids. Another family had fun with the ties due to COVID-19 restrictions. We recognize the importance of
pictures by using them as a backing guide to color over with crayons participatory design and in the future would prefer a more inter-
on another sheet of printer paper. One aspect of these designs that woven feedback and design system and are actively working to
make them ideal for coloring is the barrier of the papercut that build one. We are currently working to collect feedback from broad
creates a guide for crayons. and diverse sources to provide more insights into this new tactile
image technique. The feedback we have received indicates this
medium is a viable tactile picture alternative but further testing
4.2.2 Open-Source Design Share. In addition to dissemination of
scenarios are needed.
assembled pictures, all design fles are shared as open-source. File
As mentioned, one of the biggest limitations to the technique is
variations for the images include: .SVG for layered cardstock, .PDF
the lack of accessibility in the production process. Another draw-
of all layers for swell paper and embossers, stencils fles for hand
back in the production process is the time required for each picture,
embossing, and the original .AI design fle. Ofering a variety of fle
the need for a well-ventilated area when gluing, and the mid-range
types for each design increases availability based on what the user
cost per picture.
has access to, whether it be a Cricut, swell machine, embosser, or
low-tech craft supplies. Files are available in the meta tactile graphic
library BTactile [29], which catalogs tactile graphic libraries all over 6 FUTURE WORK
the world, consolidating the resources into a single source. Putting There are many avenues for possible future work on this topic. One
the fles into community hands ofers new learning experiences possible direction of the work is more rigorous comparison studies
and insights about this medium and its practical applications. between this technique and other common tactile image methods
such as raised-line or embosser graphics. We would also like to
4.2.3 Tactile Drawing Club. In collaboration with the BTactile better assess the accessibility of the Cricut interface and the broader
team, we shared the layered cardstock method to TVIs in Colombia cardstock layering technique. Another direction that was suggested
and Argentina on two occasions. First, during an hour presentation by BTactile is to use this layering technique with new materials
about the cardstock paper layering technique with a question-and- such as acetate to create a master design that can be embossed on
answer session with around 60 participants. regular braille paper through a press.
Low-Cost Tactile Coloring Page Fabrication on a Cuting Machine ASSETS ’22, October 23–26, 2022, Athens, Greece

7 CONCLUSION [9] Denise Prescher, Jens Bornschein, J., and Gerhard Weber. 2017. Consistency of a
tactile pattern set. ACM Transactions on Accessible Computing, 10(2), 1-29. DOI:
Tactile images have value to the blind and low-vision community as https://doi.org/10.1145/3053723
a mode of information, helping with spatial reasoning and working [10] Ira Puspitawati, Ahmed Jebrane, and Annie Vinter. 2014. Local and Global Pro-
cessing in Blind and Sighted Children in a Naming and Drawing Task. Child Dev,
memory. Production techniques of these images have many limita- 85, 1077-1090. DOI: https://doi.org/10.1111/cdev.12158
tions, mostly due to their analog nature. Lack of availability hinders [11] Richa Gupta, P. V. M. Rao, M. Balakrishnan, M., & Steve Mannheimer. 2019.
information access as well as tactile graphicacy skills. We propose a Evaluating the use of variable height in tactile graphics. 2019 IEEE World Haptics
Conference (WHC), 121-126. DOI: https://doi.org/10.1109/WHC.2019.8816083
cardstock layering technique fabricated on a Cricut Maker cutting [12] Braille Authority of North America [BANA]. 2010. Guidelines and Standards for
machine that uses registration holes for streamlined production. Tactile Graphics. http://www.brailleauthority.org/tg/web-manual/index.html
[13] Megan Lawrence and Amy Lobben. 2011. The design of tactile thematic symbols.
Based on preliminary feedback we believe this method has value Journal of Visual Impairment & Blindness, 105, 10, 681-691. DOI: https://doi.org/
as a tactile image mode but further testing is needed. 10.1177/0145482X1110501014
As discussed in this report, the layered cardstock technique [14] Krish Sathian. 2000. Practice makes perfect: Sharper tactile perception in the
blind. Neurology, 54, 12, 2203. DOI: https://doi.org/10.1212/WNL.54.12.2203
carries both benefts and drawbacks. The main benefts include low [15] Amede D’Angiulli, John Kennedy and Morton Helle. 1998. Blind children rec-
acquisition and material costs, streamlined assembly, wear-resistant ognizing tactile pictures respond like sighted children given guidance in ex-
storage, and the use as coloring activities in early learning settings ploration. Scandinavian Journal of Psychology, 39, 3, 187x-190, DOI: https:
//doi.org/10.1111/1467-9450.393077
and to develop tactile graphicacy. Main drawbacks include the lack [16] Frances Aldrich, Linda Sheppard and Yvonne Hindle. 2003. First Steps Towards
of accessibility in the production process, moderate production a Model of Tactile Graphicacy. Cartographic Journal, vol. 40, 3, 283-287. DOI:
https://doi.org/10.1179/000870403225013014
time, and the need for a well-ventilated area when gluing. [17] Annie Vinter, Viviane Fernandes, Oriana Orlandi, and Pascal Morgan. 2012.
Exploratory procedures of tactile images in visually impaired and blindfolded
ACKNOWLEDGMENTS sighted children: How they relate to their consequent performance in drawing.
Research in Developmental Disabilities, 33, 6, 1819–1831. DOI: https://doi.org/10.
This work was supported by National Science Foundation under 1016/j.ridd.2012.05.001
Grant No. IIS-1453771 and DGE 1650115. We thank Maria Zuniga- [18] David Dulin and Yvette Hatwell. 2006. The efects of visual experience and
training in raised-line materials on the mental spatial imagery of blind persons.
Zabala and John Guerra-Gomez for collaborating as well as hosting Journal of Visual Impairment & Blindness, 100, 7, 414-424. DOI: https://doi.org/
the design fles on BTactile. We also thank Stacey Chambers with 10.1177/0145482X0610000705
the ECC and Me, Melisa Matthews with Eye Am LLC, the Tactile [19] Miriam Ittyerah. 2009. Hand ability and practice in congenitally blind children.
Journal of Developmental and Physical Disabilities, 21, 5, 329-344. DOI: https:
Drawing Club participants, and all anonymous feedback sources //doi.org/10.1007/s10882-009-9146-8
for their valuable perspectives and insights. Appreciation to Gary [20] Susanna Millar and Miriam Ittyerah. 1992. Movement imagery in young and
congenitally blind children: Mental practice without visuo-spatial information.
Paynter for production assistance. International Journal of Behavioral Development, 15, 1, 125-146. DOI: https:
//doi.org/10.1177/016502549201500107
REFERENCES [21] David Dulin and Coline Serrière. 2009. Special needs education in the blind
population: Efects of prior expertise in raised line drawings on blind people’s
[1] Don Parkes. 1998. Tactile Audio Tools for Graphicacy and Mobility "A circle is
cognition. Procedia - Social and Behavioral Sciences, 1, 1, 549-553. DOI: https:
either a circle or it is not a circle." The British Journal of Visual Impairment, 16,
//doi.org/10.1016/j.sbspro.2009.01.099
3, 99-104. DOI: https://doi.org/10.1177/026461969801600304
[22] Lora T. Likova and Laura Cacciamani. 2018. Transfer of learning in people who
[2] Timo Götzelmann. 2018. Visually Augmented Audio-Tactile Graphics for Visually
are blind: Enhancement of spatial-cognitive abilities through drawing. Journal
Impaired People, ACM Transactions on Accessible Computing, 11, 2, 1-31. DOI:
of Visual Impairment & Blindness, 112, 4, 385-397. DOI: https://doi.org/10.1177/
https://doi.org/10.1145/3186894.
0145482X1811200405
[3] Yvette Hatwell, Arlette Streri, and Edouard Gentaz. 2003. Touching for know-
[23] Athina Panotopoulou, Xiaoting Zhang, Tammy Qiu, Xing-Dong Yang, and Emily
ing: Cognitive psychology of haptic manual perception. John Benjamins Pub,
Whiting. 2020. Tactile line drawings for improved shape understanding in blind
Philadelphia, PA.
and visually impaired users. ACM Transactions on Graphics, 39, 4, 89:1-89:13.
[4] Kim Zebehazy and Adam Wilton. 2014a. Charting success: The experience of
DOI: https://doi.org/10.1145/3386569.3392388
teachers of students with visual impairments in promoting student use of graphics.
[24] Darleen Bogart and Alan J. Koenig. 2005. Selected Findings from the First In-
Journal of Visual Impairment & Blindness, 108, 4, 263-274. DOI: https://doi.org/
ternational Evaluation of the Proposed Unifed English Braille Code. Journal of
10.1177/0145482X1410800402
Visual Impairment & Blindness, 99, 4, 233.
[5] Kim Zebehazy and Adam Wilton. 2014b. Straight from the source: Perceptions of
[25] Natalie Martiniello and Walter Wittich. 2019. Employment and visual Impairment,
students with visual impairments about graphic use. Journal of Visual Impairment
in The Routledge Handbook of Visual Impairment (1st ed.). Ravenscroft, J. (Ed.),
& Blindness, 108, 4, 275-286. DOI: https://doi.org/10.1177/0145482X1410800403
Routledge, 415-437.
[6] Penny Rosenblum and Tina S. Herzberg. 2015. Braille and Tactile Graphics: Youths
[26] “Relief Sculpture,” Encyclopedia of Sculpture. [Online]. Available: http://www.
with Visual Impairments Share their Experiences. Journal of Visual Impairment
visual-arts-cork.com/sculpture/relief.htm. [Accessed: 22-Jun-2022].
& Blindness, 109,. 3, 173-184. DOI: https://doi.org/10.1177/0145482X1510900302
[27] “Braille Printers, Tactile Graphics & Supplies,” American Thermoform. [Online].
[7] Amy Kalia, Rose Hopkins, David Jin, Lindsay Yazzolino, Svena Verma, Lotf
Available: http://www.americanthermoform.com/. [Accessed: 16-Jun-2021]
Merabet, Flip Phillips, Pawan Sinha. 2014. Perception of tactile graphics: Em-
[28] “Braille Embossers,” National Library Service for the Blind and Print
bossings versus cutouts. Multisensory Research, 27, 2, 111-125. DOI: https:
Disabled (NLS) | Library of Congress, 2020. [Online]. Available:
//doi.org/10.1163/22134808-00002450
https://www.loc.gov/nls/resources/blindness-and-vision-impairment/devices-
[8] Sandra Jehoel, Simon Ungar, Don McCallum, and Jonathan Rowell. 2005. An evalu-
aids/braille-embossers/#sources-braille-embossers. [Accessed: 16-Jun-2021]
ation of substrates for tactile maps and user preferences. Journal of Visual impair-
[29] BTactile Graphics MetaLibrary, 2021. [Online]. Available: https://btactile.com/.
ment & Blindness, 99, 85-95. DOI: https://doi.org/10.1177/0145482X0509900203
[Accessed: 16-Jun-2021]
Animations at Your Fingertips: Using a Refreshable Tactile
Display to Convey Motion Graphics for People who are Blind or
have Low Vision
Holloway, Leona Swamy Ananthanarayan Matthew Butler
Monash University, Australia Monash University, Australia Monash University, Australia
leona.holloway@monash.edu swamy.ananthanarayan@monash.edu matthew.butler@monash.edu

Madhuka De Silva Kirsten Ellis Cagatay Goncu


Monash University, Australia Monash University, Australia Monash University, Australia
madhuka.desilva@monash.edu kirsten.ellis@monash.edu cagatay.goncu@monash.edu

Kate Stephens Kim Marriott


Monash University, Australia Monash University, Australia
kate.stephens@monash.edu kim.marriott@monash.edu

Figure 1: Touch readers exploring images on a Refreshable Tactile Display


ABSTRACT CCS CONCEPTS
People who are blind rely on touch and hearing to understand the • Human-centered computing → Empirical studies in acces-
world around them, however it is extremely difcult to understand sibility; Accessibility technologies; Accessibility systems and
movement through these modes. The advent of refreshable tactile tools.
displays (RTDs) ofers the potential for blind people to access tactile
animations for the very frst time. A survey of touch readers and KEYWORDS
vision accessibility experts revealed a high level of enthusiasm for blind, tactile graphics, refreshable displays, accessible graphics
tactile animations, particularly those relating to education, mapping
and concept development. Based on these suggestions, a range of ACM Reference Format:
tactile animations were developed and four were presented to 12 Holloway, Leona, Swamy Ananthanarayan, Matthew Butler, Madhuka De
touch readers. The RTD held advantages over traditional tactile Silva, Kirsten Ellis, Cagatay Goncu, Kate Stephens, and Kim Marriott. 2022.
graphics for conveying movement, depth and height, however there Animations at Your Fingertips: Using a Refreshable Tactile Display to Con-
vey Motion Graphics for People who are Blind or have Low Vision. In
were trade-ofs in terms of resolution and textural properties. This
ASSETS ’22: ACM SIGACCESS Conference on Computers and Accessibility,
work ofers a frst glimpse into how refreshable tactile displays can October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 16 pages.
best be utilised to convey animated graphics for people who are https://doi.org/10.1145/3517428.3544797
blind.

1 INTRODUCTION
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed A recent resurgence in refreshable tactile displays (RTDs) with
for proft or commercial advantage and that copies bear this notice and the full citation larger tactile surfaces ofers new opportunities for the touch read-
on the frst page. Copyrights for components of this work owned by others than the ing community. Along with the obvious advantages of storage space,
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission reduction of hard copy materials and immediate access, RTDs ofer
and/or a fee. Request permissions from permissions@acm.org. the brand new possibility of animated tactile images. Until now,
ASSETS ’22, October 23–26, 2022, Athens, Greece if you were blind or had low vision (BLV), access to dynamic graph-
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 ics such as an animation showing how a bird faps its wings was
https://doi.org/10.1145/3517428.3544797 provided by a sequence of tactile graphics, with successive graphics
ASSETS ’22, October 23–26, 2022, Athens, Greece Holloway et al.

showing key frames in the animation. Tactile animations ofer an approach is to arrange the frames temporally. This results in an
exciting alternative. animated graphic in which the frames are shown to the viewer, one
While there has been considerable research into technologies after another. Typically the frames are shown in rapid succession
for building refreshable displays, e.g. [44, 72], there has been little so that the viewer’s visual system does not perceive the individual
research into the usefulness and efectiveness of animated tactile im- frames but rather perceives objects to be moving smoothly. In
ages for presenting dynamic graphics to BLV people. Prior research the second approach, called small multiples, selected frames are
has focused on specifc applications–a blind soccer match [37, 54], arranged spatially in a sequence or grid. In the third approach, a
freworks display [62, 63] and zooming/panning on a map [75, 76]. few frames may be overlaid and arrows or other annotations used
Here we investigate the efectiveness and design of animated tactile to show the direction of movement.
images in general. Our specifc contributions are threefold: There have been many studies comparing the efectiveness of
Contribution 1: Identifcation of potential application areas instructional materials showing change over time using animation
and diagrams that touch readers would like to see displayed with those using small multiples [7, 11, 33, 70]. The results are
as animated tactile images on RTDs. As a frst step we con- mixed. The disadvantage of an animation is that it is difcult to
ducted an online survey of 19 touch readers and seven accessibility track multiple simultaneous changes and that if the viewer wishes
experts asking what animated tactile images would be of interest to compare two frames they must remember one of those frames
to touch readers. This revealed a high level of interest in animated and mentally compare it with the current frame. On the other
tactile images. Learning support for children and information for hand, it seems to be easier to track selected objects if the surrounds
adults were ranked as more important than sport and entertainment are also changing in an animation [5]. There is also evidence that
applications. human body movements are easier to learn from an animation than
Contribution 2: Design considerations for creating ani- small multiples [7]. Another application in which animation has
mated tactile images for touch readers. The research team, one proved efective is to animate transitions between diferent views
of whom is blind, collaboratively designed a variety of animated of the data in data visualisation [30].
tactile images. These were tested on a Graphiti RTD1 . Design con-
siderations include taking account of the lower resolution of most
RTD displays, and tailoring the refresh rate to the complexity of the
image. A limitation of the Graphiti display (like many RTDs), is a
slow refresh rate of up to 5 seconds for a full page. This meant that 2.2 Tactile Graphics
the animations are generally experienced as a sequence of discrete Tactile graphics, also known as raised line drawings, are recom-
frames rather than as fuid animations. mended as the best way for people who are blind or have low
Contribution 3: Comparison of animated tactile images with vision to access and understand diagrams with spatial informa-
a sequence of tactile graphics. We conducted a user study with tion [17, 42, 65]. The frst tactile graphics were produced manually
12 BLV participants comparing animated tactile images shown on using collage and copied using pressed wet paper (e.g. [40]) or ther-
the Graphiti versus traditional tactile graphics. We presented four moforming, and collage remains the most popular technique used
image sequences in both formats. Overall, participants were better by vision specialist teachers in schools [58]. Since the widespread
able to understand change and gain a sense of movement using the adoption of digital production in the 1980s, swell paper and em-
RTD and drew parallels to animation and movies. However, tactile bossing have become the most popular tactile graphics production
graphics were generally preferred for understanding of a static techniques [66]. Discriminability of textures and lines are the most
diagram due to higher resolution and superior textural and colour important factors determining user preference for tactile graphics,
contrast. An unexpected advantage of RTDs over traditional tactile with variable heights and visual contrast as welcome additions [4].
graphics is that images can be immediately modifed in response to While there are many internationally recognised and detailed
viewer feedback. guidelines on the design of tactile graphics (e.g. [17, 25, 65]), they
The research presented here is a frst step in exploring the po- provide little or no advice on how to represent moving images.
tential of tactile animations. We hope that it motivates greater ex- Instead, there is an assumption that tactile graphics will be based
ploration of this new media for touch readers. Our fndings provide on static images, usually from text books. These may use small
guidance for future designers creating content for touch readers on multiples or annotations like arrows to convey change and move-
RTDs. They also provide guidance on the design of future RTDs, ment (Fig. 2a&b). Due to the high storage and/or material costs of
suggesting that increasing the refresh rate is less important than some tactile graphics, when a large number of diagrams is required,
increasing the resolution. as with small multiples, they are often reduced in size. This can
negatively impact their legibility by touch.
2 RELATED WORK Another approach to movement in tactile graphics, employed
mainly for young children, has been the addition of moving parts
2.1 Animated Graphics to a base diagram. For example, a cardboard clock hand may be
There are three standard approaches to visually showing change attached to a tactile graphic of a clock using a split pin (Fig. 2c), or
over time. These rely on creating a sequence of frames, each frame a bead may be attached to a thread placed across the tactile image
a snapshot of the scene at a particular instant in time. The frst to denote movement along a route [23, 25]. Whilst providing a high
level of engagement, these graphics must be handmade, they are
1 http://www.orbitresearch.com/product/graphiti/ often bulky and they must be stored in hard copy.
Animations at Your Fingertips ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 2: Examples of traditional approaches to conveying movement with tactile graphics: (a) small multiples depicting solar
eclipse stages (b) using arrows to signal the movement of melted sulfur during the Frasch process (c) moving parts (clock
hands attached with a split pin) on a tactile graphic and (d) a manipulative to accompany a static tactile diagram (Fleximan by
Hungry Fingers)

2.3 Refreshable Tactile Displays (RTDs) limitations for touch input and made subsequent suggestions re-
Refreshable tactile displays date back as far as 1931 [1], and some garding the design of refreshable graphics to be used for this pur-
such as the Optacon have even been used to covert images to pose [59, 61].
refreshable pins on a small display [19, 22]. Since the advent of
maker technologies and an injection of funding spurred by the 2.4 Applications of RTDs
DAISY Consortium’s Transforming Braille Project [69], a range of
The most basic research into RTDs examines perception based on
competing technologies have come to market or are in prototype
pin density, height and other factors like vibration [27, 37, 54]. Initial
phase. The majority of refreshable tactile displays consist of a grid
research into their use has focused on the display of static images
of pins controlled by electro-mechanical actuators [72]. Resulting
in the felds of art [29], maps [18, 34, 43, 51, 67, 75] and textbook
from the HyperBraille project2 , Metec have been selling refreshable
diagrams [36, 52, 57]. Most notably, O’Modhrain and colleagues ex-
graphics displays since 20123 . More recently, Orbit Research has
plored the possibilities and potential pitfalls of refreshable graphics
released Graphiti, a tactile display of 60×40 refreshable pins that can
displays from the perspective of vision impaired users [55]. They
be raised to four diferent heights4 . Looking forward, Humanware
point out that most refreshable tactile displays have low fdelity,
is working in partnership with the American Printing House for the
therefore images must be simplifed and carefully designed to be
Blind to develop a Dynamic Tactile Display (DTD) similar in size
readable on a refreshable tactile display.
to Graphiti and spaced comparably with braille [24]. Meanwhile,
There has also been considerable interest in using RTDs to give
Dot Inc is developing Dot Pads, a refreshable pin display with
real-time feedback to assist people who are BLV when creating
the capability to zoom and pan5 . Also in development, the Blitab
graphics. They have been used for drawing [13, 38, 53], plotting
Android Tablet has 14 rows of 23 6-dot braille cells6 and is designed
mathematical charts [3, 39] and for 3D modelling [47, 68]. Similarly,
for displaying both text and images, and TTPAT are developing
Bornshein, Prescher and colleagues have investigated the use of
the TouchPad Pro with dots raising to varying heights and a colour
touch-sensitive refreshable displays for collaborative creation of
display7 .
tactile graphics [14–16]. The advantages of being able to update
Fixed grids of movable pins or braille dots are not the only ap-
graphics or store multiple versions on an RTD have also been
proach for creating refreshable tactile displays. Some researchers
explored [18, 43, 45].
have attached a small grid of pins to a mouse; as the mouse is
moved, the pins are updated refecting the mouse’s position on a
large virtual grid [41]. Disney Research used water jets on a fexible 2.5 Animated Tactile Images
screen [62, 63]. Surface haptic devices such as the TeslaTouch [9] There has been much less research into the use of RTDs to show
use electrostatic resistance on a fat screen, however detection is change over time by presenting animated tactile images.
limited to very simple shapes using this method [71]. Weber and colleagues have investigated zooming and panning
The touch-feedback feature of tactile displays can also been of a map and UML diagrams on an RTD [48, 60, 75, 76]. A pilot
used to support audio feedback. Researchers have explored the evaluation suggested BLV participants found it difcult to explore
a map using panning [76].
2 http://hyperbraille.de/project/ Ohshima, Kobayashi and colleagues developed and tested a sys-
3 https://metec-ag.de/produkte-graphik-display.php
tem to represent a football match on an RTD [37, 54]. They reported
4 http://www.orbitresearch.com/product/graphiti/
5 https://pad.dotincorp.com/ that of their seven participants “several participants commented
6 https://blitab.com/ that they could distinguish the detailed movement of the play-
7 https://tppat.com/the-touchpad-pro/ ers” [54]. Jung and colleagues presented animated games like pong
ASSETS ’22, October 23–26, 2022, Athens, Greece Holloway et al.

on a small prototype RTD. Users held their hand fat on the de- that the respondents are not a homogeneous group, potentially
vice and were able to locate stimuli within a 13mm a margin of difering in their priorities.
error [35].
A range of other methods have also been trialled to create tactile 3.3 Results
animations. Disney Research investigated the use of their water-jet As shown in Table 1, there is a high level of interest in moving tactile
display for visualisation of freworks displays, which was evaluated images. In general, learning support for children and information
as enjoyable by most of their BLV participants [62, 63]. The Phan- for adults were seen as more important than sport and entertain-
tom device has been used to haptically explore 3D objects moving ment applications. More specifcally, the most popular selections
in virtual reality, e.g. [10, 47]. Slide-tone and Tile-tone use motors were astronomy (e.g. planetary orbits, eclipse, the big bang), ge-
to move a user’s fnger along a graph line [26] and Robographics ographic movement (e.g. shifting of the continents over time,
uses motorised robots to indicate key areas on a graphic [28]. How- tectonic plate movement, erosion, waves), maps with movement
ever, none of these methods are applied to representational tactile along a route, rotation (e.g. geometric shapes rotating, the earth
graphics. spinning), biology concepts (e.g. plant growth, embryo growth,
To date there has not been a systematic investigation of using mitosis and meiosis), physics animations (e.g. projectile motion,
animations to show change in tactile graphics over time. Here we pulleys, pendulum movement, airplane wings), maps with move-
address three fundamental questions: When and for what types of ment in space (e.g. stage directions, dance positions, etc.) and hu-
graphics are they useful? What are design guidelines for creating man movement (e.g. dance moves, exercise moves, sports moves,
them? And how do they compare with traditional tactile graphic swimming strokes).
presentations? In addition, participants made a number of further suggestions
for moving graphics for education (analogue and digital output
3 CONSULTATION from oscilloscopes, plotters and curve tracers; CAD/CAM software
We began by consulting with the vision impaired community to for schematics and CNC machining work; chemical reactions; dy-
determine whether tactile animation is of interest and, if so, what namic charts for maths and statistics; historical maps; sex educa-
type of animations would be of most value. tion), for orientation and mobility (transport routes with move-
ment), sports (team sport positions; martial arts), for entertain-
3.1 Method ment (movies, tv or videos), and to assist with content creation
(videos; presentation slides). Some further suggestions were given
A list of possible animations was derived through multiple brain-
for static images.
storming sessions with the research team, cross-referencing against
school textbooks, and in consultation with touch readers. This list,
4 INITIAL DESIGN EXPLORATION
as given in Table 1 but with accompanying examples, was presented
in survey format. Respondents were asked to indicate which they 4.1 Graphiti
would be interested in having made available and to provide further
examples of tactile animations that they would recommend.
The survey was distributed online using Google Forms. It was
tested for accessibility and keyboard shortcuts were provided. The
survey link was shared through a participants pool, social media
and listservs.

3.2 Participants
A total of 26 people responded to the survey. Nineteen were touch
readers and seven were sighted members of the BLV community –
parents of touch readers(n=2), accessible formats producers(n=3)
and vision specialist teachers(n=3). The average age was 52 years
(sd=13.7), ranging from 20 to 75. The vast majority resided in Aus-
tralia (n=17), with the remainder in North America (n=7) or New
Zealand (n=1). There was a reasonable gender balance, with 14
women, 10 men and 1 non-binary. Education level was high, with
17 respondents (68%) holding a Bachelor’s degree or higher, com-
pared with 24% of the Australian adult population in 2016 [6]. Of
the touch readers, 16 were totally blind and 2 were legally blind.
Twelve were blind from birth and two had acquired their vision
impairment at age 15 or over. The majority of the touch readers Figure 3: Graphiti tactile display consisting of 60×40 refre-
(n=12) considered themselves up-to-date with new technology and shable pins, showcasing an ocean wave
a further fve were early adopters. Overall, the self-selected sample
are well-educated and technology savvy, potentially representing The test images were presented using a Graphiti tactile display,
the future early adopters of RTDs. However, it is acknowledged commercially available from Orbit Research [56] for USD$25,000.
Animations at Your Fingertips ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Interest in moving diagrams by survey respondents who are BLV and others (parents, educators and accessible format
producers

BLV Others Total


Suggested topic
(n=19) (n=7) (n=26)
Education and science
Astronomy 18 7 25
Geographic movement 17 6 23
Rotation 16 5 21
Physics animations 15 5 20
Biology concepts 15 5 20
Orientation & Mobility
Map with route 17 7 24
Map with movement 15 6 21
Trafc movement 15 4 19
Concept development
Modes of fying 16 5 21
Human movement 15 6 21
Animals in motion 14 5 19
News and information
Weather 14 6 20
Emergency mapping 14 5 19
Disease transmission modelling 11 4 15
Migration patterns of people or animals 10 4 14
Entertainment and games
Simple computer games 10 7 17
GIFs 7 6 13
Sport
Players and ball on a football feld 9 5 14
Players and ball on a tennis court 8 6 14
Race on a straight track 8 4 12
Race around a circuit 7 5 12

The Graphiti device measures 29.5×26.9× 4.1 cm and provides an standards (e.g. [17, 65]) by Holloway, an experienced tactile graph-
array of 60 (horizontal)×40 (vertical) refreshable pins spaced 2.1 ics producer. A range of diferent software programs were used to
mm apart (Fig. 3). Each pin can be raised to one of four heights (0.5 create and edit the graphics: Graphiti PC Utility, AseSprite, Libre-
mm, 1 mm, 1.5 mm, and 2.0 mm) allowing images to be displayed Sprite, Excel, Adobe Illustrator and MS Paint. For each sequence
in relief. The pins are noticeably larger than standard braille dots. of RTD images, corresponding tactile graphics were created using
Below the tactile display, the Graphiti has eight input keys, a capsule paper or collage as these techniques are most popular in
space bar, and a navigation pad with four directional buttons (up, production houses [66] and schools [58].
down, left, right) along with a select button. These controls were These test images were explored by Stephens, who is blind and a
not used in our study as the images were loaded through a HDMI competent touch reader. Improvements to the graphics were made
connection to a host PC. as a result of her feedback, for example simplifying or enlarging.
The frame rate on the HDMI connection can be confgured to As described by O’Modhrain, we found that the low fdelity of the
display an image every 1 to 15 seconds. Each line of the image RTD means that graphics must be greatly simplifed [55].
refreshes progressively from the top down. Up to fve seconds are
required for a refresh, but usually less as only the pins being raised 4.2.1 Map with Route. One of the frst test images was a map of the
or lowered are refreshed. route to our workplace. A few methods were trialled to dynamically
show movement along the route (Fig.4f). The favoured option was
to show the full route on the base map then move a single-pin
blinking cursor alongside the route as it was verbally described.
4.2 Creation of Sample Graphics This allows a touch reader to follow the description but also to
We began by creating a broad range of image sequences to exper- independently explore the route. Using the blinking cursor to draw
iment with how best to design images for an RTD and confrm attention was preferable to the invasive practice of hand-over-hand
which types of graphics might work best as tactile animations (Fig. direction of touch [21]. We were also able to use the blinking cursor
4). All graphics were created in accordance with tactile graphics to aid discussion of other landmarks and routes not already marked
ASSETS ’22, October 23–26, 2022, Athens, Greece Holloway et al.

Figure 4: Some of the sample graphics created for presentation on an RTD in the development phase: (a) map of a complex
multi-level train station (b) alignment of the sun, earth and moon prior to a solar eclipse (c) cat twitching its tail (d) plant
growth (e) Elvis Presley dance moves (f) workplace map (g) cat with moving eyes (h) fetal development. With the exception of
the maps, each graphic is part of a sequence of images to show movement or change.

Figure 5: Input images (top row) and RTD images (bottom row) representing a gull in fight. This sequence was very difcult
to understand by touch.

on the map. This aligns with prior work fnding that blinking pins A second attempt depicted an owl in fight, seen from the front.
were the best method of highlighting a static route and important This eliminated the problem of occlusion, however it was difcult
landmarks on a map presented on an RTD [34]. to understand the horizontal spreading of the wings when they
were beating down compared with the folding of the wings when
they were lifting upwards. The number of images in the sequence
4.2.2 Birds in Flight. Two attempts were made to illustrate a bird in was greatly increased to a total of 10 per wing cycle, allowing the
fight. Firstly, a fying gull was depicted in profle view in a sequence sequence to be viewed at a faster rate without the refresh direction
of four images (Fig. 5). However, the concept was not conveyed causing confusion. The size of the image was also reduced so that
successfully as there was a big change in wing position from one the full width of the wings could more easily be felt under the hands
image to the next. This was exacerbated by a slow refresh rate and the refresh was faster.
from the top to the bottom of the page, such that the gull may have
two sets of wings during the refresh. Moreover, even though the
wings were shown at a higher level than the body, it was difcult 4.2.3 Human Movement. Two sets of diagrams were created to
to distinguish them when held in line with the body. demonstrate human movement. The frst illustrated sequential
Animations at Your Fingertips ASSETS ’22, October 23–26, 2022, Athens, Greece

poses of Elvis Presley dancing (Fig. 4e). It provided a good im- baby curls up in the later months it was impossible to distinguish
pression of the dance moves that Stephens had heard about but not the limbs (Fig 4h). Refnement of the images using diferent dot
understood until seeing it on the RTD. heights may have helped to some extent.
While Elvis Presley’s dance moves were interesting, they served
purely as entertainment. We next considered human movement for 4.2.6 GIFs. Two of the original test images were animated GIFs
instructional purposes, creating a series of six diagrams to illustrate from social media–one of a cat looking from side to side (Fig. 4g),
the moves in a Tai Chi sequence. After some experimentation, we and another of a cat twitching its tail (Fig.4c). They were selected for
chose to show the eye(s) and nose on the face to give an indicator of trial because the movement was simple and fast. However, while
the direction of the head; raised dots (buttons) down the centre front the graphics themselves were clear, the purpose of the GIFs to
of the body to show its direction; and limbs shown with lower dots convey emotional states (puzzlement and annoyance) was unsuc-
if they were further away than the body or higher dots if they were cessful because touch readers have a more limited visual vocabulary.
closer than the body (Fig. 9). This representation was successful Very simplifed facial expressions may be easier to understand, and
and Stephens liked being able to look at the pose, try it herself and therefore more in keeping with the original intention of GIFs.
get verbal feedback regarding any minor adjustments that were
required. This was much preferred over having someone try to 4.3 Learnings
fully describe the poses or physically guide her into the correct 4.3.1 Capitalising on Height to Convey Information. Several of the
positions. diagrams were unsuccessful on the RTD due to a difculty in distin-
guishing between diferent areas of the diagram, such as the wings
4.2.4 Solar Eclipse. Astronomy was the most wanted requested
on the gull, the eyes on the cat and limbs on the fetus. The gull
application for moving tactile images in our survey. To represent a
was produced using standard image software then converted auto-
solar eclipse, we began with a representation of the light from the
matically by the Graphiti software. Direct editing of the individual
sun, which was blocked by the path of the moon over a series of
pixels and their heights may have given a better result. Likewise,
26 steps shown in quick succession. While this sequence gave an
the fetus diagrams could have been improved with greater use of
adequate representation of the phenomenon of a solar eclipse, it did
contrasting heights, with level 1 for the base shape and level 3 or 4
little to explain the causes. We next produced an 8-image sequence
to highlight the the most important features.
depicting the movement of the earth around the sun (slowly) and
the moon around the earth (more quickly) up to the point where 4.3.2 Resolution. The low resolution of the RTD meant that less
the moon blocks the light of the sun in a solar eclipse (Fig 4b). detail could be included in a single RTD diagram compared with a
When used together, the two diagram sequences enabled a good tactile graphic. For example, more braille labels and an additional
understanding. feature (stairs) could be included on the tactile graphic version
4.2.5 Science. Growth and change through time are a common of a city square map. Furthermore, curved shapes were difcult
topic of inquiry in science, especially biology and earth sciences. to convey with a fxed pin matrix, and prior research has found
To depict plant growth from a seed, we began with a horizontal that tactile shape recognition is worse for pins compared with
(landscape) alignment, however it was difcult to show all of the continuous lines [27].
important details so we turned the image (and RTD) on its side to 4.3.3 Refreshing from the Top of the Page. The Graphiti display
portrait orientation to use more of the display space. The soil was refreshes its pins progressively from the top row to the bottom.
initially depicted as low dots but it was difcult to distinguish from This afected reading of the graphics. Stephens’ favourite graphic
the seeds and roots, so the diagram was simplifed to show only sequence depicted the formation of a waterfall, beginning with
the line of the top of the soil. This greatly improved contrast and water running from left to right along the top of the display then
understanding. dropping further down as the ground erodes. She was able to follow
The formation of a wave was confgured for the RTD as a series the water as the waterfall dropped downwards. Conversely, she
of seven diagrams with the water represented as the highest level had a lot of difculty interpreting a side view graphic of gull in
dots and the sand at level 2. A row of blank dots was reserved fight because when the wing was raised, the previous wing was
between the water and the sand to aid tactile distinction. A single still visible while the display was refreshing. Images that changed
wave changed shape and moved from the ocean on the left to the from side to side, such as a wave forming from left to right, were
beach on the right in each successive diagram (Fig 6). less impacted by the refresh from top to bottom. When using RTDs
The formation of a waterfall over time was based on a series of that refresh line-by-line, the refresh direction is a limiting factor
fve simple print graphics that could be translated quite closely on and design consideration for the display of sequential images.
the RTD. Water was shown as level 1 dots, hard rock as level 4 and
soft rock as level 3. Once again, a row of blank dots was reserved 4.3.4 Procedure for Presenting Moving Images on a Refreshable Tac-
between elements to assist with tactile distinction (Fig. 7. tile Display. A procedure for best presenting moving tactile images
Both the wave and waterfall formation series were considered was developed through trial and error while exploring prototype
highly successful – they were simple to understand and the move- image sequences with Stephens. Guidelines for teaching with tactile
ment was easy to follow, in part because it followed a predictable graphics state that an overview should be given frst [4, 23, 31]. Ac-
path in a discrete area of the display. cordingly, we began by giving a brief but precise verbal description
Stephens requested a representation of human fetal development, of the frst graphic in the sequence, beginning with an overview
thinking that it might be useful for blind parents. However, as the and then giving the location and height of the key features.
ASSETS ’22, October 23–26, 2022, Athens, Greece Holloway et al.

Figure 6: Sequence of seven graphics illustrating the movement of a wave for presentation on a refreshable tactile display,
progressing from top left to bottom right, and equivalent collage tactile graphic

5 EVALUATION cursor. The corresponding tactile graphic was produced on A4


In order to better understand the relative benefts and considera- size swell paper with straight lines for roads and solid flled areas
tions for using an RTD for tactile animations, we ran a user study for buildings. Due to the higher fdelity of swell paper, there was
comparing RTDs with traditional tactile diagram representations. enough space to include two additional street names and striped
areas for stairs. The route was shown as a dotted line created with
5.1 Materials a spur wheel. The test materials are shown in Fig. 8. A map was
included in the user study because it is one of the most common
Four tactile animations were selected for the user study. They repre-
types of graphics used by people who are blind or have low vision
sented the most popular topics from the online survey and showed
and because it allowed us to explore the use of a fashing cursor as
a range of diferent applications. Each image sequence was also
a diferent means of conveying movement on an RTD.
produced as corresponding tactile graphics, created in a manner
that was considered to be typical of standard practice to provide an
ecologically valid comparison. Descriptive text to accompany the 5.1.4 Tai Chi. As described in Section 4.2.3, a series of diagrams
graphics was scripted and kept as consistent as possible between were created to represent a series of Tai Chi movements. The dia-
the two presentation modes (Appendix A). gram was shown sideways (in portrait) on the RTD to allow details
5.1.1 Wave. The formation and movement of a wave was based such as the facial features and shirt buttons to indicate rotation.
on a print graphic showing a cross-section of the ocean and sea bed The corresponding tactile graphics were created on swell paper,
with several waves on a single diagram. As described in Section with the addition of spur wheel lines for the shirt buttons. Three
4.2.5, this was translated to a 7-image sequence of a single wave fgures were able to be placed on each page, albeit at a smaller size
moving towards the beach on the RTD for the sake of simplicity than on the RTD, and limbs were shown with a dashed line if they
and because each individual image has no associated material costs. were further away than the body. We began with a sequence of six
The tactile graphic could show more detail in one diagram and was diagrams, however after showing them to the frst two participants,
therefore based much more closely on the original print graphic. It we realised that this number was unnecessary and time-consuming
was constructed using collage with water represented using blue to interpret, so the number of diagrams was reduced to three for
cardboard covered in a smooth clear plastic, and sand represented the remaining participants. The test materials are shown in Fig. 9.
with a fne-grained sandpaper. The wave formation test materials
(Fig. 6) were chosen for the user study because the RTD animation
was thought to give a good sense of movement and animation. 5.2 Participants
A total of twelve blind adults took part in the user evaluations,
5.1.2 Waterfall. Waterfall formation was chosen for the user study
as detailed in Appendix B. Aged from 23 to 68 (x̄=46.9), half were
because it was thought to be a successful example that gave a
totally blind and half were legally blind. Seven had past or present
fairly straightforward translation of educational materials. The
visual experience that they could draw upon to understand visual
corresponding tactile graphics were likewise very similar to the
concepts, including three participants who had enough residual
original print graphics. They were created on fve separate A4 sheets
vision to assist in their understanding of the high contrast tactile
of swell paper with water represented as horizontal lines, hard rock
graphics. All were braille touch readers. While none of the partic-
as solid fll and soft rock as a textured fll. The test materials are
ipants used tactile graphics more frequently than monthly, only
shown in Fig. 7.
four considered themselves beginner users needing support in the
5.1.3 Map. A new map was created to represent Federation Square, use of tactile graphics. The remainder considered themselves pro-
a complex and well-known outdoor area. Buildings were shown as fcient (n=3) or expert (n=5) in exploring and interpreting tactile
high (level 4) dots and the roads were represented as low (level 1) graphics. The majority (n=10) considered themselves up-to-date
dots. Space allowed for only two braille labels. The route was not with technology, while one was an early adopter and one needs
shown on the base graphic, but instead represented by a moving support or encouragement to try new technologies.
Animations at Your Fingertips ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 7: Five stages in the formation of a waterfall for presentation on a refreshable tactile display (top) and on swell paper
(bottom)

Stephens. After viewing one cycle, the rate was adjusted until the
touch reader was satisfed.
After the participant had seen each diagram (either as a tactile
graphic or on the RTD), they were asked whether the static image
was understandable and whether they could detect what changed.
After they had seen both diagrams on a single topic, they were asked
which was better in terms of initial understanding, detecting what
Figure 8: Map of Federation Square for presentation on a re- had changed, getting a sense of motion, and which they preferred
freshable tactile display (left) and on swell paper (right) overall. They were also asked how the images might be improved,
whether they had learned anything, and whether an RTD is useful
for that type of diagram.
After the participant had explored all of the diagrams, they were
asked a series of open-ended questions about their preferred ani-
mations and suggestions for future use. They were also asked eight
questions from the System Usability Scale [20]. The term ‘system’
was replaced with ‘refreshable tactile display’ as the SUS tool has
been found to be robust to such wording substitutions [8].
Sessions were conducted at the University or the participant’s
home and took around 90-120 minutes to complete. Video and
audio recording was conducted. The audio was transcribed with
additional notes regarding observed hand positions.

5.4 Results
Participant preferences for the RTD sequences compared with the
Figure 9: Sequence of six Tai Chi poses for presentation on tactile graphics are given in Table 2. Overall, there was a moderate
a refreshable tactile display (top) and swell paper (bottom) preference for the tactile graphic for gaining an initial understand-
ing of the static display (RTD=17; same=10; TG=21), whereas the
RTD was preferred for detecting change (RTD=24; same=7; TG=17),
5.3 Method giving a sense of direction and motion (RTD=20; same=13; TG=10)
and overall (RTD=25; same=6; TG=17). However, there was a lot
Participants were shown all four tactile animations and the corre- of nuance in these preferences and the fndings difered for each
sponding tactile graphics. The order of presentation was counter- diagram.
balanced with participants split into four groups to adjust both the
order of diagrams and the order of tactile graphic versus RTD. 5.4.1 Representing Static Images on Refreshable Tactile Displays.
The diagram description was delivered verbally so that other Comparing the RTD with an equivalent tactile graphic for the
factors would not infuence the results, such as braille reading presentation of static images, the most successful image was the
ability, available space for braille labels, or diferent mechanisms Tai Chi moves, for which the static image was considered easier
for audio labels on the refreshable display compared with the tactile to understand (RTD=7, same =1, TG=4). This is because the use of
graphic. pin heights was intuitive and aided in understanding of the body
The diagrams were shown one at a time until the touch reader position (11 comments) much more than the use of solid and dashed
was ready to go to the next. Once they had understood the static di- lines on the tactile graphics (5 comments). Participant 1 said of the
agrams, the images on the refreshable display were run in sequence RTD diagram, “This has heights. It makes it more real.” This fnding
with a cycle speed between 2 and 8 seconds, as predetermined by was also refected in responses to the map: “If you’re thinking 3D,
ASSETS ’22, October 23–26, 2022, Athens, Greece Holloway et al.

Table 2: Preferred format for understanding the static image Table 3: Average desired frame rate on the Graphiti display
(‘Understanding’), detecting the change from one image to (secs), and number of participants (n=12) who learned some-
the next (‘Change’), giving a sense of direction and motion thing new from viewing the graphics (in either format) and
(‘Movement’) and preference overall (‘Overall’). thought that RTDs are useful for this type of graphic.

Refreshable Tactile Graphic Frame rate Learned RTD useful?


Preference Same None
Display Graphic
Wave 2.25 10 11
Understanding 17 10 21 0 Waterfall 7.17 10 10
Wave 3 3 6 0 Map n/a 9 11
Waterfall 3 3 6 0 Tai Chi 9.18 6 11
Map 4 3 5 0
Tai Chi 7 1 4 0
5.4.2 Representing Sequences and Moving Images on Refreshable
Change 24 7 17 0
Tactile Displays. The materials for this study were chosen specif-
Wave 12 0 0 0
cally to examine the representation of change and movement, where
Waterfall 6 3 3 0
change was defned as being able to detect what had changed from
Map 0 2 10 0
one image to the next. The RTD was rated as somewhat better in
Tai Chi 6 2 4 0
conveying change (RTD=24, same =7, TG=17).
Movement 20 13 10 5 Movement was defned as giving a sense of the moving object’s
Wave 7 4 1 0 direction and motion. Again, the RTD performed better than tactile
Waterfall 4 6 2 0 graphics for conveying movement (RTD=20, same=13, TG=10). The
Map 3 3 5 1 wave diagram was the most successful in terms of giving a sense
Tai Chi 6 0 2 4 of movement and animation (RTD=7, same=4, TG=1).
Overall 25 6 17 0 “Now this is the power of the Graphiti, because you
Wave 8 1 3 0 can feel it coming, it’s like looking at a motion picture.”
Waterfall 6 1 5 0 [P5]
Map 5 3 4 0 This success may be attributed to the simplicity of the diagram
Tai Chi 6 1 5 0 with changes localised to a small area and minor changes from one
diagram to the next, allowing the sequence to be shown quickly,
with an average preferred speed of 2.25 seconds per frame. Unlike
the buildings are taller. It’s more intuitive” [P5]. Thus, RTDs with the formation of a waterfall, waves moving towards the beach was
variable heights may be particularly well suited to diagrams where a sequence that could be viewed almost in real time.
height or distance is important. The waterfall was also successful in terms of giving a sense of
The Tai Chi graphics were also more successful on the RTD movement on the RTD, although there was only a slight preference
because the larger size enabled a clearer distinction of the nose over the tactile graphics for this purpose (RTD=4, same=6, TG=2).
and eyes (4 comments). While this size diference is due to a design Because the areas of change were larger and slightly more complex,
decision made by the researchers, we believe it refects real-world the waterfall was viewed at a slower frame rate of 7.1 seconds on
practicalities – space is conserved on expensive swell paper whereas average. The direction of refresh from top to bottom matched the
there is no cost per page for display on an RTD. movement of the water and rocks from the top of the waterfall
By contrast, the static RTD was considered less successful than down to the plunge pool, which added to the sense of animation (1
the tactile diagrams for both the wave and waterfall diagrams comment).
(RTD=3, same = 3, TG = 6). The tactile graphics were preferred The Tai Chi movements were the least successful in terms of
because the areas on the diagram were considered more distinct giving a sense of movement. Even though the RTD was preferred
(21 comments) and intuitive (8 comments). The use of a smooth over the tactile graphic (RTD=6, same=0, TG=2), four participants
surface to represent water and sandpaper to represent sand on said that neither format gave a sense of movement and they had not
the collage wave diagram was praised for being tactually distinct thought about the poses as being part of a sequence: “I didn’t think
and meaningful. “This [water] feels gorgeous. This [sandpaper] they were connected. It didn’t occur to me” [P5]. This problem may
represents sand well” [P2]. Additionally, the colour contrast on the have been due to the complexity of the diagram, combined with the
tactile graphics assisted people with low vision. The RTD provided large changes from one diagram to the next: “I can’t really follow
less contrast, particularly the separation of hard rock (level 4) and what’s going on here, there’s too much going on as it changes” [P6].
soft rock (level 3) on the waterfall diagram. Participant 9 suggested The sequence was viewed on the RTD at a slow frame rate of 9.2
using a pattern such as every second dot to represent the soft rock, seconds on average. In response to suggestions from participants,
and two other touch readers agreed that this provided a much better we created two additional diagrams with intermediate poses to
tactual contrast. It is clear that a greater efort must be made in the be viewed between the original diagrams on the RTD. The two
design of RTDs to ensure that there is adequate distinction between participants who viewed these diagrams agreed that it helped to
diferent regions of the graphic. connect the poses and give a better sense of movement.
Animations at Your Fingertips ASSETS ’22, October 23–26, 2022, Athens, Greece

Movement was defned diferently on the map, using a moving with advances in the technology, others are likely to remain as
cursor on the RTD compared with a static route on the tactile limitations inherent to the device. We did not ask explicitly about
graphic. There was no clear preference for one format over the advantages of the RTD, but participants commented on the fact
other to give a sense of movement on the map (RTG=3, same=3, that each electronic diagram shown on an RTD has zero associated
TG=5). The experience of following the cursor was enjoyed for material costs, allowing for access to a greater number of images
being novel and interactive (5 comments): “I liked following the (5 comments). In terms of usability, the average SUS score was
dot” [P2]; “it was fun” [P1]. However, it was sometimes difcult 79.9 (sd=6.2), which can be interpreted as ‘good’ [8]. However, this
to fnd the cursor (10 comments) and following a static route was result is difcult to interpret as the participants were not operating
defnitely easier (10 comments). the RTD themselves. Note also that scores were adjusted for the
We asked participants whether the sound of pins refreshing removal of questions 7 and 8 relating to the system integration and
helped them to interpret the diagrams or movement. Only two consistency, as several participants were unsure how to interpret
people agreed, stating that the sound helped them to locate the or answer these questions. The SUS is robust for the removal of
moving pins. A further fve participants suggested that the sound of one question [46] but results with two questions removed should
the pins refreshing served more as an indication that something was be interpreted with caution.
changing, “like the sound of a curtain when you switch between As researchers and accessible format producers, one of the
scenes in a play” [P9]. biggest advantages of the RTD for us was the ability to quickly
and easily change diagrams while discussing them with the touch
5.4.3 Hand Movements. Hand movements were central to the de-
reader. For example, we were able to move lines further away from
tection of change. Participants were asked about their hand move-
each other on the map, test whether tactile distinction was easier
ments and the researchers made independent observations. Seven
or more difcult with a blank line of separation between areas, and
of the twelve participants reported that they had adjusted their
trial diferent dot patterns. By contrast, it was much more dif-
hand movements for accessing graphics on the RTD. When using
cult for us to go back to the ofce to edit and re-print the tactile
the RTD they would hold their hands in the region where they
graphics. Four of the participants likewise commented on the value
expected the change to occur so that they could feel the change
of quick editing on the RTD: “one of the strengths of refreshable
under their hands (10 mentions), however this process relied on
displays is the ability to refresh and show what is needed at the time
memory of the frst image. By contrast, the tactile graphic provided
rather than predetermined” [P4]. This immediacy is of particular
the opportunity for parallel comparisons by exploring two images
value for the blind and low vision community to cater for individu-
simultaneously (4 mentions), with one hand on one image and the
alised needs, as there is wide diversity in terms of eye conditions,
other hand on a second image that was placed either beside or
co-occurrence with other disabilities, and user skills and experi-
underneath the the frst.
ence with tactile graphics. RTDs therefore address the difculty of
The wave and waterfall were considered best for detecting
modifying traditional tactile graphics, which has been identifed as
changes on the RTD because the region of change was predictable
problematic in prior work [2, 49, 64, 73].
and in a relatively small region. When following the cursor for the
map route, participants would use the fnger pads on both hands,
5.4.5 Potential Application Areas for Refreshable Tactile Displays.
lined up in the direction they expected the cursor to travel based on
Participants in the user study reported learning about the subject
the verbal description. Detecting changes in the Tai Chi poses was
matter from viewing the images presented as tactile graphics and on
much more difcult on the RTD because the arms and legs were
the RTD. As shown in Table 3, the majority of participants agreed
so spread out that they could not all be felt at once (6 comments).
that they had learned something about the subject matter from all
Stephens reported that it was easier to use a large area of the hand
of the diagrams except Tai Chi, for which only half agreed. Often,
to detect change on the RTD because the pins are so prominent.
what they learned was basic concepts such as the shape of a wave,
She tended to use the fnger pads to focus on the detail of what is
the structure of a waterfall, and an overview of an area that they
changing (for simple diagram) or the key area of interest (for more
had visited but not understood as a whole. For example, “I had sort
complex diagrams), combined with the fat of both hands to get an
of imagined a waterfall as a perfectly smooth curve whereas in
overview.
reality it wouldn’t be because of the rocks underneath it” [P8].
It was noted that the swell paper moved around while the partic-
As a novel device able to give the frst access to tactile animations
ipants were examining the diagrams; this was particularly problem-
without input from the user, it is not surprising that the RTD was
atic for a braille reader with use of only one hand. The RTD’s weight
found to be engaging (15 comments): “I get very excited when
was advantageous in this context as it was completely stable.
there are toys to play with” [P2]. This could help with student
5.4.4 Practical Considerations. Beyond user experiences, practical engagement and therefore learning in the classroom, particularly
considerations are also factors in the likely adoption and use of as it provides a tangible means for exploring visual narratives [74].
RTDs. When asked about the disadvantages of RTDs, participants As one participant commented, “I could sit here for ages watching
mentioned the limited contrast between diferent areas (7 partici- it. So cool” [P6].
pants), the low resolution or wide spacing between the dots (n=5), When asked what applications they might have for animated
the high cost (n=5), unreliability with the potential for individual tactile graphics in their life, all but one participant was able to give
dots to fail or for the whole device to break down (n=3), the weight examples. These included maps (n=5); for collaborating with charts
(n=2), the hard feeling of tall dots on the fngers (n=2), and the slow and process diagrams at work (n=3); for fun applications such as
refresh rate (n=1). While some of these issues may be overcome following live sports (n=2), learning dance steps or exercise moves
ASSETS ’22, October 23–26, 2022, Athens, Greece Holloway et al.

(n=2), playing games (n=1), watching demonstration videos (n=2) more complex diagrams require longer display times, and less
or watching the weather radar (n=1); and for concept development change between sequential diagrams allows faster display
such as learning about print notation (n=3), visual communication times.
such as hand gestures and body language (n=2) and trafc move- • If the display refreshes one line at a time, consider the di-
ment for orientation and mobility (n=1). While only one of the rection of the refresh when designing sequences of mov-
participants was a current student, many of them also suggested ing images. Movement in the opposite direction to the refresh
that RTDs would be especially useful for students to gain access to can be difcult to interpret.
educational materials (n=7). • Make use of the ability to edit images while they are be-
ing used. This study benefted from the ability to immedi-
6 DISCUSSION ately adjust diagrams based on feedback from touch read-
ers, either to improve the diagram or to cater to individual
6.1 Potential of Tactile Animations on needs and preferences. On-the-spot editing could also be
Refreshable Tactile Displays used to gradually add features (or remove braille labels) as
Feedback from participants in our studies makes it clear that there the learner’s understanding advances. Such an approach has
is a high level of enthusiasm for access to tactile animations. RTDs been proposed as best practice [12] but implementation is
can be used to show change over time and for some images they difcult using traditional tactile graphics.
hold advantages over traditional tactile graphics for conveying
movement, depth and height. However, this was not true for all
images and there were trade-ofs in terms of resolution and textural 6.3 Considerations for Refreshable Tactile
properties. Display Technology
In the best case, the changes to the image were localised and Our study reinforces that a limitation of current RTDs is their low
allowed the touch reader to feel the changes under their hand. resolution. Even with 2,400 pins, less detail could be conveyed
The wave and waterfall were favoured on the RTD because touch on the Graphiti display compared with an equivalent sized tactile
readers knew where to look for the changes, which they could then graphic. Moreover, the wide spacing between pins meant that it
follow as animations. Conversely, the Tai Chi diagram was the most was difcult to accurately perceive areas across the display simulta-
difcult to follow because it was not possible to follow the changes neously. These fndings align with a previous study in which shape
to the four limbs at the same time. recognition on RTDs was found to improve when the pin spacing
is decreased and pin array size is increased [27].
6.2 Design Considerations for Animated Labels and accompanying descriptions are always impor-
Graphics Shown on Refreshable Tactile tant to support understanding of tactile images due to the bottom-up
Displays nature of tactile image consumption [32, 50]. As there is less space
available for braille on RTDs compared with tactile graphics, other
The process of creating, using and sharing graphics on an RTD
strategies should be built into the devices, such as touch-triggered
provided insights into design considerations that are either specifc
audio labels or an accompanying braille display.
to RTDs or more important when using this format.
Use of high contrast or colour is of great value to touch read-
• Simplify: As the images need to be read quickly and on a ers who have some residual vision. White pins against a black
low-fdelity screen, moving tactile images should be very background did not provide sufcient contrast because it was dif-
simple, with less detail than on an equivalent tactile graphic cult to see whether they were raised. Designers of future devices
and much less detail than the original print graphic. This should consider further measures to provide visual cues, such as
aligns with O’Modhrain’s guidelines for tactile graphics on the coloured lights on the TouchPad Pro.
refreshable displays [55].
• Use height to distinguish diferent components within the
diagram. Specifcally, pin height can signify the height of ob- 6.4 Limitations and Future Work
jects seen from above, or to signify distance by representing During our study, we wanted participants to focus on exploring
closer elements with higher pins. Pin heights of 0.5mm, 1mm the tactile animations frame by frame in order to gather rich and
and 2mm were clearly distinguishable from one another. detailed feedback. Consequently, we did not ask participants to
• Texture patterns, such as stripes or alternative pins in a operate the RTD themselves or independently access the description
grid, can also be used to clearly distinguish large areas on of the graphic. When these tasks are added in an individual home
the diagram. setting, touch readers may lose their place in the graphic. A key
• Use blinking pins to direct attention to important areas of question is how best to support independent interaction while
the graphic. This recommendation is in line with prior work maintaining relative positioning of the hands and fngers. Future
with RTDs and travel route planning [34]. usability studies are required to better understand ease of use in
• Ideally, movement should be restricted to one region real-world settings.
of the diagram, especially since simultaneous movement in Another avenue of potential research lies in the drawing and
multiple regions can be difcult to follow by touch. collaboration capabilities of RTDs. This was of interest to several
• The ideal display duration depends on the complexity of of our participants. Although there is prior work in this area [3, 38,
the diagram and the amount of movement between frames – 39, 53], the use of touch sensitive RTDs with larger tactile surfaces,
Animations at Your Fingertips ASSETS ’22, October 23–26, 2022, Athens, Greece

audio feedback, and network capabilities ofers new possibilities [8] Aaron Bangor, Philip T. Kortum, and James T. Miller. 2009. Determining What
and warrants further research. Individual SUS Scores Mean: Adding an Adjective Rating Scale. Journal of
Usability Studies 4, 3 (2009), 114–123.
Lastly, design guidelines for tactile graphics are an integral tool [9] Olivier Bau, Ivan Poupyrve, Ali Israr, and Chris Harrison. 2010. Teslatouch:
for the transcription process. Even in our brief study, we have Electrovibration for Touch Surfaces. In Proceedings of the 23rd Annual ACM
Symposium on User Interface Software and Technology. ACM, New York, NY, USA.
already identifed a number of ways graphics need to be designed https://doi.org/10.1145/1866029.1866074
diferently for use on an RTD. These guidelines need to be expanded [10] Cristian Bernareggi, Dragan Ahmetovic, and Sergio Mascetti. 2019. µ Graph:
for a wider range of diagrams and users. Haptic Exploration and Editing of 3D Chemical Diagrams. In ASSETS ’19: The
21st International ACM SIGACCESS Conference on Computers and Accessibility
(Pittsburgh, PA, USA). ACM, New York, NY, USA. https://doi.org/10.1145/3308561.
3353811
6.5 Conclusion [11] Sandra Berney and Mireille Bétrancourt. 2016. Does animation enhance learning?
Given the considerable cost of refreshable tactile displays, inde- A meta-analysis. Computers & Education 101 (2016), 150–167.
[12] Donna Bogner, Ben Wentworth, and David Hurd. 2011. Visualizing Science and
pendent studies like this provide important information to help Adapted Curriculum Enhancements (ACE): Resource Manual. (2011). https:
potential consumers decide whether an RTD is a worthwhile in- //sites.google.com/site/mcrelace
vestment. To our knowledge, this is the frst study that has sought [13] Jens Bornschein, Denise Bornschein, and Gerhard Weber. 2018. Blind Pictionary:
Drawing Application for Blind Users. In CHI Conference on Human Factors in
to gain understanding of the most suitable use cases and efcacy Computing Systems. ACM, New York, NY, USA. https://doi.org/10.1145/3170427.
of RTDs for tactile animations. 3186487
[14] Jens Bornschein and Denise Prescher. 2014. Collaborative Tactile Graphic Work-
Our results suggest that RTDs have considerable potential for station for Touch-Sensitive Pin-Matrix Devices. In TacTT ’14 (Dresden, Germany).
use in education, orientation and mobility training, and collabora- [15] Jens Bornschein, Denise Prescher, and Gerhard Weber. 2015. Collaborative
tion in the workplace. While their use for entertainment was of Creation of Digital Tactile Graphics. In ACM SIGACCESS Conference on Computers
and Accessibility (ASSETS ’15) (Lisbon, Portugal). ACM, New York, NY, USA, 117–
some interest, support for education was seen as much more im- 126. https://doi.org/10.1145/2700648.2809869
portant. Unsurprisingly, RTDs demonstrated a particular strength [16] Jens Bornschein, Denise Prescher, and Gerhard Weber. 2015. Inclusive Production
for conveying motion, however some previously unconsidered ben- of Tactile Graphics. In INTERACT 2015: IFIP Conference on Human-Computer
Interaction (Bamburg, Germany). Springer, 80–88.
efts also emerged, such as allowing for on-the-spot editing and [17] Braille Authority of North America (BANA). 2010. Guidelines and Standards for
customisation for individualised needs. Further work is needed to Tactile Graphics. The Braille Authority of North America, USA. http://www.
brailleauthority.org/tg/
obtain a nuanced understanding of what diagrams are most suited [18] Luca Brayda. 2018. Updated Tactile Feedback with a Pin Array Matrix Helps
to the RTD. Blind People to Reduce Self-Location Errors. Micromachines (Basel) 14, 9 (2018),
This work provided the basis for a series of preliminary guide- 351. https://doi.org/10.3390/mi9070351
[19] Stephen A. Brewster, Steven A. Wall, Lorna M. Brown, and Eve E. Hoggan.
lines and design considerations relating to both the graphics being 2008. Tactile Displays. In The Engineering Handbook of Smart Technology for
presented on RTDs and also the design of RTD technology. We Aging, Disability, and Independence. John Wiley & Sons, Ltd, 339–352. https:
hope that these can be adopted to further explore development of //doi.org/10.1002/9780470379424.ch18
[20] John Brooke. 1986. SUS: a "quick and dirty" usability scale. Taylor and Francis,
this important technology as well as determine best practice for London, England.
portraying graphic material on these RTDs. [21] C. A. Cook Walker. 2015. Hands On? Hands Of! Future Refections Winter
(2015). https://nfb.org//sites/www.nfb.org/fles/images/nfb/publications/fr/fr34/
1/fr340101.htm
[22] James C. Craig and Carl E. Sherrick. 1982. Dynamic tactile displays. Tactual
ACKNOWLEDGMENTS perception: A sourcebook (1982), 209–233.
Thanks are extended to the touch readers who contributed their [23] Louise Curtin, Leona Holloway, and Debra Lewis. 2019. Documenting Tactile
Graphicacy. JSPEVI Journal of the South Pacifc Educators in Vision Impairment
time and expertise to the user study. Thanks also to Ann Cunning- 12, 1 (2019), 82–98.
ham for sharing her expert advice on diagram design for under- [24] Anne Durham. 2021. APH is Ready for a Braille Revolution. American Printing
standing through touch. House for the Blind website (2021). https://www.aph.org/aph-is-ready-for-a-
braille-revolution/
[25] Polly K. Edman. 1992. Tactile Graphics. AFB Press, Arlington, VA, USA.
[26] Danyang Fan, Alexa F. Siu, Wing-Sum Adrienne Law, Raymond Ruihong Zhen,
REFERENCES Sile O’Modhrain, and Sean Follmer. 2022. Slide-Tone and Tilt-Tone: 1-DOF Haptic
[1] 1931. Blind Can Read Any Book with Aid of Electric Eye. Popular Science Monthly Techniques for Conveying Shape Characteristics of Graphs to Blind Users. In
(1931). https://archive.org/details/blindcanreadanyb0000unse CHI ’22: CHI Conference on Human Factors in Computing Systems. ACM, New
[2] Dragan Ahmetovic, Niccolò Cantù, Cristian Bernareggi, João Guerreiro, Sergio York, NY, USA. https://doi.org/10.1145/3491102.3517790
Mascetti, and Anna Capietto. 2019. Multimodal Exploration of Mathematical [27] Nadia Garcia-Hernandez, N. G. Tsagarakis, and D. G. Caldwell. 2011. Feeling
Function Graphs with AudioFunctions.web. In W4A ’19: Proceedings of the 16th through Tactile Displays: A Study on the Efect of the Array Density and Size on
International Web for All Conference. Article 8, 1–2. https://doi.org/10.1145/ the Discrimination of Tactile Patterns. IEEE Transactions on Haptics 4, 2 (2011).
3315002.3332438 [28] Darren Guinness, Annika Muehlbradt, Daniel Szafr, and Shaun K. Kane. 2019.
[3] Peter Albert. 2006. Math Class: An Application for Dynamic Tactile Graphics. RoboGraphics: Dynamic Tactile Graphics Powered by Mobile Robots. In ASSETS
In ICCHP: International Conference on Helping People with Special Needs (Linz, ’19: The 21st International ACM SIGACCESS Conference on Computers and Acces-
Austria). Springer, 1118–1121. sibility. ACM, New York, NY, USA, 318–328. https://doi.org/doi.org/10.1145/
[4] Frances K. Aldrich and Linda Sheppard. 2001. Tactile graphics in school education: 3308561.3353804
perspectives from teachers. British Journal of Visual Impairment 19, 3 (2001), [29] Stanislav Gyoshev, Dimitar Karastoyanov, Nikolay Stoimenov, Virginio Cantoni,
93–97. https://doi.org/10.1177/026461960101900303 Luca Lombardi, and Alessandra Setti. 2018. Exploiting a Graphical Braille Display
[5] Daniel Archambault and Helen C. Purchase. 2016. Can animation support the for Art Masterpieces. In Computers Helping People with Special Needs: 16th In-
visualisation of dynamic graphs? Information Sciences 330 (2016), 495–509. ternational Conference, ICCHP (Linz, Austria), Klaus Miesenberger and Georgios
[6] Australian Bureau of Statistics. 2017. Educational Qualifcations in Aus- Kouroupetroglo (Eds.). Springer International Publishing, Switzerland, Part II,
tralia. https://www.abs.gov.au/ausstats/abs@.nsf/Lookup/by%20Subject/ 237–245. https://doi.org/10.1007/978-3-319-94274-2_35
2071.0~2016~Main%20Features~Educational%20Qualifcations%20Data% [30] Jefrey Heer and George Robertson. 2007. Animated transitions in statistical data
20Summary%20~65 graphics. IEEE transactions on visualization and computer graphics 13, 6 (2007),
[7] Paul Ayres, Juan C. Castro-Alonso, Mona Wong, Nadine Marcus, and Fred Paas. 1240–1247.
2019. Factors that impact on the efectiveness of instructional animations. Ad- [31] Morton A. Heller, Jefrey A. Calcaterra, Lynetta L. Burson, and Lisa A. Tyler. 1996.
vances in cognitive load theory: Rethinking teaching (2019), 180–193. Tactual picture identifcation by blind and sighted people: Efects of providing
ASSETS ’22, October 23–26, 2022, Athens, Greece Holloway et al.

categorical information. Perception & Psychophysics 58 (1996), 310–323. https: Greece). ACM, 1–7. https://doi.org/10.1145/2769493.2769502
//doi.org/10.3758/BF03211884 [53] Atsushi Nishi and Ryoji Fukuda. 2006. Graphic Editor for Visually Impaired
[32] R. A. L. Hinton and D. G. Ayres. 1987. The development of tactile diagrams for Users. In ICCHP: International Conference on Computers Helping People with
blind biology students. Journal of Visual Impairment & Blindness 81, 1 (1987), Special Needs (Linz, Austria), Klaus Miesenberger, Joachim Klaus, Wolfgang L.
24–25. Zagler, and Arthur I. Karshmer (Eds.). Springer, Switzerland, 1139–1146.
[33] Tim N Höfer and Detlev Leutner. 2007. Instructional animation versus static [54] Hiroyuki Ohshima, Makoto Kobayashi, and Shigenobu Shimada. 2021. Devel-
pictures: A meta-analysis. Learning and instruction 17, 6 (2007), 722–738. opment of Blind Football Play-by-play System for Visually Impaired Specta-
[34] Mihail Ivanchev, Francis Zinke, and Ulrike Lucke. 2014. Pre-journey Visualization tors: Tangible Sports. In 2021 CHI Conference on Human Factors in Comput-
of Travel Routes for the Blind on Refreshable Interactive Tactile Displays. In ing Systems (Yokohama, Japan (virtual event)). ACM, New York, NY, USA, 1–6.
ICCHP: International Conference on Computers Helping People with Special Needs https://doi.org/10.1145/3411763.3451737
(Paris, France), Klaus Miesenberger, Deborah Fels, Dominique Archambault, [55] Sile O’Modhrain, Nicholas A. Giudice, John A. Gardner, and Gordon E. Legge.
Petr Peňáz, and Wolfgang L. Zagler (Eds.), Vol. Part II. Springer International, 2015. Designing Media for Visually-Impaired Users of Refreshable Touch Displays:
Switzerland, 81–88. Possibilities and Pitfalls. IEEE Transactions on Haptics 8, 3 (2015), 248–57. https:
[35] Jingun Jung, Sunmin Son, Sangyoon Lee, Yeonsu Kim, and Geehyuk Lee. 2021. //doi.org/10.1109/TOH.2015.2466231
ThroughHand: 2D Tactile Interaction to Simultaneously Recognize and Touch [56] Orbit Research. 2016. Graphiti® - a Breakthrough in Non-Visual Access to All
Multiple Objects. In CHI ’21: Proceedings of the 2021 CHI Conference on Human Forms of Graphical Information. http://www.orbitresearch.com/product/graphiti/
Factors in Computing Systems. ACM, New York, NY, USA. https://doi.org/10. [57] Grégory Petit, Aude Dufresne, Vincent Levesque, Vincent Hayward, and Nicole
1145/3411764.3445530 Trudeau. 2008. Refreshable Tactile Graphics Applied to Schoolbook Illustrations
[36] Seondae Kim, Yeongil Ryu, Jinsoo Cho, and Eun-Seok Ryu. 2019. Towards for Students with Visual Impairment. In SIGACCESS Conference on Computers
Tangible Vision for the Visually Impaired through 2D Multiarray Braille Display. and Accessibility - ASSETS ’08 (Nova Scotia, Canada). ACM, 89–96.
Sensors 19, 23 (2019). https://doi.org/10.3390/s19235319 [58] Mahika Phutane, Julie Wright, Brenda Veronica Castro, Lei Shi, Simone R. Stern,
[37] Makoto Kobayashi, Yoshiki Fukunaga, and Shigenobu Shimada. 2018. Basic Holly M. Lawson, and Shiri Azenkot. 2021. Tactile Materials in Practice: Under-
Study of Blind Football Play-by-Play System for Visually Impaired Spectators standing the Experiences of Teachers of the Visually Impaired. ACM Transactions
Using Quasi-Zenith Satellites System. In ICCHP: International Conference on on Accessible Computing (TACCESS) (2021). https://doi.org/10.1145/3508364
Computers Helping People with Special Needs, Karl Miesenberger and Georgios [59] Christopher Power. 2006. On the Accuracy of Tactile Displays. In ICCHP: Interna-
Kouroupetroglou (Eds.). Springer, Switzerland, 23–27. tional Conference on Computers Helping People with Special Needs (Linz, Austria),
[38] Makoto Kobayashi and Tetsuya Watanabe. 2004. Communication System for Klaus Miesenberger, Joachim Klaus, Wolfgang L. Zagler, and Arthur I. Karshmer
the Blind Using Tactile Displays and Ultrasonic Pens – MIMIZU. In ICCHP: (Eds.). Springer, Switzerland, 1155–1162.
International Conference on Computers Helping People with Print Disabilities (Paris, [60] Denise Prescher and Gerhard Weber. 2017. Comparing Two Approaches of
France). Springer, Switzerland, 731–738. Tactile Zooming on a Large Pin-Matrix Device. In INTERACT 2017: Human-
[39] Christopher Alexander Kopel. 2021. Accessible SVG Charts with AChart Creator Computer Interaction – INTERACT 2017, Vol. 10513. Springer, 173–186. https:
and AChart Interpreter. Thesis. //doi.org/10.1007/978-3-319-67744-6_11
[40] Martin Kunz. c.1890. Abbildungen fur Blinde (Pictures for the Blind. Blind Institute, [61] Hrishikesh Rao and Sile O’Modhrain. 2019. Multimodal Representations of
Illzach, Germany. Complex Spatial Data. In CHI Conference on Human Factors in Computing Systems
[41] Ki-Uk Kyung, Seung-Chan Kim, and Dong-Soo Kwon. 2007. Texture display (Glasgow, Scotland). ACM.
mouse: vibrotactile pattern and roughness display. IEEE/ASME Transactions on [62] Dorothea Reusser, Espen Knoop, Roland Siegwart, and Paul Beardsley. 2017. Feel-
Mechatronics 12, 3 (2007), 356–360. https://doi.org/10.1109/TMECH.2007.897283 ing Fireworks. In UIST ’17 Symposium on User Interface Software and Technology
[42] Steve Landau. 2013. An Interactive web-based tool for sorting textbook images (Québec City, Canada). ACM.
prior to adaptation to accessible format: Year 1 Final Report. (2013). http:// [63] Dorothea Reusser, Espen Knoop, Roland Siegwart, and Paul Beardsley. 2019.
diagramcenter.org/decision-tree.html Feeling Fireworks: An Inclusive Tactile Firework Display. In CHI Conference on
[43] Fabrizio Leo, Tania Violin, Alberto Inuggi, Angelo Raspagliesi, Elisabetta Capris, Human Factors in Computing Systems (Glasgow, Scotland). ACM, New York, NY,
Elena Cocchi, and Luca Brayda. 2019. Blind Persons Get Improved Sense of USA.
Orientation and Mobility in Large Outdoor Spaces by Means of a Tactile Pin-Array [64] Patrick Roth, Hesham Kamel, Lori Petrucci, and Thierry Pun. 2002. A Compar-
Matrix. In CHI ’19 Workshop on Hacking Blind Navigation (Glasgow, Scotland). ison of Three Nonvisual Methods for Presenting Scientifc Graphs. Journal of
ACM, New York, NY, USA. Visual Impairment & Blindness 96, 6 (2002), 420–428. https://doi.org/10.1177/
[44] Daniele Leonardis, Loconsole Claudio, and Antonio Frisoli. 2017. A survey on 0145482X0209600605
innovative refreshable braille display technologies. In International Conference [65] Round Table on Information Access for People with Print Disabilities Inc.
on Applied Human Factors and Ergonomics. Springer, 488–498. 2005. Guidelines on Conveying Visual Information. Round Table on Informa-
[45] Vincent L’evesque, Grégory Petit, Aude Dufresne, and Vincent Hayward. 2012. tion Access for People with Print Disabilities Inc., Lindisfarne, Tasmania, Aus-
Adaptive level of detail in dynamic, refreshable tactile graphics. In IEEE Haptics tralia. https://printdisability.org/guidelines/guidelines-on-conveying-visual-
Symposium (HAPTICS) (Vancouver, BC, Canada). IEEE, 1–5. https://doi.org/10. information-2005/
1109/HAPTIC.2012.6183752 [66] Jonathan Rowell and Simon Ungar. 2003. The world of touch: An international
[46] James R. (Jim) Lewis and Jef Sauro. 2017. Can I Leave This One Out? The Efect survey of tactile maps. Part 1: production. British Journal of Visual Impairment
of Dropping an Item From the SUS. Journal of Usability Studies 13, 1 (2017), 21, 3 (2003), 98–104. https://doi.org/10.1177/026461960302100303
38–46. [67] Bernhard Schmitz and Thomas Ertl. 2012. Interactively Displaying Maps on a
[47] Sebastian Lieb, Benjamin Rosemeier, Thorsten Thormählen, and Knut Buettner. Tactile Graphics Display. In SKALID 2012 Spatial Knowledge Acquisition with
2020. Haptic and Auditive Mesh Inspection for Blind 3D Modelers. In ASSETS Limited Information Displays (Germany). 13–18.
International SIGACCESS Conference on Computers and Accessibility (Greece [68] Alexa F. Siu, Son Kim, Joshua A. Miele, and Sean Follmer. 2019. shapeCAD:
(virtual event)). ACM, New York, NY, USA. https://doi.org/doi.org/10.1145/ An Accessible 3D Modelling Workfow for the Blind and Visually-Impaired Via
3373625.3417007 2.5D Shape Displays. In The 21st International ACM SIGACCESS Conference on
[48] Claudia Loitsch and Gerhard Weber. 2012. Viable Haptic UML for Blind People. Computers and Accessibility (Pittsburgh, USA). ACM, New York, NY, USA, 342–
In ICCHP: International Conference on Helping People with Special Needs (Linz, 354. https://doi.org/10.1145/3308561.3353782
Austria), Klaus Miesenberger, Arthur Karshmer, Petr Penaz, and Wolfgang L. [69] Larry Skutchan. 2016. Transforming Braille. http://transformingbraille.org/blog/
Zagler (Eds.). Springer, Switzerland, Part II, 509–516. [70] Barbara Tversky, Julie Bauer Morrison, and Mireille Betrancourt. 2002. Animation:
[49] David McGookin, Euan Robertson, and Stephen A. Brewster. 2010. Clutching at can it facilitate? International Journal of Human-Computer Studies 57, 4 (2002),
straws: using tangible interaction to provide non-visual access to graphs. In CHI 247–262.
’10: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. [71] Cheng Xu, Ali Israr, Ivan Poupyrev, Olivier Bau, and Chris Harrison. 2011. Tactile
ACM, New York, NY, USA, 1715–1724. https://doi.org/10.1145/1753326.1753583 display for the visually impaired using TeslaTouch. In CHI Conference on Human
[50] S. Millar. 1999. Memory in touch. Psicothema 11, 4 (1999), 747–767. Factors in Computing Systems (Vancouver, BC, Canada). ACM, New York, NY,
[51] Tatsuo Motoyoshi, Sota Mizushima, Kei Sawai, Takumi Tamamoto, Hiroyuki USA, 317–322. https://doi.org/10.1145/1979742.1979705
Masuta, Ken’ichi Koyanagi, and Toru Oshima. 2018. Prototype Development [72] Wenzhen Yang, Jinpen Huang, Ruirui Wang, Wen Zhang, Haitao Liu, and Jianliang
of a Shape Presentation System Using Linear Actuators. In ICCHP: Interna- Xiao. 2021. A Survey on Tactile Displays For Visually Impaired People. IEEE
tional Conference on Computers Helping People with Special Needs (Linz, Austria), Transactions on Haptics 14, 4 (2021), 712–721.
Karl Miesenberger and Georgios Kouroupetroglo (Eds.). Springer International, [73] Koji Yatani, Nikola Banovic, and Khai N. Truong. 2012. SpaceSense: repre-
Switzerland, Part II, 226–230. https://doi.org/10.1007/978-3-319-94274-2_31 senting geographical information to visually impaired people using spatial
[52] Rahul Kumar Namdev and Pattie Maes. 2015. An interactive and intuitive stem tactile feedback. In CHI ’12: Proceedings of the SIGCHI Conference on Human
accessibility system for the blind and visually impaired. In PETRA: International Factors in Computing Systems. ACM, New York, NY, USA, 415–424. https:
Conference on PErvasive Technologies Related to Assistive Environments (Corfu, //doi.org/10.1145/2207676.2207734
Animations at Your Fingertips ASSETS ’22, October 23–26, 2022, Athens, Greece

[74] Bilal Yousuf and Owen Conlan. 2018. Supporting Student Engagement Through under the plunge pool and the hydraulic action have further eroded
Explorable Visual Narratives. IEEE Transactions on Learning Technologies 11, 3 the plunge pool and notch.
(2018), 307–320. https://doi.org/10.1109/TLT.2017.2722416
[75] Limin Zeng, Mei Miao, and Gerhard Weber. 2015. Interactive Audio-haptic Map
Explorer on a Tactile Display. Interacting with Computers 27, 4 (2015), 413–429.
https://doi.org/10.1093/iwc/iwu006
A.3 Map
[76] Limin Zeng, Gerhard Weber, and Ulrich Baumann. 2012. Audio-haptic you-are- Note that all participants were familiar with the area depicted on
here maps on a mobile touch-enabled pin-matrix display. In IEEE International
Workshop on Haptic Audio Visual Environments and Games (HAVE 2012) (Munich,
the map, but did not know all of the buildings.
Germany). IEEE. https://doi.org/10.1109/HAVE.2012.6374428 Both formats: This is a map of Federation Square, seen from
above.
Swell paper: Roads are shown as solid lines. Buildings are shown
A VERBALISED DESCRIPTIONS OF SAMPLE as flled shapes. Stairs are shown as striped blocks. A walking route
GRAPHICS is shown with a dotted spur wheel line.
These descriptions were provided verbally to accompany the sample RTD: Roads are shown as low lines. Buildings are shown as taller
graphics as they were explored tactually by touch readers in the blocks. The position I am describing will be indicated by a blinking
Evaluation phase. Pauses were inserted while the graphic was being pin.
explored, and additional guidance was given if needed to ensure Both formats: Flinders street runs along the top of the map,
that the key features had been identifed. Swanston street is to the left, Russel street is on the right, and
the Yarra River at the bottom.
Young and Jackson’s pub is at the top left corner. This is where the
A.1 Wave famous painting, Chloe, hangs. Moving right across Swanston street,
Collage: This diagram shows how waves form as they come in we fnd St Paul’s Cathedral. Further along Flinders street we come to
towards the beach. The diagram is a cross section showing the Hosier Lane, famous for its street art. After Hosier Lane is the Forum
water (raised smooth area) and sand (sandpaper on the lower right). Theatre. Moving down across Flinders street, there is NGV Australia.
RTD: This is the frst of seven diagrams showing a wave as it Below the NGV is ZINC, a venue for events like weddings. Moving
forms and comes in towards the beach. The diagram is a cross left from ZINC we come to BMW Edge, a multi-purpose theatre
section showing the water (highest dots) and sand (lower dots on that is open to the atrium (the space to the left of the NGV). Moving
the lower right, in a low triangular shape). left from the atrium we come to ACMI, the Australian Centre for
Both formats: The wind is blowing from the ocean (left) to the the moving image. On the corner is the Melbourne Visitor Centre,
beach (right). Starting from the left, the frst waves are beginning to which is currently undergoing construction to form an entrance
form as low lumps. The air pressure creates the waves, with upper to the new Town Hall underground train station. Crossing over
layer of air sink on either side of the wave. The wave gets higher as Swanston street, we come to Flinders street station. In the bottom
the sea foor gets higher. Friction from the beach slows the lower left corner, there is the Transport Hotel.
part of the wave but the upper part continues to move forward, When you are ready, let’s trace the route on the map. The route
forming a curve. After the wave breaks, swash moves up the beach goes from Flinders Street Station to the replacement buses for the
and backwash moves down. Frankston, Pakenham and Cranbourne lines. Let’s start at the corner
steps of Flinders street station. First we head to the pedestrian
A.2 Waterfall crossing at the tram stop on Swanston street. We cross the road
from Flinders street station to Federation square. The Information
Both formats: This is the frst of fve diagrams showing the creation
centre is on our left. We follow around the information centre to
of a waterfall over time.
the footpath on Flinders street. We pass by ACMI and the atrium
Swell paper: The diagram shows a cross section of the earth, with
on our right. We turn right when we reach Russell street. The bus
water at the top (horizontal stripes). The water is fowing down
stop is behind NGV Australia.
from left to right. There is a triangle of hard rock on the left, just
below the water. It is shown as solid fll. Below the hard rock there
is a large area of soft rock, shown with textured fll. A.4 Tai Chi
RTD: The diagram shows a cross section of the earth, with water Both formats: This series of three diagrams shows a sequence of
at the top (low dots). The water is fowing down from left to right. movements for Tai Chi. There are three fgures per page, progress-
There is a triangle of hard rock on the left, just below the water. It ing from left to right. Each fgure has a rectangular body with arms,
is shown with high dots. Below the hard rock there is a large area legs, feet, and a head with eyes and a nose. There may be one or
of soft rock, shown with slightly lower dots. two eyes, depending on the angle of the head. There is a dotted line
Both formats: Waterfalls are often formed where a layer of harder on the body, representing shirt buttons down the centre front of
rock overlays a layer of softer rock. Diagram 2: As the river passes the body.
over the softer rock, it is able to erode it at a faster rate, forming a Swell paper: Solid lines for the arms and legs indicate that they
step in the river bed. The water is lower on the right now, where are positioned in front of the body. Dotted lines for the arms and
the soft rock has eroded. Diagram 3: Erosion continues, and cuts legs indicate that they are positioned behind the body.
underneath the hard rock. Diagram 4: The tip of the harder rock RTD: The body in the centre is at height 2. Lower pins indicate
has collapsed and fallen into the plunge pool, because there was not that the arms or legs are further away than the body. Higher pins
enough support underneath it. Diagram 5: The rocks and boulders indicate that the arms of legs are closer than the body.
ASSETS ’22, October 23–26, 2022, Athens, Greece Holloway et al.

B EVALUATION PARTICIPANT PROFILES


Table 4: Profles for participants in the evaluation study, giv-
ing their age range, level of blindness (totally blind or legally
blind), onset of vision impairment, and self-rated level of
competency using tactile graphics.

ID age blindness onset TG competency


1 40s total acquired profcient
2 40s legal acquired beginner
3 50s total congenital beginner
4 50s total congenital expert
5 50s legal congenital beginner
6 20s legal acquired profcient
7 20s legal acquired expert
8 60s total congenital expert
9 60s total congenital profcient
10 30s legal acquired profcient
11 40s total congenital profcient
12 50s legal acquired beginner
Quantifying Touch: New Metrics for Characterizing What
Happens During a Touch
Junhan Kong Mingyuan Zhong Jacob O. Wobbrock
junhank@uw.edu James Fogarty wobbrock@uw.edu
The Information School, myzhong@cs.washington.edu The Information School,
University of Washingtonl jfogarty@cs.washington.edu University of Washington
Seattle, WA, USA Paul G. Allen School of Computer Seattle, WA, USA
Science & Engineering DUB Group,
University of Washington
Seattle, WA, USA
ABSTRACT 1 INTRODUCTION
Measures of human performance for touch-based systems have With the proliferation of modern touch-enabled devices such as
focused mainly on overall metrics like touch accuracy and target smartphones, tablets, watches, and other surfaces, touch input has
acquisition speed. But touches are not atomic—they unfold over become perhaps the most prevalent form of input to computer
time and space, especially for users with limited fine motor systems. As with mouse pointing and text entry [37, 48, 49, 59, 60],
function, for whom it can be difficult to perform quick, accurate understanding human performance with touch-based systems can
touches. To gain insight into what happens during a touch, we reveal opportunities for improved device and interface design.
offer 15 target-agnostic touch metrics, most of which have not To date, most measures of human performance with touch-based
been mathematically formalized in the literature. They are touch systems have mainly focused on overall performance like accuracy
direction, variability, drift, duration, extent, absolute/signed area and target acquisition speed (e.g., [3, 4, 19, 23, 24, 44]). However,
change, area variability, area deviation, area extent, absolute/signed a touch is not an atomic event; it unfolds over space and time
angle change, angle variability, angle deviation, and angle extent. [43]. Therefore, examining what happens during a touch might
These metrics regard a touch as a time series of ovals instead of yield insights into users’ touch behaviors and any underlying
a mere (G, ~) coordinate. We provide mathematical definitions causes of touch inaccuracy. It might also afford designers the
and visual depictions of our metrics, and consider policies for chance to improve device or interface designs, or enable software to
calculating our metrics when multiple fingers perform coincident usefully adapt to users’ touch behaviors at runtime. Particularly for
touches. To exercise our metrics, we collected touch data from accessibility, measuring what happens during a touch could help
27 participants, 15 of whom reported having limited fine motor designers make touch interactions more accessible. For example, if
function. Our results show that our metrics effectively characterize a system measures how much a user “drifts” on the screen while
touch behaviors including fine-motor challenges. Our metrics their finger is down before lifting, the system can enlarge their
can be useful for both understanding users and for evaluating widgets at runtime to accommodate.
touch-based systems to inform their design. A similar idea motivated prior work by MacKenzie et al. [37],
who formulated measures for what happens during a mouse
CCS CONCEPTS pointing movement, rather than just relying on overall speed
• Human-centered computing → Touch screens. and accuracy to understand pointing performance (Figure 1).
MacKenzie et al. formulated seven new accuracy measures that
KEYWORDS captured properties of the cursor’s path of movement. Building
Touch input, touch screens, touch metrics, human performance, on this work, Keates et al. [33] used MacKenzie et al.’s metrics
limited fine motor function. with people with motor impairments, introducing six new path
measures along the way. Hwang et al. [26] built upon this work
ACM Reference Format:
yet further, conducting submovement analyses of motion-impaired
Junhan Kong, Mingyuan Zhong, James Fogarty, and Jacob O. Wobbrock. 2022.
users with 12 additional metrics. In each case, examining the path
Quantifying Touch: New Metrics for Characterizing What Happens During
a Touch. In The 24th International ACM SIGACCESS Conference on Computers of movement by formalizing new quantitative metrics revealed
and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, why overall speed and accuracy differences emerged.
New York, NY, USA, 13 pages. https://doi.org/10.1145/3517428.3544804 Inspired by this prior work, we aim to understand why human
performance with touch screens emerges the way it does. Certainly,
some of our touch metrics have been previously used in individual
This work is licensed under a Creative Commons Attribution International 4.0 License. studies or in ad hoc ways, as was the case for some of the
aforementioned cursor measures. But few of our metrics have
ASSETS ’22, October 23–26, 2022, Athens, Greece been formalized mathematically so as to be operationalized in a
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10.
reusable, generalizable way. Furthermore, our metrics have not,
https://doi.org/10.1145/3517428.3544804 to the best of our knowledge, been applied to the study of touch
behaviors by people with and without limited fine motor function.
ASSETS ’22, October 23–26, 2022, Athens, Greece Kong, et al.

Figure 1: (a) MacKenzie et al. [37] defined metrics for quantifying what happens during a mouse pointing movement, revealing
underlying causes of speed and accuracy outcomes. For example, the variability (”wiggliness”) of a movement can be quantified.
(b) In an analogous fashion, we formulate 15 metrics for quantifying what happens during a touch, which is not atomic but
unfolds over time and space, even if only briefly. For example, a user might land outside a target, then slide into the target and
lift, creating touch ovals with various areas, orientations, and dimensions along the way.
To these ends, we formalized 15 target-agnostic touch metrics, because, in practice, it is generally infeasible for software to know
many of which are analogues inspired by MacKenzie et al. [37]. what a target is and whether a user was intending to touch [11, 12]
They are: touch direction, variability, drift, duration, extent, (This challenge becomes trivial in an artificial testbed, but not at
absolute/signed area change, area variability, area deviation, area runtime with software “in the wild.”) Despite being target-agnostic,
extent, absolute/signed angle change, angle variability, angle our touch metrics can nonetheless explain why overall touch
deviation, and angle extent. Instead of treating a touch as an atomic accuracy is low or high. Thus, our metrics can help understand
event or single (G, ~) coordinate, our metrics regard a touch as a users’ touch abilities and inform better designs of touch-based
time series of ovals approximating a finger’s contact area from systems that can adapt to these abilities.
finger-down to finger-up. Thus, a “touch process” [43] can contain The major contributions of this work are:
movement of the touch oval centroid and change of the touch oval • Formalizing 15 target-agnostic touch metrics to characterize
size, orientation, and shape. For each of our 15 metrics, we provide what happens during a touch;
a mathematical formula and intuitive description. We also include • Proposing and investigating three policies for handling
visual depictions of some of our metrics to convey their purposes. inadvertent multi-finger situations during a touch;
In addition, ambiguity arises when multiple fingers touch • Exercising our metrics in a formal study of 15 participants
concurrently. Although our metrics are defined on a per-finger with and 12 participants without limited fine motor function.
basis, we address the issue of how to handle multiple concurrent
touches. Specifically, we discuss three policies for defining the touch 2 RELATED WORK
process in multi-finger situations. The three policies are first-down, Our work attempts to “pry the cover” off a touch to characterize
longest-lived, and sum-of-all. These three policies converge to the what happens “inside” it. As noted above, prior work by MacKenzie
same definition in the case of a single-finger touch. Our study et al. [37] pried the cover off mouse pointing to understand more
results also show that all three policies are viable and result in than just speed and accuracy, but how pointing unfolds over time
similar conclusions, even if their specific values change. and space. To this end, MacKenzie et al. devised accuracy measures
To put our metrics through their paces, we conducted a user that include movement variability, offset, error, axis crossings,
study to collect touch data on a Microsoft Perceptive Pixel (PPI) and more for evaluating mouse pointing, revealing differences
interactive tabletop display. Our study collected complete touch among input devices. Keates et al. [33] extended MacKenzie et al.’s
oval information from 27 participants, 15 of whom reported measures to better understand pointing by people with limited fine
having limited fine motor function with different specific fine motor function. Through a study of mouse movements by users
motor challenges. We then computed all 15 of our touch metrics with motor impairments, Keates et al. characterized movement
on our data and ran both descriptive and inferential statistics. distributions and submovement characteristics, finding significant
Our results show that our metrics effectively characterize users’ differences between people with and without upper-body motor
touch behaviors. Specifically, we found that our metrics uncover impairments. Hwang et al. [26] further analyzed submovements by
differences among people with no, moderate, and severe fine people with motor impairments. These papers inspired the current
motor challenges. We also found that each metric correlates with a work1 and point to the value of characterizing input processes,
different subset of specific fine motor challenges. Additionally, we not just outcomes. That said, none of the metrics from MacKenzie
found that for multi-finger touches, the longest-lived policy and the et al. [37], Keates et al. [33], or Hwang et al. [26] can be applied
sum-of-all policy yielded the same conclusions in characterizing directly to touch. For example, most of their metrics require a
users’ touch processes, while the sum-of-all policy magnifies the known target and a task axis between the mouse position and
differences between user groups.
1 The current work also builds upon an ACM ASSETS 2021 poster [34]. This offered
Importantly, none of our 15 metrics are “target-aware” [2, 11];
only a subset of metrics and exercised only four of them on a previously collected data
rather, our metrics are “target-agnostic” [56], calculable using only set [14] that lacked complete touch-oval information (e.g., no oval major or minor
the touch data itself without knowledge of target locations or axes, no areas or orientations). The current work formulates more metrics, collects an
original data set, and exercises all metrics on complete touch-oval information.
dimensions. Being target-agnostic is an immensely useful property
Quantifying Touch: New Metrics for Characterizing What Happens During a Touch ASSETS ’22, October 23–26, 2022, Athens, Greece

target center; moreover, a mouse cursor only occupies a single (G, ~) touch-based systems are often inaccessible [8]. For example,
position at one time, whereas a touch occupies an area, abstracted Findlater et al. [13] studied users’ target acquisition performance
as an oval. Therefore, we developed custom metrics pertaining to on touch screens, finding that touch accuracy and usability issues
touch; furthermore, our metrics are target-agnostic, making them still persist for people with limited fine motor function. To improve
practical to deploy in actual systems [2, 11, 12, 56]. systems’ accessibility, Kane et al. [28] used accessible touch and
Holz and Baudisch [23, 24] sought to understand factors affecting gesture interactions in SlideRule to create the first finger-driven
touch performance by modeling how users perceive their own touch screen reader. Kane et al. [30] also developed Access Overlays for
input, specifically their intended versus actual point of contact. interactive tabletops—software-based “layers” that impose various
They did not formulate a set of metrics for quantifying touch, as interaction techniques on displayed content to make it more
we do here, but they devised the generalized perceived input point accessible to blind users. This approach was extended to physical
model [23] and the projected center model [24] to help characterize documents using computer vision in Access Lens [29]. Similarly,
perceptual aspects of the touch process. Their work examined users’ Guo et al. [20] used computer vision and crowdsourcing in StateLens
mental models through user studies, interviews, and analyses based to model dynamic touch screen interfaces, and provide guidance
on a series of pointing studies, identifying sources of inaccuracy for blind users while using existing inaccessible touch screens.
in touch devices as a result of the parallax between the top and Using hardware, Kane et al. [31] created Touchplates to provide
bottom of the finger. Bi and Zhai [6], using a different approach, physical-tactile landmarks when interacting with tabletop displays.
conceptualized touch input in target selection as an uncertain In a related fashion, Zhang et al. [61] created Interactiles to provide
process, and proposed a Bayesian touch criterion to statistically physical-tactile interactions on smartphones. Wacharamanotham
model the probabilistic distribution of touch selections. In other et al. [51] presented Swabbing for improving target selection
work, Bi et al. [5] proposed FFitts law to model finger-touch based accuracy for users with tremor. A similar technique proposed by
inputs on touch screens, which reflects relative precision due to Mertens et al. [40] called TRABING enables users to continuously
the speed-accuracy tradeoff and captures the absolute precision of move over desired targets, improving accuracy for people with
finger touch. tremor. Guerreiro et al. [19] studied characteristics of mobile
Beyond understanding, characterizing, and modeling touch touch screen interfaces to provide tools for better interface design
input, researchers have sought to improve touch performance for motor-impaired users. Montague et al. [41, 42] investigated
through the invention and evaluation of new touch input techniques. shared user models and user interface adaptivity for making
In early work, Potter et al. [45] evaluated three strategies for target touch applications more accessible, and examined motor-impaired
selection on touch screens, finding lift-off with an offset cursor to be touchscreen interactions in the wild. Sarcar et al. [46] proposed the
most accurate. Sears and Shneiderman [47] evaluated a stabilization Touch-WLM model for making predictions of how users with given
technique for improving touch accuracy. More recently, Vogel and abilities enter text, then optimized touch screen layouts for users
Baudisch presented Shift [50], which uses an offset lens to magnify with tremor and dyslexia. Mott et al. created both Smart Touch [43]
touch targets when multiple targets are beneath the finger. Cao et and Cluster Touch [44] to resolve intended touch locations by people
al. proposed ShapeTouch [7] to utilize touch regions and motion to with motor impairments through template matching of segmented
infer virtual contact forces, enabling pseudo-force-based interaction “frames” of the touch process and clustering, respectively. Cluster
techniques. Benko et al. [3] designed high-precision touch selection Touch was also directed towards people incurring situational
techniques by utilizing two fingers that together dynamically adjust impairments [55] due to walking; similarly, Goel et al. created
the control-display (C-D) gain. Harrison et al. [21] used machine WalkType [17] and ContextType [18] for improving touch accuracy
learning classification to distinguish the sounds made by finger tips, on smartphone keyboards while walking. Although some of our
pads, nails, and knuckles, extending the “interaction vocabulary” metrics (e.g., touch drift) were used in an ad hoc fashion in prior
of touch-based systems. Holz and Baudisch [25] went even further work, they were not formalized mathematically, leaving them open
with Fiberio to detect fingerprints on a touch table with a custom to variations of implementation and interpretation.
fiber optic plate. On the output side, Wigdor et al. [54] created Relatedly, researchers have analyzed touch behaviors to detect
Ripples, which are visualizations accompanying users’ touches to specific diseases that relate to impaired motor function. Mastoras
provide vital feedback and reduce touch errors. These are just some et al. [38] extracted statistical features from keystroke sequences to
of the many interaction techniques developed to improve human detect depressive disorder. Giancardo et al. [16] and Arroyo-Gallego
performance with touch-based systems. By contrast, our metrics et al. [1] used hold time (e.g., the interval between key press and
contribute to understanding human performance at a fine-grained release) and flight time (e.g., the interval between one key’s release
level. Indeed, they could be used to explain why and how some of and the next key’s press) to distinguish early-stage Parkinson’s
these touch-improvement techniques succeed. disease (PD). Iakovakis et al. [27] used normalized flight time and
Some work has also attempted to improve touch accuracy, normalized pressure to detect declines in fine motor function in
not through novel interaction techniques, but through advanced early PD patients. Kay et al. [32] created PVT-Touch to implement
machine learning. For example, Weir et al. [52, 53] used Gaussian the Psychomotor Vigilance Task [10] as a robust clinical tool for
process regression to improve touch accuracy. Kumar et al. [35] assessing issues related to sleep loss. These projects demonstrate
collected capacitive touch images and trained convolutional neural the ability to use touch behaviors for disease- or condition-specific
networks (CNNs) to improve touch accuracy. Mayer et al. [39] used diagnoses and tracking. In contrast, our touch metrics are general,
CNNs to infer finger orientation from capacitive touch images. capable of illuminating touch behaviors that could be useful in a
The accessibility of touch-based systems for people with variety of applications (e.g., we provide examples in Section 7).
disabilities has also been a focus of prior work, particularly because
ASSETS ’22, October 23–26, 2022, Athens, Greece Kong, et al.

Variability Low High High High Very High


Drift Medium Low High Low High
Extent Medium Medium High High High
Table 1: Example traces of touch oval centroids and their
touch variability, drift, and extent. This table was inspired by
Figure 2: (a) A sequence of = touch ovals %0, ..., %=−1 from Figure 5 from MacKenzie et al. [37].
finger-down to finger-up along a time axis. (b) The first touch
oval in the sequence has centroid (G 0, ~0 ), angle \ 0 , major mathematical definition uses radians.
axis 0 0 , and minor axis 1 0 . The last touch oval has centroid ⎧ tan −1 [(~=−1 − ~0 )/(G=−1 − G 0 )], if G 0 < G=−1 ⎫
(G=−1, ~= −1 ), angle \=−1 , major axis 0=−1 , and minor axis 1=−1 .

⎪ ⎪

⎨ c/2, if G 0 = G=−1 , ~0 > ~=−1 (straight up)

⎪ ⎪


Direction =
3 OUR TOUCH METRICS ⎪
⎪ 3c/2, if G 0 = G=−1 , ~0 < ~=−1 (straight down) ⎪

⎪ undefined, if G 0 = G=−1 , ~0 = ~=−1 (no movement)
⎪ ⎪
In this section, we formalize 15 touch metrics for characterizing

⎩ ⎭
what happens during a touch. Although some of our metrics are ∈ [0, 2c)
admittedly not new (e.g., touch duration), most have never been
3.2.2 Touch Variability. Touch variability is the total distance
previously conceptualized or given to mathematical formalism.
covered by successive touch inputs from finger-down to finger-up.
As a general principle, our metrics result in values closer to zero
Intuitively, this metric indicates how much distance was covered
when touches are closer to “perfect” (i.e., when a single finger
during a touch (e.g., its “jitter”).
lands and lifts instantaneously from the same place). Similarly, our
−1 q
metrics generally increase in magnitude as touches are further from =Õ

“perfect” (i.e., as a finger moves, rotates, changes area, or persists on Variability = (G8 − G8 −1 ) 2 + (~8 − ~8 −1 ) 2 ∈ [0, ∞)
the screen). We begin by conceptualizing the “anatomy of a touch”
8=1

as a time series of ovals. 3.2.3 Touch Drift. Touch drift is the Euclidean distance from
finger-down to finger-up. Intuitively, this metric indicates how far
3.1 Anatomy of a Touch away a touch finished from where it started.
Our touch metrics are designed to capture what happens during a
q
Drift = (G=−1 − G 0 ) 2 + (~=−1 − ~0 ) 2 ∈ [0, ∞)
touch process, which, in the very least, consists of a finger-down
oval and a finger-up oval (Figure 2). Additional ovals might be 3.2.4 Touch Duration. Touch duration is the time elapsed from
created if the finger moves during the touch or if the touch software finger-down to finger-up. Intuitively, this metric indicates how long
samples the finger position at regular intervals. Our metrics are a touch persisted.
defined assuming only one finger is in contact with the screen, an Duration = ()=−1 − )0 ) ∈ [0, ∞)
assumption we revisit in Section 3.5.
Let = be the total number of touch input events captured from 3.2.5 Touch Extent. Touch extent is the Euclidean distance
finger-down to finger-up, inclusive. A touch process can be defined between the most distant two oval centroids. Intuitively, this
as a sequence of touch ovals %0, ..., %=−1 , where finger-down is %0 metric indicates the spatial range of the touch process.
and finger-up is %=−1 . Then, for the 8 C oval:
q
Extent = max (G 9 − G8 ) 2 + (~ 9 − ~8 ) 2 ∈ [0, ∞)
• (G8 , ~8 ) is the location of its centroid; 8,9 ∈0,...,=−1
• 08 , 18 are the lengths of its major and minor axes, respectively;
• \8 is the angle, in radians, of the major axis relative to the 3.3 Metrics Based on Touch Area
+G axis (i.e., straight right on the screen); In our second of three categories, we formalize five touch metrics
• (8 is the oval’s size, calculated as c08 18 /4; based on touch area. The metrics are absolute area change, (signed)
• )8 is the oval’s timestamp. area change, area variability, area deviation, and area extent.

3.2 Metrics Based on Touch Location and Time 3.3.1 Touch Absolute Area Change. The absolute area change of
a touch is the change in area between finger-up and finger-down
In our first of three categories, we formalize five touch metrics based ovals. Intuitively, this metric indicates the difference in area
on location and time. These metrics are touch direction, variability, between the start and end of a touch.
drift, duration, and extent.
Absolute Area Change = |(=−1 − ( 0 | ∈ [0, ∞)
3.2.1 Touch Direction. Touch direction is the angle formed
between the centroids of the finger-down and finger-up ovals. 3.3.2 Touch Area Change. The (signed) area change of a touch is the
Intuitively, this metric indicates the overall direction that a touch same as above, but positive if finger-up is larger than finger-down;
moved. Conceptually, we define straight right (+G) from the negative or zero otherwise. Intuitively, this metric indicates whether
finger-down centroid as 0◦ , straight left (−G) as 180◦ , straight a touch grew (+) or shrank (-), and by how much.
up (−~) as 90◦ , and straight down (+~) as 270◦ , although our Area Change = ((=−1 − ( 0 ) ∈ (−∞, ∞)
Quantifying Touch: New Metrics for Characterizing What Happens During a Touch ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: Depictions of (a) touch variability, (b) touch drift, and (c) touch extent. The blue ovals centered at the black and red
dots represent the first and the last touch ovals of the sequence, respectively. The dark-blue line represents the centroid trace,
and the yellow line represents the respective metric.
3.3.3 Touch Area Variability. The variability in touch area is the 3.4.2 Touch Angle Change. The (signed) angle change of a touch is
cumulative change in area over all touch ovals from finger-down the same as above, but positive when counterclockwise and negative
to finger-up. Intuitively, this metric indicates how “stable” the size when clockwise. Intuitively, this metric indicates whether the finger
of a touch was during its lifetime. rotated counterclockwise or clockwise, and by how much.
−1
=Õ Angle Change = (\=−1 − \ 0 ) ∈ (−c, c]
Area Variability = |(8 − (8 −1 | ∈ [0, ∞)
8=1 3.4.3 Touch Angle Variability. The variability in touch angle is the
cumulative change of the major axis angle over the duration of the
3.3.4 Touch Area Deviation. The deviation in touch area is the
touch. Intuitively, this metric indicates how “jittery” the orientation
standard deviation of touch oval area over the duration of the
of the finger was during a touch.
touch. Intuitively, this metric indicates how much the area changed
−1
from one oval to the next during the touch process. =Õ
s Angle Variability = |\8 − \8 −1 | ∈ [0, ∞)
Í= − 1 ¯2 =Õ−1
8 =0 ((8 − ( )
8=1
Area Deviation = ∈ [0, ∞), where (¯ = (8 /=
=−1 3.4.4 Touch Angle Deviation. The deviation in touch angle is the
8=0
standard deviation of the major axis angle over the duration of
3.3.5 Touch Area Extent. The extent of a touch’s area is the the touch. Intuitively, this metric indicates how much the finger
difference between the largest and smallest touch areas from orientation changed from one oval to the next during a touch.
finger-down to finger-up. Intuitively, this metric indicates the s
range of touch areas that occurred during the touch process.
Í=−1 ¯2 =−1
8=0 (\ 8 − \ )
Õ
Angle Deviation = ∈ [0, ∞), where \¯ = \8 /=
=−1
   
Area Extent = max (8 − min ( 9 ∈ [0, ∞) 8=0
8 ∈0,...,=−1 9 ∈0,...,=−1
3.4.5 Touch Angle Extent. The extent of a touch’s angle is the angle
3.4 Metrics Based on Touch Angle between the two major axes that are farthest apart in orientation.
In our third of three categories, we formalize five touch metrics Intuitively, this metric indicates the range of angles that the major
based on oval angles, that is, the angle of the major axis relative to axis covered throughout the touch.
the +G-axis (straight right), which is defined as 0◦ . These metrics
 
are absolute angle change, (signed) angle change, angle variability, Angle Extent = max |\ 9 − \8 | ∈ [0, c]
8,9 ∈0,...,=−1
angle deviation, and angle extent.
For a theoretical “perfect touch,” where the finger-down
Note that two touch angles with greatly different values can, in
oval exactly coincides with the finger-up oval and the touch
fact, be quite close to each other (e.g., 359◦ and 1◦ are only 2◦ apart).
is instantaneous, 14 metrics will equal zero and touch direction will
Therefore, to make our metrics accurately reflect the amount of
be undefined. In practice, of course, there should be some elapsed
angle change during a touch, we define the angle change from \ 1
time between finger-down and finger-up, with distinct ovals for
to \ 2 as
each captured event. Of course, the number of ovals and their
\ 2 − \ 1 = \ 1,2
properties one receives depend upon the software and hardware
where \ 1,2 is the angle between \ 1 and \ 2 in (−c, c]. Also, comprising the sensing platform one is using. For example, some
counter-clockwise is considered positive and clockwise is considered Android smartphones and tablets do not reliably report major and
negative. minor axis information. In our study, described below, we used a
3.4.1 Touch Absolute Angle Change. The absolute angle change of Microsoft Perceptive Pixel (PPI) display with custom software that
a touch is the change in the major axis angle from finger-down to we wrote to ensure all necessary oval information was reported for
finger-up. Intuitively, this metric indicates how much the finger calculation of our 15 touch metrics.
orientation changed between the start and end of a touch. Table 1 offers a few example touch centroid traces to illustrate the
intuitive differences among three of our touch metrics. Recall that
Absolute Angle Change = |\=−1 − \ 0 | ∈ [0, c]
touch variability is the total distance covered by successive touch
ovals; touch drift is the distance from finger-down to finger-up; and
ASSETS ’22, October 23–26, 2022, Athens, Greece Kong, et al.

Figure 4: Depictions of (a) touch angle change, (b) touch angle variability, (c) touch angle extent. The blue ovals represent the
touch oval sequence, the orange & yellow lines represent the major axes of ovals, and the green arcs represent measured angles.
touch extent is the distance between the two most distant touch appear, and then simply sum these over the entire touch process.
ovals that occurred during the touch process.
The above three policies provide a pragmatic approach to
3.5 Handling Multiple Concurrent Fingers handling cases where multiple fingers appear in a touch process
Our 15 metrics assume that a single finger is performing a touch. that only requires a single finger. As we show below, all three
Indeed, future research could explore additional metrics specifically policies resulted in similar empirical conclusions in our user study,
meant to characterize multi-touch behaviors. But our metrics can which only required single-finger touches and no swipe or pinch
nonetheless be used in multi-touch situations, and inadvertent gestures to be performed. As for metrics to specifically characterize
multi-touch events can and do occur, especially for people with multi-touch behaviors, we leave their formulation to future work.
limited fine motor function or for people in impairing situations.
Therefore, we address the question of which “policy” to adopt for 4 STUDY METHOD
our metrics when multiple concurrent touches occur. To this end, In this section, we describe a formal user study to put our 15 metrics
we explore three policies: first-down, longest-lived, and sum-of-all. through their paces. We conducted this study using custom software
In discussing these three policies, we regard a touch trace as the we built for the Microsoft Perceptive Pixel (PPI) display, which is
successive touch ovals associated with a single finger, and a touch capable of giving complete oval information for touch down, move,
process as the collection of touch ovals from the first finger-down and up events. The goal of our study was to understand whether
to the last finger-up, even if they are different fingers. Under this and how our metrics might characterize users’ touch behaviors,
view, a single touch process can contain multiple touch traces if particularly any differences between users with and without limited
multiple fingers touch the screen. fine motor function.
3.5.1 First-Down Policy. As a baseline, we examined the first-down
4.1 Participants
policy, which defines a touch process by whichever finger is first
to land on the touch-sensitive display. If other fingers land on the We recruited 27 participants, 15 of whom reported having limited
display while the first finger is still present, they are simply ignored. fine motor function (LFMF), using email solicitations, phone
This policy is obviously naïve and sometimes incorrect, but serves calls, convenience sampling, and snowball sampling (Table 2). On
as a useful point of comparison. average, participants with LFMF were 63.8 years old (( = 18.8),
with 8 women and 7 men. Reported health conditions included
3.5.2 Longest-Lived Policy. With the longest-lived policy, a touch essential tremor (5), arthritis (4), Charcot-Marie-Tooth disease (1),
process is defined by whichever finger persists on the touch-sensitive enhanced physiological tremor (1), idiopathic tremor (1), referral
display the longest. Thus, all fingers are tracked during the touch nerve pain (1), spinal cord injury (1), tendinitis (1), and traumatic
process, and once the process has ended, the finger with the brain injury (1). Reported specific fine motor challenges included
greatest duration is used to calculate our 15 touch metrics. This tremor (53.3%), spasm (40.0%), numbness (6.7%), stiffness (53.3%),
policy has the benefit that short-lived fingers are often the result of pain (46.7%), rapid fatigue (26.7%), poor coordination (13.3%),
unintentional contact with the display surface. low strength (33.3%), slow movements (20.0%), difficulty gripping
3.5.3 Sum-Of-All Policy. With the sum-of-all policy, we avoid (33.3%), difficulty lifting (20.0%), difficulty holding (20.0%), difficulty
trying to discern which finger is most “indicative” of the user’s holding still (26.7%), difficulty forming hand postures (26.7%), and
intended touch process, and instead include all touch traces in difficulty controlling movement direction (13.3%) and distance
the calculation of our metrics. As unintentional finger-touches are (13.3%). Participants without LFMF were, on average, 34.8 years old
often short-lived and produce relatively few touch events, they (( = 22.0), with 8 women and 4 men.
naturally contribute little to the sum of touch oval data than the
finger indicative of the user’s actual intentions. For people with 4.2 Apparatus
limited fine motor function, multiple fingers contacting the screen To collect touch data, we developed a custom testbed program for
can even be an indicator of their fine motor abilities and can be use on a Microsoft Perceptive Pixel (PPI) 55-inch tabletop display
reflective of their challenges using a device. Thus, for this policy, we connected to a PC running Windows 10. The testbed ran a Universal
first calculate our 15 metrics for the touch traces of all fingers that Windows Platform (UWP) application we wrote in C#. The purpose
of the testbed was to present participants with a series of crosshairs
Quantifying Touch: New Metrics for Characterizing What Happens During a Touch ASSETS ’22, October 23–26, 2022, Athens, Greece

Participants w/ Limited Fine Motor Function (LFMF) Participants w/o LFMF


ID Gender Age Diagnosis Fine Motor Challenges ID Gender Age
A1 woman 22 Tendinitis in wrist Stiffness, rapid fatigue B1 woman 21
A2 man 68 N/A Low strength B2 woman 23
Stiffness, pain, slow movements,
difficulty gripping, difficulty holding,
A3 man 77 Arthritis B3 man 23
difficulty holding still,
difficulty forming hand postures
Pain, poor coordination,
Mild referral
A4 man 59 difficulty forming hand postures, B4 woman 25
nerve pain
difficulty controlling movement distance
Tremor, spasm, stiffness, pain, rapid fatigue,
Arthritis, poor coordination, low strength,
A5 woman 75 Charcot-Marie-Tooth slow movements, difficulty gripping, B5 woman 26
disease difficulty lifting, difficulty holding still,
difficulty forming hand postures
A6 man 71 Essential tremor Tremor, slow movements B6 woman 29
A7 woman 67 Essential tremor Tremor, spasm, pain B7 man 23
Tremor, spasm, stiffness, low strength,
A8 woman 68 Essential tremor B8 woman 37
difficulty gripping, difficulty holding
Tremor, stiffness, pain,
A9 woman 81 Arthritis B9 woman 80
difficulty gripping, difficulty lifting
Tremor, stiffness, pain, rapid fatigue,
low strength, difficulty holding still,
A10 woman 71 Essential tremor B10 woman 24
difficulty controlling movement direction,
difficulty controlling movement distance
A11 man 78 Essential tremor Tremor B11 man 24
A12 man 34 Enhanced physiological tremor Spasm, pain, rapid fatigue B12 man 82
Tremor, spasm, stiffness, low strength,
Arthritis, Samilial
A13 woman 78 difficulty lifting, difficulty holding still,
idiopathic tremor
difficulty controlling movement direction
Spasm, difficulty gripping, difficulty holding,
A14 man 32 Spinal cord injury
difficulty forming hand postures
A15 woman 76 Traumatic brain injury Numbness, stiffness
Table 2: Demographics of participants with and without limited fine motor function (LFMF).

Figure 6: (a) A participant using a finger to tap on a crosshairs


target. (b) The sitting posture for participants who preferred
it. (c) The standing posture for participants who preferred it.

Figure 5: Our custom tabletop testbed. Four buttons in trials, crosshairs were generated at random (G, ~) locations only
the top-left enabled the experimenter to control the study: within this rectangle. The crosshairs themselves had a line thickness
start session, resume session, restart session, and undo. The of 1 pixel and a radius of 10 pixels, and participants were told to
participant ID and trial number were displayed at the top. The touch the crosshairs’ centers as accurately as they could. The PPI
large white rectangular region was defined by the participant display was placed on a desk for participants who preferred to stand,
to indicate their comfortable reach area. Crosshairs were or on a coffee table for participants who preferred to sit (Figure 6).
randomly generated within this region.
4.3 Procedure
Study sessions were conducted in-person following all COVID-19
that they touched as accurately as they could. All touch events health and safety protocols, with approval from our university’s
(finger down, move, and up) were logged as touch ovals. Each oval Institutional Review Board (IRB). After a brief introduction to
was reported with its centroid location, orientation, major and the study and the collection of demographic information via a
minor axis lengths, and a timestamp. questionnaire (see Table 2), participants were asked to draw a
To ensure participants could reach the crosshairs presented to rectangular area on the screen indicating the region in which they
them, before the trials began, participants were asked to indicate the were comfortable reaching. Each participant then completed 10
area of the tabletop that they could comfortably reach by touching practice trials to familiarize themselves with the study. A single
the four corners of a rectangle of their choosing (Figure 5). During
ASSETS ’22, October 23–26, 2022, Athens, Greece Kong, et al.

trial consisted of touching a crosshairs that was drawn at a random 5 RESULTS


(G, ~) location within the comfortable reach area. After the practice In this section, we present the results of our study. These results
trials, each participant completed 10 blocks of 20 trials for a include an examination of the effects of Impairment and of each
total of 200 trials. (With 27 participants, our data set consisted specific fine motor challenge on our 15 touch metrics. The boxplots
of 200 × 27 = 5400 touch trials in all.) Short breaks were offered in Appendix A show each of our 15 touch metrics by the three
between each block of trials. Upon completing all pointing trials, levels of Impairment. We also show associations between specific
participants were asked to rate how challenging they thought the self-reported fine motor challenges and the 15 touch metrics (see
trials were on a Likert-type scale ranging from “1=very easy” to Table 3). But first, we discuss our choice of touch policy for resolving
“7=very difficult.” They were also asked about anything they found inadvertent multiple touches (see Section 3.5).
challenging during the trials, and any difficulties they have using
touch screen devices in their daily lives. The study session took ∼45 5.1 Choosing a Touch Policy
minutes for participants with limited fine motor function (LFMF), Overall, 1.2% of trials had more than one finger touch the surface.
and <30 minutes for participants without LFMF. Participants with By level of Impairment, multiple fingers occurred in 0.1% (None),
LFMF were compensated $40 USD and participants without LFMF 0.5% (Moderate), and 4.4% (Severe) of trials.
were compensated $15 USD. We calculated our 15 touch metrics using the policies discussed
in Section 3.5. In conducting our analyses, we found that the results
4.4 Design and Analysis from all three touch policies generally agreed. Specifically, our
The purpose of our study was to exercise our touch metrics and see analyses revealed that in 77.2% of touch processes that had multiple
how they differ for participants with and without limited fine motor fingers, the finger to land first on the table was also the longest-lived
function (LFMF). In formal experiment terms, our study involved finger, explaining the convergence of results for the first-down and
a single between-subjects factor indicating whether someone longest-lived policies. The sum-of-all policy also resulted in the
reported having LFMF or not. That said, during our study sessions, same conclusions, but with the differences magnified, owing to
we observed that the touch abilities of participants with LFMF the inclusion of all fingers that touched the surface during a touch
varied a great deal, as some participants experienced significant process. Given the convergence of the three policies, we proceed
difficulty touching crosshairs while others had almost no difficulty. using the sum-of-all policy for the remainder of our analyses.
To ensure that our analysis took into account such differences, we 5.2 How Our Metrics Reveal Touch Behaviors
further grouped our participants into three levels having different Appendix A shows means and standard deviations for all 15 metrics
degrees of fine motor challenge: None, Moderate, and Severe. by level of Impairment. Visually, it is apparent that the Severe
The None group included participants who did not report having group tended to have somewhat higher values for many of our
LFMF and who did not show any observable signs of having touch metrics. Non-parametric analysis indicates a significant main
limited fine motor function during the study. The Moderate group effect of Impairment on touch variability (j2,# 2
=5400 = 15.55, ? <
included participants who (1) reported having LFMF, but reported 2 2
.001), drift (j2,# =5400 = 11.58, ? < .01), duration (j2,# =5400 =
that they were having a “good day” on the day of their study
2
10.85, ? < .01), extent (j 2,# =5400 = 12.86, ? < .01), and area
session, or (2) reported having LFMF, but reported that they found
2
change (j2,#
ways of accommodating their fine motor challenges such that =5400 = 6.46, ? < .05). Furthermore, Impairment had a
their touch screen use was not affected, or (3) did not report marginal effect on absolute area change (j2,#2
= 5400 = 4.78, ? = .092).
having LFMF, but showed some observable signs of having limited Post hoc pairwise comparisons among levels of Impairment
fine motor function during the study. Finally, the Severe group revealed that the Severe group had significantly higher touch
included the participants who both reported having LFMF and variability (C 24 = −3.76, ? < .01), drift (C 24 = −3.27, ? < .01),
showed observable signs of having fine motor challenges. These duration (C 24 = −3.28, ? < .01), and extent (C 24 = −3.43, ? < .01)
participants also described having difficulty operating touch screen than the None group, and marginally higher area change (C 24 =
devices in their everyday lives. In total, we had 11 participants −2.51, ? = .057) than the None group. The Severe group also had
in the None group, 10 participants in the Moderate group, and 6 significantly higher touch variability (C 24 = −3.31, ? < .01), drift
participants in the Severe group. (C 24 = −2.80, ? < .05), and extent (C 24 = −2.98, ? < .05) than the
For inferential statistics, because most of our metrics were Moderate group. Thus, it seems that some of our metrics are indeed
conditionally non-normal, we used the rank transform procedure capable of revealing differences in touch behaviors among our
of Conover and Iman [9] to apply midranks to our 15 touch metrics three groups.
before conducting our analyses of variance using linear mixed Along with examining the effects of Impairment on our 15 touch
models [15, 36], resulting in nonparametric analyses. Our single metrics, we also examined correlations between specific fine motor
factor was Impairment with three levels: None, Moderate, and Severe, challenges and our metrics. Recall that at the start of our study,
as described above. In addition, we ran separate analyses with each participants indicated whether or not they experienced various
specific fine motor challenge (e.g., “tremor”) as a dichotomous fixed fine motor challenges (see Section 4.1), namely tremor, spasm,
effect. As is customary, all models included Participant as a random numbness, stiffness, pain, rapid fatigue, poor coordination, low
factor to account for repeated measures [36]. Any post hoc pairwise strength, slow movements, difficulty gripping, difficulty lifting,
comparisons following significant or marginal omnibus tests were difficulty holding, difficulty holding still, difficulty forming hand
corrected with Holm’s sequential Bonferroni procedure [22]. postures, difficulty controlling movement direction, and difficulty
Quantifying Touch: New Metrics for Characterizing What Happens During a Touch ASSETS ’22, October 23–26, 2022, Athens, Greece

Poor Difficulty Difficulty For- Difficulty Con-


Difficulty
Tremor Spasm Stiffness Pain Coordi- Holding ming Hand trolling Movement
Lifting
nation Still Postures Distance
Direction – – 3.08 · – – 5.66 ∗ 3.91 ∗ – –
Variability 8.07 ∗∗ – 3.38 · 4.95 ∗ – 3.89 ∗ – – –
Drift 6.39 ∗ – 2.74 · 4.45 ∗ – 3.46 · – – –
Duration 8.27 ∗∗ 3.57 · – – – – – – 3.11 ·
Extent 7.04 ∗∗ – 3.02 · 4.64 ∗ – 3.52 · – – –
Absolute Area Change 3.94 ∗ – – – – – – – –
Area Change 3.90 ∗ – – – – – – – –
Area Variability 2.73 · – – – – – – – –
Area Deviation 2.71 · – – – – – – – –
Area Extent 3.04 · – – – – – – – –
Absolute Angle Change – – – 3.72 · – – – 4.01 ∗ –
Angle Change 2.86 · – – – 6.00 ∗ – – – 2.77 ·
Angle Variability – – – 3.51 · – – – 3.08 · –
Angle Deviation – – – – – – – 3.50 · –
Angle Extent – – – 3.30 · – – – 3.23 · –
Table 3: Chi-square (j 2 ) statistics for all significant (? < .05) and marginal (? < .10) effects of each fine motor challenge on the
touch metrics for people with vs. people without the given challenge. All j 2 -statistics are for j2,# 2
=5400 . Significance indicators:
·? < .10, ∗? < .05, ∗∗ ? < .01.

controlling movement distance. These challenges were indicated in behaviors of participants in the None and Moderate groups were
a dichotomous fashion (“yes” or “no”) at participants’ discretion. often difficult to distinguish during study sessions, and so we
All significant (? < .05) and marginal (? < .10) correlations are not surprised to find few statistically detectable differences
between these specific fine motor challenges and our touch metrics between them.
are presented in Table 3.2 This table reveals, for each specific fine Many or all of touch variability, drift, duration, and extent were
motor challenge, the touch metrics with which it significantly (or associated with tremor, stiffness, pain, and difficulty lifting. Touch
marginally) correlates compared to people without that challenge. duration was also associated with spasm and difficulty controlling
For example, people who self-reported experiencing tremor showed movement distance. Touch direction was associated with stiffness,
a significant correlation with touch variability, drift, duration, difficulty lifting, and difficulty holding still. Touch extent was
extent, and absolute area change; and marginal correlations with associated with tremor, stiffness, pain, and difficulty lifting. Touch
area variability, area deviation, area extent, and angle change angle change was associated with tremor, poor coordination, and
compared to people who did not report experiencing tremor. Taken difficulty controlling movement distance. These associations, once
as a whole, then, we can use these results to understand how revealed by our analyses, generally are quite intuitive.
people experiencing different fine motor challenges tended to Although most of the area and angle metrics did not show
exhibit certain corresponding touch behaviors, as revealed by our significant differences between Impairment groups, they were
touch metrics. nonetheless useful in characterizing different specific fine motor
challenges. The touch area metrics were positively correlated
6 DISCUSSION with tremor (absolute area change, signed area change, area
Overall, touch variability, drift, duration, extent, and (signed) area variability, area deviation, and area extent). Certain touch angle
change distinguished well among our three Impairment groups, metrics were also positively correlated with tremor (signed angle
especially between the None and Severe groups. To a lesser extent, change), pain (absolute angle change, angle variability, angle
the same can be said of absolute area change. Our results match extent), poor coordination (signed angle change), difficulty forming
our expectations that people with limited fine motor function hand postures (absolute angle change, angle variability, angle
might experience more challenges while using touch screens, deviation, and angle extent), and difficulty controlling movement
and therefore would have higher values for some metrics. Our distance (signed angle change).
findings also comport with those of Findlater et al. [13], which Among all specific fine motor challenges, tremor had an impact
showed increased pointing errors on touch screens and a high on the most individual touch metrics. This finding is intuitive,
frequency of spurious touches for people with limited fine motor as tremor directly affects people’s finger movements and use of
function. Moreover, our results also align with what we observed touch screen devices. Also, stiffness was correlated with touch
during our study sessions, namely that participants in the Severe direction, variability, drift, and extent, which also comports with our
group experienced greater difficulty touching accurately than those expectations, as stiffness might result in reduced finger movement.
in either the None or Moderate groups. Furthermore, the touch Although challenges like pain, difficulty lifting, and difficulty
2 Fine
forming hand postures also had associations with a few of our
motor challenges mentioned in Section 4.1 but not shown in Table 3 exhibited
no detectable correlations with our touch metrics; they are omitted for clarity of metrics, their expected effects on our touch metrics are less directly
presentation. predictable. Regardless, our findings on the whole show that our
ASSETS ’22, October 23–26, 2022, Athens, Greece Kong, et al.

metrics are useful in characterizing a number of different fine larger target sizes, an idea consistent with ability-based design
motor challenges, and can be helpful for understanding users’ [57, 58]. Note that such a hypothesis could be drawn without
different behaviors and challenges while using touch screens. knowing anything about the specific targets in the user interface
Our findings also match findings in the earlier ACM ASSETS or the particular user’s intention, as all 15 of our touch metrics are
work which analyzed a subset of the metrics [34], that people who target-agnostic [56]. This property of our metrics makes them easy
self-reported as having limited fine motor function tended to have to implement in a way that avoids the vast complications of having
significantly larger touch variability, drift, duration and extent than to identify targets in a user interface [2, 11, 12, 56]. Besides adaptive
people who did not self-report thusly. We believe that our metrics or personalized user interfaces, our metrics might also be used by
can benefit future accessibility research, by providing additional software designed to diagnose or track human motor functioning.
information for understanding motor-impaired touch behaviors For example, a personal health application of our metrics might
in empirical studies, as well as enabling inventions of accessible allow a user to self-track their touch performance with our 15
systems with our metrics – for example, ability-aware systems that metrics to see whether they are affected by medication, time of
dynamically adapt to users’ touch abilities (Section 7). day, fatigue, therapy, or progressive symptoms. Yet another future
application of this work is in studies comparing touch screens,
6.1 Limitations user groups, or user situations. We anticipate that numerous future
As with any study, ours had limitations. To acquire full touch-oval research and development projects might benefit from measuring
information, our study was run on a Microsoft Perceptive Pixel more than just touch accuracy and speed, using our metrics to
(PPI) tabletop display, which required us to be co-located with understand why overall touch performance is what it is.
our participants in a lab setting. A more ecologically valid touch
data set could be obtained from people’s everyday smartphone use, 8 CONCLUSION
although based on our testing, many Android smartphones fail to Current human performance measures for use with touch-based
report complete touch-oval data, making this approach challenging. systems have focused mainly on overall performance like touch
Future smartphone models might remedy this limitation. accuracy and target acquisition speed. But to understand what
Although we were able to recruit 15 participants who reported happens during a touch, we formalized 15 target-agnostic touch
having limited fine motor function, and 12 participants otherwise, metrics: touch direction, variability, drift, duration, extent, absolute
our sample of 27 participants was still relatively small considering / signed area change, area variability, area deviation, area extent,
the large variety and severity of fine motor challenges. With absolute / signed angle change, angle variability, angle deviation,
more participants, we would probably be able to identify stronger and angle extent. Instead of treating a touch as an atomic event, our
and more correlations between fine motor challenges and our 15 metrics regard a touch as a time series of touch ovals approximating
touch metrics, likely resolving many of our marginal results into a finger’s contact area from finger-down to finger-up. For each of
statistically significant ones. Furthermore, many of our participant our 15 metrics, we provided the mathematical formula and intuitive
volunteers were quite tech-savvy, and might have found ways description of what the metric means. We also included visual
to adapt their touch behaviors to accommodate their fine motor depictions of some of our metrics to aid understanding. We also
challenges. Our distinguishing of participants with moderate and described three policies to handle cases where multiple fingers
severe fine motor challenges enabled us to detect performance inadvertently contact the screen during a touch.
differences between them, but having more participants in each To exercise our 15 metrics, we built a custom testbed for an
group would enable some statistical trends to further emerge. interactive tabletop and collected complete touch-oval data from
Finally, our goal was to isolate participant touches using a testbed 27 participants, 15 of whom reported having limited fine motor
showing crosshairs as, effectively, single-pixel targets. It remains to function. Our analysis showed that our metrics could effectively
be seen whether participants’ touch behaviors would be the same distinguish unimpaired from impaired touch behaviors. Our metrics
on “real” user interfaces with, for example, buttons, hyperlinks, were also significantly associated with different self-reported fine
scroll bars, and menus, to name a few. A future version of our study motor challenges. Thus, our metrics can shed light on the underlying
could not only include crosshairs as targets, but also genuine user causes of touch inaccuracy, and can further help understand users’
interfaces whose targets are typical of touch-based applications. touch abilities. Conceivably, our metrics might inform the design of
touch-based systems. Touch devices and touch accuracy will remain
7 FUTURE WORK important for many years to come; our metrics can contribute to a
In addition to addressing the limitations above, strategic directions better understanding of both.
for future work are numerous. This work provided an initial
scientific exploration into 15 touch metrics and their association ACKNOWLEDGMENTS
with limited fine motor function. Going forward, systems could This work was supported in part by Google, by the University
be implemented that observe a user’s touch behaviors and offer of Washington Center for Research and Education on Accessible
possible adaptations for improving the usability or accessibility Technology and Experiences (CREATE), and by National Science
of their user interfaces, even at runtime. For example, if a system Foundation grant #IIS-1702751. Any opinions, findings, conclusions
detects high touch variability, drift, extent, area change, area or recommendations expressed in our work are those of the authors
variability, and angle change, it might be reasonable for it to and do not necessarily reflect those of any supporter.
hypothesize that its user experiences tremor and could benefit from
Quantifying Touch: New Metrics for Characterizing What Happens During a Touch ASSETS ’22, October 23–26, 2022, Athens, Greece

REFERENCES [23] Christian Holz and Patrick Baudisch. 2010. The generalized perceived input point
[1] Teresa Arroyo-Gallego, María Jesus Ledesma-Carbayo, Alvaro Sánchez-Ferro, Ian model and how to double touch accuracy by extracting fingerprints. In Proceedings
Butterworth, Carlos S Mendoza, Michele Matarazzo, Paloma Montero, Roberto of the ACM SIGCHI Conference on Human Factors in Computing Systems. 581–590.
López-Blanco, Veronica Puertas-Martin, Rocio Trincado, et al. 2017. Detection of [24] Christian Holz and Patrick Baudisch. 2011. Understanding touch. In Proceedings of
motor impairment in Parkinson’s disease via mobile touchscreen typing. IEEE the ACM SIGCHI Conference on Human Factors in Computing Systems. 2501–2510.
Transactions on Biomedical Engineering 64, 9 (2017), 1994–2002. [25] Christian Holz and Patrick Baudisch. 2013. Fiberio: a touchscreen that senses
[2] Ravin Balakrishnan. 2004. “Beating” Fitts’ law: virtual enhancements for pointing fingerprints. In Proceedings of the 26th Annual ACM Symposium on User Interface
facilitation. International Journal of Human-Computer Studies 61, 6 (2004), Software and Technology. 41–50.
857–874. [26] Faustina Hwang, Simeon Keates, Patrick Langdon, and John Clarkson. 2004.
[3] Hrvoje Benko, Andrew D Wilson, and Patrick Baudisch. 2006. Precise selection Mouse movements of motion-impaired users: A submovement analysis. In
techniques for multi-touch screens. In Proceedings of the ACM SIGCHI Conference Proceedings of the 5th International ACM SIGACCESS Conference on Computers
on Human Factors in Computing Systems. 1263–1272. and Accessibility. 102–109.
[4] Joanna Bergstrom-Lehtovirta, Antti Oulasvirta, and Stephen Brewster. 2011. The [27] Dimitrios Iakovakis, Stelios Hadjidimitriou, Vasileios Charisis, Sevasti
effects of walking speed on target acquisition on a touchscreen interface. In Bostantzopoulou, Zoe Katsarou, and Leontios J Hadjileontiadis. 2018.
Proceedings of the 13th International Conference on Human Computer Interaction Touchscreen typing-pattern analysis for detecting fine motor skills decline in
with Mobile Devices and Services. 143–146. early-stage Parkinson’s disease. Scientific Reports 8, 1 (2018), 1–13.
[5] Xiaojun Bi, Yang Li, and Shumin Zhai. 2013. FFitts law: modeling finger touch [28] Shaun K Kane, Jeffrey P Bigham, and Jacob O Wobbrock. 2008. Slide Rule: making
with fitts’ law. In Proceedings of the ACM SIGCHI Conference on Human Factors in mobile touch screens accessible to blind people using multi-touch interaction
Computing Systems. 1363–1372. techniques. In Proceedings of the 10th International ACM SIGACCESS Conference
[6] Xiaojun Bi and Shumin Zhai. 2013. Bayesian touch: a statistical criterion of target on Computers and Accessibility. 73–80.
selection with finger touch. In Proceedings of the 26th Annual ACM Symposium [29] Shaun K Kane, Brian Frey, and Jacob O Wobbrock. 2013. Access lens: a
on User Interface Software and Technology. 51–60. gesture-based screen reader for real-world documents. In Proceedings of the
[7] Xiang Cao, Andrew D Wilson, Ravin Balakrishnan, Ken Hinckley, and Scott E ACM SIGCHI Conference on Human Factors in Computing Systems. 347–350.
Hudson. 2008. ShapeTouch: Leveraging contact shape on interactive surfaces. In [30] Shaun K Kane, Meredith Ringel Morris, Annuska Z Perkins, Daniel Wigdor,
2008 3rd IEEE International Workshop on Horizontal Interactive Human Computer Richard E Ladner, and Jacob O Wobbrock. 2011. Access overlays: improving
Systems. IEEE, 129–136. non-visual access to large touch screens for blind users. In Proceedings of the 24th
[8] Sinead Carew. 2009. Touch-screen gadgets alienate blind. Reuters, January 8 Annual ACM Symposium on User Interface Software and Technology. 273–282.
(2009). [31] Shaun K Kane, Meredith Ringel Morris, and Jacob O Wobbrock. 2013. Touchplates:
[9] William J Conover and Ronald L Iman. 1981. Rank transformations as a bridge low-cost tactile overlays for visually impaired touch screen users. In Proceedings of
between parametric and nonparametric statistics. The American Statistician 35, 3 the 15th International ACM SIGACCESS Conference on Computers and Accessibility.
(1981), 124–129. 1–8.
[10] David F Dinges and John W Powell. 1985. Microcomputer analyses of performance [32] Matthew Kay, Kyle Rector, Sunny Consolvo, Ben Greenstein, Jacob O Wobbrock,
on a portable, simple visual RT task during sustained operations. Behavior research Nathaniel F Watson, and Julie A Kientz. 2013. PVT-touch: adapting a reaction
methods, instruments, & computers 17, 6 (1985), 652–655. time test for touchscreen devices. In 2013 7th International Conference on Pervasive
[11] Morgan Dixon, James Fogarty, and Jacob Wobbrock. 2012. A general-purpose Computing Technologies for Healthcare and Workshops. IEEE, 248–251.
target-aware pointing enhancement using pixel-level analysis of graphical [33] Simeon Keates, Faustina Hwang, Patrick Langdon, P John Clarkson, and Peter
interfaces. In Proceedings of the ACM SIGCHI Conference on Human Factors in Robinson. 2002. Cursor measures for motion-impaired computer users. In
Computing Systems. 3167–3176. Proceedings of the 4th International ACM SIGACCESS Conference on Computers
[12] Abigail Evans and Jacob Wobbrock. 2012. Taming wild behavior: The input and Accessibility. 135–142.
observer for obtaining text entry and mouse pointing measures from everyday [34] Junhan Kong, Mingyuan Zhong, James Fogarty, and Jacob O Wobbrock. 2021.
computer use. In Proceedings of the ACM SIGCHI Conference on Human Factors in New Metrics for Understanding Touch by People with and without Limited
Computing Systems. 1947–1956. Fine Motor Function. In Proceedings of the 23th International ACM SIGACCESS
[13] Leah Findlater, Karyn Moffatt, Jon E Froehlich, Meethu Malu, and Joan Zhang. Conference on Computers and Accessibility. (To apprear).
2017. Comparing touchscreen and mouse input performance by people with [35] Abinaya Kumar, Aishwarya Radjesh, Sven Mayer, and Huy Viet Le. 2019.
and without upper body motor impairments. In Proceedings of the ACM SIGCHI Improving the input accuracy of touchscreens using deep learning. In Extended
Conference on Human Factors in Computing Systems. 6056–6061. Abstracts of the ACM SIGCHI Conference on Human Factors in Computing Systems.
[14] Leah Findlater and Lotus Zhang. 2020. Input Accessibility: A Large Dataset and 1–6.
Summary Analysis of Age, Motor Ability and Input Performance. In Proceedings of [36] Ramon C Littell, PR Henry, and Clarence B Ammerman. 1998. Statistical analysis
the 22th International ACM SIGACCESS Conference on Computers and Accessibility. of repeated measures data using SAS procedures. Journal of Animal Science 76, 4
1–6. (1998), 1216–1231.
[15] Brigitte N Frederick. 1999. Fixed-, Random-, and Mixed-Effects ANOVA Models: [37] I Scott MacKenzie, Tatu Kauppinen, and Miika Silfverberg. 2001. Accuracy
A User-Friendly Guide for Increasing the Generalizability of ANOVA Results. measures for evaluating computer pointing devices. In Proceedings of the ACM
(1999). SIGCHI Conference on Human Factors in Computing Systems. 9–16.
[16] Luca Giancardo, Alvaro Sanchez-Ferro, Teresa Arroyo-Gallego, Ian Butterworth, [38] Rafail-Evangelos Mastoras, Dimitrios Iakovakis, Stelios Hadjidimitriou, Vasileios
Carlos S Mendoza, Paloma Montero, Michele Matarazzo, José A Obeso, Martha L Charisis, Seada Kassie, Taoufik Alsaadi, Ahsan Khandoker, and Leontios J
Gray, and R San José Estépar. 2016. Computer keyboard interaction as an indicator Hadjileontiadis. 2019. Touchscreen typing pattern analysis for remote detection
of early Parkinson’s disease. Scientific Reports 6, 1 (2016), 1–10. of the depressive tendency. Scientific reports 9, 1 (2019), 1–12.
[17] Mayank Goel, Leah Findlater, and Jacob Wobbrock. 2012. WalkType: using [39] Sven Mayer, Huy Viet Le, and Niels Henze. 2017. Estimating the finger orientation
accelerometer data to accomodate situational impairments in mobile touch screen on capacitive touchscreens using convolutional neural networks. In Proceedings
text entry. In Proceedings of the ACM SIGCHI Conference on Human Factors in of the ACM International Conference on Interactive Surfaces and Spaces. 220–229.
Computing Systems. 2687–2696. [40] Alexander Mertens, Nicole Jochems, Christopher M Schlick, Daniel Dünnebacke,
[18] Mayank Goel, Alex Jansen, Travis Mandel, Shwetak N Patel, and Jacob O and Jan Henrik Dornberg. 2010. Design pattern TRABING: touchscreen-based
Wobbrock. 2013. ContextType: using hand posture information to improve input technique for people affected by intention tremor. In Proceedings of the
mobile touch screen text entry. In Proceedings of the ACM SIGCHI Conference on ACM SIGCHI Symposium on Engineering Interactive Computing Systems. 267–272.
Human Factors in Computing Systems. 2795–2798. [41] Kyle Montague, Vicki L Hanson, and Andy Cobley. 2012. Designing for
[19] Tiago Guerreiro, Hugo Nicolau, Joaquim Jorge, and Daniel Gonçalves. 2010. individuals: usable touch-screen interaction through shared user models. In
Towards accessible touch interfaces. In Proceedings of the 12th International ACM Proceedings of the 14th International ACM SIGACCESS Conference on Computers
SIGACCESS Conference on Computers and Accessibility. 19–26. and Accessibility. 151–158.
[20] Anhong Guo, Junhan Kong, Michael Rivera, Frank F Xu, and Jeffrey P Bigham. [42] Kyle Montague, Hugo Nicolau, and Vicki L Hanson. 2014. Motor-impaired
2019. Statelens: A reverse engineering solution for making existing dynamic touchscreen interactions in the wild. In Proceedings of the 16th international ACM
touchscreens accessible. In Proceedings of the 32nd Annual ACM Symposium on SIGACCESS conference on Computers & accessibility. 123–130.
User Interface Software and Technology. 371–385. [43] Martez E Mott, Radu-Daniel Vatavu, Shaun K Kane, and Jacob O Wobbrock. 2016.
[21] Chris Harrison, Julia Schwarz, and Scott E Hudson. 2011. TapSense: enhancing Smart touch: Improving touch accuracy for people with motor impairments with
finger interaction on touch surfaces. In Proceedings of the 24th Annual ACM template matching. In Proceedings of the ACM SIGCHI Conference on Human
Symposium on User Interface Software and Technology. 627–636. Factors in Computing Systems. 1934–1946.
[22] Sture Holm. 1979. A simple sequentially rejective multiple test procedure. [44] Martez E Mott and Jacob O Wobbrock. 2019. Cluster Touch: Improving touch
Scandinavian Journal of Statistics (1979), 65–70. accuracy on smartphones for people with motor and situational impairments.
In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing
Systems. 1–14.
ASSETS ’22, October 23–26, 2022, Athens, Greece Kong, et al.

[45] Richard L Potter, Linda J Weldon, and Ben Shneiderman. 1988. Improving the [55] Jacob O Wobbrock. 2019. Situationally-induced impairments and disabilities. In
accuracy of touch screens: an experimental evaluation of three strategies. In Web Accessibility. Springer, 59–92.
Proceedings of the ACM SIGCHI Conference on Human Factors in Computing [56] Jacob O Wobbrock, James Fogarty, Shih-Yen Liu, Shunichi Kimuro, and Susumu
Systems. 27–32. Harada. 2009. The angle mouse: target-agnostic dynamic gain adjustment based
[46] Sayan Sarcar, Jussi PP Jokinen, Antti Oulasvirta, Zhenxin Wang, Chaklam on angular deviation. In Proceedings of the ACM SIGCHI Conference on Human
Silpasuwanchai, and Xiangshi Ren. 2018. Ability-based optimization of Factors in Computing Systems. 1401–1410.
touchscreen interactions. IEEE Pervasive Computing 17, 1 (2018), 15–26. [57] Jacob O Wobbrock, Krzysztof Z Gajos, Shaun K Kane, and Gregg C Vanderheiden.
[47] Andrew Sears and Ben Shneiderman. 1991. High precision touchscreens: design 2018. Ability-based design. Commun. ACM 61, 6 (2018), 62–71.
strategies and comparisons with a mouse. International Journal of Man-Machine [58] Jacob O Wobbrock, Shaun K Kane, Krzysztof Z Gajos, Susumu Harada, and Jon
Studies 34, 4 (1991), 593–613. Froehlich. 2011. Ability-based design: Concept, principles and examples. ACM
[48] R William Soukoreff and I Scott MacKenzie. 2003. Metrics for text entry research: Transactions on Accessible Computing (TACCESS) 3, 3 (2011), 1–27.
An evaluation of MSD and KSPC, and a new unified error metric. In Proceedings [59] Shumin Zhai. 2004. Characterizing computer input with Fitts’ law
of the ACM SIGCHI Conference on Human Factors in Computing Systems. 113–120. parameters—the information and non-information aspects of pointing.
[49] R William Soukoreff and I Scott MacKenzie. 2004. Towards a standard for International Journal of Human-Computer Studies 61, 6 (2004), 791–809.
pointing device evaluation, perspectives on 27 years of Fitts’ law research in HCI. [60] Shumin Zhai, Michael Hunter, and Barton A Smith. 2002. Performance
International Journal of Human-Computer Studies 61, 6 (2004), 751–789. optimization of virtual keyboards. Human–Computer Interaction 17, 2-3 (2002),
[50] Daniel Vogel and Patrick Baudisch. 2007. Shift: a technique for operating 229–269.
pen-based interfaces using touch. In Proceedings of the ACM SIGCHI Conference [61] Xiaoyi Zhang, Tracy Tran, Yuqian Sun, Ian Culhane, Shobhit Jain, James Fogarty,
on Human Factors in Computing Systems. 657–666. and Jennifer Mankoff. 2018. Interactiles: 3D printed tactile interfaces to enhance
[51] Chat Wacharamanotham, Jan Hurtmanns, Alexander Mertens, Martin mobile touchscreen accessibility. In Proceedings of the 20th International ACM
Kronenbuerger, Christopher Schlick, and Jan Borchers. 2011. Evaluating SIGACCESS Conference on Computers and Accessibility. 131–142.
swabbing: a touchscreen input method for elderly users with tremor. In
Proceedings of the ACM SIGCHI Conference on Human Factors in Computing
Systems. 623–626. A DISTRIBUTION OF VALUES OF TOUCH
[52] Daryl Weir. 2012. Machine learning models for uncertain interaction. In Adjunct
proceedings of the 25th Annual ACM Symposium on User Interface Software and
METRICS BY LEVEL OF IMPAIRMENT
Technology. 31–34. The 15 boxplots below show distribution of values of all 15 touch
[53] Daryl Weir, Simon Rogers, Roderick Murray-Smith, and Markus Löchtefeld.
2012. A user-specific machine learning approach for improving touch accuracy
metrics by level of Impairment. The plots are vertically organized
on mobile devices. In Proceedings of the 25th Annual ACM Symposium on User such that the first column shows metrics based on touch location
Interface Software and Technology. 465–476. and time (Section 3.2), the second column shows metrics based on
[54] Daniel Wigdor, Sarah Williams, Michael Cronin, Robert Levy, Katie White, Maxim
Mazeev, and Hrvoje Benko. 2009. Ripples: utilizing per-contact visualizations to touch area (Section 3.3), and the third column shows metrics based
improve user interaction with touch displays. In Proceedings of the 22nd Annual on touch angle (Section 3.4).
ACM Symposium on User Interface Software and Technology. 3–12.
Quantifying Touch: New Metrics for Characterizing What Happens During a Touch ASSETS ’22, October 23–26, 2022, Athens, Greece

1250
15
•• Q)
• Q)
Ol
250 •
g11000 ffi 200
• Cl)

2
C
••
I .c
(_)
750
.c
(_)

150 ••• I I

~
..!!:!
10
••
Cl)
• ~ Ol
(.)

-~
• <(
500
C
I I

111
0 Q) I <( 100
-s Q)
-s
$ $ ••
5 0Cf) •
250 0Cf) 50

0
.0
<(
0
___L_ _L _l_ .0
<(
0
None Moderate Severe None Moderate Severe None Moderate Severe

I 1200 • •
40 • 200

Q)
• I


g30
Q)
Ol 800 Ol
100 •
• C
••
C

+++
Cl) Cl)
:0
-~ 20
•• .c
(_)
.c
(_)
Cl)
I Cl)
400 I Q)
o,
0
> ~
I C
<(

.L
<(

_l_ j_ ~
10

+ •
-100
0
0 -I-


I I

I
-200
None Moderate Severe None Moderate Severe None Moderate Severe

15 •• 1500 • 600 • • •
g • •
I
£ •
•• • 15Cl) 1000 ~ 400
I

••

.;:::
10
•• •I -~ •
•I
-~
•I II •

l
~ >Q)
I
>Cl)
• •• I

..L l J_
500 g,200
~

l_
5
<( I

_l_
<(

0 0
....J..._ _J_

~ 0
None Moderate Severe None Moderate Severe None Moderate Severe

1250
• • 150 •
400 ••• •I
C
0
~:J
1000

750 •


2
C

Cl)
·;;:
300
I
C
0
~ 100
·;;:
Q)
I

I

I
I
••

11
Q)
0
I
~
500 0 200
0 Cl) I Q)
o, 50
~
• • I
l_
C
250 <( 100 <(

0
~ $ 0

_J_ _J_ ...L 0
None Moderate Severe None Moderate Severe None Moderate Severe

15 •• • • I •
• 150 •
•• -C 1000 c • ••
•• • • Q)

c Q)
• •
10
•I •
I
x • ;E 100
••
I• I I
2 w
X I Q)
w Cl)
~
•• o,

..L l ~
I 500 •
5 <( • I
C
<( 50

0 0
_J_ .....L _L 0 ~ j_ _L
None Moderate Severe None Moderate Severe None Moderate Severe
Creating 3D Printed Assistive Technology Through Design
Shortcuts: Leveraging Digital Fabrication Services to Incorporate
3D Printing into the Physical Therapy Classroom
Erin L. Higgins∗ William Easley Karen Gordes
University of Maryland, Baltimore University of Maryland, Baltimore University of Maryland Baltimore,
County County Graduate School
erinh2@umbc.edu williameasley3@gmail.com karen.gordes@umaryland.edu

Amy Hurst Foad Hamidi


New York University University of Maryland, Baltimore
amyhurst@nyu.edu County
foadhamidi@umbc.edu
ABSTRACT ACM Reference Format:
Digital fabrication methods have been shown to be an efective Erin L. Higgins, William Easley, Karen Gordes, Amy Hurst, and Foad Hamidi.
2022. Creating 3D Printed Assistive Technology Through Design Shortcuts:
method for producing customized assistive technology (AT). How-
Leveraging Digital Fabrication Services to Incorporate 3D Printing into the
ever, the skills required to utilize these tools currently require a Physical Therapy Classroom. In The 24th International ACM SIGACCESS
high level of technical skill. Previous research showed that integra- Conference on Computers and Accessibility (ASSETS ’22), October 23–26, 2022,
tion of these skills within physical therapy training is appropriate Athens, Greece. ACM, New York, NY, USA, 16 pages. https://doi.org/10.1145/
but that the level of technical difculty required can be an issue. 3517428.3544816
We worked to address these issues by introducing a group of PT
students to maker concepts and having them develop custom AT 1 INTRODUCTION
for real end users with the help of makers. We present three consid-
Previous research has shown that consumer-grade fabrication meth-
erations when integrating making into PT curriculum: 1) including
ods, such as 3D printing and laser cutting, can improve Assistive
all stakeholders, 2) developing interdisciplinary competencies for
Technology (AT) development and production [1-3]. In a recent
PTs and makers, and 3) leveraging academic training programs to
article, Mankof et al. provided an overview of the possibilities and
connect makers and PT students. In this paper, we contribute to
challenges of using consumer-grade fabrication methods to develop
knowledge on how to facilitate the 3D printing of customized ATs
customized AT solutions [1]. They found that these fabrication
for PT students by connecting them with a community organization
methods have great potential for creating small-batch customized
that provides digital fabrication services and technical expertise.
devices that may better meet the needs of a user and for involv-
By connecting multiple stakeholders (i.e., PT students, digital fabri-
ing users more closely in the design and creation of their own
cators, and AT users), we ofer an approach to overcome time and
technology. Despite this, the research also identifed that current
capacity constraints of PT students to utilize advanced fabrication
fabrication tools and processes are not inclusive of people without
technologies to create customized ATs through connecting them to
prior technical expertise or knowledge and that without stakeholder
professional makers.
involvement at all levels of fabrication design and implementation,
important challenges in integrating these techniques into therapy
CCS CONCEPTS
and medicine remain [1]. In the last few decades, several online
• B7; Human-centered computing → Accessibility. communities of makers interested in creating customized ATs have
formed [2, 4]. While successful outcomes from these communities
KEYWORDS are documented, previous research has also shown a tension be-
3D Printing, Assistive Technology, Physical Therapy, Makerspaces, tween the priorities of hobbyists and makers and those of clinicians
Education, Digital Fabrication and therapists [5]. The tension, manifesting in concerns about the
safety and practicality of ATs customized or created by makers,
∗ Corresponding author may be due to the diference between the clinical ethos of “do not
harm” and the making ethos of “help where you can” [5].
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed Previous research has explored how to better connect digital
for proft or commercial advantage and that copies bear this notice and the full citation fabrication and physical therapy skill sets. McDonald et al. con-
on the frst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
ducted a study on how to introduce Physical Therapy (PT) students
republish, to post on servers or to redistribute to lists, requires prior specifc permission to 3D printing [3, 6]. The study was motivated by the potential
and/or a fee. Request permissions from permissions@acm.org. of enhancing clinician’s ability to use digital fabrication skills to
ASSETS ’22, October 23–26, 2022, Athens, Greece
develop customized, usable, and safe devices by providing them
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 with the needed maker skills and experience. While the research
https://doi.org/10.1145/3517428.3544816
ASSETS ’22, October 23–26, 2022, Athens, Greece Erin Higgins et al.

showed that 3D printing can be a viable and useful addition to PT for PT students, how the communication between PT students and
students’ current curriculum, it also identifed several issues with professional makers can be strengthened, including when to use
integrating it. Specifcally, it identifed the signifcant amount of clay modeling and when measurements and sketching might be
time required for clinicians to learn Computer-Aided Design (CAD) a better option, and how future PT training programs can better
skills necessary to model a 3D printable object or specifc and de- integrate interdisciplinary skills and competencies related to digital
tailed features of an AT device. Furthermore, the study identifed fabrication, including those related to 3D scanning and printing
the lack of consistent access to fabrication tools and the challenge practice.
of getting end users to return for multiple visits required to tweak We will contextualize our study using previous research into
and evaluate the designs, as barriers for PTs using digital fabrica- AT abandonment, the use of consumer-grade fabrication tools by
tion for AT design and creation [3]. Overall, this previous research people with disabilities, the practices of online DIY communities,
shows that while digital fabrication tools have great potential for specifcally focused on designing ATs, and previous eforts to use
supporting PTs to create and customize ATs for their clients, it 3D printing in the context of PT education. We then describe how
is unclear how to make them accessible without requiring PTs to we designed and implemented class sessions on 3D printing and
become experts in 3D printing or CAD. fabrication into a graduate PT course and describe our data collec-
To investigate how to make digital fabrication processes acces- tion and analysis procedures. Next, we report fndings from our
sible to PT students without the need for them to become experts study and describe each end user’s resulting customized AT device.
at using them, we conducted a study that connected PT students We conclude with a discussion of PT students’ experiences with
to staf at a community makerspace that provides 3D printing and respect to the diferent aspects of the project and suggest directions
scanning services to local organizations and individuals. By lever- for future work.
aging the services and skills of expert makers, rather than learning
complex and time-consuming 3D modeling and printing skills, PT 2 RELATED WORK
students could focus on understanding and translating their clients’
Our work builds on the existing body of research that has explored
needs to a physical design to be communicated for realization. Fur-
the possibilities and challenges of using digital fabrication processes,
thermore, by including real end users in the design and fabrication
such as 3D printing, for AT development and customization. In this
process, we investigated how to form small stakeholder groups
section, we provide an overview of this previous research, including
consisting of a group of clinicians, an end user, and a community
work on AT abandonment that motivates eforts to customize ATs
maker, who would communicate and work together to create a
to better meet the needs of end users, studies of Do-it-yourself
customized AT. This approach allowed us to investigate the com-
(DIY) and bespoke approaches to AT design, and the role of AT
munication possibilities and challenges involved in this process to
development in physical therapy education.
design future communication protocols that may alleviate the need
for PTs to learn digital fabrication skills and instead use external
fabrication services and resources to create customized ATs. 2.1 AT Abandonment and Fit Issues
In this study, we seek to understand (1) if consumer-grade fabrica- According to the Center for Disease Control and Prevention, 61
tion with 3D printers can be used to produce customized and usable million adults in the United States live with a disability, equating
ATs for real end users in the context of a graduate PT classroom, (2) to 26% or 1 in 4 adults [7]. Worldwide, an estimated one billion
if resulting AT designs could be accurately communicated by PT individuals need ATs. This number is expected to increase to 2
students to a youth-stafed community makerspace for fabrication, billion by 2030 [8]. Without appropriate AT, individuals are often
and (3) what information would be needed by makers and PTs to unable to fully participate in society and live active, independent
successfully communicate with each other about this process. lives. Therefore, it is increasingly important for AT procurement to
We investigated these questions by integrating six 3D printing become accessible by a large number of people.
class sessions into the curriculum of 58 PT graduate students over AT-focused projects tend to have very high social impact but
two semesters. The students were frst given an introduction to 3D a low economic impact [9] and related proft margins. Because of
printing, then they designed AT devices for 5 simulated end users these relatively small proft margins, it is uncommon for companies
leveraging the resources of a community makerspace, and then to fund research and development that advances the feld of AT to
repeated the process for creating AT devices for 12 real end users. meet the growing need. This leads to many issues.
In this paper, we contribute to knowledge on how to facilitate One issue is the high abandonment rate of ATs. Of all ATs that are
the 3D printing of customized ATs for PT students by connecting prescribed, 20 to 30 percent are abandoned by end users [10]. Poorly
them with a community organization that provides digital fabri- designed or ftted devices that do not meet the needs of the end users
cation services and technical expertise. By connecting multiple is a key factor related to abandonment [11, 12]. Though improved
stakeholders (i.e., PT students, digital fabricators, and AT users), service techniques have been shown to reduce abandonment, there
we ofer an approach to overcome time and capacity constraints of is a growing concern that the quality and reliability of ATs are
PT students to utilize advanced fabrication technologies to create decreasing over time [13-16]. Reduced quality may be due in part to
customized ATs through connecting them to professional makers. the reduced reimbursements for ATs and related services [14, 15, 17],
We describe our observations, including feedback and reactions which leads to a second major issue in AT – a lack of fnancial
from PT students who worked with real end users in a classroom incentive for companies to provide post-launch support.
setting to design and realize customized AT products. Our fndings These issues have motivated eforts to develop ways for end
provide insights on how the AT design process can be improved users to be more involved in the design and customization of ATs
Creating 3D Printed Assistive Technology Through Design Shortcuts: Leveraging Digital Fabrication Services to
Incorporate 3D Printing into the Physical Therapy Classroom ASSETS ’22, October 23–26, 2022, Athens, Greece

to reduce poor ft and increase user technology buy-in. Our study should not be a requirement to also be an engineer to create or
contributes to this space, by investigating how to connect PTs, customize appropriate AT devices for users in a PT setting.
makers, and end users to allow for the creation of assistive devices
that are customized based on user needs and are, therefore, less
likely to be abandoned. 2.3 DIY Online Communities for AT
In addition to hands-on localized eforts to create DIY-AT (described
above), many studies have looked at how to create or leverage online
2.2 Consumer Grade Fabrication Tools for communities to connect makers interested in developing ATs, end
People with Disabilities users, and other stakeholders together. For example, Buehler et al.
Over the past decades much research has investigated the possi- studied technologies related to AT posted on the online community
bilities of using consumer grade fabrication tools and prototyping of Thingiverse [2]. While they found that the platform housed a
materials for creating customized and Do-It-Yourself Assistive Tech- rich collection of AT designs for free usage and customization, they
nologies (DIY-ATs) [4, 18-20]. Much of this efort has focused on also found that a majority of the AT designers using it did not have
understanding how creating their own technologies can lead to disabilities nor any training in AT.
empowerment for people with disabilities [18, 19, 21]. Previous While Thingiverse is a general-purpose fabrication design repos-
studies have shown that AT users fnd it empowering to create itory that houses AT designs in addition to many other designs,
their own AT, which could lead to fewer cases of abandonment. other communities have formed that focus specifcally on DIY-ATs.
Hurst and Tobias showed that individuals with disabilities were For example, the e-NABLE community was created specifcally to
motivated by increased control over design elements, passion, and support makers that focused on prosthetic devices [4]. Their open-
cost to create and use their own DIY-ATs [18]. Meissner et al. found sourced designs are shared online and can be printed by volunteers
that participants with disabilities in DIY-AT workshops reported for those who need them. Despite its promise and potential to ofer
several empowerment-related outcomes, including viewing maker customized ATs to a wide range of users across the world, previous
skills as extensions of their own accessibility hacking abilities and research has identifed several challenges in this community. For
as tools to gain recognition in their community. Other eforts have example, Parry-Hill et al. showed that while there are a number of
shown that creating DIY-ATs can lead to community building both e-NABLE chapters around the world, there are only a few mem-
ofine (e.g., [22]) and online (see next subsection). bers who can design devices as opposed to those who can fabricate
A number of barriers have been identifed in relation to DIY-ATs. and deliver them. Furthermore, there is an issue when someone
For example, many of the tools that are used in makerspaces, along receives a device of not being able to get back in contact with the
with makerspaces themselves, are not accessible. This greatly limits volunteer if something goes wrong [4]. At a workshop that brought
the ability for individuals with disabilities to work in these spaces together various stakeholders of this community, including makers
independently. Recommendations have been made for special edu- and clinicians, serious tensions were identifed between the “do no
cation makerspaces [23, 24] but this has yet to become the norm for harm” culture of clinicians and the “help where you can” culture
these settings. The design of CAD tools, informed by HCI research, of makers [5]. Furthermore, the workshop surfaces concerns about
could greatly increase the inclusiveness of consumer-grade fabricat- the ability of consumer grade materials and design tools to produce
ing [25] but has yet to produce meaningful changes. Furthermore, artifacts and customizations specifc enough for clinical practice.
Hook et al. found considerable skill and time are required in creat- Previous research into online DIY-AT communities provides
ing DIY-ATs, putting additional pressures on parents or teachers motivation to develop better communication protocols and mecha-
[26]. nisms that bring together the expertise of digital fabricators, profes-
Recognizing the need for diverse expertise in this space, research sional PTs, and the lived experiences of end users in the design and
eforts have brought together multiple stakeholders (e.g., end-users, customization of ATs. These approaches may ensure device safety
digital fabricators/makers, therapists) in this space [5, 27]. For ex- and medical appropriateness by including feedback from PTs while
ample, Afatoony and Lee conducted a study that brought together strengthening device acceptance by incorporating feedback from
an end-user with physical disabilities with four occupational ther- end users.
apists, and four industrial designers in a series of workshops to
co-design ATs [27, 28]. They found that working together resulted
in knowledge exchanges and mutual learning that resulted in ap- 2.4 Digital Fabrication for AT Development in
plying combined expertise to co-designing novel and advanced AT Clinical Education
solutions. They further identifed the lack of tools, methods and Until recently, there were few projects that studied how to bring in
materials to co-design ATs. digital fabrication, including 3D printing, into the physical and oc-
Our work builds on these eforts to leverage the potential of cupational therapy classrooms. A recent systematic review of using
consumer grade fabrication tools to create ATs with input from end- 3D printing in biological education, found only one paper focused
users. While it is important to continue making digital fabrication on PT [29]. In this pioneering project, McDonald et al. developed a
tools accessible, creating processes that bring together stakeholders course to introduce PT students to 3D printing [3]. They identifed
with complementing skills and lived experiences can enrich the several opportunities for connecting PTs and makers in the future,
space of AT development and reduce the burden of learning new including that (1) PTs already perform making tasks and presenting
skills. While some individuals will enjoy the process of design, there them with new DIY tools could expand this existing practice, (2)
PTs bring important and complementary medical expertise to the
ASSETS ’22, October 23–26, 2022, Athens, Greece Erin Higgins et al.

table when engaging with DIY practice and could ensure that DIY- study with 17 OTs, Afatoony and Shenai investigated how clini-
AT products are created to be safe and appropriate for extended cians customize and adapt ATs using low-tech materials to better
use, and (3) PTs had access to actual end users who could inform ft the needs of their clients [34]. They found that while most OTs
the DIY maker process. They also identifed multiple barriers in tried to use of-the-shelf ATs, adaptations were often necessary to
introducing PTs to digital fabrication. These included PTs having ft a client’s needs. Most OTs described using low-cost craft ma-
limited time to learn new digital fabrication skills, the availability terials (e.g., foam, duct tape, etc.) and techniques (e.g., gluing or
of easily purchased devices that most PTs can default to rather than cutting to enlarge or reduce size, or to attach items together) to
creating new artifacts because of their limited time, and the lack of make adaptations. They described a process where they would as-
tools to ensure standardized reliability of DIY-AT products, leading sess a client’s need, make an adaptation for them, and then iterate
to concerns about their liability. on it with the client to troubleshoot and refne it over time. Addi-
More recently, several eforts have explored interdisciplinary tionally, OTs mentioned having concerns around liability and safety
approaches for integrating digital fabrication in PT education. For of modifed devices that makes them try to do minimal mechanical
example, Wagner et al. described a project which connected PT and adaptations and, for some, to avoid adapting ATs with electronic
OT students and faculty with librarians operating 3D printers on a components all together. In another study, where authors worked
college campus [30]. The librarians were able to connect OT and PT with four OTs as digital fabricators over a period of four-months,
students to a group of biomedical engineering students to help with Hofmann et al. found a misalignment between the priorities of ATs
3D modeling tasks. School faculty collaborated with the librarians and digital fabricators/makers: specifcally, OTs prioritized client
to develop assistive technology assignments for students that were safety over taking risks in fabrication and aimed to come up with
then printed at the library. Over three years, 78 students collabo- efective adaptations for their clients in one shot (rather than us-
rated with librarians and faculty to 3D print ATs. The stakeholders ing iterations) [35]. Furthermore, they identifed a need to make
used in-person and virtual meetings as well as email and shared it clear what resources are available and what is feasible to the
spreadsheets to keep track of projects [30]. This case study shows clients, so they are not overreaching when using digital fabrication
that leveraging on services (e.g., librarians trained in 3D printing) methods. These fndings show that adapting and customizing ATs
can be an efective way to introduce 3D printing in OT and PT edu- is an important part of OTs practice and could be supported by
cation through the creation of ATs. In another recent project, Davis utilizing emerging technological tools and processes. Furthermore,
and Gurney assessed the impact of using 3D printing in project- clinicians have concerns and priorities that may difer from makers
based OT learning and found that compared with a control group, and digital fabricators.
students who used 3D printing in their projects had increased tech- Our work builds on this previous research by both recognizing
nology self-efcacy [31]. While students were asked to consider the potential of 3D printing to enhance the learning of OT and PT
a “real-world” user of their designs, they did not interact or work students and by studying an approach to limit the need for time and
iteratively on their work and following a month-long module in 3D technical expertise on PT students when using digital fabrication for
printing were asked to design and print devices for their projects. AT development through connecting them to professional services
While both of these projects provide evidence that 3D printing can provided by staf at a community makerspace.
be successfully introduced in the PT and OT classroom, they did not
provide details on challenges faced in implementing the programs 3 METHODS
and communication strategies that proved successful or otherwise
with respect to specifc student projects. 3.1 Educational Series Design
Other work by Chen et al. elicited positive attitudes in OT clini- We designed and implemented a digital fabrication educational
cians towards integrating 3D printing into their practice, especially series for a single cohort of PT students that consisted of six face-
with respect to the potential for reducing cost of customized items to-face learning sessions. This course series was modeled after an
and the potential for the customizability of digital fabrication to earlier course designed by McDonald et al. [3] but there were three
fll in a gap in creating ATs [32]. The clinicians stressed that the major diferences. First, in contrast to the earlier course, CAD was
time to learn how to use 3D printing efectively can be a barrier not attempted to be taught in this course, but simply introduced to
to adoption. While most previous eforts have focused on general allow the PTs the ability to communicate with makers efectively.
applications of 3D printing for PT and OT, Paterson et al. developed Second, after the design with simulated end users, PT students in
and evaluated a computer-aided software design specifcally for our course designed with real end users. Finally, instead of the
creating wrist splints [33]. The software and associated workfow research team 3D printing the design, they were fabricated by a
allow practitioners to customize a base model to match a client’s 3D local community makerspace.
scanned wrist model and create a customized model of an aestheti- The frst session was a 1.5-hour introduction to AT with the
cally pleasing splint. In interviews with 10 clinicians, they found subsequent fve sessions involving AT development (modeling, re-
that users were able to navigate and use the software to create vising, reviewing) with the later fve sessions each spanning 3 hours
models but also were concerned about capturing patient scan data in length. All six sessions were conducted with the same cohort of
(needed as input into the software tool) and the material and cost PT students over a total of 2 semesters within the PT educational
suitability of the designs (since the models need to be created on curriculum. During the initial sessions (2, 3, and 4), students de-
higher end fabrication devices) [33]. signed AT for simulated end users and during the later sessions
Finally, several studies have focused on the existing practices (5 and 6) they worked with real end users on AT devices. We will
of OTs and PTs when adapting ATs. For example, in an interview describe the format and content of each phase next.
Creating 3D Printed Assistive Technology Through Design Shortcuts: Leveraging Digital Fabrication Services to
Incorporate 3D Printing into the Physical Therapy Classroom ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Simulated AT Design Cases (adapted from McDonald, et al. 2016)

Case Number Age Gender Diagnosis Specifcations


1 62 Male Left Hemorrhagic stroke Requires use of hemiwalker; limited ability to grasp
with right hand
2 45 Female Traumatic Brain Injury (TBI) Requires use of a quad cane; limited by right wrist
fexor synergy
3 32 Male Humeroradial fracture Requires use of auxiliary crutches for recent ankle
sprain; limited by right elbow fexion range
4 60 Female Right adhesive capsulitis Requires use of straight cane for balance; limited by
shoulder fexion range
5 8 Male Cerebral Palsy Balance issues; limited by fxed trunk fexion to 20
degrees; fxed trunk right side bend/rotation to 10
degrees

The fabrication for the educational learning series took place at pre-existing devices. The scenarios refected the students’ learning
a local community makerspace stafed by youth employees with objectives in the course and, therefore, centered on walking aids
expertise in 3D modeling and CAD, 3D scanning, 3D printing, and such as canes, crutches, and walkers. The cases refect common
other digital fabrication methods (e.g., laser cutting). The mak- pathologies and frequently used assistive devices. They provided
erspace houses a 3D print shop with multiple consumer-grade an opportunity for practical application of the student’s existing
mid-range printers (e.g., LulzBots, Prusas, and Ultimakers) that use knowledge and skills in the new domain of 3D printing. By utilizing
common plastic flaments including PLA and NinjaFlex fexible familiar content, we reduced cognitive overload.
flament. The second session used the specifcations provided in the table
to fll out order forms (Table 2) that would be sent to the local mak-
3.1.1 Phase 1: Building Foundational Skills with Simulated End erspace where staf printed prototypes based on their specifcations.
Users. The six session, in-person educational series introduced PT Question fve and question nine were added to this form based on
students to 3D printing and designing custom assistive technolo- feedback after phase 1. In the third session, PT students presented
gies across two semesters of the PT curriculum, with the frst four the frst version of their designs to the class and had an opportunity
sessions being incorporated into a foundational PT course and the to make fnal revisions by resubmitting revised order forms. In the
last two sessions embedded into a clinical focused PT course. fourth session, the 3D printed models were delivered and evaluated
The frst four sessions were embedded into a PT course designed by the PT students.
to bridge foundational sciences with introductory clinical skills.
For instance, in this course PT learners are exposed to principles 3.1.2 Phase 2: Designing with End Users. For sessions fve and six,
of biomechanics and the application of these concepts to assistive PT learners were asked to apply knowledge from the frst phase of
devices. Session One began with a defnition of additive manufac- the course to design AT devices for 12 volunteer end users (Table
turing and a description of current applications of 3D printing with 3). Sessions 5 and 6 were incorporated into a course designed to
specifc examples of PT products (such as handgrips). The students teach advanced concepts of clinical care across patient populations,
were then given an overview of the common types of 3D printers specifcally, with medically-complex patient populations. In session
(including SLA, SLS, and FDM) and their pros and cons. The stu- fve, the PT students were broken into 12 teams and assigned to do
dents then went through a detailed description of the process of design and modeling sessions for a particular end user.
designing for 3D printing and a detailed description of additive For these sessions, we kept the constraint of the maximum 5
3D printers and how they work. Students were fnally asked to by 5 by 5-inch dimension. Based on our observations from the
complete a thought exercise in which they came up with an assis- frst sessions, we reevaluated and refned the order form. The main
tive device that could be augmented by 3D printing. The intended changes were to add two questions: one regarding the physical
outcome of this session was an understanding of the process and properties and how the object should feel and one clarifying if this
the limitations of 3D printing and to brainstorm some practical uses. object was to be attached to another object. This fnal version of
In the second session, the PT students were separated into 5 teams the order form (Table 2) was used to provide specifcations to the
and designed assistive devices for imaginary end user cases (Table community makerspace.
1). In these cases, the PT students were presented with scenarios During this phase, students’ groups had the option of using
in which an end user needed an assistive technology device that clay models to specify AT designs. If the group used clay models,
was no larger than 5 by 5 by 5 inches. This limitation was to ensure they were left to dry and then brought back to the print shop for
that it could be printed on consumer-grade 3D printers that were scanning using a NextEngine 3D Laser Scanner. Once the 3D model
available at the makerspace. was captured and refned by the staf, it was 3D printed. Some
The imaginary design cases were presented as shown in Table groups had their designs printed based of of sketches and given
1 below and adapted from McDonald et al. They were created dimensions through the order form. Before the fnal review session,
by the PT instructor and focused on designing augmentations to all 12 teams were allowed an opportunity to make modifcations to
ASSETS ’22, October 23–26, 2022, Athens, Greece Erin Higgins et al.

Table 2: Questions from the Final Order Form Used in the Course

Question Number Question Text


1 Client description: age, diagnosis, expected use of device, expected impact of device use
2 Object Drawing: Please provide a drawing of your fnal object design (with dimensions in mm).
3 Surface Details: Are there any decorative aspects (surface details) that are not important to your
fnished print?
4 Material (circle one): STIFF FLEXIBLE
a. Please provide a detailed description for why your object should be stif or fexible?
b. Will this 3D printed object come in contact with your client’s skin? (Circle one): Yes/No
c. If yes, please describe where the client will touch it, and how much weight you think they will
put on it.
5 (not included in phase 1) Will the 3D printed object attach to an existing object (circle one) Yes/No
a. If yes, please describe that object and provide a drawing showing how the 3D printed part will
touch the other part
6 Density (circle one): HOLLOW PARTIALLY FILLED SOLID
b. How much weight will your object hold (e.g., in pounds)? How much stress will it be under?
7 Size: Should your printed object be the same size as your clay model? (Circle one): YES/NO
c. If no, should your model be scaled up or down? (Maximum dimensions of 139mm x 139mm x
139mm)
8 Color: What color is your object? (Circle one): Black/White
9 (not included in phase 1) Physical Properties: How should your printed object feel? Describe another object that has similar
physical properties (strength/fexibility/texture)
10 Additional notes

Table 3: End Users Information and Description of Diagnosis and AT Design and Use Description

Case Number Age Need for AT Expected Use Expected Impact


6 56 Grip issues when writing Increase size of pen/pencil Decrease efort exerted while
with pencil writing
7 74 Flexor Synergy Increase left wrist extension Increase wrist extension and
decrease fexor synergy
8 68 No Diagnosis To stabilize wrist Help her type, eat, shower, use
stairs
9 61 Cerebrovascular accident Finger extension Improve left fnger and wrist
(CVA) extension
10 66 Right stroke and left To allow left hand to open and Would allow user to restore some
hemiparesis supinate restoring some function functional use with grip strength
11 78 Right CVA Holding eating utensil Achievement of goal of using left
hand more
12 68 Decreased grip strength To open caps and lids Able to open jars and can
independently
13 72 Severe bilateral collapsed Daily in-shoe Reduce pain by straightening ankle
arch and supinate in his foot; relieve
pressure on bone spur callous
14 70 Extension of right Assist with opening of right hand Improve door opening and cooking
metocarpophalangeal joints
and proximal
interphalangeal joints
15 66 Lower grip strength and Wear on wrist to improve grip on Improved grip and steadiness
control due to stroke left side
16 63 Stroke; fexor synergy Extend fngers More functional use of arms
17 61 Stroke Help cut vegetables when cooking Helps with activities of daily living
Creating 3D Printed Assistive Technology Through Design Shortcuts: Leveraging Digital Fabrication Services to
Incorporate 3D Printing into the Physical Therapy Classroom ASSETS ’22, October 23–26, 2022, Athens, Greece

their models through the order forms again. When necessary, prints questions that the students raised throughout the course that were
were brought back to the makerspace and modifed or reprinted. not refected in the survey responses.
Almost all of the communication was through the order forms or The data was analyzed using qualitative content analysis with an
face-to-face with a minimal amount of communication happening inductive approach, as well as thematic analysis [36]. All collected
over email. data was stored in a spreadsheet and coded according to repeated
The fnal session of the second phase consisted of the fnal review themes brought up by PT students. The themes were abstracted
and AT delivery session with therapists and end users. During this and labeled with a code. Fourteen diferent codes were used and
session, 3Doodler Pens were provided to student teams for any fnal identifed. These 14 categories were used to identify six themes that
customizations. These pens allow users to draw a raised graphic addressed most of the topics brought up by the PT students.
using PLA flament on an object’s surface. This allowed the students
to quickly add small additional changes to their devices without any
modeling or measuring. They could directly “draw” new additions 4 FINDINGS
onto their devices. Only a small number of groups (between 1 and 3) Our fndings consisted of device design outcomes and PT students’
utilized this tool. They were also provided with extra components feedback on their experience with the learning material presented in
such as Velcro which is common practice to include with AT devices the sessions. We will frst present end products that were designed
in the PT profession. by the PT students throughout the course followed by qualitative
fndings from student surveys and observations. It should be noted
that in direct participant quotes throughout the paper, PT students
3.2 Participants refer to the end users as “patients” which is consistent with their
58 Physical Therapy (PT) students participated in the educational educational context. However, we will use the term “end user” in
series. Sessions 1-4 were conducted during the Spring semester of our writing as it better refects a social model of disability.
their frst year of the PT curriculum and sessions 5-6 were completed
during the following Fall semester, correlating to the second year
of the PT program. 4.1 Device Outcomes
Twelve end users also participated in sessions 5 and 6 of the At the conclusion of each course, all 5 student teams from sessions
series (Table 2). Finally, 12 youth (6 female, age range 14-18) staf 2-4 for simulated patients (case numbers 1 through 5) and all 14
members at the community maker space participated in the study. student teams from sessions 5-6 for end users (case numbers 6
The youth had received training in 3D printing, 3D modeling, 3D through 17) had successfully created customized AT designs (Table
scanning, and other aspects of digital fabrication (e.g., laser cutting) 4). Small changes were required in almost every case mostly relating
prior to this project and worked collaboratively on a part-time basis to the attachment mechanisms or smoothness of the devices, as
in the community maker space primarily on 3D printing tasks for seen in Table 4.
between 3-12 months prior to this project. In this paper, we focus Overall, the designs were successfully printed in large part due
on data collected from the PT students and will discuss data from to the sketching skill of the PT students as seen in Figure 1. The PT
makers in the future. students were able to provide incredibly detailed technical sketches
with precise measurements to assist the makers in developing their
designs. This skill from the PT profession allows makers to easily
3.3 Data Collection and Analysis understand specifcations needed in the end product. Figure 1 shows
Our data consisted of PT student surveys to gain feedback about the sketch for case 2 that was the frst step in the development of
their feelings on 3D printing being integrated into their profession the printed device shown in Figure 2.
from both phases of the course, AT model order forms, AT design However, even with their sketching talents, PT students learned
documentation, and observations of the courses by researchers. We the value of iteration when some of the devices were unusable after
have described the order forms and AT design documentation pre- their frst order form. For example, the frst print of their sketch in
viously and, in this subsection, will describe the surveys collected Figure 1 was entirely unusable, as seen in the left-hand image of
from the students. Figure 2.
During the frst phase, at the end of each session, the PT students This failure was caused by attempting to use cheetah flament for
flled out a survey. Each survey served to evaluate how each student printing. They originally thought this would be an ideal material
felt about the diferent aspects of sessions. It helped to ascertain because of its fexibility, but it led to the failed print. They had
their reservations and expectations about the 3D printing process. to re-evaluate their techniques to better align with the skills of
It also contained their evaluations of the fnal products. the makers and the use case of their design and ended up with a
During the second phase of the course, because the PT students successful product at the end of phase 1 as seen in the right-side
had already completed a similar process with simulated end users, image of Figure 2 by changing the flament type.
only two sets of surveys were collected for this study. One was Despite all end devices being determined as successful, some
used to collect data about the student’s experience in comparison devices, for example as seen in the top right image of Figure 3,
to simulated end users as well as how they feel about 3D printing were printed in such a way that they could not fully be used by
in their feld overall. The other survey was used to garner feedback the end user due to lack of ft. However, the devices in the top left
from the PT students and end users about the fnal products. Ob- and bottom images of Figure 3 were all more appropriate and ftted
servations were utilized throughout both phases to capture verbal properly.
ASSETS ’22, October 23–26, 2022, Athens, Greece Erin Higgins et al.

Table 4: Final product description and PT student comments on the 3D prints for imaginary (phase 1) and real (phase 2) users

Case Product Created PT Student Comments End user comments


Number
1 Custom ftting hand grip for a Worried gravity efected their clay model N/A – simulated case
hemi-walker and that the device wouldn’t ft on the
hand of a hemiwalker
2 A bowl for someone to use a cane with Worried it is not strong enough to be N/A – simulated case
their elbow weight bearing
3 Custom elbow splint that can attach to The overall size would need to be bigger N/A – simulated case
crutches to ft an adult patient
4 Shoe wrap to improve balance Side loops for the Velcro aren’t big N/A – simulated case
enough
5 Forearm rest on a walker It is only 70% of the size of their model so N/A – simulated case
the dimensions are of and the clay model
captured many imperfections
6 Custom grip for pencil Would round edges more if given time; He is hopeful that this will relieve the
would want a “squishier” material tiredness and aching in his hand
allowing him to write better
7 Curved splint with holes for straps Round the edges and cut more room for N/A – did not share feedback
the thumb
8 Wrist splint Smooth rough edges; design change to This will give her more confdence in
hollow if given time holding the railing while going up
stairs; needs smoothed for comfort
9 Finger straps to help increase The fngers were not wide enough; wrong The stretch felt so good; needed to be
fexibility color smoothed for comfort
10 Wrist splint with holes for straps Slightly too small Too small and does not believe it can
sustain enough pressure to pull him
out of pronation; generally, very
curious about 3D printing, however,
and willing to try again
11 Larger handle that attaches to fork for The angles did not match the design so it If the design is not correct in 8 weeks,
eating wasn’t properly ergonomic he would be uninterested
12 Y-shaped handle that attaches to a jar The material needs to be more fexible She is hopeful that it will allow her to
that requires lateral force instead of and a hole needs to be drilled to attach a open all jars and bottles easily
twisting to open handle because that is tough for her
13 A shoe insert for heel The material was too rough and there Looking forward to the potential pain
were concerns about breakdowns with relief; needs smoothed for comfort
continuous use
14 Prints that ft over joints on the hand Smoothness was great just needed bigger Really excited; exactly what she
to allow for training to open hand holes for attachments wanted
wider
15 Cylinder weights that allow patient to Better print than they were expecting Comfortable and stable enough to
strength build throughout their day wear and not interfere with his
activities; ideas are worth printing as
a prototype to see if the idea is
feasible
16 Splint to assist in extending fngers Need a hole for attachments N/A – did not share feedback
17 Customized extended grip holder that Need to adjust hole size for attachments Exactly what was expected and she is
can ft silverware hopeful that it will help her efectively
cut fruits and vegetables safely
Creating 3D Printed Assistive Technology Through Design Shortcuts: Leveraging Digital Fabrication Services to
Incorporate 3D Printing into the Physical Therapy Classroom ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 1: Example of sketching skills from Group 2 in phase 1 depicting a bowl designed to be comfortable for an elbow and
attachable to a cane.

Figure 2: (Left) This image shows a failed cheetah print of an elbow bowl for case 2 in phase 1 and (Right) This image shows the
successfully printed version of the image on the left.

4.2 PT Student Reactions to Working with Real with real end users made the process more meaningful in compar-
End Users ison to simulated users for the PT students and made them more
motivated to produce successful results. Student 20 stated, how-
After completing designs for simulated and real end users, the stu-
ever, that this added some stress by saying, “There’s more pressure!
dents flled out a survey asking for their opinions on the process.
[You] don’t want to disappoint. An emotional aspect has been added”.
When asked to compare the experience to designing for simulated
While many felt this emotional pressure, real end users helped the
patients, most PT students said that designing with a real end user
learning process. Student 33 stated, “It is defnitely more challenging
was helpful to their design process and led to more impactful AT
working with a real patient but it allows us to get real feedback on
device creation. For example, Student 24 stressed the importance of
things that will beneft the patient”, showing how feedback from end
the relationship with the end user stating, “I am more motivated to
users was found to be very important. Many PT students also found
make an object that will really make a diference for my patient be-
cause I am forming a relationship with him.” Furthermore, working
ASSETS ’22, October 23–26, 2022, Athens, Greece Erin Higgins et al.

Figure 3: (Top Left) The pen grip designed for and being used by end user 1. (Top right) the hand splint designed and being used
for end user 5. (Bottom Left) The strength building arm device designed for and being used by end user 10. (Bottom right) The
device to train users to open their hand wider being used by end user 9.

that designing for real end users helped them to form a more real- felt that the drawings and precise measurements were the most
istic understanding about what designs could actually be created important. Student 2 stated, “I don’t think the designers were able to
utilizing the methods that they were learning. Student 49 stated, use our drawings, measurements, and written descriptions. I would
“Some things that seem functional or well designed in simulation don’t like to have seen what could be produced by our written plans. If I had
work out or show faws when applied to a real patient. Working with to guess, more views of the designed device may have been useful”.
a patient shows how to create a more realistic design”. PT students also felt that face time with the makers would have
greatly benefted them, student 19 stating, “I think you just need to
4.3 Challenges Experienced by PT Students be as specifc as possible. Maybe it would help to talk directly to the
Communicating with Makers individual printing”.
A specifc example of a communication breakdown can be seen
PT students identifed a series of challenges in using both paper
in Group 4. This group’s printed project had fnger holes that were
forms (Figure 4) and clay models (Figure 5) to communicate with
too small. A possible reason for this was because in their order form
makers. Many of these comments point to a need for face-to-face
they simply gave a list of diameters in centimeters. Not only were
meeting and discussion with makers. While CAD was introduced
dimensions asked for in millimeters, but they also never specifed
to the PT students in initial sessions of the project, PT students
if this was to be in the inner or outer diameter of the fnger holes.
found it incredibly difcult to learn how to efectively communicate
Their sketch is shown with these specifcations in Figure 4 below.
about CAD designs in a short time. Therefore, all PT students relied
With this lack of understanding, the makers made assumptions that
entirely on drawings and clay models for the development of AT
were incorrect.
with end users.
We found that using clay for 3D modeling was a great way to engage
With respect to the paper forms, most of the feedback pointed to
the end users and provided an easy way to get exact measurements
not knowing the best way to communicate what needed to be done
in the moment. However, there were major issues with digitizing
to the makers because of lack of personal connection and shared
the clay model and turning it into a 3D printed object. A major
language. Student 2 stated after phase 1, “It really seems like some
issue was the shrinking and deforming of clay models while drying.
groups benefted from having their design tweaked by a CAD program.
For example, Student 6 explained, “I found that the clay was not
So without a good working relationship with a computer engineer or
super helpful in the transition to the 3D printer. The clay tends to
a personal knowledge of CAD program this seems like a difcult and
lose its shape during the drying process”. There were also complaints
clumsy venture”. As seen in Figure 4, the forms contained places
about the surface issues created by scanning clay. When using
for drawings, measurements, materials, etc. but the PT students
Creating 3D Printed Assistive Technology Through Design Shortcuts: Leveraging Digital Fabrication Services to
Incorporate 3D Printing into the Physical Therapy Classroom ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 4: The sketch depicting a drawing of a hand with measurements for each fnger hole that was provided by Group 4 to
guide the makers in printing

Figure 5: (Left) Clay model of an attachable device for a walker that has deformed slightly because of the method used for
modeling an object that is heavy. (Right) A clay model of a pencil grip that has held its shape well because of the small
dimensions and uncomplicated nature of the item.

clay, modeling smaller devices, such as the pen grip seen in the to develop exact specifcations for the makers. The clay model seen
right image of Figure 5, were very successful because they did not in Figure 5 on the left-hand side can be seen in Figure 6 being used
sufer some of these deforming issues. Whereas bigger objects, such to generate exact measurements for a sketch on the paper form.
as the walker addition see in the left image of Figure 5, sufered This combination of sketching, measuring, and clay modeling
from some of the issues with shrinking while drying. These issues allowed the students to create a very smooth and successful fnal
led to measurements and sketches being the most reliable way to product as shown in the right-hand images of Figure 6. While the
communicate their device needs. clay model deformation led to a product that was unusable, the
Overall, we saw that while the paper forms and clay models each sketching and modeling together helped to form a usable fnal
ofered diferent possibilities for engaging PT students, end users, product.
and makers but they had limitations that ultimately led to a lack of Overall, the PT students felt there was a disconnect between
a robust shared language and communication approach. Because of what they thought they were communicating to the 3D printer oper-
the lack of face-to-face time and shared language, communication ators and what they interpreted. PT student suggestions to address
between the PTs and the makers required a large number of specifcs this were to have more face-to-face time with makers, more fre-
communicated through the forms and clay objects. Some of the quent communication, and a better way to communicate. Another
most successful products utilized multiple communication methods suggestion was that more knowledge of CAD might be helpful for
ASSETS ’22, October 23–26, 2022, Athens, Greece Erin Higgins et al.

Figure 6: (Left) The failed clay model from fgure 5 is being measured and used to create a detailed sketch also shown in the
image. (Right) The fnal 3D printed item that is smooth and attached to a walker device.

communication. This lack of competency with technology also led useful for PT throughout all workshops. Seeing our failed attempts
to mismatches between what the PTs were imagining and what made me realize how detail-oriented 3D printing is and how difcult
was possible to achieve with 3D printers. it can be. I’ve also learned that even though there’s a lot of obstacles it
most defnitely is possible to perfect.” Perfecting technology literacy
to facilitate communication with makers as well as ensuring that
4.4 Observations on Design Process PT students are aware of the possibilities with 3D printing could be
Several PT students identifed the mismatch between the engineer- an important next step in furthering the potential of 3D printing in
ing design process and the PT clinical process because in practice, AT. Student 5 expressed that the knowledge of what is possible is
to get feedback for each design iteration, they would be required to an essential next step for the relationships between AT and making
have repeated end user visits. PT students observed that they felt it to fourish stating, “Yes, it is a cool tool that we can hopefully use
would be difcult to bill for multiple iterations of the same product to help patients in the future. I would like to see examples of more
or repeat visits for iterating on their design. Student 16 summarized products and become more familiar with items that are possible to see
the issues stating, “The design process is very similar to the traditional what works. The failed prints allowed us to see what doesn’t work or
engineering design process...time constraints and the number of avail- what should be reconsidered.”.
able patient visits (insurance-wise) could be a challenging aspect of After the course, we found an interesting diference between PT
actually using 3D printing for devices for real patients”. The iterative students in their knowledge of the prospects of 3D printing and
nature of the making process is at odds with the time-pressured how many materials were available. For example, Student 17 stated
and insurance-run nature of healthcare. It led to some frustrations that, “There are many materials available, which allows us to tailor
amongst the groups when they were not able to actually achieve our device to the likes of our patient”. However, Student 30 stated
their full vision. Some PT students ofered suggestions about what “3D printing has limitations and I wish there were more options to
might help ease this tension including ensuring that end users were incorporate other materials”. This mismatch in understanding of the
more communicative in the short time they spent together. Student availability of materials is an indication that there might be some
7 stated, “I think having a patient that explicitly states what they additional information about materials that would be helpful to
want help in will help in the creative process”. The lack of end user impart to the PT students so that they all have a full understanding
availability means that the communication between the PTs and of what can currently be done with this process.
the end users’ needs to be informed and accurate the frst time so
that they are able to fully execute their design with the makers.
After being introduced to the concepts of 3D modeling in phase 4.5 Final Products and Appropriate Use Cases
1 of the course, many PT students had a moment of realization PT Students provided detailed feedback on the process of using
that the CAD modeling might be too difcult to learn on top of digital fabrication to develop and create AT devices. Many of them
their other curriculum but that there are experts out there whose saw the potential but just didn’t feel that the digital technology
services can be leveraged. Student 18 stressed the importance of was where they needed it to be to make exactly what they wanted
these experts stating, “I’ve felt like 3D technology was incredibly due to the difculty of the software and the time needed to iterate
Creating 3D Printed Assistive Technology Through Design Shortcuts: Leveraging Digital Fabrication Services to
Incorporate 3D Printing into the Physical Therapy Classroom ASSETS ’22, October 23–26, 2022, Athens, Greece

and create projects. Some acknowledged the limitations of the products. PT students generally do not have much formal education
technology but instead ofered some solutions for them. Nine PT about the cost of customization and it is something they learn from
students had specifc suggestions for appropriate contexts and how real-world clinical practice. Therefore, the PT students did not have
3D printing could be used in the meantime. For example, Student much practical knowledge on which to base these estimates.
3 was very specifc in stating, “I think that 3D technology is most End users also gave feedback on the amount they would pay and
applicable for patients that require customization for the grip and fne the time they were comfortable spending. Four end users stated
motor control but not for general adaptive technology. For example, that they would spend any amount of time necessary to have a
I think our elbow cup was not a good device to 3D print whereas custom device. However, most end users had a cost limit that was
the customized walker grip on case number 1 was ideal and patient based of of previously paid for AT.
centered”.
Multiple PT students brought up how the printing might have
made more sense as an addition to already created devices. For 4.7 Liability Concerns
example, Student 2 stated, “Perhaps attaching the device to the cane
During observations, several PT students expressed concerns about
would be more efective using some sort of commercially available
liability issues stemming from the types of materials available to
hardware”. This could mean having AT on hand to build of of
3D print with. However, no students brought up liability in their
and specify from or printing something similar and then adding
written responses, possibly due to the format of our surveys. The
these customized components as Student 3 stated would be helpful,
questions brought up in the classroom were in regards to concerns
“I think our device could beneft from a non-3D printed component
about who is liable if these products were to break or injure an end
to afx it to the quad cane. I would have liked to modify the AT in
user. In one survey, Student 46 also expressed fears about device
addition to just afxing the device – we really wanted to screw into the
failures due to materials and ill-ftting devices stating that in regards
cane itself”. All of them said it was benefcial to have the old device
to 3D printing in PT, “[Their] opinion has grown, but I feel specifcs
with them for reference, usually along the lines of how Student
are very important to ensure a reliable print with limited failures.
18 stated, “I think [makerspace staf] having access to the assistive
Failures could become costly for patients”. There was an apparent
devices we used to make our design [would be] incredibly helpful”.
fear that the materials available were not exactly appropriate in the
Considering this feedback, it might be helpful to encourage an
medical context. Though not much data was collected in regards to
understanding of of-the-shelf devices for the makers so that they
overall student opinions about liability, the concerns brought up
can just add additions to standards instead of having to create a
by the students are valid and need to be addressed in order to fully
device entirely by themselves.
integrate 3D printing into PT practice.
Despite the challenges, 37 of the PT students expressed that they
were excited to see where the use of digital fabrication in AT was
going but not as excited about their current outcomes, with Student
49 expressing, “[3D printing] will be more and more useful as printers 5 DISCUSSION
and materials improve”. They seemed to believe that only signifcant Conducting the course provided insights into the possibilities and
improvements in technology could get them where they wanted challenges that exist when PT students, makers, and end users work
to be in terms of designing items with 3D printing but saw the together to use digital fabrication tools to create customized AT. We
potential. For example, Student 56 stated, “Your imagination is your found that working with end users and outsourcing the fabrication
limit”. Overall, following the course the PT students were able to was overall successful at providing PT students with an opportunity
identify what the shortcomings for this type of device development to utilize digital fabrication tools and techniques in their learning.
were and ofered suggestions of how to make more appropriate Compared to previous work, where PT students learned how to use
projects. 3D modeling and printing tools [3], outsourcing the digital design
and fabrication to a community makerspace addressed many of
the issues PT students faced. By creating this connection between
4.6 Expected Material and Time Costs makers and PTs, our approach eliminated the need for either group
The PT students were asked to share what they would be willing to become experts in each other’s felds. Instead, they developed a
to pay for 3D printing services and the amount of time that they shared language to communicate their needs to each other. While
were willing to wait for the fabrication process to be completed. our study showed that developing and efectively using this shared
This question was not asking about the amount of time the PTs language is non-trivial, it is a promising direction and can equip
themselves would spend, but the amount of time they were com- all stakeholders with relevant interdisciplinary competencies. Out-
fortable waiting for a device to be printed. Twelve PT students sourcing the 3D modeling adds an overhead of communication,
said they would be willing to pay any cost for a device that was however, and training clinicians in fabrication skills has value that
perfectly suited to the needs of their end user. Otherwise, they said needs to be balanced with other factors when outsourcing fabri-
that they would be willing to pay 40.81 dollars on average with the cation. Based on these fndings, a promising future direction to
median response being 25 dollars. Student 56 who said there was focus on is developing training programs that provide both PT
no limit, stated, “It could change our patients’ lives. She said within students and makers with a shared interdisciplinary knowledge
minutes that the mid/stretch felt ‘so good’. That is invaluable”. Of the and language that can be leveraged for using efective and efcient
PT students included, on average they said they would be willing communication protocols. In the following subsections, we will
to spend 345.34 hours with a median of 168 hours to develop these discuss lessons learned from the educational series in more detail.
ASSETS ’22, October 23–26, 2022, Athens, Greece Erin Higgins et al.

5.1 Involving All Stakeholders these techniques for AT development. Conversely, an overview of
A key aspect of our approach was including real end users in the common PT terminology used to describe and evaluate ATs can be
second phase of the project. We found including end users efective provided to makers to help develop a common language. Makers
at motivating PT students in creating functional and safe AT devices. would need to have an understanding of what of-the-shelf AT
After iterations, all AT devices designed in the sessions incorpo- looks like and the basics of the PT practice in order to more quickly
rating end users were evaluated as successful. Furthermore, PT understand what they are being asked to create.
students described working with actual clients as valuable, mean- Second, a combination of digital and paper forms combined with
ingful, and “real”. While previous research has shown that clinicians a mechanism for continuous asynchronous communication (e.g.,
have concerns about device safety and liability if injury occurs and a Slack channel [37]) between makers and PT students can be es-
if the materials used in digital fabrication is inappropriate for ther- tablished to provide detailed and frequent feedback on designs and
apeutic or medical use [3, 5], we saw that these considerations fabrication iterations. These forms can provide a template with ex-
were heightened for our participants in the later sessions of the plicit felds for the most important measurements and descriptions.
educational series when they were working with real users. Another possible facilitator to communication could be creating
To address concerns about safety and liability, AT project se- a shared VR workspace that would allow the makers and the PTs
lection needs to be carefully considered. If a design case has the to send CAD fles to each other and interact with them in a more
potential, if broken or used over time, to cause injury, the appro- tangible way before printing.
priateness of using 3D printing to create it needs to be approached Third, in line with what McDonald et al. also recommended
with caution. In this study, the project that was the most successful previously, creating a base set of 3D printed AT designs to start
was also happened to have the lowest liability (e.g., the pen grip). from and build of, in combination with detailed documentation
Working with real users can emphasize the importance of AT de- about each design’s purpose, consideration, and possible variations
vice safety, not only in the context of the PT classroom but in any would be incredibly benefcial [3]. In addition to capturing existing
context, such as online communities or community makerspaces, knowledge, this document can be informed by expert clinician and
where digital fabrication methods may be used to create ATs. In maker perspectives.
the future, more input from PT experts on the type of ATs that are
appropriate for digital fabrication could help inform future itera- 5.3 Leveraging Academic Training Programs to
tions of similar courses and programs, and also inform community Connect Makers and Clinical Students
DIY-AT eforts.
An important aspect of our project was to connect university PT
training programs to community organizations. Our project pro-
5.2 Developing Interdisciplinary Competencies vides an example of how community resources, such as makerspaces
One of the key fndings from our study was the need to develop an or other youth technical learning programs, can connect with uni-
interdisciplinary shared language between makers and PTs. While versity programs to form mutually-benefcial relationships. Our
in the fnal sessions of the course, all AT devices were evaluated research team comprises experts in physical therapy, digital fabrica-
as successfully designed, throughout the process, PT students ex- tion, and community engagement and the interdisciplinary nature
pressed challenges in communicating the characteristics of the of our collaboration as well as the range of resources and relation-
devices they designed to makers. Compared to previous research ships available to use facilitated the design and implementation of
which required PT students to use CAD software and engage with the project. Several co-authors had a long-term working relation-
3D printing directly [3], our study facilitated the process of digital ship with the community makerspace, and another research team
fabrication for PT students and allowed them to focus more on AT member was a core faculty member, with lead teaching responsi-
specifcations through methods they are already familiar with (e.g., bility for the PT training program. Researchers and practitioners
sketching and drawing on paper). PT students were very skilled considering setting up similar future programs should consider
and able to provide very detailed sketches to the makers. PT lead- the need for long-term relationships between all organizations in-
ers are very hands-on and much of their training focuses on the volved and identify clear roles, timelines, and expected outcomes
development of psychomotor skills in manipulating objects with for everyone involved. In our case, working out these details were
their hands. They were able to utilize these skills to make use of essential to project success.
the clay for modeling as well. However, efectively communicat- While community engagement was central to our project, an
ing the design with clay in precise enough language to produce a alternative confguration can leverage the increasing number of
perfect product remained challenging. This paper identifes smaller universities that are creating makerspaces on their campuses [38].
objects that are molded directly to a patient as more appropriate for These spaces provide ample opportunities for interdisciplinary col-
clay modeling while large objects, especially with unequal weight laboration across diferent academic programs, such as PT and
distributions, were less successful and might be better suited for engineering, and can be sites of AT innovation and development.
sketching and measurements. Introducing engineers to the AT design process can ofer mean-
These challenges can be addressed in the future in several ways: ingful and motivating experiences for diferent students in these
frst, more detailed information about the fabrication and design programs and lead to more sophisticated designs that draw on
process can be provided to PT students as part of the training. For skills from multiple engineering disciplines, for example to also
example, a set of sample 3D printed ATs, including failed ones, can add sensors to AT devices. Such collaborative programs can beneft
be used to demonstrate the possibilities and challenges of using
Creating 3D Printed Assistive Technology Through Design Shortcuts: Leveraging Digital Fabrication Services to
Incorporate 3D Printing into the Physical Therapy Classroom ASSETS ’22, October 23–26, 2022, Athens, Greece

by ongoing input from existing medical professionals in interdis- between PT students and makers. It has demonstrated the potential
ciplinary roles such as Assistive Technology Professionals (ATPs) for this collaboration as one way to address many issues encoun-
or experts who generally have both engineering and medical skills. tered when attempting to teach PTs to design and fabricate these
Further collaborations can include connections with hospitals and devices independently [3]. This study further worked to identify
therapy centers. specifc barriers to 3D-printing adoption in PT and some acceptable
use cases. Specifcally, we highlight the need to involve all stake-
6 LIMITATIONS AND FUTURE WORK holders in the process of custom AT development and the need
to develop interdisciplinary competencies to further facilitate this
This course was run with one group of PT students over one aca- relationship. We also discussed an important fnding in regards
demic year and needs to be verifed with more students in the to the best communication methods for this design process. We
future. The specifc characteristics of the community makerspace highlight some cases in which clay modeling is the most efcient
that was stafed by trained youth may have impacted some of the and in which paper forms and measurements are the most efcient
project outcomes and in future work we plan to study how these method of communication. In all cases, however, we point to the
factors may be diferent if collaborations are with other types of importance of face-to-face communication is for makers and PTs
community organizations or digital fabrication services. The com- and how these connections might be facilitated by universities. In
munity makerspace we worked with was stafed by youth makers the future, we hope to continue teaching 3D printing classes to PT
who while trained and knowledgeable in fabrication were not as students, as well as expand to a variety of medical professionals. We
experienced as professional industrial designers, engineers, or other also hope to develop the tools and competencies needed to make
professional fabricators. 3D printing a seamless part of medical practice.
Future work can explore how working with other organizations,
for example makerspaces at universities or colleges or online profes-
sional services, such as Shapeways, would impact the process and ACKNOWLEDGMENTS
outcomes. Given the PT student feedback on wanting to work with This work is supported by the National Science Foundation under
more materials and fabrication techniques, in the future, it would Grants DRL-2005502, DRL-2005484, and EEC-1623490. We would
be helpful to collaborate with multiple makerspaces or fabrication like to thank the PT students, instructors, and clinicians from UMB
facilities that may provide a wider range of options for the students who participated in our study. We would like to thank the Digital
to work with. For example, the use of thermoplastics or two-part Harbor Foundation for providing the space, materials, machines,
modeling materials could alleviate some issues with clay shrinking. and student workers to facilitate printing the devices for this study.
Furthermore, this study took place before the COVID-19 pan- We would also like to thank the Adaptive Design association, the
demic and most design communication between makers and PT Shield, and the Ability Project, Anita Perr, Shawn Grimes, and
students was done through order forms or face-to-face. In future Darius McCoy.
iterations of the course, the learned familiarity with online confer-
encing tools (such as Zoom) can potentially help to facilitate these REFERENCES
interactions. For example, video conferencing tools can be lever- [1] Mankof, J., et al., Consumer-Grade Fabrication and Its Potential to Revolutionize
aged to communicate about designs. This will be especially helpful Accessibility. Communications of the Acm, 2019. 62(10): p. 64-75.
[2] Buehler, E., Branham, S., Ali, A., Chang, J.J., Hofmann, M.K., Hurst, A., and Kane,
in the future if individuals choose to design projects at a larger S.K. Sharing is caring: Assistive technology designs on Thingiverse. in 33rd Annual
scale or that contain external components or perform mechanical ACM Conf. on Human Factors in Computing Systems. 2015.
functions. [3] McDonald, S., Comrie, N., Buehler, E., Carter, N., Dubin, B., Gordes, K., McCombe-
Waller, S., and Hurst, A. Uncovering challenges and opportunities for 3D printing
A challenge of 3D printing in the PT classroom is that it is difcult assistive technology with physical therapists. in 18th Intern. ACM SIGACCESS Conf.
to get end-users back for multiple iterative cycles, especially in this Computers and Accessibility. 2016.
[4] Parry-Hill, J., Shih, P.C., Mankof, J., and Ashbrook, D. Understanding volunteer
study as the end users were volunteers, many of whom had limited AT fabricators: Opportunities and challenges in DIY-AT for others in e-NABLE.
access to transportation and technology. Increasing the frequency in 2017 Conf. Human Factors in Computing Systems. 2017. ACM.
of in-person visits would have created an undue burden on the [5] Hofmann, M., Burke, J., Pearlman, J., Fiedler, G., Hess, A., Schull, J., Hudson,
S.E., and Mankof, J. Clinical and maker perspectives on the design of assistive
volunteers and digital communication as an alternative would have technology with rapid prototyping technologies. in 18th Intern. ACM SIGACCESS
been a challenge. In the future, we would like to explore facilitating Conf. Computers and Accessibility. 2016. ACM.
more collaboration between makers, PT students, and end-users. [6] Gordes, K.L.M.W., S. , Novel partnerships for interprofessional education: A pilot
education program in 3D technologies for human centered computing students
We also plan to gather more feedback from all three stakeholders and physical therapy students. Journal of Interprofessional Education & Practice,
in order to get a fuller picture of the whole design process. 2019. 15: p. 15-18.
[7] Okoro, C.A., et al., Prevalence of Disabilities and Health Care Access by Disability
Finally, in this study, we worked with PT students rather than Status and Type Among Adults - United States, 2016. MMWR Morb Mortal Wkly
expert clinicians and while student perspectives provide valuable Rep, 2018. 67(32): p. 882-887.
insights into what may be relevant in PT practice, inquiring into [8] Assistive Technology. 2018; Available from: https://www.who.int/news-room/fact-
sheets/detail/assistive-technology.
the perspectives of expert clinicians in the future can further enrich [9] Investing in Innovation – Government Funding R&D. 2010; Available from: https:
this research area. //www.oecd.org/site/innovationstrategy/45188215.pdf.
[10] Phillips, B. and H. Zhao, Predictors of assistive technology abandonment. Assist
Technol, 1993. 5(1): p. 36-45.
7 CONCLUSIONS [11] Hocking, C., Function or feelings: factors in abandonment of assistive devices1.
Technology and Disability, 1999. 11(1-2): p. 3-11.
This study has continued investigating the efectiveness of cus- [12] Sugawara, A.T., et al., Abandonment of assistive products: assessing abandonment
tomized assistive technology developed through a collaboration levels and factors that impact on it. Disabil Rehabil Assist Technol, 2018. 13(7): p.
ASSETS ’22, October 23–26, 2022, Athens, Greece Erin Higgins et al.

716-723. [26] Hook, J., Verbaan, S., Durrant, A., Olivier, P., and Wright, P. A study of the chal-
[13] Presti, A.L., Scherer, M.J., and Corradi, F., Measuring the assistive technology match, lenges related to DIY assistive technology in the context of children with disabilities.
in Assistive Technology Handbook, Second Edition. 2017, CRC Press. p. 53-70. in DIS’14. 2014.
[14] Toro, M.L., et al., Type and Frequency of Reported Wheelchair Repairs and Related [27] Afatoony, L., and Lee, S. Codea: a Framework for Co-Designing Assistive Tech-
Adverse Consequences Among People With Spinal Cord Injury. Arch Phys Med nologies With Occupational Therapists, Industrial Designers, and End-Users
Rehabil, 2016. 97(10): p. 1753-60. With Mobility Impairments. in Design Society: DESIGN Conference. 2020.
[15] Worobey, L., et al., Increases in wheelchair breakdowns, repairs, and adverse con- [28] Afatoony, L.L., S. AT makers: a multidisciplinary approach to co-designing
sequences for people with traumatic spinal cord injury. Am J Phys Med Rehabil, assistive technologies by co-optimizing expert knowledge. in 16th Participatory
2012. 91(6): p. 463-9. Design Conference 2020. 2020.
[16] Worobey, L., et al., Diferences between manufacturers in reported power [29] Hansen, A.K., et al., Exploring the Potential of 3D-printing in Biological Education:
wheelchair repairs and adverse consequences among people with spinal cord A Review of the Literature. Integr Comp Biol, 2020. 60(4): p. 896-905.
injury. Arch Phys Med Rehabil, 2014. 95(4): p. 597-603. [30] Wagner, J.B., et al., Three professions come together for an interdisciplinary
[17] McClure, L.A., et al., Wheelchair repairs, breakdown, and adverse consequences for approach to 3D printing: occupational therapy, biomedical engineering, and
people with traumatic spinal cord injury. Arch Phys Med Rehabil, 2009. 90(12): p. medical librarianship. J Med Libr Assoc, 2018. 106(3): p. 370-376.
2034-8. [31] Davis, K. and L. Gurney, Impact of 3D Printing on Occupational Therapy Student
[18] Hurst, A.T., J. Empowering individuals with do-it-yourself assistive technology. in Technology Efcacy. International Journal of Technology in Education and Science,
13th international ACM SIGACCESS conference on Computers and accessibility 2021. 5(4): p. 571-586.
(ASSETS ’11). 2011. Association for Computing Machinery. [32] Chen, D., et al., Three-Dimensional Printing in OT: Increasing Integration Into
[19] Meissner, L.J., Vines, J., McLaughlin, J., Nappey, T., Maksimova, J., and Wright, P. Practice by Countering Barriers. American Journal of Occupational Therapy, 2021.
Do-It-Yourself Empowerment as Experienced by Novice Makers with Disabilities. in 75.
Proceedings of the 2017 Conference on Designing Interactive Systems. 2017. [33] Paterson, A.M., et al., Computer-aided design to support fabrication of wrist splints
[20] Hamidi, F., et al., TalkBox: a DIY communication board case study. Journal of using 3D printing: A feasibility study. Hand Therapy, 2014. 19(4): p. 102-113.
Assistive Technologies, 2015. 9(4): p. 187-198. [34] Afatoony, L. and S. Shenai, Unpacking the Challenges and Future of Assistive
[21] Profta, H.P., Stangl, A., Matuszewska, L., Sky, S., and Kane, S. K. Nothing to Technology Adaptation by Occupational Therapists, in CHItaly 2021: 14th Bian-
Hide: Aesthetic Customization of Hearing Aids and Cochlear Implants in an nual Conference of the Italian SIGCHI Chapter. 2021, Association for Computing
Online Community. in Proceedings of the 18th International ACM SIGACCESS Machinery. p. 1-8.
Conference on Computers and Accessibility (ASSETS ’16). 2016. [35] Hofmann, M., Williams, K., Kaplan, T., Valencia, S., Hann, G., Hudson, S.E.,
[22] Hamidi, F., Mbullo, P., Onyango, D., Hynie, M., McGrath, S., and Baljko, M. Par- Mankof, J., and Carrington, P. . "Occupational Therapy is Making" Clinical Rapid
ticipatory design of DIY digital assistive technology in Western Kenya. in Second Prototyping and Digital Fabrication." in Proceedings of the 2019 CHI Conference on
African Conference for Human Computer Interaction: Thriving Communities Human Factors in Computing Systems. 2019.
(AfriCHI ’18). 2018. Association for Computing Machinery. [36] Miles, M., Huberman, A. & Saldana, J. , Qualitative Data Analysis: A Methods
[23] Buehler, E., Kane, S.K., and Hurst, A. ABC and 3D: Opportunities and obstacles to Sourcebook. 2014, Thousand Oaks, CA: Sage.
3D printing in special education environments. in 16 th Intern. ACM SIGACCESS [37] Easley, W., Hamidi, F., Lutters, W. G., & Hurst, A. Shifting Expectations: Un-
Conf. Computers & Accessibility. 2014. ACM. derstanding Youth Employees’ Handofs in a 3D Print Shop. in ACM Human
[24] Buehler, E., et al., Investigating the Implications of 3D Printing in Special Education. Computer Interaction CSCW. 2018.
Acm Transactions on Accessible Computing, 2016. 8(3): p. 1-11. [38] Barrett, T., Pizzico, M., Levy, B. D., Nagel, R. L., Linsey, J. S., Talley, K. G., Forest,
[25] Lindtner, S., Bardzell, S., and Bardzell, J. Reconstituting the utopian vision of C. R., and Newstetter, W. C. A Review of University Maker Spaces. in ASEE Annual
making: HCI after technosolutionism. in 2016 Conf. Human Factors in Computing Conference. 2015. Seattle, WA.
Systems. 2016.
BentoMuseum: 3D and Layered Interactive Museum Map for
Blind Visitors
Xiyue Wang Seita Kayukawa
Miraikan – The National Museum of Miraikan – The National Museum of
Emerging Science and Innovation Emerging Science and Innovation
Tokyo, Japan Tokyo, Japan
wang.xiyue@lab.miraikan.jst.go.jp seita.kayukawa@lab.miraikan.jst.go.jp

Hironobu Takagi Chieko Asakawa


Miraikan – The National Museum of Miraikan – The National Museum of
Emerging Science and Innovation Emerging Science and Innovation
Tokyo, Japan Tokyo, Japan
hironobu.takagi@miraikan.jst.go.jp chieko.asakawa@miraikan.jst.go.jp

Figure 1: BentoMuseum, a 3D and layered design of a museum map that makes information accessible to visually impaired
visitors. (a) All foors can be stacked or separated. (b) A user taps the interactive label, which responds with an audio guide when
the foor is overlayed on an iPad app. (c) A user explores a structural attraction with fngers (a circular walkway named Oval
Bridge which goes around a “globe-like” display named Geo-Cosmos). (d) The Oval Bridge and Geo-Cosmos in the museum.
ABSTRACT CCS CONCEPTS
Obtaining information before a visit is one of the priority needs and • Human-centered computing → Accessibility systems and
challenges for blind museum visitors. We propose BentoMuseum, a tools; • Hardware → Tactile and hand-based interfaces; • Social
layered, stackable, and three-dimensional museum map that makes and professional topics → People with disabilities.
complex structural information accessible by allowing explorations
on a foor and between foors. Touchpoints are embedded to provide KEYWORDS
audio-tactile interactions that allow a user to learn the museum’s information access, 3D structure, audio-tactile, touch screen
exhibits and navigation when one foor is placed on a touch screen.
Using a tour design task, we invited 12 frst-time blind visitors to ACM Reference Format:
Xiyue Wang, Seita Kayukawa, Hironobu Takagi, and Chieko Asakawa. 2022.
explore the museum building, chose exhibits that attracted them,
BentoMuseum: 3D and Layered Interactive Museum Map for Blind Visitors.
and built a mental map with exhibit names and directions. The In The 24th International ACM SIGACCESS Conference on Computers and
results show that the system is useful in obtaining information that Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New
links geometric shapes, contents, and locations to then build a rough York, NY, USA, 14 pages. https://doi.org/10.1145/3517428.3544811
mental map. The connected foors and spatial structures motivated
users to explore. Moreover, having a rough mental map enhanced
1 INTRODUCTION
orientation and confdence when traveling through the museum.
As audience-centered institutions with a range of educational and
social roles, museums are, more than ever, aware of the impor-
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed tance of delivering equality, diversity, and inclusion, as well as their
for proft or commercial advantage and that copies bear this notice and the full citation power to bring about positive change in society [13, 33, 34]. In-
on the frst page. Copyrights for components of this work owned by others than ACM creased attention is being paid to the development of equal access
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a and multidimensional access, which not only refers to physical
fee. Request permissions from permissions@acm.org. access but also to multi-sensory, intellectual, fnancial, emotional,
ASSETS ’22, October 23–26, 2022, Athens, Greece cultural, educational as well as information access [2, 12, 24], all
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 of which are important to the visually impaired community. At
https://doi.org/10.1145/3517428.3544811 present, many barriers impede the attempts of people with visual
ASSETS ’22, October 23–26, 2022, Athens, Greece Wang, et al.

impairments to visit a museum. Alongside exhibition accessibility Previous literature either developed foor plans in isolation [20,
and mobility assistance [2, 3], the limited provision of information 29] or reproduced external structures [26, 50], in which complex
and orientation before a visit can lead to a negative overall experi- multi-foor structures such as museums were rarely explored. Our
ence and emotional isolation [2, 10]. Nevertheless, making detailed core concept and innovation is the design of stackable 3D foors to
and usable information available would create a welcoming and capture the complex multidimensional nature of a museum. The
encouraging environment to blind visitors [30]. multidimensional information includes the external and internal
The museum’s unique environment brings challenges to blind structures, exhibits and facilities, and their locations as well as
people in accessing information, diferent from other public spaces route-fnding. Using a participatory and user-centered approach,
such as neighborhoods, parks, and hallways in a building. First, con- we designed BentoMuseum, a 3D and layered museum map with
temporary museums have distinctive architecture, internal design, audio-tactile interactions, to support blind users in obtaining in-
and “inter-foor structures” (i.e., stairs and walkways connecting the formation and understanding the 3D attractions through tactile
foors, see an example in Fig. 1d) as a part of their exhibitions [31, 43]. explorations (Fig. 1). The system contains two main elements: the
Blind visitors have difculty grasping the museum’s complex “mul- 3D and layered foors (Fig. 1a), which can be either interlocked to
tidimensional information”, such as the overall geometric shape of allow vertical exploration between foors (Fig. 1c) or separated to
the building, the inter-foor structures, and each exhibit section’s support horizontal exploration of a single foor; interactive touch-
name, description, approximate shape and size, and location. Sec- points on the foor that allow audio feedback by touch (Fig. 1b).
ond, even though some museums have a minimal structure with When one foor is placed on a touch screen, diferent levels of infor-
a pre-defned route, more museums contain freely arranged ex- mation and tactile navigation with audio support can be triggered
hibits that may or may not have clear route indication [14]. In such by tapping. Novel designs we propose include stackable foors, 3D
museums, visitors explore the museum and choose the exhibits and 2D attributes that represent diferent types of contents, and
they examine based on their interests. By encouraging such “free simulated navigation by tracing paths and intersections.
explorations”, these museums efectively trigger a sighted visitor’s We invited 12 participants with severe vision impairment to be
curiosity, but conventional posted information may cause blind museum tour designers and instructed them to use the system as
visitors access difculties and orientation frustrations. part of an authentic museum tour. We let them explore as much
Accessible maps are the means for visually impaired visitors to as possible, obtain information, choose exhibits of their interest,
learn about a site. Tactile maps are often available in public spaces and try to build a mental map. Participants expressed their map
and institutions to provide information and assist in navigation, exploration styles and elaborated on their needs for information
with the aim of helping the user to build a mental map before going access before, during, and after the tour. Our results suggest: (1)
to a new place [45, 46]. Since the efectiveness and understandabil- Using the system, the participants were able to actively obtain
ity of a tactile map largely depend on the user’s tactile skills and information that links shape, location, and content. Consequently,
abilities [38], three-dimensional (3D) maps with volumetric symbols they were able to choose exhibits of interest and build a rough
have been developed for ease of understanding. Moreover, audio- mental map. (2) Touching inter-foor structures motivated blind
tactile labels have been proposed for seamless and autonomous users to explore the museum map. Along with the navigation, it
operation. The user is thus largely freed from either shifting atten- supported them in building a 3D mental map. (3) Building a rough
tion between the map and the braille legend [25] or asking others for mental map beforehand was benefcial for the subsequent visit. It
explanations [16]. The current 3D-printed audio-tactile maps show provided orientation, enhanced the sense of safety and confdence
thrilling possibilities, but limitations persist. These maps usually that they would not get lost while traveling through the museum,
present a simple one-foor layout, which is insufcient to support a and led to a positive and inclusive museum experience.
structural mental map of a multidimensional museum.
Due to the fact that a museum contains a large amount of mul- 2 BACKGROUND AND RELATED WORK
tidimensional information, and it is not a frequently visited place,
blind visitors might feel it’s particularly challenging to obtain infor- 2.1 Museum Accessibility
mation, orient themselves, and build a mental map. Consequently, Museums are not only institutions for the collection, preservation,
they refuse to do this and give up the idea of a self-reliant visit. How and display of valued objects but also audience-centered spaces
can museums that have a complex 3D structure and freely arranged with a wide range of social roles and responsibilities [2, 6, 27, 48].
exhibitions provide information access to blind people before a Sandell suggested that museums should contribute to social inclu-
visit? To bridge the gap between museums and blind visitors in sion on individual, community, and society levels by supporting
terms of information access, and to investigate the suitable format creativity and confdence; empowering independence, decision-
of an accessible and inclusive museum map, the following research making processes, and democratic structures; fostering acceptance
questions emerge: and respect; and challenging stereotypes [41].
However, barriers exist when people with visual impairments
• RQ1. How can we make the vast amount of needed infor- attempt to visit a museum. It was noted that a blind person’s visit
mation (e.g., architecture and interior structures, exhibits, experience begins even before entering the museum site [48]. The
facilities, locations, and route-fnding) accessible and under- provision of information has been a priority need in terms of ser-
standable on a museum map? vice accessibility [11, 23, 48]. A United Kingdom survey concluded
• RQ2. Is building a mental map possible and signifcant in the that the basic accessibility information provided by a majority
museum context? of museum websites could not address the access needs of blind
BentoMuseum: 3D and Layered Interactive Museum Map for Blind Visitors ASSETS ’22, October 23–26, 2022, Athens, Greece

or partially sighted persons who want to plan an independent compared 3D printed maps with tactile maps and found that 3D
visit [10]. Argyropoulos et al. found that complex museum architec- maps were preferred. The use of more easily understood icons and
ture/interior design also hindered access to a museum. The overall relative heights of map elements facilitated improved memory in
inaccessibility led to a lack of motivation and negative emotions [2]. short-term recall [25]. Gual et al. designed urban maps containing
It was noted that providing haptic, touchable, and multi-sensory both volumetric attributes and relief attributes and demonstrated
objects is one of the most anticipated and efective ways of improv- the value of the maps in terms of interpreting, memorizing, and
ing museum accessibility [2, 9, 51]. In terms of accessing informa- understanding. However, they also found that the maps required
tion and orienting visually impaired visitors, Vas et al. suggested verbal support to be used autonomously [20, 21].
presenting the museum space and exhibition through audio at the
beginning of the visit [49]. Tactile maps and 3D models have also 2.3 Touch Sensing and Audio-Tactile Labels
been developed to the needs of visually impaired visitors. Urbas et
Audio-tactile maps using touch sensing or buttons have been pro-
al. explored 3D printing techniques to develop physical foor plans
posed to further support understanding in addition to the tactile
to be mounted on the museum walls for touch [47]. Holloway et
sensations. Brock et al. designed an interactive map composed of
al. created 3D maps with distinct icons for an event, and they sug-
a multitouch screen, a raised-line overlay, and audio output. Com-
gested that large maps require time to explore and thus should be
pared with tactile maps, they showed that replacing braille with
made available before an event or at the entrance in a comfortable
audio-tactile interaction signifcantly improved efciency and sat-
environment [26]. Leporini et al. pointed out that being able to
isfaction [7]. Comparing tactile maps with interactive small-scale
explore and familiarize themselves with the structure and details of
models (SSMs) for learning, Giraud et al. showed that the interac-
a large cultural site is crucial for orienting visually impaired people.
tive SSMs improved both space and text memorization and were
This allows them to gain a global understanding and an overall
also adaptable to diferent situations and needs [16]. Several stud-
impression [29].
ies demonstrated that perceptible buttons that invoked diferent
levels of audio content promoted an interactive, autonomous explo-
2.2 Accessible Maps for the Visually Impaired ration [25, 29] and increased emotional engagement [52]. Utilizing
printing technologies, automatic creation of 3D printable tactile
Maps that enable tactile explorations are designed for the visu-
graphics, and touch screens, research has developed the ability to
ally impaired to learn the environment. One of the most common
instantly produce tactile-audio representations on a printed map
methods is tactile maps created with a set of accessibility guide-
by implementing the map on a touch screen [17, 44, 53].
lines [4, 35, 36]. Such maps represent features with raised lines, sym-
The previous literature has built a strong foundation for devel-
bols, keys, and orientation [38, 55], often with the same elevation
oping 3D maps with audio-tactile labels. However, those works
(around 0.5mm) on swell paper or a greater height diference (up to
focused on one-foor settings with relatively small amounts of in-
2-3cm) when thermoformed [25, 39]. Most tactile maps are created
formation. To the best of our knowledge, few research works have
for information provision in a digestible form that can be easily
explored accessible maps for complex multi-foor structures. We
understood [39, 40, 55] and for mobility and navigation [25, 40, 46].
fll this research gap by proposing stackable foor maps to access
Although they are commonly used, tactile maps propose a set of
both the internal and external structures of an entire museum.
challenges to visually impaired users. They are limited in creating
3D structures and a variety of heights [25]. The understandability of
a tactile map also largely depends on the user’s tactile ability [38, 40] 3 PARTICIPATORY SYSTEM DESIGN
and training in the skill of reading tactile graphics [1, 45, 46]. These The design concept is implemented in a science museum, Miraikan
maps also contain a relatively small amount of information, and nor- – The National Museum of Emerging Science and Innovation1 , with
mally a large area needs to be divided into diferent sections [38, 55]. distinctive structure and symbolic interior attractions. It is a seven-
Furthermore, the tactile maps need supplementary means to access foor building with a large-area atrium (with the 2nd, 4th, and 6th
the information, for example, a symbols glossary or legend [38]. foors mainly atrium space) and structural attractions such as a
To mitigate the understandability issues, research has explored series of escalators that directly connect all the foors (Fig. 3a), a
maps with more distinct 3D structures. Voigt and Marten suggested walkway called Oval Bridge that goes around a “globe-like” display
the use of 3D models of buildings to facilitate spatial orientation named the Geo-Cosmos (Fig. 1d)2n, and a Dome Theater with half
and build a mental map [50]. Leporini et al. developed foor plans of it inside the building and the other half extended into the exterior
of indoor monuments of a cultural site to help both visually im- (Fig. 3b). It also lacks maps that can be perceived by touch.
paired and sighted people explore and familiarize themselves with We employed a participatory and user-driven methodology to
elements before a visit [29]. Gual et al. found it was easier to mem- design a map adapted to the museum. The design sessions include
orize 3D volumetric symbols than 2D symbols [18]. Comparing seven interviews with the blind designer (once in prototype 1, three
two tactile maps, one with only 2D elements and the other also times in prototype 2, and three times in preparing for the fnal
including volumetric symbols, Gual et al. found the use of 3D vol- design), one event that involved twenty blind museum visitors
umetric symbols signifcantly reduced both location-fnding time and three staf members, and one group meeting with those staf
and discrimination errors [19]. Holloway further compared the members.
readability of diferent 3D icons on a map and proposed guidelines
for icon design [26]. Pistofdis et al. tested a number of parameters
related to 3D shape and haptic performance [37]. Holloway et al. 1 https://www.miraikan.jst.go.jp/en/
ASSETS ’22, October 23–26, 2022, Athens, Greece Wang, et al.

Figure 2: The printout of an early iteration and the fnal design. (a) The realistic model of the 3rd foor that was used in
Prototype 1 to start the discussion (Section 3.2). (b) The 3rd foor of the fnal model, with simplifed designs, audio-tactile labels
painted in black, and control buttons on a touch screen. (c) The fully stacked fnal model that shows the external structure.

3.1 Motivation: “What if I can open the model • Facilities: While the current maps were highly focused on
and get information about the foors?” the exhibits, the basic map elements such as restrooms, ele-
vators, and escalators also needed to be included.
One of the designers, P0, is an adult female who has complete
• Orientation and navigation: The map should support iden-
blindness, as well as being an interaction designer and researcher.
tifying the entrance, main route, each exhibit, and how to
After being presented a 3D model of the museum, she expressed the
move around the space. These elements support the devel-
need to understand the interior: “I have heard about the symbolic
opment of a “mental map,” which is crucial for blind people.
globe-like display and the Oval Bridge around it. But it’s so hard
to imagine them just through descriptions. I wish I could open the The feedback revealed that the 3D map was good at delivering
model and touch them” 2 . This was the initial attribute of the map we structural impressions while the tactile map preserved scanning,
hoped to investigate: a 3D model that contains internal structures. thus confrming the previous research [25]. Previous literature
suggested 3D maps with audio-tactile labels ofer clear advantages,
3.2 Prototype 1: Feedback of a Realistic Model including understandability, memorization, and efectiveness, all of
which are important for museums. Nevertheless, we did not fnd
An early version of the map was a realistic 3D print of the museum
a clear advantage to using the tactile map for the museum, so we
foor (Fig. 2a). We sliced the 3D model into foors and encapsulated
were motivated to focus on 3D maps. We utilize the advantages
the detailed information such as the walls and tables at each exhibit
of both volumetric attributes and relief attributes, learning from
to a 24cm×14.5cm×1.2cm miniature foor. A tactile map resembling
previous practices [18, 19, 25, 26, 29] while making our own design
the 3D map’s layout was developed and printed on swell paper
innovations and adaptations.
for comparison. During a two-day event called Inclusion Week,
two maps along with other 3D prints were explored in the wild
3.3 Prototype 2: 3D Floors and Audio-tactile
by 20 blind visitors, for 5 to 10 minutes each person. From their
comments, we learned the following needs to satisfy in developing Interactions
an understandable map: Based on the feedback, we categorized the museum’s multidimen-
sional information into the following three types of information,
• Content: Simplifed and categorized forms were needed.
and we provided design criteria for each of them:
Users highly praised the understandable form of the Oval
Bridge on the 3D map but also pointed out that the detailed • Structural attractions include inter-foor structures and
depictions of exhibits were not digestible. symbolic spatial structures (Fig. 1c, 1d, 3a, 3b). Our design
• Tactile exploration: A relatively smooth surface without choices include: (1) Simplifying structures into primary forms
acute edges was preferred. Small and pointy objects (i.e., with understandable relative scales. For example, the parallel
walls and tables) on the 3D map hindered hand scanning. escalators and stairs were made simpler into one slope with
• Explanations: Automated audio-tactile interactions were textures (Fig. 3a). (2) Simplifying prominent walls into 1mm
desired. Both maps were not understandable unless the mu- tall and 3mm wide cuboids. (3) Embedding magnets to sup-
seum staf gave explanations. port easy stacking and lining up of foors, which has proved
to be efective in developing 3D objects for the blind [15].
The two maps were then tested by P0 during a 30-minute interview.
Floors can be partially stacked to simulate how to walk
Further requirements were confrmed based on her knowledge of
between them (Fig. 3a) or fully stacked to show a facade
the museum and expertise in design:
(Fig. 2c).
• The exhibits included booths, wall-divided spaces, and arti-
2 All communication with the participants was in their native language. In this paper, facts placed in open spaces. Our design choice was to simplify
we present any translated content in the form of “translated content.” them into outlines that were proportional to the real space
BentoMuseum: 3D and Layered Interactive Museum Map for Blind Visitors ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: Designs for diferent types of information. (a) The escalators and stairs run parallel in the museum (left) and their
representations on several foors (right). (b) The Dome Theater (left) and its representations on several foors (right). (c) One
exhibit area (left) and it’s representation on one foor (right). (d) Eight symbols that represent museum facilities.

they took (Fig. 3c). This design supports clear separation, dif- touchpoint (see an example in Fig. 1c). To tactually distinguish
ferentiation, and scanning. The outlined shape was hollowed hollowed interactable exhibits from the atrium, we attached a pa-
to enable audio-tactile interaction (described later). per with textures on the back of the foor (Fig. 2b). An app that
• The facilities in the museum were summarized into eight processes touch and provides voiceover information was developed
frequent items. We represented them using volumetric sym- in Unity. As learned from the previous work [7] and through our
bols (Fig. 3d), with design guidelines from previous work own testing, we mainly adopted a double tap as the recognized
[18, 19, 26]. For those facilities that take a large space (e.g., touch to prevent accidental triggering during the exploration. Two
lobby and restaurant), their outlines were hollowed out to modes were developed to serve the needs of free exploration and
show the area and enable touch interaction. route-fnding:
To support orientation and navigation on the map, we further de- • In the Exploration mode, double-tapping a touchable area
fned paths and intersections to indicate how the user can travel. triggers the audio explanations.
• The path is similar to the tactile paving [32], which suggests • In the Navigation mode, the user double-taps two exhibits
a route on open ground. According to the actual layout and to select the destination and the start. A route with a start,
fow, we defned a main path in the center of the exhibition destination, and a number of intersections in-between is
space and sub-paths as routes that connect each exhibit’s generated. Next, The user is instructed to move a fnger
entrance to the main path. All of the paths are represented to the entrance of the start place, which is the location in
by 1mm wide embossed lines (Fig. 2b). the exhibit reached by the path. Once the user moves there
• The intersection is represented as a 10mm × 10mm hol- (without any tapping), she is directed to trace the path to the
lowed square located at the crossing of the paths, which is next location of the route until she reaches the destination.
distinguishably smaller than the exhibit areas (Fig. 2b). The fnal foors are created with a stacked area of approximately
32cm × 20cm × 13cm stacked, and 2.5cm tall, 1.5mm thick each foor
To automate the explanations with diferent levels of detail, we
(Fig. 2c), which was at a 1/400 scale of the actual museum (see
implemented audio-tactile labels using capacitive sensing on a
specifc sizes and details in Fig. 4). This is the largest size that can
touch screen. A 12.9-inch iPad Pro was used as a platform to sense
ft onto an iPad to support audio-tactile exploration. It is designed
touch. When a foor is placed on it, a touch can be sensed directly on
in Autodesk Fusion 360, and printed with Formlabs Form 3L SLA
the hollowed exhibits. On a structural attraction with a geometric
3D printer, using Clear Resin material.
shape (Oval Bridge, Geo-Cosmos, Dome Theater), the audio-tactile
label was implemented by redirecting touch from the screen to the
surface of the shape using conductive ink, following touch screen 3.4 Final Design: Content and Customization
redirection technical guidelines [42, 54]. A 3.5mm wide tube was We then conducted a 1.5-hour group meeting with 3 museum staf
cut out in the geometric shape, flled with the conductive ink, and members, who are not only profcient as museum guides but also
had its top and bottom painted with conductive ink. We also pasted experienced in guiding blind users. We decided to include the fol-
a 4mm wide circular tactile sticker at the center to indicate the lowing contents:
ASSETS ’22, October 23–26, 2022, Athens, Greece Wang, et al.

Figure 4: The layout of all the 3D foors. The x and y axes are in cm. Diferent colors marked the foors, facilities, and paths.

• The audio guide for the 3D structure or the exhibit, in Table 1. They were recruited via an e-newsletter for people with
which speaks at one of two levels of detail in turn when visual impairments, and compensated $75 plus travel expenses for
tapped. The frst level contains name, keyword (e.g., uni- their time. All of them were frst-time visitors who held minimal
verse, earth, life), and accessibility info (e.g., “Over there, you preset knowledge about this museum where the study took place.
can touch a 3D model of the rocket engine.” ) The second level Six participants were frequent museum visitors who visited other
contains a 15-second description about it. museums more than once a year. Three visited other museums
• The audio guide for an intersection, which speaks the every two to three years, and three rarely visited a museum. All
surrounding information when tapped (e.g., “This intersection of them had experience with tactile materials, including tactile
is connected to an earth-type exhibit on the top and a universe- graphics (P2, P4–P12) and 3D models (P1–P11).
type exhibit on the bottom. Eight exhibits are on the left. Five
are on the right.” ) Table 1: Demographic information of the participants.
The following updates were made to enable user customization:
• Three physical buttons (stop the voiceover, modify the speed, Blind Navigation Visiting
ID Age
and change Exploration/Navigation mode) were developed. since Aid other Museums
They are clipped onto the iPad and can be triggered at any
P1 58 41 Guide Dog Once every 2–3 years
time using a double-tap (Fig. 2b).
P2 49 45 White Cane 2–3 times/year
• In the Navigation mode, the route explanation style can be
P3 63 50 Guide Dog Once every 2–3 years
switched between the turn-by-turn (default) and the north-
P4 60 45 White Cane Once every 2–3 years
up navigation.
P5 42 32 White Cane 2–3 times/year
In summary, all elements in our proposed map are as follows: P6 49 16 White Cane Never
(1) the 3D and layered foors, which include inter-foor structural P7 57 53 White Cane 2–3 times/year
attractions, the outlined exhibits, and facilities shown by volumetric P8 24 3 White Cane Once/year
symbols; (2) the audio-tactile interactions, which include the two- P9 71 60 White Cane 2–3 times/year
level audio guide of exhibits, an audio guide at the intersections, P10 43 3 White Cane A few times
and navigation by tracing the paths and intersections. P11 61 56 White Cane 4 times/year
P12 68 35 Guide Dog Never
4 USER STUDY
We conducted a user study at the science museum1 to investigate
our research questions and evaluate the efectiveness of our pro-
posed system. The staf who joined the fnal design process (Sec-
4.2 Task and Procedure
tion 3.4) stressed that visitors came with diferent interests and 4.2.1 Pre-Interview. The frst part of the study, tour design, took
expectations. A fxed task and a rigorous evaluation of the perfor- place in a guest room located on the frst foor of the museum.
mance might discourage the participants, who are also important Before presenting them the system, we conducted a roughly 10-
stakeholders. We came to agree that a tour design task should be minute pre-tour interview, hearing about their tactile experience
fexible to refect diferent user styles and support curiosity and and previous preparations before going to other museums.
autonomy, which are the museum’s important social roles. We in- 4.2.2 Structural Exploration. Next, we presented the fully stacked
cluded a tour after the map exploration to help visitors generate BentoMuseum 3D model and allowed the participant to explore
feedback towards a real museum visit. Each individual study took freely by touch as we introduced the basic external structures. We
two to three hours in an order of tour design, conducting the tour, informed participants that the museum had the shape of a boat, and
and post-tour interview. the front “bow” should be kept on the left-hand side for a consistent
orientation. We then introduced the “Bento Box” characteristics
4.1 Participants and encouraged the participant to take the foors apart one by one.
We recruited 12 blind participants (male = 5, female = 7) with ages Next, the foors were stacked back one by one, and we encouraged
ranging from 24 to 71 years old (mean = 53.8, SD = 13.1), as listed the participant to touch the inter-foor structures (e.g., escalators
BentoMuseum: 3D and Layered Interactive Museum Map for Blind Visitors ASSETS ’22, October 23–26, 2022, Athens, Greece

and Oval Bridge) to learn how foors are connected. Finally, the interview included two forms: a seven-point Likert rating, from
participants were primed with a list of 10 icons (eight in Fig. 3d, strongly disagree (score = 1) to strongly agree (score = 7), and
escalator and wall), which is separately prepared on a sheet. This free responses. Four sections composed the interview: (1) rating
phase took roughly 10 minutes. the overall experience of using the system (Q1–Q3 in Fig. 5); (2)
rating the overall system usability (Q4–Q6 in Fig. 8); (3) rating the
4.2.3 Training Phase. A training phase was conducted to familiar- 3D foors and audio-tactile interactions, which were further divided
ize participants with the audio-tactile interactions. The participant into eight specifc elements in terms of A. understandability and B.
was shown the 1st foor map on the touch screen in Exploration usefulness (Q7.A–Q14.B in Fig. 9); (4) free responses about using the
mode. The following steps were taken: (1) double-tap the special map prior to the visit, the strengths and limitations of the system,
exhibit zone to listen to the name; (2) double-tap it again to hear applications, and other fndings after the tour. For all of the ratings,
the details; (3) double-tap the intersection to hear information of we asked participants to consider or imagine accessing information
surroundings; (4) fnd the guest room by double-tapping; (5) double- by audio means, such as reading a homepage when preparing for a
tap the speed buttons to adjust voiceover speed; (6) double-tap visit, as a baseline (score = 4).
the cancel button to stop the audio; (7) double-tap the navigation
button to change to the Navigation mode; (8) double-tap to set the 5 RESULTS
special exhibit zone as the goal and the guest room as the start, and
trace the route following the audio guide. This entire process took 5.1 Preparations before Going to the Museums
around 10 minutes. Eight out of nine who had visited museums said they would gather
information ahead of time through web pages and other means.
4.2.4 Tour Design Task. A loosely structured tour design task was
The information they hoped to collect included exhibit information
conducted. The individual participant was asked to imagine the
(all), foor information (P2, P3, P7, P8, P10, P11), the detailed route
following real-world scenario: The system is placed at the entrance
to get to the exhibits (P7, P8, P11), and facility information such as
of the museum, and they are using this system to select the exhibits
restaurants (P4, P5) and opening hours (P5). Two participants said
of interest and design a unique tour for themselves. With the help
it was difcult to acquire information they needed through web
of the staf, they can place any foor on the touch screen. From a
pages, so they made calls (P5, P7).
total of 28 exhibits, they were asked to select 6 (equivalent to a two-
hour tour) and to try to build a mental map with routes connecting
5.2 Performance of Information Access
the spots within 45 minutes. Considering the real-world scenario,
they were also free to take notes. During the task, a researcher was 5.2.1 Overall Experience. All participants successfully fnished the
taking notes of the selected spots for later evaluation. We video and tour design task within the allowed time (45 minutes). The ratings
audio recorded the session and saved app log data for later analysis. related to information access through the task are summarized in
When time was up or the participant was fnished, they were Fig. 5 (Q1–Q3). The participants strongly agreed that by using the
asked to orally explain the (1) name and (2) orientation and location system they could get an overall image of the museum (median =
of each spot. Based on their explanations after the task and during 7). They also agreed that they were able to grasp the details of the
the tour, we determined which level of mental map they possessed. museum (median = 6). All participants agreed (median = 7) that it
In this study, we defned fve levels of the mental map: is important to decide on their own where to go. Participants were
excited about having good control of information and being able to
• Level 1: Hardly remember any exhibits they chose.
design the tour independently based on their own interests:
• Level 2: Remember some of the exhibits they chose.
• Level 3: Remember all of the exhibits they chose. A1:“The museum visits are precious parts of my life. I really
• Level 4: Remember all of the exhibits they chose and which don’t want to miss anything interesting. Thus I want both inde-
foor each exhibit is on. pendent exploration and recommendations." P9
• Level 5: Remember all of the exhibits they chose and the A2:“It might be nice if there were a recommended course, but I
location of each exhibit. would still like to explore it myself. There is a sense of security
to control where I go." P11
We also asked the participants to give a self-evaluation of what
level of the mental map was needed. 5.2.2 User Exploration Styles. By analyzing the double-tap log data
4.2.5 Conducting the Tour. To validate their mental map in a real- during the Exploration mode, we identifed several hand movement
world setting and gather feedback on the important factors before styles when the participants were exploring the foor. Research has
visiting the exhibitions, we invited the participant to experience the found that touch readers were taught to frst systematically scan in
designed tour. We shortened the two-hour tour to approximately 15 a circular pattern, and this efective scan strategy was used with
minutes, the time allotted for walking over the designed tour with both tactile map and 3D models [25]. We were interested in learn-
a museum guide and listening to elaborated guidance of a chosen ing whether did they naturally perform this exploration strategy.
exhibit to get a taste of the museum. We encouraged participants to Although all participants had tactile experience, we identifed that
concentrate on validating their mental map, and they were allowed three participants (P8, P9, P12) under-explored the foor (see ex-
to fully explore the museum after the study. ample in Fig. 6a). They only touched some of the exhibits, and no
circular pattern was formed. Three participants (P3, P5, P11) over-
4.2.6 Post-Tour Interview. A roughly 30-minute interview was con- explored the foor (see example in Fig. 6b). Their tapping covered
ducted after the participant was settled back in the guest room. The most exhibits, but their fngers traveled randomly by long distances
ASSETS ’22, October 23–26, 2022, Athens, Greece Wang, et al.

Figure 5: Questionnaire results of the overall experience of our system (Q1–Q3).

(a) Under-exploration (P8, 3rd foor) (b) Over-exploration (P3, 3rd foor)

(c) Typical (P4, 3rd Floor) (d) Ideal (P4, 5th foor)

Figure 6: Diferent exploration styles. The location of frst level touches and their sequences during the Exploration mode are
plotted on a 2D foor map. The arrow indicates the consequence, and the color shows the touch order.

on the map. A clear, circular pattern was hardly observed. The rest exploration efectiveness on tactile graphics [5]. These quantitative
of the participants showed a typical style (see example in Fig. 6c). measures include coverage percentage of the exhibitions, tap counts
Even though they had slightly long traveling, they were able to build (perceived as a more reliable indication of time in our case compared
a relatively circular and systematic pattern. In latter exploration, to the clock time), and exploration distance. We computed a Univari-
one participant (P4) also showed an ideal style (see example in ate ANOVA with a Bonferroni Pairwise post-hoc test to compare
Fig. 6d) with a very systematic circular pattern. After one exhibit the results. Mean coverage for under-exploration was signifcantly
was explored, he moved to the closest exhibit. This style exhibited lower than those for typical (p < .01) and over-exploration (p <
the “circular and complete” scanning strategy, which was noted as .001) (Fig. 7a). There was no signifcant diference among the three
one of the most efcient strategies for exploring a map [25]. styles in terms of tap count (Fig. 7b). Mean exploration distance
We next analyzed the relationship between the identifed styles for over-exploration was signifcantly higher than those for typical
and the quantitative performance, and the latter was used to identify (p < .01) and under-exploration (p < .01) (Fig. 7c). The results show
BentoMuseum: 3D and Layered Interactive Museum Map for Blind Visitors ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 7: Mean (a) coverage, (b) tap count, and (c) exploration distance for the typical, over-explored and under-explored groups.
p: p-value of the Univariate ANOVA with a Bonferroni Pairwise post-hoc test (*** and ** indicate the 0.001 and 0.01 levels of
signifcance, respectively).

that among all the styles, those participants who hold a typical style they felt the complex environment is not yet ready to allow them
are having high coverage and short exploration distance, which to travel alone (P1, P5, P6).
indicated relatively efcient explorations. A7:“It was easier to remember the routes than the exhibit
names. I wish the names could be replaced with another format,
5.3 Performance of Mental Map Building such as numbers.” P9
Among twelve participants, nine participants (75%) remembered
all the exhibits they chose and their locations (level 5), two (16.7%) 5.4 System Usability
remembered the exhibitions and their corresponding foors (level 5.4.1 Overall Usability. The results of three ratings related to us-
4), and one (8.3%) remembered a part of the chosen exhibits (level ability are summarized in Fig. 8 (Q4–Q6).
2). Level 5 participants could explain the general location of each All participants agreed that the system was useful (Q4, median
exhibit (e.g., “Exhibit A is on the upper-left side of the 5th foor = 6.5) and enjoyable compared with getting information from the
or “Exhibit B is located to the left as you exit Exhibit C), and we homepage (Q6, median = 6.5). In particular, seven participants (P1,
determined that they had built a “rough” mental map. P3, P4, P5, P7, P10, P12) commented that linking geometric or
All participants noted that there could be a clear diference be- outlined shape, location, and content using the 3D model and audio-
tween with and without a mental map, and some commented that tactile labels was both efective and enjoyable.
3D and layered foors and the Navigation mode of the proposed sys- A8:“I’m not good at building mental maps, but I somehow
tem were efective for building a 3D mental map (see Section 5.4.2 managed to do it by touching and listening." P1
and Section 5.4.3). The participants noted that there was an improve- A9:“Processing the audio-tactile information, I could under-
ment in orientation (P2, P6, P9, P10) and confdence that they were stand the content and arrangement of the exhibit booths." P10
safe and would not get lost (P7–P9, P11) with a rough mental map A10:“Compared to merely reading the homepage, touching and
during the visit, compared to their previous museum experiences listening made me excited about the following trip." P4
without a mental map.
Three participants (P2, P11, P12) were also excited about the inde-
A3:“ Without the mental map, I didn’t understand where I was pendence they obtained in the exploration.
walking. Controlling where I was going using the map, I walked A11:“The best thing is that I could explore independently with-
with a sense of accomplishment, and everywhere I went became out asking for help." P11
a lot more fun." P2
A4:“ Without a rough mental map, it just feels like being pulled The participants other than P12 leaned toward giving a rating
around and it’s simply boring and tiring." P9 that the system was easy to use (score equal to or greater than 5).
A5:“ If I don’t have a mental map [before following a course], I Six participants commented that the double-tap was not easy at
don’t remember where I went, I don’t know how long I will walk. frst (P2, P4, P6, P8, P9, P12), which infuenced their score on Q5.
Now when I notice where I am, I can calculate back from the The participants pointed out that there was a learning curve, and
mental map, and I feel a completely diferent level of security." P8 it largely depended on the user’s profciency with mobile devices
A6:“ It feels safe to decide where I go, understand the relative and voiceover controls.
locations, and have a structure of the museum in my mind. It A12:“I wasn’t used to double-tapping and didn’t know where
makes the tour and the discussion easier. I might make mistakes I could tap at frst. But I got profcient after I spent more time
about the route, but I can soon integrate the new information with it." P6
and easily correct my map." P7
Three participants hoped to use a more explicit and seamless touch
On the other hand, all participants contented themselves with the to trigger the audio (P2, P6, P8), although they acknowledged that
current level of the mental map they built. The participants thought a single-tap would trigger unnecessary sounds (P2, P8).
building a higher level of mental map, which means remembering A13:“The double-tap also reacted to other fngers during ex-
the route clearly, would be unnecessary for the following reasons: ploration. I think touch with a stronger force is better than
(1) the museum is not a frequently visited place (P4); (2) there is too double-tap. It would respond to more conscious movements." P6
much intellectual information to remember (P9, P11, P12); and (3)
ASSETS ’22, October 23–26, 2022, Athens, Greece Wang, et al.

Figure 8: Questionnaire results of the overall usability of our system (Q4–Q6).

Figure 9: Questionnaire results of the usability of 3D and layered foor-related elements (Q7–Q10) and audio-tactile interface-
related elements (Q11–Q14). A: [The element] is easy to understand, and B: [It] is useful in exploration and tour design.

5.4.2 Usability of 3D and Layered Floors. The results of ratings Participants commented that touching the structural attraction’s
related to 3D and layered foors are summarized in Fig. 9 (Q7.A– geometric shape on the model was the best way to understand its
Q10.B). In general, all 3D and layered foor-related elements were actual structure (P2, P12). As a result, all of the participants chose
rated as understandable and useful (median >= 6). The fully stacked the Oval Bridge to be a part of the tour they designed.
3D building (Q7) and partially stacked 3D interlocking foors (Q8) A17:“It would be impossible to understand the structure of the
received especially positive ratings (median = 7). The participants Oval Bridge without the 3D model." P7
felt it was especially benefcial and enjoyable to be able to stack and
About the outlined exhibits, not every participant associated
touch structural attractions in the building, such as the Oval Bridge,
them with the actual size and outline of the exhibition area. How-
the Dome Theater, the atrium, and the series of long escalators (P2,
ever, when they noticed the association, they were very positive
P4–P8, P10). They noted that they were attracted by the “Bento”
about this kind of information being provided.
characteristics, which motivated them to learn structural details.
A18:“One exhibit was a narrow and long chamber. When I went
A14:“When building a mental map before, I could only make a in, I was like ‘That’s it!’ I remembered the shape clearly with
fat map for one foor. But using this system, I had a stronger my fngers. The impression would not be that strong if I had
impression of 3D movement. I was so excited to walk [with a only heard from the voice guide that it was narrow and long."
fnger] in the 3D space." P4 P4
A15:“By exploring the structures in order, such as the entrance The volumetric symbols were understandable but not perceived
and escalator, I feel like I was walking in the museum. It gave as especially useful due to the fact that the task was designing a tour.
me the sense of being immersed into the museum." P6 They acknowledged that even though it was not very much used, it
A16:“The building can be separated, and it is easy to under- would be absolutely important in the actual visit (P4). Participants
stand the details. In the case of a tactile map, it is difcult to also noted that facilities, especially the restroom and front desk,
understand how to move from foor to foor." P8 possibly needed touch interactions (P4, P10, P11).
BentoMuseum: 3D and Layered Interactive Museum Map for Blind Visitors ASSETS ’22, October 23–26, 2022, Athens, Greece

A19:“I want to know more about the ticketing information and A26:“The museum has a shop, right? I want to learn what are
where the fush button is in the restroom." P11 popular souvenirs. I also want to know the restaurant menu. I
The study did not fnd particular challenges in stacking and want to hear a lot of options in this museum!" P11
orienting the foors due to the model’s irregular shape, the magnets’ A27:“I want to use it at the entrance when I come again. But
support of a specifc lockup, and the maintained model orientation. it’s not only for the visually impaired. Foreigners, children, it
However, some participants reported difculties in recognizing can be useful for everybody!" P12
separable 3D structures. They noted that even though the inter-
foor structures (e.g., Oval Bridge and escalators) on the bottom 6 DISCUSSION
foor were self-explanatory, it was difcult to locate them on the
6.1 RQs: Efectiveness in Information
top foor alone (P1, P3, P6).
Provision and Mental Map Building
5.4.3 Usability of Audio-tactile Interface. The results of ratings 6.1.1 RQ1. How can we make the vast amount of needed information
related to audio-tactile interactions are summarized in Fig. 9 (Q11.A– accessible and understandable on a museum map? While contem-
Q14.B). The exhibition’s two-level audio information (Q11 and porary museums contain a massive amount of multidimensional
Q12) and Navigation mode (Q14) were rated as understandable and information, through the participatory design with stakeholders
useful (median >= 6.5). After obtaining information, all participants we categorized the needed information into structural information
included at least one exhibit with accessible content in their tour. (layered and stackable), exhibition (name, detailed description, and
Two participants (P4, P11) chose the north-up navigation style, and accessibility), facilities, and their locations on the map.
the rest used the default turn-by-turn style. Participants mentioned Our proposed method, BentoMuseum, has been proved efective
that the navigation was helpful for them to develop the route in in obtaining the above types of information (Section 5.4.1, Q1 and
their minds (P1–P4, P7, P11) and that using a fnger to trace the Q2). It helped the participants to build knowledge that integrates
route was an enjoyable experience (P1, P3, P4, P7, P10). the structural information, location, and contents. All of the ratings
A20:“Thanks to the Navigation mode, I could learn relative of the purposed system outperformed the baseline of accessing
locations and build a mental map." P1 information by reading a web page, the current prevalent method.
A21:“It was fun to grasp the location relationship by tracing the The multi-sensory method was noted to be more helpful than using
route with a fnger." P3 either a tactile map or an audio guide alone (A8 and A9). Participants
Nevertheless, the need for the Navigation mode might depend especially praised the innovative “Bento Box” characteristics of the
on the tactile and orientation ability: map as being curiosity-arousing, enjoyable and understandable
(Section 5.4.2, Q7 and Q8). Enabling the users to travel from one
A22:“I don’t need the Navigation mode now because the layout
foor to another by touch when the map is stacked gave them a sense
is easy. I can understand and remember it just by touch." P8
of immersion (A14 and A15). It also made the structural attraction’s
Participants somehow agreed that double-tapping the intersec- geometric shape, especially those crossing several foors, accessible
tion was easy (Q13.A, median = 5) and useful (Q13.B, median = and understandable in a concrete way (A16 and A17).
4.5). Four participants reported that the intersection was small for Aside from the provision of information, we also found a set
a double-tap (P5, P7, P9, P10). In terms of usefulness, we received a of advantages that BentoMuseum contributes toward the social
variety of comments and suggestions. inclusion of the museum. It empowers users to achieve independent
A23:“I can touch the exhibit to learn the needed information, exploration and decision-making, which are reported to be valuable
thus I didn’t use the intersection." P8 (Section 5.2.1, A1 and A2) [41]. At the end of the study, participants
A24:“I might have touched a number of exhibits, but I don’t showed their gratitude for the museum becoming more accessible
know which are untouched. I hope the system tells me what area and inclusive. Frequent museum visitors (P5, P11) ideated further
hasn’t been explored. Probably it is best at an intersection." P6 customization (Section 5.4.4, A25 and A26), and those who rarely
visited a museum (P12) expressed the desire to come again for an in-
5.4.4 Free Comments and Suggestions. Participants freely expressed depth exploration (A27). The fndings indicate that the system can
where they wanted to use the system. The answers are categorized help to bridge the understanding between the blind visitors and the
as follows: locations containing many points of interest, such as information provider and contribute to the social role of museums,
museums (P2, P5, P6, P7, P9, P11), amusement parks (P5, P11, P12), that is, to welcome blind visitors and challenge stereotypes.
and department stores (P5, P9, P11); large and complex places, such
as convention halls (P6) and airports (P2, P10); and frequently vis- 6.1.2 RQ2. Is building a mental map possible and significant in the
ited places, such as train stations (P3, P8, P10, P11), hospitals (P1, museum context? The results from a non-rigorous fve-level evalua-
P11), city halls (P4, P11), schools (P4, P6), and in the train (P4). tion show that most participants could build a rough mental map
Participants also raised a variety of hopes and suggestions of using our system (level 5 in Section 5.3). The proposed system was
what the system could ofer. helpful in the following ways: In addition to the other elements that
A25:“I want to bring the system with me, and let it explain allowed them to explore the exhibits of interest, Navigation mode
things to me just like a tour guide. Like ‘We arrived here. It was benefcial for drawing location relationships on a foor (A20 and
is about...’ I hope we can hold a discussion in-depth about the A21), and the touchable inter-foor structures helped them to con-
exhibition. It would also be nice to tell me where it is crowded nect foors and build a 3D mental map (A14–A16). Diferent from
and where it isn’t." P5 frequently visited places where Orientation and Mobility (O&M)
ASSETS ’22, October 23–26, 2022, Athens, Greece Wang, et al.

training has been conducted, the visitors exhibited resistance to methods to other locations that troubled them, attracted them, or
building a detailed mental map of the museum (Section 5.3). The re- required their confdence (Section 5.4.4).
sults demonstrate the ability of our system to help build the current Even though the design method is currently implemented in one
type of rough mental map (A20 and A21) and indicate its ability to particular museum, it could be generalized to a variety of locations,
support route memorization (A7). Even though creating and main- especially buildings with irregular external or internal structures
taining a fner mental map imposes a cognitive load in addition and complex information. We suggest the following design con-
to information access and tour design, it is necessary for visually siderations in applying our method: (1) Categorize the complex
impaired people’s independence [22]. A future work is to rigorously information of a building into three types: structural attractions, ex-
measure spatial memory to support O&M training. hibits (or informative attractions), and facilities. Simplify the shape
On the other hand, when we asked participants’ thoughts and im- of each type into 3D elements, 2D relief elements, and volumetric
pressions about the mental map after the museum tour, all of them symbols on the foor, respectively. (2) Make foors stackable and en-
explicitly stated that the current rough mental map was benefcial sure the inter-foor 3D elements can be understood by touch when
for their tour. First, it supported their orientation, which made their foors are stacked. (3) Add audio labels to the points of interest
tour meaningful (A3 and A4). Second, it gave them a sense of safety and ensure they can be recognized by touch. Such a design can
and the confdence that they will not get lost (A5 and A6). This be implemented with relatively inexpensive materials: 3D printers,
confdence is an objective of navigation for the visually impaired touch screens, and conductive ink.
in an unfamiliar environment, which supports autonomy and self- Through our study of introducing an entire museum to frst-time
reliance [8, 28]. Through such comments, we infer that the rough visitors, another important lesson we learned was that instruction,
mental map is signifcant in improving the museum experience. guidance, and encouragement could motivate the users and promote
efective information provision when presenting a complex map.
6.2 Limitations and Next Steps In our study, we instructed the users to travel through inter-foor
structures using fngers and consequently learned they especially
Some usability issues are related to system design, which can be
enjoyed these structures. If we failed to do so, some participants
addressed by further improvement: (1) Enlarge the small touch area
might overlook them due to the large number of 3D and 2D elements
(e.g., the intersections) to ft diferent fnger sizes. (2) Refne 3D
on each foor. The methods to communicate the map and motivate
details to make clear, inter-foor structures available on both top
the users should be a part of the map design.
and bottom foors. (3) Refne the audio content. The audio-tactile in-
formation at the intersection was noted as not useful (A23); instead,
it can be used to report the exhibit coverage as P6 suggested (A24). 6.4 Toward Universal Access
Some participants also found double-tap difcult. This input method Participants showed an interest in having the BentoMuseum as an
was used to prevent unintentional touches but might be unnatural “ask-me-anything” box (A25 and A26). Further investigations (e.g.,
when all fngers are used for exploration. Seamless touch interac- conversational agent) should be made to fulfll individual needs
tions with less learning efort (e.g., force recognition suggested by while maintaining simple operation. Some participants also ex-
P6 in A13) need to be further investigated. pected it to support the needs of those beyond themselves (A27).
To make the system a part of the museum facility, the display Indeed, it can potentially share information between diferent stake-
and communication methods need to be further examined: (1) Self- holders: the blind visitors, museum staf members and service
serve foor-changing. The participants did not confuse the order of providers, domestic visitors, foreign visitors, and other visitors
the foors because we handed them each foor upon request. We with disabilities to create universal access. For example, the infor-
also instructed the participants to keep the same orientation. To mation that is now decoded to audio might be presented in sign
order and orient the foors in the wild by the users themselves, clear language to support hearing impaired visitors in gaining informa-
labels, verbal instructions, and tactile indications need to be tested. tion access. Diferent services can be connected through the map
(2) Automated instructions. Our study proved that instructions to extend access throughout the visit.
delivered by museum staf (Section 4.2.2 and Section 4.2.3) were
benefcial to understanding the external structure and learning 7 CONCLUSION
the system. Automating those instructions is needed to reduce the
This work investigated how a museum with a massive amount
staf expertise for system operation. (3) Time and interest-based
of multidimensional information could provide accessible maps
instructions for efciency. The tour design task took approximately
to blind visitors. We designed 3D and layered museum maps for
45 minutes, which was a considerable amount of time that not every
each foor of a science museum which can be stacked or placed
visitor can aford. Our log data show that the movement styles were
on a touch screen to learn diferent levels of detail. An authentic
linked to efciency, and the most efcient style was not performed
tour design task with 12 blind frst-time museum visitors showed
naturally (Section 5.2.2). Since museum visitors might have varied
our system’s efectiveness in obtaining information and building a
tactile skills and needs, customizable instructions that support an
rough mental map. Through user feedback, we learned the potential
efcient exploration need to be investigated.
of our system to contribute to a positive and inclusive museum
experience. Our next steps include expanding this design method
6.3 Generalizability and Lesson Learned to other museums and attractions, making it smarter to support
Many participants hoped that the method would become avail- diferent needs, and making it available along with other means of
able in museums. Interestingly, they also suggested applying the assistive technologies to support autonomous museum exploration.
BentoMuseum: 3D and Layered Interactive Museum Map for Blind Visitors ASSETS ’22, October 23–26, 2022, Athens, Greece

ACKNOWLEDGMENTS tactile map key. British Journal of Visual Impairment 32, 3 (2014), 263–278.
[19] Jaume Gual, Marina Puyuelo, and Joaquim Lloveras. 2015. The efect of volumetric
We thank all of the participants who took part in our user study. We (3D) tactile symbols within inclusive tactile maps. Applied Ergonomics 48 (2015),
also thank Sakiko Tanaka, Bunsuke Kawasaki, and Kotaro Osawa, 1–10.
[20] Jaume Gual, Marina Puyuelo, Joaquim Lloveras, et al. 2011. Universal design and
the Science Communicators at Miraikan – The National Museum visual impairment: Tactile products for heritage access. In Proceedings of the 18th
of Emerging Science and Innovation, for their contributions to the International Conference on Engineering Design (ICED 11). Lyngby/Copenhagen,
system design and feld study. Denmark, 155–164.
[21] Jaume Gual, Marina Puyuelo, Joaquim Lloverás, and Lola Merino. 2012. Visual im-
pairment and urban orientation. Pilot study with tactile maps produced through
3D printing. Psyecology 3, 2 (2012), 239–250.
REFERENCES [22] João Guerreiro, Dragan Ahmetovic, Kris M. Kitani, and Chieko Asakawa. 2017.
[1] Frances Aldrich, Linda Sheppard, and Yvonne Hindle. 2002. First steps towards Virtual Navigation for Blind People: Building Sequential Representations of the
a model of tactile graphicacy. British Journal of Visual Impairment 20, 2 (2002), Real-World. In Proceedings of the 19th International ACM SIGACCESS Conference on
62–67. Computers and Accessibility (Baltimore, Maryland, USA) (ASSETS ’17). Association
[2] Vassilios S Argyropoulos and Charikleia Kanari. 2015. Re-imagining the mu- for Computing Machinery, New York, NY, USA, 280–289. https://doi.org/10.
seum through “touch”: refections of individuals with visual disability on their 1145/3132525.3132545
experience of museum-visiting in Greece. Alter 9, 2 (2015), 130–143. [23] Kozue Handa, Hitoshi Dairoku, and Yoshiko Toriyama. 2010. Investigation of
[3] Saki Asakawa, João Guerreiro, Dragan Ahmetovic, Kris M. Kitani, and Chieko priority needs in terms of museum service accessibility for visually impaired
Asakawa. 2018. The Present and Future of Museum Accessibility for Peo- visitors. British journal of visual impairment 28, 3 (2010), 221–234.
ple with Visual Impairments. In Proceedings of the 20th International ACM [24] Kevin Hetherington. 2000. Museums and the visually impaired: the spatial politics
SIGACCESS Conference on Computers and Accessibility (Galway, Ireland) (AS- of access. The Sociological Review 48, 3 (2000), 444–463.
SETS ’18). Association for Computing Machinery, New York, NY, USA, 382–384. [25] Leona Holloway, Kim Marriott, and Matthew Butler. 2018. Accessible Maps for
https://doi.org/10.1145/3234695.3240997 the Blind: Comparing 3D Printed Models with Tactile Graphics. In Proceedings of
[4] Japanese Standards Association. 2021. JIS T 0922:2007 Guidelines for older the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC,
persons and persons with disabilities – Information content, shapes and display Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA,
methods of tactile guide maps [In Japanese]. https://webdesk.jsa.or.jp/books/ 1–13. https://doi.org/10.1145/3173574.3173772
W11M0090/index/?bunsyo_id=JIS+T+0922%3A2007 [26] Leona Holloway, Kim Marriott, Matthew Butler, and Samuel Reinders. 2019. 3D
[5] Sandra Bardot, Marcos Serrano, Bernard Oriola, and Christophe Joufrais. 2017. Printed Maps and Icons for Inclusion: Testing in the Wild by People Who Are
Identifying How Visually Impaired People Explore Raised-Line Diagrams to Blind or Have Low Vision. In The 21st International ACM SIGACCESS Conference
Improve the Design of Touch Interfaces. In Proceedings of the 2017 CHI Conference on Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19). Association
on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI ’17). for Computing Machinery, New York, NY, USA, 183–195. https://doi.org/10.
Association for Computing Machinery, New York, NY, USA, 550–555. https: 1145/3308561.3353790
//doi.org/10.1145/3025453.3025582 [27] Eilean Hooper-Greenhill, Richard Sandell, Theano Moussouri, Helen O’Riain,
[6] Graham Black. 2012. The engaging museum: Developing museums for visitor et al. 2000. Museums and social inclusion: The GLLAM report. University of
involvement. Routledge, London. Leicester, Leicester, UK.
[7] Anke M Brock, Philippe Truillet, Bernard Oriola, Delphine Picard, and Christophe [28] Sulaiman Khan, Shah Nazir, and Habib Ullah Khan. 2021. Analysis of Navigation
Joufrais. 2015. Interactivity improves usability of geographic maps for visually Assistants for Blind and Visually Impaired People: A Systematic Review. IEEE
impaired people. Human–Computer Interaction 30, 2 (2015), 156–194. Access 9 (2021), 26712–26734. https://doi.org/10.1109/ACCESS.2021.3052415
[8] Kangwei Chen, Victoria Plaza-Leiva, Byung-Cheol Min, Aaron Steinfeld, and [29] Barbara Leporini, Valentina Rossetti, Francesco Furfari, Susanna Pelagatti, and
Mary Bernardine Dias. 2016. NavCue: Context Immersive Navigation Assistance Andrea Quarta. 2020. Design Guidelines for an Interactive 3D Model as a Sup-
for Blind Travelers. In The Eleventh ACM/IEEE International Conference on Human porting Tool for Exploring a Cultural Site by Visually Impaired and Sighted
Robot Interaction (Christchurch, New Zealand) (HRI ’16). IEEE Press, 559. People. ACM Trans. Access. Comput. 13, 3, Article 9 (aug 2020), 39 pages.
[9] Anne Chick et al. 2017. Co-creating an accessible, multisensory exhibition https://doi.org/10.1145/3399679
with the National Centre for Craft & Design and blind and partially sighted [30] Nina Levent and Christines Reich. 2012. How Can My Museum Help Visitors With
participants. REDO Cumulus Conference Proceedings (2017). Vision Loss? Museum. American Association of Museums July-August (2012), 21–
[10] Matthew Cock, Molly Bretton, Anna Fineman, Richard France, Claire Madge, 22. http://ww2.aam-us.org/docs/default-source/museum/visitors-with-vision-
and Melanie Sharpe. 2018. State of Museum Access 2018: does your museum loss.pdf?sfvrsn=0
website welcome and inform disabled visitors? http://vocaleyes.co.uk/state-of- [31] Georgia Lindsay. 2020. Contemporary Museum Architecture and Design. Routledge,
museum-access-2018/ New York.
[11] Eugenia Devile and Elisabeth Kastenholz. 2018. Accessible tourism experiences: [32] Marianne Loo-Morrey. 2005. Tactile Paving Survey. https://www.hse.gov.uk/
the voice of people with visual disabilities. Journal of Policy Research in Tourism, research/hsl_pdf/2005/hsl0507.pdf
Leisure and Events 10, 3 (2018), 265–285. [33] ICOM The International Council of Museums. 2020. International Museum Day
[12] Jocelyn Dodd, Richard Sandell, et al. 2001. Including museums: perspectives on 2020 - Museums for Equality: Diversity and Inclusion. https://imd.icom.museum/
museums, galleries and social inclusion. University of Leicester, Leicester, UK. past-editions/2020-museums-for-equality-diversity-and-inclusion/
[13] Fatma Faheem and Mohammad Irfan. 2021. Museums for Equality: Diversity and [34] ICOM The International Council of Museums. 2022. International Museum Day
Inclusion–A New Concept of Future Museums. IAR Journal of Humanities and 2022: The Power of Museums. https://imd.icom.museum/international-museum-
Cultural Studies 2, 1 (2021), 12–13. day-2022-the-power-of-museums/
[14] Natália Filová, Lea Rollová, and Zuzana Čerešňová. 2022. Route options in [35] The Braille Authority of North America. 2010. Guidelines and Standards for
inclusive museums: Case studies from Central Europe. Architecture Papers of the Tactile Graphics. http://www.brailleauthority.org/tg/web-manual/index.html
Faculty of Architecture and Design STU 27, 1 (2022), 12–24. [36] Round Table on Information Access for People with Print Disabilities Inc. 2005.
[15] Uttara Ghodke, Lena Yusim, Sowmya Somanath, and Peter Coppin. 2019. The Guidelines on Conveying Visual Information. https://printdisability.org/
Cross-Sensory Globe: Participatory Design of a 3D Audio-Tactile Globe Prototype guidelines/guidelines-on-conveying-visual-information-2005/
for Blind and Low-Vision Users to Learn Geography. In Proceedings of the 2019 [37] Petros Pistofdis, George Ioannakis, Fotis Arnaoutoglou, Natasa Michailidou,
on Designing Interactive Systems Conference (San Diego, CA, USA) (DIS ’19). Melpomeni Karta, Chairi Kiourt, George Pavlidis, Spyridon G Mouroutsos, De-
Association for Computing Machinery, New York, NY, USA, 399–412. https: spoina Tsiafaki, and Anestis Koutsoudis. 2021. Composing smart museum exhibit
//doi.org/10.1145/3322276.3323686 specifcations for the visually impaired. Journal of Cultural Heritage 52 (2021),
[16] Stéphanie Giraud, Anke M Brock, Marc J-M Macé, and Christophe Joufrais. 2017. 1–10.
Map learning with a 3D printed interactive small-scale model: Improvement of [38] Jonathan Rowell and Simon Ongar. 2003. The world of touch: an international
space and text memorization in visually impaired students. Frontiers in psychology survey of tactile maps. Part 2: design. British Journal of Visual Impairment 21, 3
8 (2017), 930. (2003), 105–110.
[17] Timo Götzelmann. 2016. LucentMaps: 3D Printed Audiovisual Tactile Maps [39] Jonathan Rowell and Simon Ungar. 2003. The world of touch: an international
for Blind and Visually Impaired People. In Proceedings of the 18th International survey of tactile maps. Part 1: production. British Journal of Visual Impairment
ACM SIGACCESS Conference on Computers and Accessibility (Reno, Nevada, USA) 21, 3 (2003), 98–104.
(ASSETS ’16). Association for Computing Machinery, New York, NY, USA, 81–90. [40] Jonathan Rowell and Simon Ungar. 2005. Feeling our way: tactile map user
https://doi.org/10.1145/2982142.2982163 requirements-a survey. In International Cartographic Conference, La Coruna. 652–
[18] Jaume Gual, Marina Puyuelo, and Joaquim Lloveras. 2014. Three-dimensional 659.
tactile symbols produced by 3D Printing: Improving the process of memorizing a [41] Richard Sandell. 2002. Museums, society, inequality. Routledge, London.
ASSETS ’22, October 23–26, 2022, Athens, Greece Wang, et al.

[42] Martin Schmitz, Mohammadreza Khalilbeigi, Matthias Balwierz, Roman Lisser- on Software Development and Technologies for Enhancing Accessibility and Fight-
mann, Max Mühlhäuser, and Jürgen Steimle. 2015. Capricate: A Fabrication ing Info-Exclusion (Online, Portugal) (DSAI 2020). Association for Computing
Pipeline to Design and 3D Print Capacitive Touch Sensors for Interactive Objects. Machinery, New York, NY, USA, 17–21. https://doi.org/10.1145/3439231.3439272
In Proceedings of the 28th Annual ACM Symposium on User Interface Software & [50] Andreas Voigt and Bob Martens. 2006. Development of 3D tactile models for
Technology (Charlotte, NC, USA) (UIST ’15). Association for Computing Machin- the partially sighted to facilitate spatial orientation. In 24th eCAADe Conference
ery, New York, NY, USA, 253–258. https://doi.org/10.1145/2807442.2807503 Proceedings. CUMINCAD.
[43] Jonathan Sweet. 2007. Museum architecture and visitor experience. Museum [51] Diana Walters. 2009. Approaches in museums towards disability in the United
Marketing: Competing in the global marketplace (2007), 226–237. Kingdom and the United States. Museum management and curatorship 24, 1
[44] Brandon Taylor, Anind Dey, Dan Siewiorek, and Asim Smailagic. 2016. Customiz- (2009), 29–46.
able 3D Printed Tactile Maps as Interactive Overlays. In Proceedings of the 18th [52] Xi Wang, Danny Crookes, Sue-Ann Harding, and David Johnston. 2022. Stories,
International ACM SIGACCESS Conference on Computers and Accessibility (Reno, journeys and smart maps: an approach to universal access. Universal Access in
Nevada, USA) (ASSETS ’16). Association for Computing Machinery, New York, the Information Society 21, 2 (2022), 419–435.
NY, USA, 71–79. https://doi.org/10.1145/2982142.2982167 [53] Zheshen Wang, Baoxin Li, Terri Hedgpeth, and Teresa Haven. 2009. Instant
[45] Simon Ungar. 2018. Cognitive mapping without visual experience. In Cognitive Tactile-Audio Map: Enabling Access to Digital Maps for People with Visual
Mapping: Past Present and Future. Routledge, London, 221–248. Impairment. In Proceedings of the 11th International ACM SIGACCESS Con-
[46] Simon Ungar, Mark Blades, and Christopher Spencer. 1993. The role of tactile ference on Computers and Accessibility (Pittsburgh, Pennsylvania, USA) (As-
maps in mobility training. British Journal of Visual Impairment 11, 2 (1993), sets ’09). Association for Computing Machinery, New York, NY, USA, 43–50.
59–61. https://doi.org/10.1145/1639642.1639652
[47] Raša Urbas, Matej Pivar, and Urška Stankovič Elesini. 2016. Development of [54] Neng-Hao Yu, Sung-Sheng Tsai, I-Chun Hsiao, Dian-Je Tsai, Meng-Han Lee,
tactile foor plan for the blind and the visually impaired by 3D printing technique. Mike Y. Chen, and Yi-Ping Hung. 2011. Clip-on Gadgets: Expanding Multi-Touch
Journal of graphic engineering and design 7, 1 (2016), 19–26. Interaction Area with Unpowered Tactile Controls. In Proceedings of the 24th
[48] Roberto Vaz, Diamantino Freitas, and António Coelho. 2020. Blind and Visually Annual ACM Symposium on User Interface Software and Technology (Santa Barbara,
Impaired Visitors’ Experiences in Museums: Increasing Accessibility through California, USA) (UIST ’11). Association for Computing Machinery, New York,
Assistive Technologies. International Journal of the Inclusive Museum 13, 2 (2020). NY, USA, 367–372. https://doi.org/10.1145/2047196.2047243
[49] Roberto Vaz, Diamantino Freitas, and António Coelho. 2020. Perspectives of [55] Limin Zeng, Gerhard Weber, et al. 2011. Accessible maps for the visually impaired.
Visually Impaired Visitors on Museums: Towards an Integrative and Multisensory In Proceedings of IFIP INTERACT 2011 Workshop on ADDW, CEUR, Vol. 792. 54–60.
Framework to Enhance the Museum Experience. In 9th International Conference
Depending on Independence
An Autoethnographic Account of Daily Use of Assistive
Technologies
Felix Fussenegger
Katta Spiel
felix.fussenegger@igw.tuwien.ac.at
katta.spiel@tuwien.ac.at
HCI Group – TU Wien
Vienna, Austria
ABSTRACT ACM Reference Format:
Assistive technologies (AT) are a necessity for a person with a Felix Fussenegger and Katta Spiel. 2022. Depending on Independence An
Autoethnographic Account of Daily Use of Assistive Technologies. In The
severe disability to be able to lead a self-determined life within
24th International ACM SIGACCESS Conference on Computers and Accessibil-
modern societies. Thus, these technologies fulfl an important soci- ity (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY,
etal role and have a signifcant impact on the lives of those afected. USA, 6 pages. https://doi.org/10.1145/3517428.3551354
This experience report provides insights on the lived experiences
surrounding the necessary use of AT from the point of view of a dis- 1 INTRODUCTION
abled person. Using an autoethnographic approach, we determine
Assistive technologies (AT) have a signifcant place in the lives of
the function and relevance of AT in everyday life and illustrate the
people with disabilities and are paramount to their participation
intended and unintended efects of AT as well as the subsequently
in modern societies. For my1 personal condition as an impaired
arising socio-technical dependencies through representative exam-
person with a high spinal cord injury, I can confrm that I would
ples. The results show how the resulting dependencies on AT pose
not be able to survive on my own without AT or at least I would be
a risk for users, especially in the event of a technological failure.
permanently and constantly dependent on the help of other people.
Furthermore, a deployment of AT without the necessary refection
Thus, I am glad to live in this day and age in which we can rely on a
and preparation of backup strategies in case of failure may lead
fast advancing variety of technologies at our disposal that support
to unexpected and inadvertent, potentially harmful, side-efects.
me in leading a self-determined life [2]. But precisely because AT
Based on these observations, we elaborate on the implications for
are gaining traction and becoming so fundamentally important in
diferent stakeholder groups involved with the design, development,
terms of enabling people with disabilities, the impacts they have on
deployment and daily use of AT. We deem the key factors for suc-
their lives must be scrutinised and well understood. Otherwise, the
cess to lie in a deeper understanding of the application context, the
disposition of AT with the resulting dependencies can and will lead
integration of afected people in the development process as well
to unexpected and inadvertent, potentially harmful, side-efects.
as a fundamentally refective approach by everyone involved with
With this experience report, I want to provide an insight into my
AT.
personal use of AT and how they infuence me, my health, my
social interactions and my surroundings; how they are shaped
CCS CONCEPTS by the needs of my specifc embodiment and shape these needs
• Social and professional topics → People with disabilities; in return [17]. Or, more generally speaking, I detail their impact
• Human-centered computing → Accessibility design and eval- on my interdependence. Interdependence can be seen as a com-
uation methods; Empirical studies in accessibility; Accessibility plementary frame for AT researchers, previously introduced to
technologies. Human Computer Interaction (HCI) by Bennett et al. The concept
considers access to be relational and simultaneous, highlights the
KEYWORDS contributions of people with disabilities, and challenges traditional
assistive technologies, dependence, interdependence, independence, hierarchies of abilities [3]. My report stems from such a relational
disability studies, autoethnography understanding of disabilities, self-determination and access.
I will describe my individual experience with AT from my per-
spective as a person living with a severe disability, but also the
drawbacks of the lived experiences, from which I draw when think-
ing through issues of dependence in AT. The chosen examples
This work is licensed under a Creative Commons are intended to guide an understanding of the importance of AT
Attribution-NonCommercial-ShareAlike International 4.0 License. for disabled persons, the impacts AT have on their everyday lives
ASSETS ’22, October 23–26, 2022, Athens, Greece 1 Thisexperience report is predominantly written from the perspective of Felix as the
© 2022 Copyright held by the owner/author(s). frst author with Katta, the second author, guiding and supporting in the development
ACM ISBN 978-1-4503-9258-7/22/10. and writing processes for this report. Hence, whenever there is a reference from a frst
https://doi.org/10.1145/3517428.3551354 person singular perspective, it refers to Felix if not stated otherwise.
ASSETS ’22, October 23–26, 2022, Athens, Greece Felix Fussenegger and Kata Spiel

and what it is like to depend on them from a personal perspective. of Table 1 lists the activities that the AT enable me to carry out
Further, I elaborate on the implications of the insights I make in independently. On the one hand, this shows what the AT empower
observing my own practices and involvements with AT to make me to do, but on the other, it also illustrates how dependent I am
them relevant for diferent stakeholders (such as researchers, medi- on them and what activities I lose or have to manage drastically
cal professionals, policy makers, and disabled people themselves). diferently in case of failure.
In that, I operate loosely from within the notions of frst-person
research methods [4].
In this context, I deem it necessary to briefy explain my own im-
pairment and my perception of disability. I live with my physical
impairment since a snowboarding accident in 2003, in which I
sufered a spinal cord injury to the sixth cervical vertebra. This 2.2 Example 1: Early Voice Recognition vs Pen
resulted in a complete sensory and motor paralysis of my body ap- The accident that caused my disability happened in my fnal year
proximately from half of the shoulder down. Besides these medical of high school. After about a year of hospitalisation and intensive
limitations, there are also physical and social barriers in society therapy in a rehabilitation centre, I returned to school with the aim
that have a substantial infuence on my disability. Therefore, I see of graduating the same year. However, there turned out to be one
disability not only defned by my own impairment, but also by the problem: it was unclear how I could perform my written exams and
lack of opportunities that society ofers me to employ my abili- exercises. With my limited hand function, I could not write fast and
ties, which means I operate from a predominantely social model of well enough, neither by hand nor on the computer, to get my work
disability also in my use of AT [10]. done in what would have been deemed a reasonable amount of
I advise that this text is based on my subjective experiences and time. Despite the long and intensive rehabilitation, my handwriting
describes purely my personal insights and convictions. I do not was like that of someone just starting their schooling and learning
represent the position of any organisation or other people with to write. The recommended solution from my therapists and from
disabilities – either similar or dissimilar to mine and operates from school was that I should use a speech recognition software and
a specifc locality and situatedness [7]. Their experiences and opin- dictate my work and exams from now on. Following their advice, I
ions may well difer from mine and result in diferent implications bought (very expensive) software and started working with it. But
for research. in 2003, such software was not yet very sophisticated and I was
deeply unsatisfed with the reusults. But the therapists and teachers
2 ASSISTIVE TECHNOLOGIES AND ME in their positions as “experts” explained to me that the software
had to get to know me, and I had to get to know the software in
This section describes my personal experiences with AT - the good
return before I could expect it to function more seamlessly. And if I
as well as the not so pleasant. I provide a rough overview of the
just trained regularly, it would get better and better with time (and
most important technologies that support me in my everyday life.
patience).
I describe how I use them and what I need them for. In the frst
So I trained - for hours and hours, but the progress was sluggish.
section, I list the specifc AT that I need to master a typical day and
During the training, I began to scribble my spoken sentences on a
explain for which activities I use them. In the following sections,
piece of paper with a pen. This helped me to formulate the spoken
I illustrate how AT has infuenced me and my life through three
language more in line with what would be expected from a text
specifc examples. These examples are chosen in such a way that
in written language. At the beginning the scribble was nothing
they 1) clearly describe the contexts and efects of the use of AT, 2)
more but a wobbly line. But after many hours, during which I
are easily comprehensible and 3) have a representative potential as
trained with the software, I noticed that the scribbles developed
they relate to other AT.
more and more into a recognisable script. While I intended to train
the software to recognise my voice better and perform more in
2.1 A Day in the Life line with my expectations, the process had, in return, unexpectedly
Table 1 contains an overview of some of the AT I use and what I trained my own body to perform the task I meant to relegate to an
need them for. The enumeration of the AT outlines a common daily AT. After a while, my progress in handwriting was greater than
routine of a common day in my life. I identifed them by tracking with the software. Hence, at some point, I only concentrated on my
my daily interactions with objects and technologies for several days handwriting skills. And lo and behold - after several fully written
and refecting on my notes, a common procedure within autoethno- notebooks, my handwriting skills were sufcient enough for me
graphic approaches [20]. The mentioned AT are things, products or to take the fnal graduation exams on a regular basis without the
systems that support and help me with my impairments to perform support of any software. This bafed both, my therapists and my
functions that might otherwise be difcult or impossible for me to teachers. Until this day, I am grateful for the poorly functioning
achieve. These devices support me to improve or maintain my daily language software that inadvertently returned my handwriting
quality of life by easing or compensating for my disability and/or skills back to me. I am convinced that this would not have happened
the ways my environments are built without having my disability with today’s much more sophisticated software and that I still would
in mind. This can be a very simple object like an eating aid or a not be able to write a text by hand or fll out a form on my own.
more complex application like speech recognition on a computer. This example shows me how much the use of AT can promote or
The technical complexity, however, says nothing about how impor- hinder one’s own abilities – even if this might be contrary to the
tant or useful an artefact or application is for me. The third column initially intended purpose of use.
Depending on Independence ASSETS ’22, October 23–26, 2022, Athens, Greece

Device Function/Explanation Enablement/Dependency

holding bar over bed - hold on without fnger function - sit up in bed
- pull up torso - turn my self over
- get dressed and undressed
- get out of the bed
wheelchair fxing device - holding mechanism to fxate the - facilitates transfer
wheelchair to the bed between bed and wheelchair
- reduces the risk of falling
manual wheelchair - ultra lightweight - covering medium distances on a level
- foldable ground
- extra strong grip on push rims - loading into and out of the car
shower wheelchair - water solid - Body wash / shower
- cutouts in the seat area - using the toilet
elevator - big enough to ft a wheelchair - get to and from my fat (2nd foor)
- reachable buttons - get to the garage
applications in the the car - transfer device - perform a safe transfer
- hand throttle - load the wheelchair on/of the passenger
- hand brake seat
- steering aid - steering the car safely
voice recognition software - dictate long texts - writing documents for work
- for PC - dictate input for chats and mails and university
- for smartphone - execute frequently used commands - participate in digital communication
eating aids - specially formed fork, knife and - eating without help
spoon
mobile phone with assistance - easy call when I unexpectedly need - get help when needed
and emergency numbers assistance
- easy call in case of emergency
credit card with NFC payment - easy payment of small charges - handling small purchases
e.g. at the grocery store
hybrid manual wheelchair - senses my activity on the - covering long distances on a semi
push rim level ground
- provides adequate support - locomotion that is easy on the joints
with an electric drive
smart thermostat - digitally programmable, network- - regulation of temperature at home
compatible thermostats
Table 1: List of AT I use during a normal day, what it does and which activities it enables

2.3 Example 2: A Small Hook with a Serious when I am not at the peak level of my ftness, for example, due to
Catch an illness. Therefore, a few years ago, I constructed an individually
ftted mechanism together with my therapist, with which I can
With my degree of disability, transferring between a wheelchair
frmly connect the wheelchair to the bed. This makes the transfer
and my bed comprises a big challenge. At the same time, the abil-
easier and safer and I can manage it even if I am not entirely ft. In
ity to master the transfer on my own means a great degree of
the beginning, I only used the mechanism when I really needed it.
independence. That is why I trained specifcally in the course of
But as time went by, I used it for every transfer, because it makes
my rehabilitation to tackle this hurdle. I’m glad I succeeded, even
the process so much easier and is therefore a relief in everyday
though I have to keep myself ft all the time so that I don’t lose
life, allowing me to use my energy for other tasks [15]. When the
this ability. Moreover, it is difcult for me to carry out the transfer
holding mechanism was once unavailable due to damage incurred,
ASSETS ’22, October 23–26, 2022, Athens, Greece Felix Fussenegger and Kata Spiel

I realised that I had lost the ability to transfer between bed and and only perform the change when I have an exit strategy or a
wheelchair without using the mechanism. The device could be re- backup plan that works for me. However, the question remains
paired quite easily, but the fact of having lost such a hard-earned whose responsibility it is to account for alternative modes in the
function without noticing until it was too late hit me hard. For me, case of malfunctioning or nonfunctioning of technological artefacts
this means that if I spend the night in a bed diferent from the one in assistive contexts.
at home (e.g. when travelling), I now need help to get in or out of
the bed when previously, I could just more easily plan for these 3 PERSONAL REFLECTION
aspects of travel. I have decided to specifcally train this skill again It was my intent that the explanations in the previous section
during my next stay at the rehabilitation centre to counteract the provide a comprehensible insight into my life experience with AT
loss of function due to the AT. After that, I plan to only use the as a person with a disability. It is important to me to emphasise
mechanism when I absolutely need it. the extent and importance of AT, but also to create awareness
This example should demonstrate how quickly you may become un- of the interconnections, the arising dependencies as well as the
intentionally dependent on AT. As in this case, it can lead to the loss scope of possible consequences of use, functioning, bad functioning,
of important functions and, thus, to a certain loss of independence malfunctioning and nonfunctioning [17].
when the AT become unavailable – potentially without anyone Without AT, my life would look very diferent and I would never
noticing that this loss has occurred until it becomes a signifcant have access to some of the experiences that I treasure in my every-
issue. day life such as:
• living on my own in the area of my choice (regardless of
2.4 Example 3: A Hot Winter Night the location of my family or the regional supply situation of
caregivers),
Having a high spinal cord injury my body’s own temperature reg- • having a job (as a product developer of medical devices) that
ulation is signifcantly reduced. This means that I get too cold or I choose based on my interests and talents and not defned
too hot quite quickly, depending on the ambient temperature of by my impairment,
the environment I am in. Some of this can be countered with ap- • enjoying hobbies like cycling, bird watching, etc., and
propriate clothing, but at home I try to keep the room temperature • being social with friends, family and community.
in an optimised range for me. To achieve this, I installed digitally
programmable, network-compatible thermostats (so-called "smart All in all, I am very grateful (and a little proud) to be able to live
thermostats") on my radiators. This enables me to set the desired a self-determined, satisfying and fulflling life, despite my severe
temperature very precisely and individually for each room. If the disability. Table 1 provides practical insight into how much AT
temperature still fails to suit me, I can quickly and easily make the supports me in my everyday life. At the same time, it also shows
desired change via my smartphone. In general, the system works how much I depend on it and what fundamental activities I lose if
quite well for me and I am happy with the usability and the overall one of the ATs fails. In addition, the examples described have shown
improved energy balance for my body. that the application of AT can also be associated with unintended
But last winter, for no obviously apparent reason, two of the ther- advantages making the AT at least partially obsolete as well as
mostats encountered a problem and stopped working. The heating veritable disadvantages.
valves were completely open and the thermostats could not be
adjusted either via the buttons on the unit or via the smartphone.
4 DISCUSSION
My fat quickly became very hot and since it was already late in In this section, I discuss what these observations may mean for
the evening, I had no option available to organise anyone to reset dealing with AT and disabilities and how the insights can be used by
or dismantle the thermostats (both can only be performed with the diferent types of people involved with AT, be they researchers,
specifc tools and full fnger function). This lead to me having to designers, policy makers, therapists or disabled people themselves.
endure a sleepless night, during which I tried to keep the tempera- At frst, the fundamental objectives of AT for people with disabili-
ture within a bearable range by constantly closing and opening the ties should be elaborated, which requires us to challenge existing
windows (thankfully, it was cold enough outside for this to work), convictions and the general focus on “doing good” [5] or a singular
until I fnally could get help the next morning. evaluation oriented solely on the ‘usefulness’ and potential of AT,
After this experience, I constructed my own personal device with but also on possible side efects, as has been argued for general
which I can reset the thermostats on my own, echoing practices purpose technological research previously [8, 22]. The goal is not
of empowerment for disabled makers more generally [13]. Addi- just to compensate the impaired functions of a disabled person but
tionally, on the radiators that are out of reach for me, I reinstalled the objectives have to be thought of in a more holistic way.
the old-fashioned but reliable mechanical thermostats. This means
that the temperature control in the fat is no longer as efective and 4.1 Implications for Researchers and Designers
pleasant, but it is safer and more consistently within my control. It is important to have a basic knowledge of diferent models of
The example is intended to illustrate how important the reliability disability and to position specifc works within those – at the very
of AT is, even when a technology is not recognised as assistive least with a basic understanding of the big two, namely the medical
in all circumstances of use, and that the exit strategy (e.g. reset) and the social models of disability [10]. Findings from disability
must similarly be accessible without barriers. Furthermore, I should studies can help to develop a better understanding for these [9, 12,
have thought about the consequences of a malfunction in advance 18, 19, 21]. Such a position requires the proper comprehension of
Depending on Independence ASSETS ’22, October 23–26, 2022, Athens, Greece

the context of the application and the fundamental objectives of disabled people might have with the assumed potential of use often
an AT that is to be researched or designed. One way to achieve persuasively communicated by technologists.
this is to involve disabled stakeholders in the whole development
process. This means not only taking part in user surveys, but also 4.3 Implications for Disabled People
employing people from the “target population” as project members Disabled people have a nuanced understanding of their own life sit-
to help shape the developments. Mankof et al. already stated how uation as long as they are allowed to share it on their own terms and
relevant for good AT it is to include disabled people directly in along their individual communicative preferences (see, for example,
technology research about them [11], drawing on a long tradition for autism contexts [14]). Therefore, it is especially important for
of disability rights activism also echoed in a notion of “rights not them to become aware of the diferent kinds of impact specifc AT
charity” [16] – or adapted for technology research: “access not can have on their lives. An open and honest communication from
benevolence”. The list of fundamental activities enabled by AT all involved can make the diference, even and particularly if these
shown in Table 1 together with the in-depth examples I provided conversations might be fraught with power dimensions that make
reveal how serious a failure of one of the used AT can be. Therefore, the refusal of AT difcult to communicate and follow through. An
the reliability and quality of AT is a fundamental requirement. already widely used form of experience exchange wit AT is to pro-
The scenario for a failure should be considered and the possible vide self-organized peer counseling through social networks. I hope
consequences analysed. If necessary, exit strategies or a backup that more and more people will participate in this type of commu-
plan must be developed along with the product itself. One strategy nication and develop collective strategies in having conversations
achieving this can be to have diferent levels of support included in around AT. In my opinion, users have a certain responsibility in
the technologies so that people might be able to choose between the interaction with AT as well. We need to actively take on our
comfort and training where applicable. Besides the intended efects expert roles and refect on the consequences and possible side ef-
of AT, possible side efects should be considered equally – including fects before we integrate a new technology or technique into our
those that may make technological support unnecessary. Again lives. I would like to further encourage anyone who is interested
it is helpful to have a clear understanding of the context of the in AT to get involved in research, development, design, supply or
application and to work in close cooperation with the intended communication - be it as a regular collaborator, as a testimonial or
(and even unintended) users. in providing constructive feedback. In line with Mankof et al. [11],
Due to the rapid development of technologies and the intrinsic it is my conviction that the involvement of people with disabilities
motivation of researchers and developers who want to improve to represent their own interests is an essential contribution to the
the lives of people with disabilities with all their commitment and development of AT.
inventive spirit, there is a danger that the use of AT becomes an
end in itself. Despite good intentions, the original goals of a cer-
5 CONCLUSION
tain AT are sometimes not defned clearly enough or abandoned
during the process [6] and a sober consideration of the advantages This experience report was written with the motivation to con-
and disadvantages (or efects and side-efects) may not be feasible tribute to the understanding and discourse about Assistive Tech-
anymore, if overridden by other constraints. nologies in the context of disabilities. I intended to accomplish
this by communicating my personal experiences and insights to a
broad audience of AT stakeholders. Therefore, I described my own
impairment and explained my understanding of disability. Subse-
4.2 Implications for Educators, Carers, Medical quently, I listed the AT most relevant to my everyday life, illustrated
Professionals and Policy Makers in which form I use them and which activities they enable me to
For people who work in one of these areas as professionals, it is do. By discussing some representative examples, I could illustrate
equally benefcial to be aware of the fundamental objectives of spe- the potential infuences and efects the application of AT have for
cifc AT and the previously referred to models of disability. Before me specifcally. On the basis of this understanding, I identifed
providing specifc AT to people with disabilities, providers need relevant implications for diferent stakeholder groups concerned
to be aware of the intended efect and the range of potential side with assistive technologies. In conclusion, the implementation and
efect of these technologies as they relate to the situated contexts advancement of AT for people with disabilities holds huge opportu-
of the disabled person they are implementing AT for. Therefore, it nities but we need to turn more honestly towards their risks as well,
is necessary to have a good knowledge of the technologies and to particularly to their risks of failure or malfunctioning. We expect
practise open communication to ensure that identifed advantages that a deeper understanding of the socio-technical aspects of the
and disadvantages are communicated to all stakeholders accord- intended application context, the analysis of diferent dependencies
ing to their communicative preferences. In the course of providing AT might introduce or amplify, the integration of afected people in
certain technologies to people with disabilities, it is essential to the development process as well as a responsible approach by ev-
consider their individual living situations and to understand the eryone involved could help to utilize the opportunities AT promises
individual application contexts and socio-technical ecologies of and minimize the risks that might come along with them.
care [1]. Further, policy makers have to provide the framework
conditions that enable individualised care along the preferences ACKNOWLEDGMENTS
and desires of the disabled person themselves. We identify here the Part of this work has been funded by the Austrian Science Fund
risk of overriding personally established strategies or use intentions (FWF) project T 1146-G.
ASSETS ’22, October 23–26, 2022, Athens, Greece Felix Fussenegger and Kata Spiel

REFERENCES the 12th International ACM SIGACCESS Conference on Computers and Accessibility
[1] Mark S Ackerman, Ayşe G Büyüktür, Pei-Yao Hung, Michelle A Meade, and (Orlando, Florida, USA) (ASSETS ’10). Association for Computing Machinery,
Mark W Newman. 2018. Socio-technical design for the care of people with spinal New York, NY, USA, 3–10. https://doi.org/10.1145/1878803.1878807
cord injuries. In Designing Healthcare That Works. Elsevier, 1–18. [12] Deborah Marks. 1997. Models of disability. Disability and Rehabilitation 19, 3
[2] Valéria Baldassin, Helena Eri Shimizu, and Emerson Fachin-Martins. 2018. Com- (Jan. 1997), 85–91. https://doi.org/10.3109/09638289709166831 Publisher: Taylor
puter assistive technology and associations with quality of life for individuals & Francis.
with spinal cord injury: a systematic review. Quality of Life Research 27, 3 (2018), [13] Janis Lena Meissner, John Vines, Janice McLaughlin, Thomas Nappey, Jekaterina
597–607. Maksimova, and Peter Wright. 2017. Do-It-Yourself Empowerment as Experienced
[3] Cynthia L. Bennett, Erin Brady, and Stacy M. Branham. 2018. Interdependence by Novice Makers with Disabilities. In Proceedings of the 2017 Conference on
as a Frame for Assistive Technology Research and Design. In Proceedings of the Designing Interactive Systems (Edinburgh, United Kingdom) (DIS ’17). Association
20th International ACM SIGACCESS Conference on Computers and Accessibility for Computing Machinery, New York, NY, USA, 1053–1065. https://doi.org/10.
(ASSETS ’18). Association for Computing Machinery, New York, NY, USA, 161–173. 1145/3064663.3064674
https://doi.org/10.1145/3234695.3236348 [14] Damian EM Milton. 2014. Autistic expertise: A critical refection on the production
[4] Audrey Desjardins, Oscar Tomico, Andrés Lucero, Marta E. Cecchinato, and of knowledge in autism studies. Autism 18, 7 (2014), 794–802.
Carman Neustaedter. 2021. Introduction to the Special Issue on First-Person [15] Christine Miserandino. 2017. The spoon theory. In Beginning with Disability.
Methods in HCI. ACM Trans. Comput.-Hum. Interact. 28, 6, Article 37 (dec 2021), Routledge, 174–178.
12 pages. https://doi.org/10.1145/3492342 [16] Jenny Morris. 2014. Pride against prejudice: Transforming attitudes to disability.
[5] Murray Edelman. 1974. The political language of the helping professions. Politics The Women’s Press.
& Society 4, 3 (1974), 295–310. [17] Ingunn Moser. 2006. Disability and the promises of technology: Technology,
[6] Jean D Hallewell Haslwanter and Geraldine Fitzpatrick. 2017. Why do few subjectivity and embodiment within an order of the normal. Information, com-
assistive technology systems make it to market? The case of the HandyHelper munication & society 9, 3 (2006), 373–395.
project. Universal Access in the Information Society 16, 3 (2017), 755–773. [18] Mike Oliver. 1990. THE INDIVIDUAL AND SOCIAL MODELS OF DISABILITY.
[7] Donna Haraway. 1988. Situated Knowledges: The Science Question in Feminism (1990), 7.
and the Privilege of Partial Perspective. Feminist Studies 14, 3 (1988), 575–599. [19] KEVIN PATERSON and BILL HUGHES. 1999. Disability Studies and Phenomenol-
[8] Brent Hecht, Lauren Wilcox, Jefrey P Bigham, Johannes Schöning, Ehsan Hoque, ogy: The carnal politics of everyday life. Disability & Society 14, 5 (Sept. 1999),
Jason Ernst, Yonatan Bisk, Luigi De Russis, Lana Yarosh, Bushra Anjum, et al. 597–610. https://doi.org/10.1080/09687599925966 Publisher: Routledge _eprint:
2018. It’s time to do something: Mitigating the negative impacts of computing https://doi.org/10.1080/09687599925966.
through a change to the peer review process. arXiv preprint arXiv:2112.09544 [20] Amon Rapp. 2018. Autoethnography in human-computer interaction: Theory
(2018). and practice. In New directions in third wave human-computer interaction: Volume
[9] Eve Lacey. 2014. Alison Kafer, Feminist Queer Crip (Indiana: Indiana University 2-methodologies. Springer, 25–42.
Press, 2013), pp. 258, ISBN: 9780253009340, £16.99, paperback. Studies in the [21] Tom Shakespeare. 2006. Disability Rights and Wrongs. Routledge, London.
Maternal 6 (Jan. 2014). https://doi.org/10.16995/sim.11 https://doi.org/10.4324/9780203640098
[10] Richard E Ladner. 2011. Accessible technology and models of disability. In Design [22] Rua M. Williams and Juan E. Gilbert. 2019. Cyborg Perspectives on Computing
and use of assistive technology. Springer, 25–31. Research Reform. In Extended Abstracts of the 2019 CHI Conference on Human
[11] Jennifer Mankof, Gillian R. Hayes, and Devva Kasnitz. 2010. Disability Studies as Factors in Computing Systems (Glasgow, Scotland Uk) (CHI EA ’19). Association
a Source of Critical Inquiry for the Field of Assistive Technology. In Proceedings of for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/
3290607.3310421
“I Used To Carry A Wallet, Now I Just Need To Carry My Phone”:
Understanding Current Banking Practices and Challenges
Among Older Adults in China
Xiaofu Jin Mingming Fan∗
IIP (Computational Media and Arts Thrust) Computational Media and Arts Thrust
The Hong Kong University of Science and Technology The Hong Kong University of Science and Technology
Hong Kong SAR, China (Guangzhou)
xjinao@connect.ust.hk Guangzhou, China
Division of Integrative Systems and Design
Department of Computer Science and Engineering
The Hong Kong University of Science and Technology
Hong Kong SAR, China
mingmingfan@ust.hk

ABSTRACT ACM Reference Format:


Managing fnances is crucial for older adults who are retired and Xiaofu Jin and Mingming Fan. 2022. “I Used To Carry A Wallet, Now I
Just Need To Carry My Phone”: Understanding Current Banking Practices
may rely on savings to ensure their lives’ quality. As digital banking
and Challenges Among Older Adults in China. In The 24th International
platforms (e.g., mobile apps, electronic payment) gradually replace ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22),
physical ones, it is critical to understand how they adapt to digital October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 16 pages.
banking and the potential frictions they experience. We conducted https://doi.org/10.1145/3517428.3544820
semi-structured interviews with 16 older adults in China, where
the aging population is the largest and digital banking grows fast.
We also interviewed bank employees to gain complementary per- 1 INTRODUCTION
spectives of these help givers. Our fndings show that older adults Banking, as an indispensable service of daily life, is undergoing un-
used both physical and digital platforms as an ecosystem based on precedented changes under the trend of the digital economy. With
perceived pros and cons. Perceived usefulness, self-confdence, and the increasing penetration of high-speed internet and electronic
social infuence were key motivators for learning digital banking. devices (e.g., tablets, smartphones), digital banking has become
They experienced app-related (e.g., insufcient error-recovery sup- increasingly pervasive [3, 21, 115, 126]. In addition to banking web-
port) and user-related challenges (e.g., trust, security and privacy sites and mobile banking apps, electronic payment services (e.g., Ali-
concerns, low perceived self-efcacy) and developed coping strate- Pay [4], ApplePay [10], Google Pay [93], PayPal [94], Venmo [113],
gies. We discuss design considerations to improve their banking WeChat Pay [119], Yunshanfu [131], ZellePay [134]) and virtual (i.e.,
experiences. online-only) banks (e.g., Netbank [84], Webank [118], Monzo [80])
recently emerged as alternative digital banking platforms. More-
CCS CONCEPTS over, this trend has been accelerated by the COVID-19 pandemic.
• Human-centered computing → Accessibility; • Social and Many banks have shut down their physical branches and replaced
professional topics → Seniors. them with more digital banking platforms [12, 52, 100, 109].
As face-to-face interactions in physical banks have been grad-
KEYWORDS ually replaced by digital user interfaces on websites and mobile
apps, researchers have investigated people’s experiences and atti-
Older adults, elderly, seniors, aging, banking, virtual bank, elec-
tudes toward online and mobile banking [46, 57, 128] and digital
tronic payment, mobile banking, accessibility, technology use, digi-
payment [66, 135]. However, such studies primarily focused on
tal inclusion, digital equity
young adults. Compared to young adults, older adults tend to use
∗ Corresponding
technologies to a lesser extent and feel more reluctant to adopt new
Author
technologies [25, 88, 111]. They also tend to encounter more dif-
culties when using new technologies in general due to factors such
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed as age-related declines (e.g., cognitive, physical) [44, 78, 132], gener-
for proft or commercial advantage and that copies bear this notice and the full citation ation/cohort efects [132], digital divide/barriers [20, 90], and fewer
on the frst page. Copyrights for components of this work owned by others than the educational opportunities to keep up with the technology [98].
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specifc permission Consequently, older adults might face more challenges when
and/or a fee. Request permissions from permissions@acm.org. adopting digital banking.
ASSETS ’22, October 23–26, 2022, Athens, Greece On the other hand, compared to young adults, older adults have
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 accumulated more experiences with a variety of technologies over
https://doi.org/10.1145/3517428.3544820 decades and may have diferent criteria for “good technology”. As a
ASSETS ’22, October 23–26, 2022, Athens, Greece Jin and Fan

result, understanding what older adults have to say about technol- and social infuence(e.g., being motivated by people in social cir-
ogy would be benefcial for designing more inclusive technology, cles). Furthermore, our study found that older adults encountered
not just for themselves but also for everyone. After all, aging is a app-related challenges (e.g., ambiguous afordance, low informa-
process that everyone is experiencing [53]. tion scent, insufcient error recovery support, lack of feedback)
Thus, it is critical to understand older adults’ banking expe- and user-related challenges (e.g., anxiety caused by intangibility,
riences to improve the accessibility of emerging digital banking trustworthiness, security, and privacy, low perceived self-efcacy
platforms. and memory concerns) when using digital banking platforms.
Researchers have already begun to explore older adults’ digital We found that although older adults encountered challenges,
banking experiences recently [20, 38, 81, 89, 95, 99, 136]. While they showed a strong willingness to learn and apply their accumu-
informative, these studies primarily focused on specifc aspects of lated experiences and knowledge to tackle problems before seeking
digital banking, such as the user experience of bank websites on help from those they trust. Taken all together, we further discuss
desktop computers [38], strategies to build trust in internet bank- the implications of our key fndings and propose design considera-
ing [99], and the efect of self-efcacy and anxiety on internet tions and future work for assisting older adults with overcoming
banking [95]. A recent survey study took a step further to un- challenges in using digital banking and for supporting help-givers
derstand older adults’ overall banking practices in light of recent to better assist older adults.
technological development and found that although the majority In sum, we make the following contributions based on both older
of older adults still used physical banks, a small percentage of them adults’ and bank employees’ perspectives:
started to adopt digital banking and digital payments [48]. However,
this survey study mainly provided a quantitative overview of older • An understanding of how older adults use physical and digi-
adults’ banking practices, and it remains largely unknown why and tal banking platforms and their perceived pros and cons of
how older adults choose to use diferent baking platforms and the these platforms;
challenges they encounter. • An understanding of the motivations to learn digital banking;
Inspired by this line of work, we took a step further to explore • An understanding of banking challenges and design consid-
the following research questions (RQs): erations to improve older adults’ banking experiences.

• RQ1: How do older adults use physical and digital banking 2 BACKGROUND AND RELATED WORK
platforms? What are the perceived pros and cons of these
banking platforms?
2.1 Banking Trend and Technology Adoption
• RQ2: What are older adults’ motivations for learning digital for Older Adults
banking? Banking services are increasingly digitized with the development of
• RQ3: What are the challenges that older adults encounter information and communication technology (ICT) [5]. At the same
when using digital banking platforms? time, physical banks also constantly reduce the number of branches
and staf to save money [74]. While digital banking could bring
To answer RQs, we conducted in-depth semi-structured inter- convenience to people who are adept at digital technology, such a
views with 16 older adults living in fve cities in China. China has rapid technological shift may pose challenges to people who are
the largest older adult population [82, 91] and has been under- accustomed to traditional physical banking, such as older adults.
going a fast growth in digital banking. For example, China is a Prior research suggested that older adults tend to adopt new
world leader in the adoption of contactless mobile payments with technologies (e.g., computers, internet, tablets) slower and may be
81.1% usage penetration [30, 46] and has the highest growth rate less likely to use technologies in general than young adults [25,
of electronic-payment transactions among all countries [21]. Un- 88, 111]. Olson et al. conducted a survey study with 430 younger
der this rapid transition to digital banking, older adults in China adults and 251 older adultsand found that older adults tend to
have opportunities to experience both physical and digital banking be frequent users of long-standing technologies (e.g., telephone)
platforms and may encounter challenges associated with various and less frequent users of more recent technologies (e.g., Internet,
digital banking platforms. ATMs) [88]. Digital banking was also found to be not as popular
Furthermore, we also conducted semi-structured interviews with among older adults as it was among the young in Asia [73].
bank employees to gain an understanding of older adults’ banking Possible reasons for this age diference could be the age-related
experiences from their complimentary perspectives as help-givers. health declines and the digital divide among older adults. Age-
These bank employees had direct interactions with older adults and related perception declines like vision impairments may make it
accumulated experiences when assisting older adults with their more difcult for older adults to perceive small icons when using
banking needs. technology devices [26, 79], and age-related physical declines like
Our fndings show that older adults used both physical (e.g., reduction in fne motor skills can cause older adults to encounter
bank counters, and ATMs) and digital banking (e.g., website, mobile more motor issues such as tapping and scrolling when using mobile
bank apps, electronic payment, virtual bank apps) platforms. They apps [129], and age-related cognitive issues, such as the reduced
performed diferent types of transactions on diferent platforms speed of learning and memory difculties, may also slow down
based on their perceived pros and cons. their learning of digital technologies [28]. Moreover, older adults
Moreover, our study uncovered three motivations for learning were exposed to a diferent set of technological products when they
digital banking, which was perceived usefulness, self-confdence, were in the workforce compared to younger generations [20, 90],
Understanding Banking Practices and Challenges Among Older Adults in China ASSETS ’22, October 23–26, 2022, Athens, Greece

which could potentially contribute to the digital divide between Furthermore, researchers investigated the factors to afect older
older and young adults. adults’ adoption of digital banking platforms, including the user
As a result, as an emerging new technology, digital banking experience of banking websites [38], trust [99], fear [136], and self-
might impose new challenges for older adults, especially under the efcacy and anxiety [95].
current COVID-19 pandemic, people in many countries are advised In sum, prior research primarily focused on the adoption of a
to avoid using in-person banking. To ensure the inclusiveness of particular type of banking platforms (e.g., physical banks [1], ATMs
digital banking for older adults, it is critical to understand how older [27, 41, 92, 102, 133], or digital banking platforms [17, 19, 38, 72, 81,
adults perform banking activities on digital banking platforms and 88, 89, 95, 99]) among older adults. However, it remains unknown
the associated challenges. the practices and challenges of banking on both physical and digital
In this work, we seek to explore this question with older adults platforms (e.g., banking activities on each platform) among older
in the context of China. China has the world’s largest and fastest- adults and how they make trade-ofs between these platforms.
growing aging population [82, 91]. It is estimated to have 280 million Furthermore, electronic payment services (e.g., AliPay, ApplePay,
older adults by 2025, which will represent about one-ffth of its Google Pay, PayPal, Venmo, WeChat Pay, Yunshanfu, ZellePay)
total population [70, 91]. At the same time, the development of and virtual banks (e.g., Netbank, WeBank, Monzo) are the latest
the digital banking and e-commerce market in China is among the technological development in banking. Although a few studies
fastest-growing in the world with a volume of 1.94 trillion USD began to explore the use of electronic payment services and virtual
in 2019 [29, 46] and has almost become a cashless economy with banks among young adults [66, 135], few focused on older adults.
the fastest growing electronic payment [21, 54, 68]. As a result, Recently, a survey study showed that electronic payment had a
older adults in China have been experiencing the fastest shift from higher adoption rate than mobile banking and even ATMs among
traditional physical banking to digital banking and may experience older adults in China [48]. This has motivated us to understand
more challenges when learning and using new banking platforms. the practices and challenges of using electronic payment and virtual
By studying the banking practices and challenges that older adults banks among older adults as well as other banking methods. Such
in China experience, we hope to reveal design opportunities to an understanding may potentially explain why electronic payment,
better improve the banking experience for the aging population. despite coming around the same time as mobile banking and virtual
banks [124, 125, 127] and about 30 years later than ATMs [85], has
gained popularity in such a short time.
2.2 Older Adults’ Banking Practices In this work, we seek to understand how older adults use physical
Physical banks (e.g., bank counters) are commonly used by older and digital banking platforms, how they learn to use digital banking
adults [1, 2, 44, 89], and older adults were found to prefer to visit platforms, and the challenges encountered under current hybrid
bank branches over conducting banking transactions online [14, digital and physical banking services.
72, 89]. In-person customer service ofered in physical banks was
rated as one of the top desired services from fnancial institutions 3 METHOD
among older adults [1]. Despite the familiarity with physical banks,
To answer RQs, we conducted IRB-approved semi-structured inter-
older adults complained about the inconvenience associated with
views with older adults to better understand their banking expe-
visiting bank branches, such as the long wait time [48, 89].
riences. Moreover, we also conducted semi-structured interviews
The automated teller machine (ATM) was a technological innova-
with bank employees who had experience helping older adults to
tion to traditional banking products [20, 85], and the adoption rate
gain a complementary understanding of the difculties older adults
of ATMs among older adults has been slowly increasing over the
encountered.
past decades [27, 41, 92, 102, 133]. Nonetheless, many older adults
still do not use ATMs [48]. One critical reason was that older adults
felt uncomfortable and less in control of their fnances when using 3.1 Participants
an ATM [27]. O’Brien et al. further identifed more factors infuenc- Older Adults. Sixteen (N=16) participants aged 60 or older were re-
ing ATM adoption among older adults, which include usefulness, cruited through our social network and snowball sampling. Table 1
compatibility, complexity, technology generation, and relative ad- shows their demographic information. Nine were self-identifed as
vantage of a technology [92]. Compared to traditional ATMs, newer female and seven as male. They resided in diferent tier-ed cities:
versions of self-service banking machines, such as Cash Recycling nine were in frst-tier cities, fve in second-tier cities, one in a third-
System (CRS) machines, could handle more banking transactions. tier city, and one in a fourth-tier city. The tier system is roughly
These CRS machines typically have a big touchable screen and can based on the level of economic development, and the frst-tier cities
handle many banking transactions. In this paper, we treat all of are considered to have the highest levels of economic develop-
them as ATMs. ment in China, such as Beijing and Shanghai [121]. All participants
Digital banking platforms, such as bank websites and mobile (Md=65, SD=7) used smartphones. Five also had a computer and
banking apps, have recently emerged as alternatives to physical one also used an iPad.
banks and ATMs [5]. Researchers investigated whether and to what Bank Employees. After interviewing older adults, we found
extent older adults adopted digital banking platforms (e.g., [20, 38, that although they wished to be assisted by their children, they
81, 89, 95, 99, 136]). Compared to young adults, older adults tend to tended to go to banks in person for reasons like worrying about
use digital banking to a lesser extent and preferred to use traditional their children not having the time or relevant knowledge, not living
banking platforms (e.g., telephones, physical banks) [73, 88, 89]. with their children, and not wanting to reveal their fnancial info
ASSETS ’22, October 23–26, 2022, Athens, Greece Jin and Fan

Table 1: Older adults’ demographic information 3.2 Procedure


We obtained approval to conduct this research from the Institu-
Id Age Sex Location Devices used tional Review Board of our university. Interviews were held through
P1 61 F Shenyang computer, android phone phone calls, video calls, or in-person at participants’ places of choice
P2 61 M Shenyang computer, android phone by one researcher. All local COVID prevention protocols were
P3 60 M Huainan android phone strictly followed. All participants spoke Mandarin, and we also
P4 63 M Shenyang android phone conducted the interviews in Mandarin. While some participants
P5 62 M Jining android phone voluntarily showed us their apps to explain their points, our studies
P6 65 F Shanghai android phone did not ask for or record any personally identifable information.
P7 60 F Zhengzhou android phone The interviews lasted about an hour ranging from 47 minutes to 76
P8 76 M Shenyang android phone minutes and were audio-recorded. Each participant was compen-
P9 60 F Shanghai android phone sated with 70 CNY.
P10 63 F Shanghai android phone For older adult participants, they were frst asked to share their
P11 60 F Shanghai android phone banking practices including the activities they performed on various
P12 77 M Shenzhen computer, iPad, iPhone banking platforms (e.g., physical banks, ATMs, mobile banking
P13 79 F Shanghai android phone apps, virtual banks, and electronic payments) and the reasons for
P14 71 F Shanghai android phone their choices. Following a semi-structured format, we asked them
P15 62 M Shanghai computer, iPhone to freely talk about their usage patterns among these platforms
P16 60 F Shanghai computer, iPhone and the challenges they encountered as well as the strategies they
used to overcome the challenges. Specifcally, we asked participants
to explain their problems by showing the issues with their apps
for both in-person and video-call interviews. For phone-call-only
to their children. In contrast, they felt bank employees had worked interviews, we installed the same app as the participants to follow
with many older adults and had more experience helping them. their descriptions of the problems. Lastly, we asked about their
Thus, we also interviewed bank employees to gain their perspec- expectations of banking platforms.
tives. Seven (N=7) bank employees who had experience helping For banking employee participants, they were frst asked to
older adults were recruited through our social network and snow- estimate the proportion of older adult customers and their queuing
ball sampling. Table 1 shows banking employees’ demographic status(e.g., whether they need to queue, how long they line up), and
information. Five worked as lobby managers, who were responsible to share their observations and experiences regarding older adults’
for guiding and answering customer inquiries. Two worked as f- practices and challenges of using physical counters and digital
nancial managers, who were responsible for fnancial management banking methods. They were also asked to talk about the types of
transactions. While lobby managers could assist customers with transactions that they often help older adult customers with. After
most of the transactions, they will hand over fnancial manage- that, they were asked to share the difculties they encountered
ment requests to fnancial managers. In China, lobby managers when helping older adult customers and their banks’ policies for
are the ones who have the most contact with customers. Generally older adults. Finally, they were asked to share how they felt banking
speaking, they are the frst front to serve customers and then guide systems could be improved to better help older adults.
customers to queue or suggest trying digital methods without wait-
ing. They are also responsible for providing help on how to use
3.3 Data Analysis
digital banking platforms. Unlike bank tellers, who are only respon-
sible for transactions at the counters with a fxed process, lobby The interview recordings were auto transcribed, and one native
managers have more diverse and extensive experience in helping Mandarin-speaking author reviewed and corrected the transcripts.
older adults, which is the reason that we choose to interview lobby Two Mandarin-speaking authors frst familiarized themselves with
managers instead of bank tellers. the transcripts and then coded them independently using an open
coding method [22]. They met to discuss their codes and rationales,
revise their codes, and resolve disagreements to gain consensus on
Table 2: Bank employees’ demographic information the codes. After that, all the codes were translated into English, and
all the researchers performed afnity diagramming [13] to cluster
Id Sex Location Title the codes and categorize emerging themes related to our research
S1 M Tianjin Lobby manager questions with an inductive approach. We followed an iterative
S2 M Shanghai Lobby manager process to constantly challenge the groupings until we reached a
S3 F Tianjin Lobby manager consensus on the fnal groupings and their themes. re
S4 M Beijing Lobby manager
S5 M Shenyang Financial manager 4 FINDINGS
S6 M Dalian Lobby manager Our analysis revealed three main themes about practices and chal-
S7 F Beijing Financial manager lenges of banking among older adults: 1) Activities and Practices
on Diferent Banking Platforms, 2) Motivations to Learning Digital
Banking Platforms, and 3) Challenges in Using Digital Banking
Understanding Banking Practices and Challenges Among Older Adults in China ASSETS ’22, October 23–26, 2022, Athens, Greece

Platforms. We refer to older adult participants as “participants” and again out to another bank account. In such a way, they could avoid
bank employee participants as “bank employees” in the rest of the a transaction fee associated with transferring money directly be-
section. tween two bank accounts because virtual banks have the beneft of
waiving online transferring transaction fees.
Electronic payment options include digital wallets (e.g., Alipay
4.1 Activities and Practices on Diferent Wallet, WeChat Wallet), contactless payment methods (e.g., Alipay,
Banking Platforms WeChat Pay), etc. To use electronic payment, participants needed to
4.1.1 Banking Transactions on Physical and Digital Platforms. Fig- create a virtual account and bind one bank account with the virtual
ure 1 shows participants’ banking platforms and transactions. Phys- account so that they could transfer money from their bank account
ical banking platforms include both bank counters (i.e., physical into the virtual account. The two most commonly used electronic
banks) and ATMs. Digital banking platforms include mobile bank- payment services among participants were WeChat Pay [119] (Fig. 1
ing apps, virtual banks, and electronic payment. Next, we present right shows two UIs) and Alipay [4], and participants used them
the details of the transactions on each platform. regularly for daily activities, such as buying groceries, shopping
Physical Platforms. Participants primarily conducted the fol- online, paying for meals, taking bus or taxi, and transferring money.
lowing transactions in physical bank counters: deposit money, with-
draw money, manage wealth, and transfer money. Likewise, partic-
ipants also conducted a similar set of transactions on ATMs though 4.1.2 Perceived Pros and Cons of Banking Platforms. Physical
they tended to use ATMs more often to check balances and details bank counters. Safety was the major perceived pros of physical
of their bank accounts. bank counters. In addition to a safe physical environment, being
Bank employees estimated that older adult customers took up able to correct errors with the assistance of bank employees was
30% to 90% of all the customers visiting physical banks, depending another reason for perceived safety. “I would run into big trouble
on the bank branch’s location, the date (e.g., day of pension release), if I accidentally typed 10 instead of 5 years for the deposit time on
and the time of day. Our analysis found that the wide range ([30%- my phone if I’m alone. But I could correct it right away with bank
90%]) was mainly related to banks’ locations and types. Banks on employees.”- P3.
the higher end of the range tended to be state-owned big banks On the other hand, the long wait time was a frequently men-
with a long trustworthy history and locate in residential areas. tioned disadvantage of physical bank counters. Bank employees
The ones on the lower end tended to be relatively small banks or reported that wait time was infuenced by the bank’s location, date,
locate in business districts. Moreover, bank employees reported that and the time of day. For example, wait time would be longer if a
older adult customers tended to food to the banks around the time bank branch was the only one in the local area, or if it was the
when pension payments are released, which crowded the banks and pension release day. Especially on the day of pension release, some-
overburdened the bank employees. Furthermore, bank employees times even before the bank branch is open, people have already
reported that older adults who did not use smartphones faced extra fled a long line at the bank gate. Moreover, they mentioned that to
challenges during the COVID-19 pandemic because banks required comply with improving regulatory requirements, the transaction
their customers to scan the QR code to reveal their health code processes became longer than before, which further increased wait
on their phones to prove they are healthy before being allowed to time.
enter the banks, which were complained a lot by them. ATMs. ATMs were perceived to be convenient and time-saving
Digital Platforms. Participants used three main forms of digital for certain banking transactions, such as withdrawing money. How-
banking platforms: mobile banking apps, virtual banks, and elec- ever, participants complained that they were unable to fnd out
tronic payment. Participants used two types of mobile banking apps whether ATMs had enough cash until they inserted their cards and
to conduct fnancial transactions remotely. One type was provided were several steps into the process. What’s more, participants were
by traditional banks (see the left UI in the middle of Fig. 1), and they concerned about the legibility of the ATM’s screen and often had to
used this type of mobile app to transfer money, manage wealth, deliberately bring reading glasses to overcome this problem. “Who
and conduct transactions related to a credit card. Bank employees would always remember to bring the reading glass when going out?”-
confrmed that mobile banking apps provided by traditional banks P1. Indeed, bank employees also shared similar feedback from older
provided the majority of the non-cash banking transactions. adult customers and pointed out some bank branches started to
The other type was provided by or in collaboration with stock provide reading glasses to customers. Moreover, participants were
exchange (see the right UI in the middle of Fig. 1), such as Eastmoney also concerned about the fact that others could spot their passwords
and Dazhihui [103, 105], and participants used such apps mostly to over their shoulders when they used ATMs.
perform wealth management, such as buying stocks or funds. Furthermore, bank employees mentioned that the latest versions
Virtual (online-only) banks are the ones that do not have physi- of ATMs allowed customers to sign on the screen using a digital
cal branches but ofer banking services remotely [123]. Participants pen. However, many older adults had hand tremors, which made it
used virtual banks’ apps to deposit money to their electronic pay- difcult for them to sign on the screen.
ment account (e.g., WeChat Pay [119]—an electronic payment app), Digital Platforms. One common advantage of all digital bank-
pay bills, and manage wealth. Furthermore, participants also used ing platforms was convenience. First, with digital platforms, partici-
the virtual bank as a bridge to transfer money between accounts in pants could save physical eforts while still satisfying their banking
diferent physical banks. They would frst transfer money from one needs at home. “I could pay utility bills using AliPay (a virtual bank)
bank account into the virtual bank and then transfer the money with a few clicks and don’t have to go out”-P2; “Those unnecessary
ASSETS ’22, October 23–26, 2022, Athens, Greece Jin and Fan

Figure 1: Participants’ banking platforms and transactions. An arrow from “traditional bank accoun” points to both physi-
cal banking platforms and digital banking platforms, which indicates that these platforms could be bound with traditional
bank accounts. Another arrow from “virtual bank accoun” points to virtual banks and electronic payments, which indicates
that these services could also be bounded with a virtual bank account. Diferent colors and types of the lines indicate the
connections between the banking platforms and the corresponding transactions that older adult participants performed.

human labors or human errors are saved by the information technol- app typically generates a barcode and displays it on their phone’s
ogy.”-P5 Second, participants did not need to carry and deal with screen for the person to scan. However, participants worried that
banknotes, which were susceptible to both hygiene and counterfeit passers-by might be able to scan the barcode easily from a distance
issues. In particular, participants appreciated that they did not have and steal their money. “A Passer-by could easily use their phone to
to carry their purses or wallet to store cash but just needed to carry scan the barcode on my phone from several meters away without my
their smartphones, which almost all of them would carry anyway awareness, so I never use it ofine.” -P7
when going out. “I used to carry a wallet and need another bigger bag
to store it along with my phone. Now, I only need to carry a small bag
to store my phone.” -P6; “It helps to avoid touching paper bills and is 4.2 Motivations to Learning Digital Banking
more hygienic; this is especially important during the pandemic.” -P5. Platforms
Third, participants could transfer the money instantaneously on We identifed three primary motivations for learning digital banking
digital platforms. P5 preferred to use pocket money (i.e., a function platforms: perceived usefulness, self-confdence, and social infu-
of WeChat Pay [119]) to transfer money to his daughter, who stud- ence.
ied far away, over traditional money transfers in banks because the Perceived Usefulness. Participants felt that it was convenient
money would arrive at his daughter instantaneously to ensure her to complete transactions for daily activities, such as paying gro-
fnancial needs without delay. ceries and transportation fees, on their smartphones, especially
On the other hand, participants expressed privacy and security for daily activities. Bank employees also reported that many older
concerns about digital banking platforms. Because of the security adults were willing to try digital banking platforms even if they
concern about the electronic payment (e.g., WeChat Pay), P7 did had to wait in lines for a long time (e.g., over an hour).
not directly bind her bank account with WeChat Pay. Instead, she Moreover, they were also motivated by economic benefts, such
usually withdrew more cash from physical banks than she needed as discount activity on online shopping platforms. “After hearing
and exchanged the extra cash with her trusted relatives so that they from others that fruits on Pinduoduo (an e-commerce platform) were
could transfer money to her WeChat Pay. Moreover, participants cheap, I started to learn to use WeChat Pay.”-P16. Similarly, they
also had security concerns about ofine transactions. When paying were also motivated by the fact that many ofine shops ofered dis-
a person ofine (i.e., in-person) using an electronic payment app, the counts when customers pay with electronic payment. “Yunshanfu
Understanding Banking Practices and Challenges Among Older Adults in China ASSETS ’22, October 23–26, 2022, Athens, Greece

(an electronic payment app) collaborates with local shops to ofer the digital bank, so she gradually learned to use electronic payment
discounts on their groceries, such as tofu, so I was motivated to learn and now she could use it for online shopping.
to use it.” -P8. Additionally, one participant was motivated by higher Furthermore, participants felt that digital banking represented a
interest rates of wealth management products ofered through mo- societal trend and they needed to keep up with the fast-changing
bile banking apps. “Some wealth management products have higher world and not become obsolete. “Society is moving forward, and new
interests, but they are only available in mobile banking apps, so it is technology is benefcial for societal development.”-P5; “Now every-
worth learning.” -P14. thing is online, so if you don’t know how to do it online, you will be
Another perceived usefulness of learning digital banking plat- outdated.” -P6.
forms was to keep their brains sharp. “I’m getting old, and if I don’t Lastly, participants also felt proud of being able to use digital
use it, I’ll lose it.” -P14. banking among their peers. P8 and P14 regarded being able to use
Self-Confdence. Overall, participants expressed confdence in digital banking as a fashion. “I think learning digital banking is a
their ability to learn digital banking. “I might be slower than others, kind of pride and fashionable.” -P8.
but just because I’m slow doesn’t mean I can’t learn it.” -P6; “Although
I am slower than young people, I could still manage to learn it.” -P14; 4.3 Challenges in Using Digital Banking
Moreover, their prior experiences with technology seemed to Platforms
help them gain confdence in learning digital banking platforms. P12 When using digital banking platforms, participants encountered
was used to learning things by himself because he did not live with app-related challenges and user-related challenges.
his children for a long time. “Many years ago I bought a computer
and started to learn to type, and I got used to this learning process 4.3.1 App-Related Challenges. As for app-related challenges, P6’s
over time, so it’s not too hard for me to learn these operations on my complaint may shout out many older adults’ thoughts, “It is not
phone.”-P12. Similarly, P15 learned how to do wealth management designed for our older adults, it is designed for the young“-P6. Besides
on the website using a computer and felt it was relatively easy to participants, bank employees confrmed that many older adult cus-
transit conducting on mobile banking apps. tomers complained about the poor design and defective accessibility
Besides, they tended to describe learning digital banking as trou- of the banking app. There were four types of app-related chal-
blesome instead of difcult. “It was troublesome to use mobile banking lenges: ambiguous afordance, low information scent, insufcient
for wealth management because I only did it once per year but still error recovery support, and lack of feedback or confrmation.
had to remember the steps.” -P4. Last but not least, they were able to
aford the learning eforts in particular if they could receive some
help along the way. “Someone taught me how to use it, and I felt it
was not that hard to learn.” -P1.
Social Infuence. All participants mentioned that they were
motivated by people in their social circles, such as their children,
grandchildren, and friends, who almost all used digital banking
platforms. P15 mentioned that he decided to learn to use digital
banking because he envied his colleagues for being able to use
digital banking apps to buy breakfast much faster. “I was reluctant
and resistant to learn it, but later I noticed that my younger colleagues
used their phones to pay for breakfast by simply scanning a code. In
contrast, I had to carry change to pay for my breakfast, which was
much slower. I really envied them and wanted to learn. Later, they
taught me how to use it.”-P15. Bank employees also reported that Figure 2: UIs for checking bills in a virtual bank (We-
there are few older adults who would like to proactively propose to Bank [118]): (a) a clickable option besides “All types of trans-
use digital banking methods to replace manual counters although actions”; once clicked, a pop-up window shows up; (b) The
most of the banking transactions could be conducted via digital pop-up windows allows users to select and view the transac-
banking platforms. Therefore, bank employees provide active guid- tions of a specifc type.
ance and advertisement to encourage and lead older adults to use
digital banking tools like mobile banking apps. While there were Ambiguous Afordance. Participants did not realize certain
still a large proportion of older adults who would rather wait a long functions existed due to the ambiguous afordance of UI elements.
time for the manual counters, which consists with the fndings of For example, Fig. 2 a shows a list of “all types of transactions” in
the study [48]. Some older adults were indeed motivated by their March 2021. There was a gray triangle icon on the right side of
words and tried these new methods with their guidance. the title. When clicked, the interface pops up a window (Fig. 2 b),
Moreover, participants were motivated to learn because they which includes an “All” option for users to check their spending on
were involuntarily connected with digital banking platforms all types of transactions and other options, such as “Red packet”,
through their social connections. P16 started to learn digital banking “Transfer”, “Pay credit card,” for them to check the spending of
because her friend transferred digital cash to her when paying her a specifc type. However, the triangle icon was not perceived as
back. She thought that she had to fnd a way to spend the money in clickable. As a result, P1 believed that there was no way to check
her total spending: “WeBank shows all transactions together. When
ASSETS ’22, October 23–26, 2022, Athens, Greece Jin and Fan

I had too many transactions, I needed to check my spending on a trustworthiness, security and privacy, overspending concerns, low
specifc category. But I could not do it.” perceived self-efcacy, and memory concerns.
Low Information Scent. One symptom of low information Intangibility. Unlike physical banks that had a physical location
scent was that participants had no idea certain functions existed in and could provide printed receipts for transactions, digital banks
the app. P5 was frustrated with not receiving real-time notifcations were perceived to be intangible. “I could not see what I buy in virtual
about his spending when using the credit function of a virtual bank. banks.”-P6. However, the worry about abstraction seemed to go
When the credit bill was due after a month, he often found he spent away as they gained more experience with digital banking over
more money than he realized: “It wasn’t transparent and didn’t send time. “I was hesitant to use digital banks in the beginning, but I
any notifcation in time. I only realized the bill was higher than I gradually accepted it after using it for a while.”-P3.
expected after a month.” This confusion was caused because P5 did Trustworthiness. The trustworthiness of digital banking plat-
not realize that the real-time notifcation needed to be enabled in forms also made participants worry. Unlike physical banks, in par-
the settings. ticular national banks (state-owned capital holding banks), that
Moreover, participants were also frustrated about being redi- had been there in their entire lives, digital banks were new. They
rected from page to page. “ATM has clear steps, but the app does not. tended to think of digital banks the same as small local banks and
You have to click here to enter a page, then click here to go to another worried about their trustworthiness. They mentioned stories about
page, and so on and so forth.” -P1. Bank employees refected that it the bankruptcy of small banks. “Small local banks (whose the con-
did happen commonly during the use of banking apps and it was a trolling shareholder is the local government or private capital) may
kind of burden for them to lead older adults to the exact page as run into cash fow issues and could even bankrupt like Haifa bank,
well. so although I use digital banks, I still worry about not being able to
Insufcient Error Recovery Support. The digital banking get my money out.”-P5. Participants’ doubts in small banks were
apps did not provide sufcient guidance for participants to recover reasonable. None of the nation-wise banks have bankrupted in
from a mistaken operation. “I once clicked on a wrong thing and was China. Although rare, one local bank (Hainan Development Bank)
taken to a completely diferent place. I had no idea how to return to did bankrupt in 1998 [60]. What’s more, digital platforms tend to be
the previous step and became panicked. In other situations, I would riskier than physical ones. For example, hundreds of digital peer-to-
have restarted my phone, but this was related to money and I was peer lending platforms were found to be fraudulent in 2018 [122].
afraid of losing money if I restarted my phone.” -P6. Thus, P6 went to The rampant fraud targeting older adults also made older adults
the bank to ask for help instead, but she felt this approach was too more cautious and anxious about digital banking.
costly to be a regular solution. Security and Privacy. The security and privacy of digital bank-
Moreover, insufcient error recovery support could also happen ing also caused anxiety. Participants mentioned that they heard
at the operating system level. P8 reported that he once could not about online hackers and data breaches in the news. Moreover,
fnd the electronic payment app after he accidentally switched to they also worried about losing money due to mistakes in electronic
other apps with an unintended touch gesture. “I didn’t know how payment. “What if there was a mistake in electronic payment and
I switched to a diferent app [from the WeChat Pay app]. I tried to others swiped my card?”-P3. Lastly, unlike physical banks that ofer
swipe in diferent directions to get it back but I couldn’t.” -P8 printed receipts, they felt virtual banks did not provide receipts
Actually, participants expressed that they would like to do trial that they could use to go and get the money back if unexpected
and error by themselves because they could learn without troubling things happened. “I don’t have any physical receipts. What if the
others. Moreover, they felt that learning independently would help digital bank disappears all of a sudden like the app is gone, what do I
them gain a deeper understanding and remember it longer. “If I do with my money?”-P6. Bank employees echoed that some older
learn and do it myself, I would remember the process much better. My adults even did not trust the receipts printed by ATMs and only
wife always asked our kid to teach her and she still couldn’t remember trusted the ones from a bank counter.
the steps. Thus, getting help from others is no better than learning it Participants also mentioned privacy breaches though they were
by me.” -P2. After receiving a text message with a link from his bank, unsure whether it was due to physical or digital banks. “I received a
P2 clicked the link on his phone, downloaded the banking app, and lot of spam calls from all over the country. I didn’t know who leaked
tried to bind his bank card to it. However, without sufcient error my phone number but I suspected it was the banks since many calls
recovery support, he felt reluctant to try those unfamiliar functions are for loaning stuf.” -P2
in the app. Overspending Concerns. Participants felt that it was harder
Lack of Feedback or Confrmation. Participants also men- to keep track of their spending when using digital banking and
tioned that their apps did not provide confrmation or feedback consequently they tended to overspend. they. “Sometimes I was
when they fnished typing their bank account number and submit- surprised how much I had spent when I was reviewing my monthly
ting it. Consequently, they were unsure if they were successful or bill.” -P6; “It (Digital banking) was indeed convenient. But I ran out of
not and therefore were hesitant to continue the following steps. budget much more easily [using it] than using cash. With cash bills, I
“The app has no confrmation after I enter the amount to be transferred. was prompted to think again about how much I would spend.” -P8
As a result, I have to check many times to make sure I have entered Low Perceived Self-Efcacy. Another user-related issue was
the correct information. Because of this, I am afraid of transferring caused by low perceived self-efcacy. One recurring symptom was
any large amount.” -P9 being afraid of making mistakes. This was because making mistakes
would likely cause them to lose money. “When buying stocks in
4.3.2 User-Related Challenges. Participants encountered the fol-
mobile apps, I am extremely worried about making mistakes. For
lowing user-related challenges: digital banking’s intangibility,
Understanding Banking Practices and Challenges Among Older Adults in China ASSETS ’22, October 23–26, 2022, Athens, Greece

example, it is very easy to type an additional zero by accident.”- for help. I don’t need to go to banks to ask their employees. I don’t want
P1 The concern about making mistakes was, even more, severer to bother others.” -P7. Meanwhile, bank employees expressed their
when the mistakes were perceived as nonrecoverable. Reasons for concerns that they might get impatient during busy hours with
low perceived self-efcacy included the lack of confdence in their older adults asking repetitive questions, as S6 refected, “If some
literacy level and their declining physical conditions, such as poor older adult customers keep asking me the same question while there
eyesight. “I don’t have much education, and it is common for me are so many people waiting in lines, my voice may sound impatient”.
to press the wrong button.” -P3; “I recommended WeChat Pay to my This might also explain why older adults feel bad about troubling
sister. However, she resisted it because she was worried about making others and want to be as independent as possible.
mistakes and losing money due to her poor eyesight.” -P2 In this case, Besides learning with others’ assistance, during the usage of
they preferred to learn to use digital banking with assistance from digital banking platforms, participants developed four strategies to
their close peers (e.g., colleagues, classmates, friends), their family manage the risk of losing money. First, they set an upper limit for
members (e.g., children, relatives), banking employees, and classes the amount of money that they would feel comfortable with even if
ofered by local communities and volunteers, that made they felt they would lose it. “I only put in 1000 CNY. Even if I became a victim
reassurance about learning. of a fraud, I would only have that much to lose, which is manageable
When deciding from whom to seek help, participants reported for me.” -P3. Second, they would not directly bind their regular bank
considering both the trustworthiness and expertise of the help accounts with digital platforms. Instead, they bind a dedicated non-
givers. Regarding trustworthiness, participants preferred to be as- primary bank account with digital banking platforms. In this way,
sisted by people who knew them well. “My children know my person- they could avoid the risk of losing money in their primary bank
ality and knowledge level. Thus, they would teach me at my level. But accounts. The third strategy was to practice and get familiar with
others might start from a higher level. They may be kind and helpful, digital banking platforms with a small amount of money frst. After
but I won’t be able to understand it.” -P8. Moreover, participants also several successful trials, they would gradually increase the amount.
worried that help givers might lose patience and assist them less For example, when P6 started to use her virtual bank, she was
carefully, especially when there were other people waiting to get concerned about whether it was easy to withdraw money from it.
help from the same help giver. After successfully withdrawing money several times, she became
Indeed, bank employee participants also reported that in peak more confdent about it and started to put more money into it. “I
hours they could only tell customers where to click with little just wanted to withdraw money from AliPay (a virtual bank) to test
time for an explanation so as not to keep other customers waiting if it worked. After realizing it worked well, I would deposit the money
for too long. In some cases, they would even operate directly on back.” -P6. Nonetheless, participants still tended to set a maximum
older adults’ phones instead of guiding them even though this amount for digital platforms. Lastly, they believed that “avoiding
was discouraged by bank regulations for privacy concerns. “The petty discounts” could also help them avoid falling for online fraud.
pressure of serving all customers in line is high, so it is challenging Memory Concerns. Another category of the user-related issue
to be always patient and meticulous. In fact, more often than not, was about forgetting various types of information related to digital
the regulation of never operating directly on customers’ devices was banking. First, they were concerned about forgetting the steps
not well followed.” -S6. From older adults’ point of view, many were of completing a bank transaction. “I followed the bank employee’s
willing to let bank employees operate on their phones, except for instructions to walk through the process step by step. However, if I
inputting the password, to save time and efort. “Many older adults didn’t write it in my notebook, I would forget about the steps.” -P11
mentioned that they had a bad memory and might forget it easily In addition, they also worried about not being able to remember
after being taught, so it was better to operate it for them directly.” -S1. how to use many functions at the same time. Two of them men-
Furthermore, some participants preferred help givers to monitor tioned that after mastering basic functions, such as how to use
their operation process and only give feedback when they ran into the electronic payment to pay bills, they worried that they might
trouble. “I’d like the bank employee to stand nearby to monitor my forget how to use basic functions if they continued to learn more
progress while I’m trying to do it on my phone. I hope that he would advanced functions. “I can’t put too much information in my head.
help me only when I run into trouble. Otherwise, I won’t be able to do Otherwise, I would mess it up and forget about it altogether.” -P13
it next time." -P6. Bank employee participants confrmed that many Second, they were also concerned about forgetting passwords.
older adult customers wished to have them oversee their operations P11 reported that she forgot the password and did not know how to
to spot mistakes and provide answers, such as “I am going to tap on fnd it back. The mobile banking app asked her to set the password
this button. Are you sure I tap on the right button?” -S2. Moreover, to be a combination of numbers and alphabets to ensure its security.
bank employees reported that many older adults had the ability to However, such requirements increased the burden of memorization
conduct transactions in banking apps by themselves, “The majority for older adults. For example, after many failed attempts, P11 fnally
of them (older adults) did the right operations all the time, but they had to work with a bank employee to get her money back and never
just wanted me there to confrm each operation.”-S2. used the mobile banking app afterward.
Eventually, they still hoped to become as independent as possible
to avoid troubling others. “Young people are busy with their jobs. I try
to avoid bothering them and learn by myself”-P6. To help themselves 5 DISCUSSION
become independent, participants felt it helpful to have a voice Prior research investigated how older adults use physical banks [1],
assistant whom they could ask for help. “It would be very helpful to ATMs [83], and online banks [89]. Although some researchers
have a voice assistant. Whenever I didn’t understand it, I could just ask studied how older adults use diferent banking platforms, they
ASSETS ’22, October 23–26, 2022, Athens, Greece Jin and Fan

primarily ofered a quantitative overview of their usage via sur- conducted as being relatively simple and having a relatively low
veys [48, 72, 88]. Moreover, electronic payment services (e.g., AliPay, risk. Reasons for not conducting more complex and risky transac-
ApplePay, and WeChat Pay) have been increasingly integrated into tions among our participants include the concerns of losing money
our societies in recent years. In particular, China has the high- and the lack of fnancial knowledge. Thus, future work should in-
est growth rate of electronic transactions [21] and becomes the vestigate ways to help older adults confdently learn to perform
current world leader in the usage of proximity mobile payments, more complex transactions.
among which 81.1% of smartphone users adopt proximity mobile
payments [31]. Although a few recent studies explored how elec-
tronic payment services were used, they mainly focused on young 5.2 Potential Cultural Efects
adults [66, 67, 135]. While a recent study provided a quantitative Our study was conducted with older adults living in China, and
understanding of how older adults use electronic payment through their banking practices may have been infuenced by the Chinese
a survey study [48], it did not explain why older adults chose to use culture that difers from those of western countries. Our partic-
it over other banking methods. Building on top of prior work, our ipants reported some practices that were not found in western
work presents an in-depth qualitative understanding of why and countries. For example, older adults in China widely use digital
how older adults use both physical and digital banking platforms payment functions in WeChat, a social media app, to pay for daily
in today’s technology landscape, how they learned to use digital expenses (e.g., groceries, and transportation). In contrast, we have
banking platforms, and the challenges that they encounter in a not found a study reporting a similar trend in western contexts.
country that has been experiencing one of the fastest growth of During Chinese traditional festivals, older generations often gift
electronic transactions. Furthermore, our work also provides com- younger generations a physical “red packet”, which is a red enve-
plementary perspectives from bank employees to better understand lope with cash in it. Interestingly, WeChat allows them to continue
older adults’ banking practices and challenges. We discuss our key practicing this tradition by sending a digital “red packet” with a
fndings, the design implications, and potential future work in this similar visual efect to each other conveniently.
section. On the other hand, there are common practices among older
adults in western countries, which are uncommon in China. For
example, older adults in western countries (e.g., the UK, the US,
5.1 Banking Activities and Practices Spain, and Canada) regularly use personal cheques [1, 58, 114].
Front-line bank employees reported that almost all older adults However, people in China in general rarely use them. Moreover,
came to the banks by themselves, showing their intention to be while it is common for older adults in western countries to use credit
independent. Older adults also had their own usage patterns with cards, many major banks in China impose age restrictions on credit
consideration of diferent platforms’ pros and cons. Although their card applications and exclude older adults [117]. This restriction
understandings of diferent platforms varied in some aspects, they might explain why older adults in China have less experience with
generally agreed that security and timely in-person assistance are credit cards [48]. One reason for rejecting older adult applicants is
two advantages of physical banks with a series of visible secure that banks lack confdence in their ability to pay credit bills [117].
checks and professional staf. On the other hand, the complexity However, such a concern was not reported for western contexts in
of the secure process combined with other factors like the bank prior studies. Furthermore, older adults in Canada were reported to
branch’s location and the date (e.g., the pension releasing day) also be comfortable with sharing online banking credentials with their
caused inconvenience, such as the long wait time in physical banks. close ties [58]. In contrast, our participants tended not to want to
In contrast, using digital banking platforms could mitigate this issue reveal their credentials to their close ties, who would then know
and bring convenience for either conducting banking transactions their fnancial status. Instead, they were comfortable sharing their
online or paying ofine without cash. However, it remains an open credentials with bank employees when seeking help because they
question of how to design a hybrid online and ofine ecosystem to felt that bank employees’ work ethics would not allow them to
integrate the advantages of the two. abuse their credentials. In sum, although our research scratched the
Bank employees mentioned that almost all banking transactions surface of the potential efects of culture on older adults’ banking
can be conducted through digital banking platforms except cash- practices and challenges, more systematic and substantial follow-up
related ones. Prior studies pointed out that the two most common research is needed to further investigate the cultural efects.
transactions conducted by older adults were money transfers and
account/transaction inquiries [15, 48, 89]. This is consistent with
what we observed in our study. Participants frequently mentioned 5.3 Motivations to Learn Digital Banking
their experiences of transferring money and checking balances or We found that older adults learned digital banking with three pri-
details. They also refected challenges related to these two types of mary motivations: perceived usefulness, social infuence, and self-
transactions (Sec. 4.3), which suggests that even the most common confdence. These three motivations relate to several dimensions in
transactions still need to be improved. Future work could investigate the UTAUT2 (Unifed Theory of Acceptance and Use of Technology)
ways to improve the accessibility of older adults’ commonly used model [112]. Specifcally, the “perceived usefulness” relates to both
transactions as a stepstone to increase their adoption of digital “expectation of performance” and “price value” in UTAUT2, the “so-
banking platforms. cial infuence” relates to “social infuence”, and the “self-confdence”
Furthermore, both older adults and banking employees summa- relates to “expectation of efort”. Unlike the motivation of learning
rized the characteristics of banking transactions that older adults social technologies for communication such as reducing loneliness,
Understanding Banking Practices and Challenges Among Older Adults in China ASSETS ’22, October 23–26, 2022, Athens, Greece

which happens more prevalent in older adults [87, 106], partici- shared across drastically diferent apps (i.e. digital banking in our
pants learned digital banking because they felt that digital banking case vs. mobile maps [129]) suggest that mobile app designs sufer
was convenient and ofered economic benefts. These motivations from common design faws for older adults.
aligned with younger adults’ acceptance of online banking and Furthermore, modern app designs also tend to use icons to indi-
electronic payments, which included “perceived ease of use, per- cate functions [9]. Although such designs can reduce text, our study
ceived values, and perception of utility” [46, 66]. In addition, older suggests that icon-based designs may lead to ambiguous afordance
adults felt learning digital banking could keep their brains sharp. issues. Instead of just focusing on solving a particular type of app
This echoes the fnding of the previous study that older adults play issue, we, as a community, should conduct more research to care-
digital games to help treat age-related cognitive disorders [23] and fully scrutinize whether general mobile app design practices and
learn computer programming to keep their brains challenged [43]. guidelines are suitable for older adults, and more importantly how
Compared to the literature, the motivation to learn new technolo- they might need to be adapted based on older adults’ experiences.
gies such as digital banking platforms for keeping brains sharp Toward this goal, we discuss some potential design solutions.
is perhaps a unique motivation among older adults, while young Minimize the number of visual elements per screen. When
adults rarely mention it as a motivation. reviewing digital banking apps, we noticed that they tended to show
Social infuence was found to afect younger adults’ adoption many options on one screen. Although app designers often use dif-
and continued use of mobile payment services recently [67]. Our ferent font sizes and colors to provide grouping and hierarchical
work extends this fnding and confrms that it also afects older information to distinguish diferent options (e.g., [7, 8]), which is a
adults’ adoption of digital banking platforms. Participants were good practice in general, our study and other ones together provide
motivated by the adoption of digital banking among people in their evidence that some UI elements with less visual salience would
close social circles and felt they should learn it to keep up with the likely go unnoticed if there are too many visual elements. As a
social norms. This motivation echoes the previous research that result, reducing the number of visual elements per screen would
older adults are motivated and encouraged by people around them likely increase the chance of each element being noticed by older
such as their children and grandchildren to learn to use mobile adults. What functions should be shown on the screen? Previous
phones and computer devices [69] and do online shopping [65]. research shows that older adults might prefer a multi-layered inter-
However, in contrast, previous research showed that, unlike face design, which provides reduced functions for initial learning
other ICTs, the intention to learn internet banking among older and then progressively increases the interface complexity, over a
adults is not signifcantly impacted by social infuence since users full-function interface [61]. However, one challenge with multi-
tend to access banking services alone considering security is- layered interface design is how to present additional functions to
sues [11, 130]. One potential reason for this diference might be that older adults. One common approach is to hide them in deeper lay-
the new generation of digital banking services ofers more social ers, which will result in a “deep and narrow” design. This raises
opportunities, such as sending digital money (e.g., a red packet) to the “discoverability” issue. Indeed, our participants mentioned that
their social circles as fnancial and emotional support or seeing oth- they did not realize the existence of certain functions that required
ers pay for goods with a single contactless tap, that was unavailable many steps to reach. It remains an open question of how to balance
decades ago. As our participants mentioned, they were motivated the “visual salience” and “discoverability” of a function in mobile
to learn digital banking after they felt envy when seeing people apps.
around them use it or received electronic money transferred from Understand older adults’ interpretations of common
others. Although “social infuence” mostly played a positive role in icons. One cause of the ambiguous afordance issue was the mis-
increasing the adoption of digital banking among older adults, a interpretation of graphical icons. For example, although a triangle
few participants also mentioned that their family members tried to icon might be interpreted as clickable by young adults, our par-
steer them away from digital banking to avoid potential fnancial ticipants did not always perceive it as clickable. Indeed, Berget
loss. This raises an open question of how to help older adults adopt G. and Sandnes F.E. found that icons were not universally known
digital banking while minimizing potential fnancial loss. by all, and age had a positive correlation on the recognition of
In general, our participants felt that they might be slow in learn- the aged icons vs timeless icons [39]. Older adults were found
ing or need some help along the way just as the previous research to have more problems using existing mobile device icons than
pointed out [25, 50, 79, 88, 108, 110, 129], but they believed they younger adults [62]. A recent study also showed that older adults
could still do it. They also felt proud of being able to use digital might misinterpret icons and interactive elements in online visual-
banking. izations [35]. Furthermore, Rock et al. summarized that four icon
characteristics—semantically close meaning (i.e. natural, a close link
5.4 Challenges in Using Digital Banking between depicted objects and associated function), familiar, labeled
We discuss design considerations to address the challenges older and concrete (i.e. those depicting real-world objects)—improved
adults encounter when using digital banking. its usability for older adults, and they suggested allowing users to
choose an icon from a set of potentially suitable icons [62]. Similar
5.4.1 App-Related Challenges. We uncovered four types of app- to icons, we also found misinterpretations of mobile app UI ele-
related challenges: ambiguous afordance, low information scent, ments. Understanding how older adults perceive icons or common
insufcient error recovery support, and lack of feedback or confr- mobile app UI elements would provide insights for app designers
mation. The frst two challenges were also found in a recent study to design ones that match older adults’ expectations. Thus, future
about older adults using digital maps [129]. The common challenges work should systematically explore older adults’ interpretation of
ASSETS ’22, October 23–26, 2022, Athens, Greece Jin and Fan

banking icons or UIs. In addition to universal icons, icons with a especially close-ties either family or peer friends has been found
specifc cultural element can also be designed for older adults living efective to help older adults manage the security and privacy of
in that culture. smartphones [56, 76, 116]. Compared to general smartphone us-
Explore interactive intelligent agents to assist older age, digital banking is more sensitive to older adults with possible
adults. One way to combat the “low information scent” and “in- economic risks and privacy leaks, which may lead older adults to
sufcient error recovery support” is to allow older adults to ask feel more anxious while using. Thus, more research is warranted to
questions when they are confused. As voice-based intelligent assis- investigate ways to better inform older adults how digital banking
tants (e.g., iOS Siri [107], Amazon Echo [120], Google Home [42]) works, help them debunk false impressions, and build confdence
continue being improved, researchers began to investigate their with social support while considering how to avoid invasion of
potential in assisting older adults [51, 104]. Indeed, our participants privacy during the support.
also expressed the need to ask others when learning digital banking. In the meantime, banking entities should also pay more attention
Thus, it is worth investigating whether and how to design voice- to security issues and further improve the mechanism of vulnerabil-
based intelligent assistants to help older adults overcome challenges ity patching (e.g., provide various channels to collect faws reported
when using mobile apps (e.g., banking) when they have no idea by users) [18] to fundamentally alleviate older adults’ anxiety about
what to do with an interface (low information scent) or need to digital banking platforms.
recover from an error. Toward this goal, older adults’ think-aloud Integrate the feeling of companionship into the design of
verbalization and voice features can be used to build AI models digital banking. On the other hand, concerns about digital bank-
to detect when they encounter problems [32, 33, 36]. Furthermore, ing’s security and privacy were not unique to older adults. Younger
bank employees mentioned that some pioneer banks started to adults were also shown to worry about the perceived risk and pri-
add voice interaction into mobile banking apps. However, for older vacy of digital banking services [46, 57]. One potential reason was
adults with poor hearing, the devices could not automatically in- that online services lack physically present security personnel as in
crease their voice to an appropriate level and react like a real bank physical banks [55]. This suggests the importance of the compan-
employee. Future work should explore how to design voice assis- ionship of a trustworthy person for creating a safety feeling
tants that are not only able to detect the problems older adults of digital banking platforms. This was echoed by both the fact that
encounter but also interact with them like real bank employees. older adults would rather wait a long time to receive assistance
Combination of human support and advanced technolo- from bank employees [48] and in bank employees’ observations that
gies. In addition to the aforementioned automatic voice assistants older adults felt safe when talking to them and receiving printed
with intelligence and voice-based interfaces, we can also consider receipts of the transactions. The sympathy that helps givers express
combining automatic features with human support to help older when interacting with older adults is what current digital services
adults. For example, when older adults frst learn to use digital lack, as S6 articulated: “Machines are not as smart as bank employ-
banking, it might be more helpful for them to interact with a real ees, but more importantly, not as sympathetic as them.” However,
remote bank teller, who can empathize better than a pure automatic bank employee participants felt that it was challenging to keep
agent. Recent technologies, such as virtual and augmented reality, being sympathetic and patient when they had too many customers
might also allow older adults to interact with a live remote bank waiting to be served. One potential future direction perhaps is to
teller through embodied avatars and shared visual cues. Moreover, integrate companionship and sympathy into the design of voice-
future research could explore ways to design features that enable based assistants to not only ofer help but also express sympathy
help-givers to draw visual guidance on older adults’ phones to sup- and companionship.
port them in conducting banking transactions and even support Increase older adults’ perceived self-efcacy. Our study
them to perform trial-and-error, which is efective for learning new found that the main causes for “low perceived self-efcacy” were
technologies but is often challenging for older adults to do [34, 35]. being afraid of making mistakes and lack of confdence in their lit-
Another approach is to design a remote collaborative tool that al- eracy level and their declining physical conditions. One common
lows help-givers to demonstrate how to complete a banking task underlying factor was the potential fnancial loss associated
with interactive guidance for older adults. with mistakes or misoperation. This suggested that older adults
might be more sensitive to “loss aversion” [49] when dealing with
5.4.2 User-Related Challenges. We propose the following design digital banking platforms. This was evident in the approaches par-
considerations to address three types of user-related challenges: ticipants took to minimize their potential fnancial loss: set an upper
anxiety, low perceived self-efcacy, and memory concerns. limit for the amount of money put in digital banking platforms; bind
Increase older adults’ understanding of digital banking a dedicated non-primary bank account with the digital banking
with social help. The main causes for anxiety were related to the platforms; practice with a small amount of money. Future work
intangibility, trustworthiness, and security and privacy concerns of should investigate ways to help older adults cope with loss aversion
digital banking platforms. Previous research found that older adults by boosting their confdence when using digital banking platforms.
are concerned about security(e.g., [65, 77]) and privacy(e.g., [75]) One potential approach is to deliver multi-model confrma-
issues when they access digital technology such as smartphones tion instead of just visual confrmation. Current mobile app confr-
and health monitor systems and they also experience challenges mation design heavily depends on visual feedback. However, older
with managing online security behaviors (e.g., [6, 37, 47, 86]) and adults often have declining eyesight and may not be sensitive to
privacy settings (e.g., [37, 45, 86]) on their own. More recently, the all sorts of visual feedback (e.g., popup boxes). Instead, the app
social support approach provided by older adults’ social networks
Understanding Banking Practices and Challenges Among Older Adults in China ASSETS ’22, October 23–26, 2022, Athens, Greece

could deliver the confrmation in multiple channels, for example, using diferent banking apps would be able to understand the pros
by popping up a box and reading the confrmation out [59]. and cons of diferent app designs.
Another approach is to design better error recovery mecha- Last but not least, our study shows that older adults are not
nisms. Current error-recovery mechanisms often require users helpless but instead are quite willing to learn new digital banking
to accurately press a series of buttons (e.g., a back button placed technologies and apply their accumulated experiences and knowl-
on the top left corner) to return to a previous step. However, it is edge to come up with creative solutions before seeking help from
not uncommon for older adults to encounter touch-related motor someone whom they trust. Inspired by this fnding and the advocate
issues [24, 96, 97], such as tapping and swiping. Thus, touch-based for positive aging [40, 53], we argue that we should design interac-
error recovery mechanisms might even lead to more errors along tive learning approaches for older adults so that they could play a
the way. Furthermore, it is also challenging for older adults to fg- more active and leading role in exploring new technological solu-
ure out which step they should backtrack to. This raises an open tions that still allow them to gain support from their close ties (e.g.,
question of how best to help older adults recover from an error. One children, grandchildren); This could also be a great opportunity to
possible direction is to leverage artifcial intelligence to infer the foster a stronger intergenerational bind, which might potentially
step where older adults start to deviate from a correct completion reduce ageism [71].
path and later guide them to backtrack to the step.
Lastly, our fndings revealed common “memory concerns”, such
as forgetting passwords or doing a required action. There is a body 7 CONCLUSION
of literature investigating ways to alleviate these memory issues, Understanding why and how older adults use current banking plat-
such as designing more memorable authentication mechanisms forms is key to improving their experiences of managing fnancial
(e.g., [101]) and reminder systems (e.g. [16, 63]). Moreover, par- activities. Toward this goal, we have conducted a semi-structured
ticipants were also afraid of forgetting the steps of completing a interview study with sixteen (N=16) older adults living in diferent
bank transaction and wrote the steps down in a notebook. However, tiered cities in China to understand why and how they use physical
writing the steps down is not a scalable approach to learning. One and digital banking platforms, how they learn digital banking, and
potential solution is to leverage AI technology to learn the steps the challenges encountered. We have also interviewed bank employ-
for completing a task and alleviate older adults from needing to ees to understand older adults’ banking practices and challenges
remember the steps. For example, Li et al. showed an approach from their perspectives as help givers. Our fndings show that older
for the mobile app to learn from the user how to complete a task adults used both physical and digital banking platforms and per-
and then automate the process for the user for the same and even formed diferent types of transactions based on their perceived pros
other similar tasks [64]. Future work could investigate similar ap- and cons. They liked the familiarity and safety of physical platforms
proaches to help older adults complete tasks without needing them and the convenience and instantaneity of digital platforms. They
to remember the exact steps. tended to use digital platforms for everyday transactions involving
a small amount of money and physical platforms for transactions
involving a large amount of money.
Our study found three motivations for learning digital banking:
6 LIMITATIONS AND FUTURE WORK perceived usefulness, self-confdence, and social infuence. More-
Our study presents a qualitative understanding of why and how over, our study revealed app-related and user-related challenges
older adults in China use both physical and digital banking plat- that older adults encounter when using digital banking platforms
forms, how they learn to use digital banking platforms and the Specifcally, app-related challenges included ambiguous afordance,
challenges that they encounter. Although our study included older low information scent, insufcient error recovery support, and lack
adults from rural areas (e.g., third-tier and fourth-tier cities), the ma- of feedback or confrmation. User-related challenges included anxi-
jority of the participants lived in middle- and big-sized cities. As the ety caused by intangibility and trustworthiness of digital banking,
general technology development and the availability of digital bank- lower perceived self-efcacy related to the fear of making mistakes
ing services might difer in diferent-sized cities, our fndings might and the lack of confdence in their ability, and memory concerns.
not refect the banking practices and challenges of older adults Taken all together, we further discussed the design implications of
living in regions with diferent levels of economical development. the fndings and potential design considerations, and future work
China has been experiencing one of the fastest growth in elec- for assisting older adults with learning digital banking and address-
tronic transactions. Thus, older adults in China might have felt ing the challenges.
stronger peer pressure to adopt digital banking compared to coun-
tries where digital banking is yet to gain popularity. Moreover, the
REFERENCES
culture, household income, the occupational and educational back-
[1] Maya Abood, Karen Kali, Robert Zdenek, et al. 2015. What can we do to help?
ground of older adults may also afect their mindsets about money Adopting age-friendly banking to improve fnancial well-being for older adults.
and their fnancial management strategies. Future work should Technical Report. Federal Reserve Bank of San Francisco.
[2] AARP Age UK. 2016. Age-friendly banking-What it is and how you do
investigate older adults’ banking practices in diferent countries to it. Retrieved Jan 20, 2021 from https://www.ageuk.org.uk/globalassets/age-
better understand cultural impacts and derive common and unique uk/documents/reports-and-publications/reports-and-briefngs/money-
challenges. matters/rb_april16_age_friendly_banking.pdf
[3] Adel M Aladwani. 2001. Online banking: a feld study of drivers, development
Furthermore, our study did not compare the user experience of challenges, and expectations. International journal of information management
diferent banking apps. A controlled experiment with older adults 21, 3 (2001), 213–225.
ASSETS ’22, October 23–26, 2022, Athens, Greece Jin and Fan

[4] Alipay. 2021. Alipay | Accessible digital payments for everyone. https://global. [29] EMarketer. 2019. Global Ecommerce 2019. Retrieved Jan 20, 2021 from https:
alipay.com/platform/site/ihome //www.emarketer.com/content/global-ecommerce-2019
[5] Faris Alshubiri, Syed Ahsan Jamil, and Mohamed Elheddad. 2019. The impact of [30] eMarketer. 2019. Global Mobile Payment Users 2019. Retrieved April,13, 2021
ICT on fnancial development: Empirical evidence from the Gulf Cooperation from https://www.emarketer.com/content/global-mobile-payment-users-2019
Council countries. International Journal of Engineering Business Management 11 [31] EMarketer. 2019. Global Mobile Payment Users 2019. Retrieved July 20, 2021
(2019). https://doi.org/10.1177/1847979019870670 from https://www.emarketer.com/content/global-mobile-payment-users-2019
[6] Catherine L. Anderson and Ritu Agarwal. 2010. Practicing Safe Computing: A [32] Mingming Fan, Yue Li, and Khai N. Truong. 2020. Automatic Detection of
Multimedia Empirical Examination of Home Computer User Security Behavioral Usability Problem Encounters in Think-Aloud Sessions. ACM Trans. Interact.
Intentions. MIS Q. 34 (2010), 613–643. Intell. Syst. 10, 2, Article 16 (may 2020), 24 pages. https://doi.org/10.1145/3385732
[7] Apple. 2021. Color - Visual Design - iOS - Human Interface Guidelines - Apple [33] Mingming Fan, Jinglan Lin, Christina Chung, and Khai N. Truong. 2019. Concur-
Developer. https://developer.apple.com/design/human-interface-guidelines/ios/ rent Think-Aloud Verbalizations and Usability Problems. ACM Trans. Comput.-
visual-design/color/. Hum. Interact. 26, 5, Article 28 (jul 2019), 35 pages. https://doi.org/10.1145/
[8] Apple. 2021. Navigation - App Architecture - iOS - Human Interface Guide- 3325281
lines - Apple Developer. https://developer.apple.com/design/human-interface- [34] Mingming Fan and Khai N Truong. 2018. Guidelines for Creating Senior-Friendly
guidelines/ios/app-architecture/navigation/. Product Instructions. ACM Transactions on Accessible Computing (TACCESS) 11,
[9] Apple. 2021. System Icons - Icons and Images - iOS - Human Interface Guide- 2 (2018), 1–35.
lines - Apple Developer. https://developer.apple.com/design/human-interface- [35] Mingming Fan, Yiwen Wang, Yuni Xie, Franklin Mingzhe Li, and Chunyang
guidelines/ios/icons-and-images/system-icons/. Chen. 2022. Understanding How Older Adults Comprehend COVID-19 Inter-
[10] ApplePay. 2021. Apple Pay - Apple. https://www.apple.com/apple-pay/ active Visualizations via Think-Aloud Protocol. International Journal of Hu-
[11] Jorge Arenas Gaitán, Begoña Peral Peral, and Mª Ramón Jerónimo. 2015. Elderly man–Computer Interaction 0, 0 (2022), 1–17. https://doi.org/10.1080/10447318.
and internet banking: An application of UTAUT2. Journal of Internet Banking 2022.2064609 arXiv:https://doi.org/10.1080/10447318.2022.2064609
and Commerce, 20 (1), 1-23. (2015). [36] Mingming Fan, Qiwen Zhao, and Vinita Tibdewal. 2021. Older Adults’ Think-
[12] BBC. 2020. Coronavirus: Bank branches close as virus afects access - BBC News. Aloud Verbalizations and Speech Features for Identifying User Experience Prob-
https://www.bbc.com/news/business-52021246 lems. In Proceedings of the 2021 CHI Conference on Human Factors in Computing
[13] Hugh Beyer and Karen Holtzblatt. 1997. Contextual Design: Defning Customer- Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery,
Centered Systems. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. New York, NY, USA, Article 358, 13 pages. https://doi.org/10.1145/3411764.
[14] Silvio Camilleri and Gail Grech. 2017. The Relevance of Age Categories in 3445680
explaining Internet Banking Adoption Rates and Customers’ Attitudes towards [37] Alisa Frik, Leysan Nurgalieva, Julia Bernd, Joyce S. Lee, Florian Schaub, and
the Service. Journal of Applied Finance and Banking 7 (01 2017), 29–47. Serge Egelman. 2019. Privacy and Security Threat Models and Mitigation
[15] Silvio John Camilleri and Gail Grech. 2017. The relevance of age categories in Strategies of Older Adults. In Proceedings of the Fifteenth USENIX Conference
explaining internet banking adoption rates and customers’ attitudes towards on Usable Privacy and Security (Santa Clara, CA, USA) (SOUPS’19). USENIX
the service. Camilleri, SJ, and G. Grech (2017), 29–47. Association, USA, 21–40.
[16] Samantha WT Chan, Thisum Buddhika, Haimo Zhang, and Suranga [38] Chrysoula Gatsou, Anastasios Politis, and Dimitrios Zevgolis. 2017. Seniors’
Nanayakkara. 2019. ProspecFit: In Situ Evaluation of Digital Prospective Mem- experiences with online banking. In 2017 Federated Conference on Computer
ory Training for Older Adults. Proceedings of the ACM on Interactive, Mobile, Science and Information Systems (FedCSIS). IEEE, 623–627.
Wearable and Ubiquitous Technologies 3, 3 (2019), 1–20. [39] Berget Gerd and Sandnes Frode Eika. 2015. On the Understandability of Pub-
[17] Deepak Chawla and Himanshu Joshi. 2018. The moderating efect of demo- lic Domain Icons: Efects of Gender and Age. In Universal Access in Human-
graphic variables on mobile banking adoption: an empirical investigation. Global Computer Interaction. Access to Today’s Technologies, Margherita Antona and
Business Review 19, 3_suppl (2018), S90–S113. Constantine Stephanidis (Eds.). Springer International Publishing, Cham, 387–
[18] Sen Chen, Ting Su, Lingling Fan, Guozhu Meng, Minhui Xue, Yang Liu, and 396.
Lihua Xu. 2018. Are mobile banking apps secure? what can be improved?. In [40] Mary M Gergen and Kenneth J Gergen. 2001. Positive aging: New images for a
Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering new age. Ageing International 27, 1 (2001), 3–23.
Conference and Symposium on the Foundations of Software Engineering. 797–802. [41] Mary C. Gilly and Valarie A. Zeithaml. 1985. The Elderly Consumer and
[19] Jyoti Choudrie, Chike-Obuekwe Junio, Brad Mckenna, and Shahper Richter. Adoption of Technologies. Journal of Consumer Research 12, 3 (1985), 353–
2018. Research Archive Citation for published version: Understanding and 357. http://www.jstor.org/stable/254379
Conceptualising the Adoption, Use and Difusion of Mobile Banking in Older [42] Google. 2021. Discover what Google Assistant is. https://assistant.google.com/
Adults: A Research Agenda and Conceptual Framework. Journal of Business [43] Philip J. Guo. 2017. Older Adults Learning Computer Programming: Motivations,
Research 88 (2018), 449–465. https://doi.org/10.1016/j.jbusres.2017.11.029 Frustrations, and Design Opportunities. In Proceedings of the 2017 CHI Conference
[20] Jyoti Choudrie, Chike Obuekwe Junior, Brad McKenna, and Shahper Richter. on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI ’17).
2018. Understanding and conceptualising the adoption, use and difusion of Association for Computing Machinery, New York, NY, USA, 7070–7083. https:
mobile banking in older adults: A research agenda and conceptual framework. //doi.org/10.1145/3025453.3025945
Journal of Business Research 88 (2018), 449–465. [44] Michael Harris, K Chris Cox, Carolyn Findley Musgrove, and Kathryn W Ernst-
[21] McKinsey & Company. 2021. The 2020 McKinsey Global Payments Report. berger. 2016. Consumer preferences for banking technologies by age groups.
https://www.mckinsey.com/~/media/mckinsey/industries/fnancialservices/ International Journal of Bank Marketing (2016).
ourinsights/acceleratingwindsofchangeinglobalpayments/2020-mckinsey- [45] Dominik Hornung, Claudia Müller, Irina Shklovski, Timo Jakobi, and Volker
global-payments-report-vf.pdf Wulf. 2017. Navigating Relationships and Boundaries: Concerns around ICT-
[22] Juliet Corbin and Anselm Strauss. 2014. Basics of qualitative research: Techniques Uptake for Elderly People. In Proceedings of the 2017 CHI Conference on Human
and procedures for developing grounded theory. Sage publications. Factors in Computing Systems (Denver, Colorado, USA) (CHI ’17). Association
[23] Túlio Teixeira Cota, Lucila Ishitani, and Niltom Vieira. 2015. Mobile Game for Computing Machinery, New York, NY, USA, 7057–7069. https://doi.org/10.
Design for the Elderly. Comput. Hum. Behav. 51, PA (oct 2015), 96–105. https: 1145/3025453.3025859
//doi.org/10.1016/j.chb.2015.04.026 [46] Guangying Hua. 2008. An Experimental Investigation of Online Banking Adop-
[24] Alma Leora Culén. 2013. Touch-Screens and Elderly users: A Perfect Match. In tion in China. https://aisel.aisnet.org/amcis2008/36 AMCIS 2008 Proceedings.
ACHI 2013, The Sixth International Conference on Advances in Computer-Human 36.
Interactions. 460465. [47] Mengtian Jiang, Hsin-yi Tsai, Shelia Cotten, Nora Rifon, Robert Larose, and
[25] Sara J Czaja, Neil Charness, Arthur D Fisk, Christopher Hertzog, Sankaran N Saleem Alhabash. 2016. Generational Diferences in Online Safety Perceptions,
Nair, Wendy A Rogers, and Joseph Sharit. 2006. Factors predicting the use of Knowledge, and Practices. Educational Gerontology 42 (06 2016). https://doi.
technology: Findings from the center for research and education on aging and org/10.1080/03601277.2016.1205408
technology enhancement (CREATE). Psychology and aging 21, 2 (2006), 333. [48] Xiaofu Jin, Emily Kuang, and Mingming Fan. 2021. ”Too Old to Bank Digitally?
[26] Sara J Czaja and Chin Chiin Lee. 2009. Information technology and older adults. ”: A Survey of Banking Practices and Challenges Among Older Adults in China.
Human-computer interaction: Designing for diverse users and domains (2009), Association for Computing Machinery, New York, NY, USA, 802–814. https:
17–32. //doi.org/10.1145/3461778.3462127
[27] Ulrike Darch and Nerina J Caltabiano. 2004. Investigation of automatic teller [49] Daniel Kahneman. 2011. Thinking, fast and slow. Macmillan.
machine banking in a sample of older adults. Australasian Journal on Ageing 23, [50] Muath Alhussain Khawaji. 2017. Overcoming challenges in smart phone use
2 (2004), 100–103. https://doi.org/10.1111/j.1741-6612.2004.00024.x among older adults in Saudi Arabia. (2017).
[28] Julie A. Delello and Rochell R. McWhorter. 2017. Reducing the Digital Divide: [51] Sunyoung Kim. 2021. Exploring How Older Adults Use a Smart Speaker–Based
Connecting Older Adults to iPad Technology. Journal of Applied Gerontology Voice Assistant in Their First Interactions: Qualitative Study. JMIR mHealth and
36, 1 (jan 2017), 3–28. https://doi.org/10.1177/0733464815589985 uHealth 9, 1 (2021), e20427.
Understanding Banking Practices and Challenges Among Older Adults in China ASSETS ’22, October 23–26, 2022, Athens, Greece

[52] Sharon Kimathi. 2020. Third wave of COVID-19 forces Hong Kong banks to close industries/fnancial%20services/our%20insights/capitalizing%20on%20asias%
branches - FinTech Futures. https://www.fntechfutures.com/2020/07/third- 20digital%20banking%20boom/digital_banking_in_asia_what_do_consumers_
wave-of-covid-19-forces-hong-kong-banks-to-close-branches/ really_want.pdf
[53] Bran Knowles, Vicki L. Hanson, Yvonne Rogers, Anne Marie Piper, Jenny Way- [74] Mckinsey. 2019. Rewriting the rules in retail banking. Retrieved April 7,2021
cott, Nigel Davies, Aloha Hufana Ambe, Robin N. Brewer, Debaleena Chat- from https://www.mckinsey.com/industries/fnancial-services/our-insights/
topadhyay, Marianne Dee, David Frohlich, Marisela Gutierrez-Lopez, Ben Je- rewriting-the-rules-in-retail-banking#
len, Amanda Lazar, Radoslaw Nielek, Belén Barros Pena, Abi Roper, Mark [75] Andrew McNeill, Pam Briggs, Jake Pywell, and Lynne Coventry. 2017. Func-
Schlager, Britta Schulte, and Irene Ye Yuan. 2021. The Harm in Confating tional Privacy Concerns of Older Adults about Pervasive Health-Monitoring
Aging with Accessibility. Commun. ACM 64, 7 (jun 2021), 66–71. https: Systems. In Proceedings of the 10th International Conference on PErvasive Tech-
//doi.org/10.1145/3431280 nologies Related to Assistive Environments (Island of Rhodes, Greece) (PE-
[54] KPMG. 2020. Banking in the new reality: China. Retrieved Jan 20, 2021 TRA ’17). Association for Computing Machinery, New York, NY, USA, 96–102.
from https://home.kpmg/cn/en/home/insights/2020/07/banking-in-the-new- https://doi.org/10.1145/3056540.3056559
reality-china.html [76] Tamir Mendel. 2019. Social Help: Developing Methods to Support Older Adults
[55] Umbas Krisnanto. 2018. Digital banking made transaction more trusted and in Mobile Privacy and Security (UbiComp/ISWC ’19 Adjunct). Association for
secured? International Journal of Civil Engineering and Technology 9 (12 2018), Computing Machinery, New York, NY, USA, 383–387. https://doi.org/10.1145/
395–407. 3341162.3349311
[56] Jess Kropczynski, Zaina Aljallad, Nathan Jefrey Elrod, Heather Lipford, and [77] Tracy L Mitzner, Julie B Boron, Cara Bailey Fausset, Anne E Adams, Neil Char-
Pamela J. Wisniewski. 2021. Towards Building Community Collective Efcacy ness, Sara J Czaja, Katinka Dijkstra, Arthur D Fisk, Wendy A Rogers, and Joseph
for Managing Digital Privacy and Security within Older Adult Communities. Sharit. 2010. Older adults talk technology: Technology usage and attitudes.
Proc. ACM Hum.-Comput. Interact. 4, CSCW3, Article 255 (jan 2021), 27 pages. Computers in human behavior 26, 6 (2010), 1710–1721.
https://doi.org/10.1145/3432954 [78] Tracy L Mitzner, Jyoti Savla, Walter R Boot, Joseph Sharit, Neil
[57] Sylvie Laforet and Xiaoyan Li. 2005. Consumers’ attitudes towards online and Charness, Sara J Czaja, and Wendy A Rogers. 2018. Technol-
mobile banking in China. International Journal of Bank Marketing 23, 5 (aug ogy Adoption by Older Adults: Findings From the PRISM Trial.
2005), 362–380. https://doi.org/10.1108/02652320510629250 The Gerontologist 59, 1 (09 2018), 34–44. https://doi.org/10.1093/
[58] Celine Latulipe, Ronnie Dsouza, and Murray Cumbers. 2022. Unofcial Proxies: geront/gny113 arXiv:https://academic.oup.com/gerontologist/article-
How Close Others Help Older Adults with Banking. In Proceedings of the 2022 pdf/59/1/34/27456596/gny113.pdf
CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) [79] Hazwani Mohd Mohadisdudis and Nazlena Mohamad Ali. 2014. A study of
(CHI ’22). Association for Computing Machinery, New York, NY, USA, Article smartphone usage and barriers among the elderly. In 2014 3rd International
601, 13 pages. https://doi.org/10.1145/3491102.3501845 Conference on User Science and Engineering (i-USEr). IEEE, 109–114.
[59] Ju-Hwan Lee, Ellen Poliakof, and Charles Spence. 2009. The efect of multimodal [80] Monzo. 2021. Banking made easy. https://monzo.com/
feedback presented via a touch screen on the performance of older adults. In [81] Nkosikhona Msweli and Tendani Mawela. 2020. Enablers and Barriers for Mobile
International conference on haptic and audio interaction design. Springer, 128– Commerce and Banking Services Among the Elderly in Developing Countries: A
135. Systematic Review. 319–330. https://doi.org/10.1007/978-3-030-45002-1_27
[60] Legalweekly. 2013. Hainan Development Bank: The frst case of bank failure [82] Sonia Mukhtar. 2020. Psychological impact of COVID-19 on older adults. Current
in China(in Chinese). Retrieved July 7,2021 from http://fnance.sina.com.cn/ medicine research and practice (2020).
money/bank/20130626/072415917875.shtml [83] Ahmad Nedaei Fard. 2013. Barriers and Facilitators of Iranian Elderly in Use of
[61] Rock Leung, Leah Findlater, Joanna McGrenere, Peter Graf, and Justine Yang. ATM Machines: A Qualitative Research in the Way of Cultural Probes. Salmand:
2010. Multi-layered interfaces to improve older adults’ initial learnability of Iranian Journal of Ageing. (01 2013), 17–24.
mobile applications. ACM Transactions on Accessible Computing (TACCESS) 3, 1 [84] NetBank. 2021. NetBank - CommBank. https://www.commbank.com.au/digital-
(2010), 1–30. banking/netbank.html
[62] R. Leung, J. McGrenere, and P. Graf. 2011. Age-related diferences in the initial [85] BBC News. 2007. The man who invented the cash machine. Retrieved Jan 18,2021
usability of mobile device icons. Behaviour & Information Technology 30 (2011), from http://news.bbc.co.uk/2/hi/business/6230194.stm
629 – 642. [86] James Nicholson, Lynne Coventry, and Pamela Briggs. 2019. "If It’s Im-
[63] Franklin Mingzhe Li, Di Laura Chen, Mingming Fan, and Khai N Truong. 2019. portant It Will Be A Headline": Cybersecurity Information Seeking in Older
FMT: A Wearable Camera-Based Object Tracking Memory Aid for Older Adults. Adults. Association for Computing Machinery, New York, NY, USA, 1–11.
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Tech- https://doi.org/10.1145/3290605.3300579
nologies 3, 3 (2019), 1–25. [87] Magnhild Nicolaisen and Kirsten Thorsen. 2014. Who are lonely? Loneliness in
[64] Toby Jia-Jun Li, Amos Azaria, and Brad A Myers. 2017. SUGILITE: creating diferent age groups (18–81 years old), using two measures of loneliness. The
multimodal smartphone automation by demonstration. In Proceedings of the International Journal of Aging and Human Development 78, 3 (2014), 229–257.
2017 CHI conference on human factors in computing systems. 6038–6049. [88] Katherine E Olson, Marita A O’Brien, Wendy A Rogers, and Neil Charness. 2011.
[65] Jiunn-Woei Lian and David C. Yen. 2014. Online Shopping Drivers and Barriers Difusion of technology: frequency of use for younger and older adults. Ageing
for Older Adults. Comput. Hum. Behav. 37, C (aug 2014), 133–143. https: international 36, 1 (2011), 123–145.
//doi.org/10.1016/j.chb.2014.04.028 [89] Funmilola O Omotayo and Tolulope A Akinyode. 2020. Digital Inclusion and
[66] Francisco Liébana-Cabanillas, Inmaculada García-Maroto, Francisco Muñoz- the Elderly: The Case of Internet Banking Use and Non-Use among older Adults
Leiva, and Iviane Ramos-de Luna. 2020. Mobile payment adoption in the age of in Ekiti State, Nigeria. Covenant Journal of Business and Social Sciences 11, 1
digital transformation: The case of Apple Pay. Sustainability 12, 13 (2020), 5443. (2020).
[67] Kuan-Yu Lin, Yi-Ting Wang, and Travis K. Huang. 2020. Exploring the an- [90] Tiago Nascimento Ordonez, Mônica Sanches Yassuda, and Meire Cachioni. 2011.
tecedents of mobile payment service usage: Perspectives based on cost–beneft Elderly online: efects of a digital inclusion program in cognitive performance.
theory, perceived value, and social infuences. Online information review 44, 1 Archives of Gerontology and Geriatrics 53, 2 (2011), 216–219.
(2020), 299–318. [91] World Health Organization et al. 2015. China country assessment report on
[68] Alex Lipton, David Shrier, and Alex Pentland. 2016. Digital banking manifesto: ageing and health. (2015).
the end of banks? Massachusetts Institute of Technology. [92] Marita A O’Brien, Katherine E Olson, Neil Charness, Sara J Czaja, Arthur D Fisk,
[69] Katrien Luijkx, Sebastiaan Peek, and Eveline Wouters. 2015. “Grandma, you Wendy A Rogers, and Joseph Sharit. 2008. Understanding technology usage in
should do it—It’s cool” Older Adults and the Role of Family Members in Their older adults. Proceedings of the 6th International Society for Gerontechnology,
Acceptance of Technology. International journal of environmental research and Pisa, Italy (2008).
public health 12, 12 (2015), 15470–15485. [93] Google Pay. 2021. Google Pay - Learn What the Google Pay App Is & How To
[70] Guoping Mao, Fuzhong Lu, Xuchun Fan, and Debiao Wu. 2020. China’s ageing Use It. https://pay.google.com/intl/usa/about/
population: The present situation and prospects. In Population change and [94] Paypal. 2021. Send Money, Pay Online or Set Up a Merchant Account - PayPal.
impacts in Asia and the Pacifc. Springer, 269–287. https://www.paypal.com/us/home
[71] Sibila Marques, João Mariano, Joana Mendonça, Wouter De Tavernier, Moritz [95] Begoña Peral-Peral, Ángel F Villarejo-Ramos, and Jorge Arenas-Gaitán. 2019.
Hess, Laura Naegele, Filomena Peixeiro, and Daniel Martins. 2020. Determinants Self-efcacy and anxiety as determinants of older adults’ use of Internet Banking
of ageism against older adults: A systematic review. International journal of Services. Universal Access in the Information Society (2019), 1–16.
environmental research and public health 17, 7 (2020), 2560. [96] Ryan M Peters, Monica D McKeown, Mark G Carpenter, and J Timothy Inglis.
[72] Minna Mattila, Heikki Karjaluoto, and Tapio Pento. 2003. Internet banking adop- 2016. Losing touch: age-related changes in plantar skin sensitivity, lower limb
tion among mature customers: Early majority or laggards? Journal of Services cutaneous refex strength, and postural stability in older adults. Journal of
Marketing 17 (09 2003), 514–528. https://doi.org/10.1108/08876040310486294 neurophysiology 116, 4 (2016), 1848–1858.
[73] Mckinsey. 2015. Digital Banking in Asia:What do consumers really want?
Retrieved Jan 18,2021 from https://www.mckinsey.com/~/media/mckinsey/
ASSETS ’22, October 23–26, 2022, Athens, Greece Jin and Fan

[97] Andraž Petrovčič, Sakari Taipale, Ajda Rogelj, and Vesna Dolničar. 2018. Design [125] Wikipedia. 2021. NetBank. https://en.wikipedia.org/wiki/NetBank
of mobile phones for older adults: An empirical analysis of design guidelines [126] Wikipedia. 2021. Online Shopping. https://en.wikipedia.org/wiki/Online_
and checklists for feature phones and smartphones. International Journal of shopping.
Human–Computer Interaction 34, 3 (2018), 251–264. [127] Wikipedia. 2021. PayPal. https://en.wikipedia.org/wiki/PayPal
[98] Jakub Pikna, Nikoleta Fellnerova, and Michal Kozubík. 2018. INFORMATION [128] Jiaqin Yang, Li Cheng, and Xia Luo. 2009. A comparative study on e-banking
TECHNOLOGY AND SENIORS. CBU International Conference Proceedings 6 (09 services between China and USA. International Journal of Electronic Finance 3,
2018), 702. https://doi.org/10.12955/cbup.v6.1236 3 (2009), 235–252.
[99] Naimot Popoola and Md Razib Arshad. 2015. Strategic Approach to Build [129] Ja Eun Yu and Debaleena Chattopadhyay. 2020. “Maps are hard for me”: Identi-
Customers Trust in Adoption of Internet Banking in Nigeria. Journal of Internet fying How Older Adults Struggle with Mobile Maps. In The 22nd International
Banking and Commerce 20 (04 2015). ACM SIGACCESS Conference on Computers and Accessibility. 1–8.
[100] PTI. 2020. COVID-19 crisis: Credit Suisse to close 37 bank branches in Switzer- [130] Yee Yen Yuen, Paul HP Yeow, Nena Lim, and Najib Saylani. 2010. Internet
land. https://www.businesstoday.in/sectors/banks/covid-19-crisis-credit- banking adoption: Comparing developed and developing countries. Journal of
suisse-to-close-37-bank-branches-in-switzerland/story/414037.html Computer Information Systems 51, 1 (2010), 52–61.
[101] Karen Renaud and Judith Ramsay. 2007. Now what was that password again? A [131] Yunshanfu. 2021. Yunshanfu. https://yunshanfu.unionpay.com/
more fexible way of identifying and authenticating our seniors. Behaviour & [132] Salifu Yusif, Jefrey Soar, and Abdul Hafeez-Baig. 2016. Older people, assistive
Information Technology 26, 4 (2007), 309–322. technologies, and the barriers to adoption: A systematic review. International
[102] Wendy A. Rogers, Elizabeth F. Cabrera, Nef Walker, D. Kristen Gilbert, and Journal of Medical Informatics 94 (07 2016). https://doi.org/10.1016/j.ijmedinf.
Arthur D. Fisk. 1996. A Survey of Automatic Teller Machine Usage across the 2016.07.004
Adult Life Span. Human Factors 38, 1 (1996), 156–166. https://doi.org/10.1518/ [133] Valarie A Zeithaml and Mary C Gilly. 1987. Characteristics afecting the ac-
001872096778940723 PMID: 8682517. ceptance of retailing technologies: A comparison of elderly and nonelderly
[103] Eastmoney International Securities. [n.d.]. Eastmoney International Securities. consumers. Journal of retailing (1987).
https://www.emsec.hk/index.html [134] Zelle. 2021. Zelle® | A fast, safe and easy way to send and receive money.
[104] Shradha Shalini, Trevor Levins, Erin L Robinson, Kari Lane, Geunhye Park, https://www.zellepay.com/
and Marjorie Skubic. 2019. Development and comparison of customized voice- [135] Yu Zhao and Sherah Kurnia. 2014. Exploring Mobile Payment Adoption in
assistant systems for independent living older adults. In International Conference China.. In PACIS. 232.
on Human-Computer Interaction. Springer, 464–479. [136] M. Švecová and E. Odlerová. 2018. Smartphone and mobile application usage
[105] Ltd. Shanghai Dazhihui Information Technology Co. [n.d.]. Dazhihui. http: among seniors in Slovakia. European Journal of Science and Theology 14 (12
//www.dzh.com.cn/#page1 2018), 125–133.
[106] Tamara Sims, Andrew E Reed, and Dawn C Carr. 2017. Information and com-
munication technology use is related to higher well-being among the oldest-old.
The Journals of Gerontology: Series B 72, 5 (2017), 761–770.
[107] Siri. 2021. Siri does more than ever. Even before you ask. https://www.apple.
com/tw/siri/
[108] Aaron Smith. 2014. Older adults and technology use.
[109] Lauren SullivanMahum Tofq. 2020. Shuttered for COVID-19, some US
banks consider closing branches for good | S&P Global Market Intelligence.
https://www.spglobal.com/marketintelligence/en/news-insights/latest-news-
headlines/shuttered-for-covid-19-some-us-banks-consider-closing-branches-
for-good-58537195
[110] Michele Van Volkom, Janice C Stapley, and Vanessa Amaturo. 2014. Revisiting
the digital divide: Generational diferences in technology use in everyday life.
North American Journal of Psychology 16, 3 (2014), 557–574.
[111] Eleftheria Vaportzis, Maria Giatsi Clausen, and Alan J Gow. 2017. Older adults
perceptions of technology and barriers to interacting with tablet computers: a
focus group study. Frontiers in psychology 8 (2017), 1687.
[112] Viswanath Venkatesh, James Y. L. Thong, and Xin Xu. 2012. Consumer Ac-
ceptance and Use of Information Technology: Extending the Unifed Theory
of Acceptance and Use of Technology. MIS Quarterly 36, 1 (2012), 157–178.
http://www.jstor.org/stable/41410412
[113] Venmo. 2021. Venmo - Share Payments. https://venmo.com/
[114] John Vines, Mark Blythe, Stephen Lindsay, Paul Dunphy, Andrew Monk, and
Patrick Olivier. 2012. Questionable Concepts: Critique as Resource for Designing
with Eighty Somethings. In Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems (Austin, Texas, USA) (CHI ’12). Association for
Computing Machinery, New York, NY, USA, 1169–1178. https://doi.org/10.
1145/2207676.2208567
[115] Christian Von Haldenwang. 2004. Electronic government (e-government) and
development. The European journal of development research 16, 2 (2004), 417–432.
[116] Zhiyuan Wan, Lingfeng Bao, Debin Gao, Eran Toch, Xin Xia, Tamir Mendel,
and David Lo. 2019. AppMoD: Helping Older Adults Manage Mobile Security
with Online Social Help. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.
3, 4, Article 154 (dec 2019), 22 pages. https://doi.org/10.1145/3369819
[117] Kexian Wang. 2015. A Study on Credit Card Application Restrictions for Older
Adults (in Chinese). Legal and Economy 3 (2015).
[118] WeBank. 2021. WeBank. https://www.webank.com/
[119] Wechatpay. 2021. WeChat Pay. https://pay.weixin.qq.com/index.php/public/
wechatpay
[120] Wikipedia. 2021. Amazon Echo. https://en.wikipedia.org/wiki/Amazon_Echo
[121] Wikipedia. 2021. Chinese city tier system. https://en.wikipedia.org/wiki/
Chinese_city_tier_system.
[122] Wikipedia. 2021. The collective failure of China’s online lend-
ing platform in 2018 (in Chinese). Retrieved July 7,2021 from
https://zh.wikipedia.org/wiki/2018%E5%B9%B4%E4%B8%AD%E5%9C%
8B%E7%B6%B2%E7%B5%A1%E5%80%9F%E8%B2%B8%E5%B9%B3%E5%8F%B0%
E9%9B%86%E9%AB%94%E5%80%92%E9%96%89%E4%BA%8B%E4%BB%B6
[123] Wikipedia. 2021. Direct bank. https://en.wikipedia.org/wiki/Direct_bank
[124] Wikipedia. 2021. Mobile banking. https://en.wikipedia.org/wiki/Mobile_
banking#cite_note-3
Mobile Phone Use by People with Mild to Moderate Dementia:
Uncovering Challenges and Identifying Opportunities
Mobile Phone Use by People with Mild to Moderate Dementia
Emma Dixon∗ Rain Michaels Xiang Xiao
University of Maryland Google Inc. Google Inc.,
eedixon@umd.edu rainb@google.com xiangxiao@google.com

Yu Zhong Patrick Clary Ajit Narayanan


Google Inc. Google Inc. Google Inc.,
yuzhong@google.com pclary@google.com ajitnarayanan@google.com

Robin Brewer∗ Amanda Lazar


University of Michigan University of Maryland
rnbrew@umich.edu lazar@umd.edu
ABSTRACT Dementia. In The 24th International ACM SIGACCESS Conference on Comput-
With the rising usage of mobile phones by people with mild de- ers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM,
New York, NY, USA, 16 pages. https://doi.org/10.1145/3517428.3544809
mentia, and the documented barriers to technology use that exist
for people with dementia, there is an open opportunity to study
the specifcs of mobile phone use by people with dementia. In this 1 INTRODUCTION
work we provide a frst step towards flling this gap through an Dementia is a condition which involves changes in cognition and
interview study with fourteen people with mild to moderate demen- afects the ability to engage in daily tasks and activities [74]. De-
tia. Our analysis yields insights into mobile phone use by people mentia is typically progressive, with people in more mild stages of
with mild to moderate dementia, challenges they experience with dementia experiencing less changes to their functioning than those
mobile phone use, and their ideas to address these challenges. Based in more advanced stages [74]. Many everyday technologies are not
on these fndings, we discuss design opportunities to help achieve designed to meet the access needs of individuals with dementia
more accessible and supportive technology use for people with [32, 36–38, 79]. At the same time, research shows a trend towards
dementia. Our work opens up new opportunities for the design greater mobile phone use by people with mild cognitive impairment
of systems focused on augmenting and enhancing the abilities of and mild dementia, with almost half using smartphones [42, 57].
people with dementia. The rising usage of mobile phones and the documented barriers
that exist lead to the opportunity to study the specifcs of mobile
CCS CONCEPTS phone use by people with dementia. This usage, as well as barriers
• Human-centered computing → Accessibility; Empirical stud- to use, are key to understand in order to design apps and features
ies in accessibility; Human computer interaction (HCI); Empirical that are useful and accessible for people with dementia. In this
studies in HCI; Accessibility; Accessibility theory, concepts and study we address the following research questions:
paradigms. • For what purposes do people with dementia use their mobile
phones?
KEYWORDS • What challenges, if any, exist with mobile phone use?
Dementia, Mobile Phones, Accessibility • What opportunities do people with dementia envision to
support them when they encounter challenges with their
ACM Reference Format:
mobile phone use?
Emma Dixon, Rain Michaels, Xiang Xiao, Yu Zhong, Patrick Clary, Ajit
Narayanan, Robin Brewer∗ , and Amanda Lazar. 2022. Mobile Phone Use by Through semi-structured interviews with fourteen people with
People with Mild to Moderate Dementia: Uncovering Challenges and Iden- mild to moderate dementia, we learned that individuals used mobile
tifying Opportunities: Mobile Phone Use by People with Mild to Moderate phones in everyday life not only to accommodate changes in cogni-
∗ Also tive ability and emotional regulation, but also to stay productive and
afliated with Google Inc.
manage their health. We uncovered three major challenges with
Permission to make digital or hard copies of part or all of this work for personal or mobile phone use: 1) navigating to apps and features; 2) task execu-
classroom use is granted without fee provided that copies are not made or distributed tion in moments of high stress, fatigue, and time pressure; and 3)
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. re-learning task fows after updates and upgrades. To address these
For all other uses, contact the owner/author(s). challenges, participants described ideal interactions with mobile
ASSETS ’22, October 23–26, 2022, Athens, Greece phones, including customizable user interfaces, activity-based cus-
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. tomization, proactive technology assistance, and extended modali-
https://doi.org/10.1145/3517428.3544809 ties for voice-based interactions.
ASSETS ’22, October 23–26, 2022, Athens, Greece Emma Dixon et al.

Based on these fndings, our work makes four primary contri- the cost of devices [56], ethical issues [39, 56], attitudinal aspects
butions to the literature. First, this paper provides an empirical [39, 56, 84], condition-related challenges [39, 56, 66, 78, 85], and
account of how fourteen people with mild to moderate dementia technology-related challenges [39, 56, 78].
use mobile phones and the challenges they face with mobile phone In regards to specifc attitudinal aspects, researchers have argued
use. Second, it describes participants’ ideas to address challenges that people with dementia face difculties with using technology
with mobile phone use. Third, abstracting from participants’ ideas due to low digital self-efcacy (belief in one’s capacity to execute
for future more accessible interactions with mobile phones, this pa- technology-related tasks) [6, 39, 78, 79]. Two studies compared digi-
per discusses design opportunities to help achieve more accessible tal self-efcacy of people with mild dementia to older adults without
and supportive technology use for people with dementia. Finally, dementia, fnding participants with mild dementia perceived using
this work opens up new opportunities for the design of systems technology to be more difcult than participants of the same age
focused on augmenting and enhancing the abilities of people with [66, 85]. This low digital self-efcacy may impact the uptake of new,
dementia. potentially useful devices [18].
This low digital self-efcacy may be due, at least in part, to
2 RELATED WORK changes that dementia brings which include memory, sensitiv-
The following section describes research on mobile phone use by ity to stress, orientation to place and time, and interpreting and
neurodiverse users. Following this, we describe research on barriers understanding information [78]. Past work investigating barri-
to general technology use for people with dementia as well the pro- ers to technology use largely focuses on cognitive challenges
vision of technical support when people with dementia experience [39, 56, 66, 78, 85]. Though researchers have also described the
challenges with technology use. unique changes to sensory abilities that can cause barriers to tech-
nology, such as changes in speech patterns and language patterns
2.1 Mobile Phone Use by Neurodiverse Users and difculty identifying diferent sounds [30]. In addition, re-
searchers have pointed to how the progressive nature of the condi-
Researchers have begun to investigate how neurodiverse users, such
tion afects technology use [39, 56, 84].
as those living with traumatic brain injury, Down syndrome, and
Past research has also investigated difculties people living with
autism use mobile phones. For example, researchers have found in-
dementia experience due to the design of various technologies,
dividuals with traumatic brain injury heavily relied on their phones
such as websites [35, 43, 55], devices for in-home monitoring and
for reminders out of concern for forgetting upcoming events and
support [9, 12, 18, 21], and computers and cell phones [78]. These
that they only used limited features as the more advanced features,
difculties include challenges with familiarity [18], conspicuous
which may have been of use, were considered too complex [25].
devices [18, 21], and complex interfaces [40, 43, 55]. Unlike past
Research with individuals with developmental disabilities broadly,
work with other neurodiverse users [25, 28, 58], little work has
including people with Down syndrome, autism and other unspeci-
investigated the accessibility of mobile phones for people with mild
fed developmental disabilities, found that mobile phones played
to moderate dementia. One exception reports difculty some tech-
a key role in safety [28], increasing independence [28], providing
savvy participants with mild to moderate dementia have typing on
entertainment [58], social connectedness [28, 58], and reminders
their mobile phone due to small key size and identifying notifcation
and scheduling support [29]. Researchers have noted specifc phone
ringtones and sounds [31]. With this limited understanding of the
features which can be barriers for individuals with developmental
technologically-related challenges people with mild to moderate
disabilities (e.g., small buttons and complex menus [28]), as well
dementia experience, in this work, we set out to investigate the
as how features can be better designed to support neurodiverse
specifc barriers people with mild to moderate have with mobile
users (e.g., large buttons, icons with titles, and a single level menu
phone use. Understanding these barriers is essential to designing
structure for individuals with traumatic brain injury [73]).
future systems to provide the necessary technical assistance to
Past researchers have designed and developed prototype apps
combat these barriers.
for tablets and mobile phones to support people across the stages
of dementia (e.g., to promote safe walking with GPS tracking on
mobile devices [46, 61, 100, 101]). Apps have also been designed 2.3 Technical Support for People with Dementia
to support self-care and health management by people with mild
With the challenges that people with dementia face with technology
dementia [40, 52, 54]. Much of this past work focuses on the design
use, research to date has largely discussed ways informal caregivers
and evaluation of new prototype applications to support users
can provide technical assistance [27, 37, 53, 54, 68, 72, 77, 84, 88].
with mild to moderate dementia in important aspects of life. Given
Past work found that while assistance from others can be useful, it
the increasing number of people with mild dementia now using
was not always desirable by people with dementia, with individuals
mobile phones in their daily lives [42, 57], there is an opportunity to
going to great lengths to avoid assistance from other people to
understand how to design for people with dementia from another
avoid burdening loved ones and to avoid others taking away tasks
angle: by studying existing use.
that they had difculty with [32]. Some participants in this study
suggested that technical assistance mediated by technology may be
2.2 Barriers to Technology Use by People with more desirable than human assistance. Researchers have proposed
Dementia one alternative to provide necessary technical assistance without
Past research has identifed barriers that people living with mild relying on caregivers - automatic personalization (e.g., [97, 98]) in
to moderate dementia experience with technology use, such as combination with AI to determine instances when an individual
Mobile Phone Use by People with Mild to Moderate Dementia: Uncovering Challenges and Identifying Opportunities ASSETS ’22, October 23–26, 2022, Athens, Greece

may need assistance and automatically provide this assistance or participants’ design ideas as participants describe them, with the
make suggestions according to the perceived need [31]. Although participants actively critiquing the designs being drawn [14]. How-
this past work suggests alternative solutions to provide technical ever, researchers have found these low-fdelity prototyping methods
support to people with dementia, researchers have not yet investi- are not always efective for people with dementia who sometimes
gated how people living with mild to moderate dementia envision prefer to verbally describe their design ideas over sketching [33].
the provision of technical support. In this work, we investigate For this reason, in the second section, we frst asked participants
the perspectives of people with mild to moderate dementia on to verbally describe their ideas to accommodate the challenges
the optimal provision of technical support when challenges arise disclosed during the frst section of the interview. We used ideas
with mobile phone use and future opportunities for technological participants shared in earlier interviews to engage later participants
support. in discussions about future mobile phone use.
Past work has also shown how using high-fdelity prototypes
3 METHODS helps participants with mild to moderate dementia grasp abstract
To understand how people with mild to moderate dementia are concepts even with changes in abstract thinking ability [33, 45].
using mobile phones (RQ1), the specifc challenges they have with For this reason, after participants had shared their ideas for future
mobile phone use (RQ2), and opportunities for technology to sup- technologies, we used a publicly available Android smartphone
port people with dementia when they encounter these challenges application as a technology probe [48], to inspire participants to
(RQ3), we conducted semi-structured interviews with fourteen peo- think of new kinds of technology to support their needs and desires.
ple with mild to moderate dementia. Below, we present our ap- This app was chosen as the technology probe because it was de-
proach to recruitment, procedures for data collection, participant signed for neurodiverse users with cognitive disabilities to provide
demographics, analytic approach, and limitations. a simplifed user interface with customizable buttons which enable
multi-step tasks with a single click (see Figure 1). We demonstrated
some functions of the technology probe which we believed may
3.1 Recruitment
be of specifc use to people with dementia, such as providing di-
Participants with mild to moderate dementia were recruited rections, making phone calls to specifc people, using smart-home
through convenience sampling and snow-ball sampling. The frst devices, and opening specifc YouTube Channels. See Figure 1 for a
author reached out to 20 potential participants with a brief email visual of the mobile phone home screen demonstrated during the
describing the study and the eligibility criteria. People were eligible interview as the technology probe.
to participate in the study if they self-reported a clinical diagnosis of The semi-structured nature of the interview allowed us to ask
mild to moderate dementia, owned a mobile phone, and used their further probing questions to pursue topics guided by the infor-
mobile phone daily. Seventeen potential participants responded to mants themselves. Following each interview, participants received
the email from the frst author expressing interest in participating a $75 gift card as an incentive. Interviews ranged from 47 to 67
in the study. Three of these potential participants who initially minutes (average = 54 minutes). The interviews were audio and
showed interest in participating in the study were not comfort- video recorded, resulting in 13 hours and 27 minutes of data. We
able with the interview being recorded and therefore chose not to provide the full study protocol, including interview questions, in
participate. The other fourteen potential participants consented the supplementary materials.
to participate electronically, and completed a demographics ques-
tionnaire. Aligning with best practices when working with people
with mild to moderate dementia [44] as well as legal and ethical
3.3 Participants
best practices in some countries [59, 93], we assumed participants’ The research team conducted 14 semi-structured interviews with
capacity to consent. Although people in the mild to moderate stages people with mild to moderate dementia. This aligns with the aver-
of dementia can experience changes in cognitive ability [74], they age sample sizes for remote interview studies [19]. Table 1 provides
are generally able to participate independently in research studies more information on participant demographics and mobile phone
[44]. All interviews were conducted remotely in November and use. All participants were familiar with voice-assistants and some
December of 2021. of their functionality. All participants resided in the U.S. with one
participant, Miranda, residing in Canada. Throughout the paper we
3.2 Procedure use pseudonyms for participants.
We conducted remote, semi-structured video interviews that were
split into two segments of questions. The frst segment focused 3.4 Analysis
on participants present mobile phone use and any challenges they We used a thematic analysis approach to analyze the interview data
experienced. The second segment focused on opportunities for [16]. To become familiar with the data, the frst author verifed
future mobile device interactions. This is in response to Lewis, computer-generated interview transcripts from audio recorded in-
Sullivan, and Hoehl’s call to include individuals with cognitive dis- terviews. The frst author then coded each transcript to generate
abilities in the design and development of future more accessible initial codes. The frst author grouped initial codes into potential
mobile phones [60]. Speculative futuring in the dementia space has themes and went back through all interview transcripts to gather
primarily used co-design methods, where people with dementia quotes relevant to those themes. The research team reviewed and
use sketching and paper prototyping to ideate future technologies discussed themes to ensure that they were relevant to the codes and
[61]. Due to changes in dexterity, other researchers have sketched quotes extracted. The frst author created a thematic map, grouping
ASSETS ’22, October 23–26, 2022, Athens, Greece Emma Dixon et al.

Table 1: Participants’ Demographic Information

Pseudonym Gender Age Education Racial or Ethnic Mobile Phone Regularly received assists
Group Group Operating from others with mobile
System phone use
Thomas Man 55-64 Bachelor’s degree White iOS Yes
Elenora Woman 55-64 Master’s degree White Android No
Kim Woman 45-54 Bachelor’s degree Black or African iOS No
American
Tristen Man 65-74 Some college credit, White Android Yes
no degree
Sylvia Woman 55-64 Some college credit, Black or African Android No
no degree American
Preston Man 55-64 Some college credit, White Android No
no degree
Miranda Woman 55-64 Some college credit, White iOS Yes
no degree
Kennith Man 65-74 Some college credit, White iOS Yes
no degree
Tina Woman 65-74 Master’s degree White iOS No
Malcolm Man 55-64 Master’s degree White iOS Yes
Josslyn Woman 65-74 Some college credit, White Android No
no degree
Hall Man 65-74 High school White iOS Yes
diploma or the
equivalent
Alecia Woman 55-64 Some college credit, White Android No
no degree
Sabella Woman 55-64 Some college credit, Black or African iOS Yes
no degree American

themes under each of the three research questions of the study. The in terms of geographic and cultural settings, which afect healthcare
names of the themes within the thematic map were then refned access, socioeconomic status, and network device coverage.
to more clearly describe the fndings in each section. Finally, the This study was also limited by the one-hour semi-structured
research team selected the most vivid and compelling examples interview method, which was chosen to minimize the time com-
that related back to the research questions to include in this report. mitment and amount of work required of participants. This choice
limited the scope of the data collected, as opposed to a longitudinal
study design which could provide several weeks for participants to
report on mobile phone use and future ideas. Further, we choose
3.5 Limitations to conduct interviews, not co-design sessions, as past work has
This work has demographic limitations. First, most participants demonstrated that some people with dementia may experience dif-
(10/14) most likely had early onset dementia, representing 9% of fculty sharing sketches over video-conference calls [33]. Instead,
dementia cases world-wide [1]. This relatively younger group of we relied on participants’ verbal explanation of their future ideas,
participants may be overrepresented in our research due to the where a UX Designer on our team later illustrated these ideas in
contacts we recruited from and the recruitment material language: Figma to include in this report.
“we’ll discuss how you use your mobile phone in daily life, any
challenges you have with mobile phone use, and your ideas for
ways to make technology use easier for you.” As one study showed, 4 FINDINGS
technophilia - high enthusiasm for new technologies - was asso- Through our interviews we learned of the reliance, desire and inter-
ciated with lower age of people with dementia [42]. The second est participants with dementia had with their mobile phones and the
demographic limitation is the lack of people of color. With research individually meaningful activities they used their mobile phones for.
showing a higher prevalence of dementia in non-white people [2] Even with the heavy use of and reliance on their mobile phones, we
and only three participants identifying as Black or African Ameri- also uncovered challenges participants had with navigating to apps
can, these demographics were underrepresented in our sample. The and features; time pressure, high stress and fatigue afecting task
third demographic limitation is that participants resided primarily execution; and difculty re-learning task fows after updates and
in the United States. Findings from this study are therefore limited upgrades. To address these challenges participants outlined their
Mobile Phone Use by People with Mild to Moderate Dementia: Uncovering Challenges and Identifying Opportunities ASSETS ’22, October 23–26, 2022, Athens, Greece

go someplace. That way, [Hall’s wife] can keep track of me and/or


call me, or I can call them. So, it’s a safety catch for me.”

4.1.1 Overall Mobile Phone Use. Participants described their use of


their mobile phones for diverse and individually meaningful activi-
ties. They described their use of apps for social media, navigation,
online shopping, mobile banking, reading the news, listening to
music, games, entertainment, productivity, and health management.
Please see the appendix for the full list of apps participants reported
using. This list does not represent all uses of the mobile phone by
participants, only those that participants were able to recall during
the interview. Participants’ specifc uses of their mobile phones
for purposeful work and health management were particularly
interesting and unexpected. Therefore, we describe them further
below.

4.1.2 Mobile Phones Facilitating Purposeful Work and Connection.


Many participants described the difculty they had after receiving a
dementia diagnosis and subsequent retirement. Malcolm described
his experience as feeling like there “was such a void in my life. . .
I have to be busy. I don’t sit around much because I feel if I sit
around, I’m not productive and I like to be productive.” Participants
described ways that their “phone is the key factor” in “keeping
myself pretty busy” [Tina].
All participants described the importance of their calendar to
support general productivity through connecting with others and
keeping track of daily tasks. A few participants described produc-
tivity in relation to helping others through advocacy work. As
Miranda described “I do a lot of one-on-one support things with
other people with dementia. So quite often I do video calls on my
phone with someone who is in need.”
Figure 1: An Android phone home screen showing customiz-
Still others worked on more social and creative activities, which
able single click buttons for common multi-step tasks spe-
they considered to be purposeful part-time jobs. For example,
cifc to the individual. Customizations include personalized
Thomas works part-time as a freelance court reporter. Sylvia “use[s]
images, labels, and sizing.
my phone a lot on Facebook live. . . I show people how to make
natural herbal lip balm, and I tape the whole thing.” Malcolm started
to co-host a podcast, which he posts online from his phone. He
ideal interactions with mobile phones, providing opportunities for explained how his podcast “is sort of my salvation. It’s the thing
the design of more accessible mobile phone interactions for people that without it I think I would probably be very depressed because
with dementia. Below we elaborate on each of these fndings. it’s the one thing that I so look forward to” [Malcolm]. Staying
productive through these various methods was as much about en-
joyment for participants as it was about “keep[ing] my brain active”
4.1 Mobile Phone Use
through “something to prepare for... I do it all pro bono, but it’s that
All participants described their reliance on their mobile phone, little job that I have. It is purposeful work, and it keeps me going”
emphasizing how they “can’t live without this phone” [Malcolm]. [Malcolm].
Further, mobile phones enhanced participants’ quality of life, as Participants used their mobile phones to support general pro-
they described them as “their best and most famous friend” [Ken- ductivity, meaningful hobbies, and for more creative activities - all
nith] and their “brain” [Thomas, Tristen, Preston]. Most participants providing purposeful work for participants.
described their reliance on the mobile phone to support “memory
issues” [Thomas] they experienced afecting their executive func- 4.1.3 Mobile Phones Making Health Management Easier. Partici-
tioning skills, leading Kennith to refer to his mobile phone as “the pants also described using their phones for health management
way I cope with my daily existence.” Their reliance on the phone for activities. For example, some participants used their phone for med-
scheduling and reminders of everyday activities resulted in several ication reminders, asking Siri to “remind me to take my medication
participants sharing the adage: “if it’s [an activity or event] not on every day at 7:00. So, I use that feature and it’s extremely helpful
the phone, it’s not going to happen” [Preston]. Other participants with making sure I take my medication” [Kim]. For Miranda these
used their phones as a security measure for moments when they medication reminders included specifc days and dates because “I
struggled to remember where they were. As Hall described, the often struggle with what day we’re on, what the day and date is,”
phone is “something that I have to have with me all the time if I which is important for specifc medications she takes.
ASSETS ’22, October 23–26, 2022, Athens, Greece Emma Dixon et al.

Participants described how valuable it was to be able to use apps with it [75]. Afordances are binary - they either exist or they don’t
on their phone to keep track of medical results [Josslyn, Sabella] - and contextual, meaning if we cannot perceive or understand
and to collect medical data [Thomas]. Thomas, who is living with an afordance because of our situation, it does not exist. In the
type 2 diabetes, demonstrated his use of his “glucose monitor, which example Preston describes where high anxiety makes scrolling to
I use with my phone” where all he has to do is place the phone up to fnd a phone number in his address book challenging. For Preston,
his left triceps to check his blood sugar. This data is then uploaded there is no identifable indication of how to interact with the phone
to the cloud so physicians can access this information, sparing in those moments, though Preston is able to understand how to do
Thomas from “hav[ing] to stick my fnger all the time. They were this in other contexts when he is less anxious.
just able to share the data and see on a day-to-day, hour-to-hour In other instances, participants described difculty navigation
way all my blood sugar.” to specifc applications and content on their mobile phones. For
The phone was also used for recalling medical information dur- example, Sabella described an instance when her daughter
ing doctor’s visits. Elenora described how during doctor’s visits “made a post about me on her Snapchat. . . it was a
they “always ask the same standard information about your medi- beautiful post about living with dementia and how
cations and surgeries and all this stuf. And I can just pull it up on she feels about me living with dementia. And so, I was
my phone. . . and I can share it with the doctors” [Elenora]. This trying to go back to fnd it to save it and I just could
included notes on her phone with “specifc links to my prescription not understand the concept of doing that... I wanted it
list document my supplements document” [Elenora]. so bad, but I just couldn’t fgure out how to do it [get
Participants also described how valuable the phone is to call back to the Snapchat post]. So, I lost it which was the
emergency medical services. For instance, when Miranda’s “phone most frustrating for me.”
is upended or shaken too severely, it will send out an SOS1 and
Participants described difculty with navigating to apps and
they will actually come on and say, ‘are you okay? Do you need
features due to the amount of information available on their phone
help?”’ In addition to this feature, Miranda has used her phone “to
and challenges with remembering where items were saved on their
Call 9112 to have an ambulance come for me.”
phone.
Participants used their mobile phones for medication reminders,
keeping track of medical results, collecting medical data, to recall 4.2.2 Time Pressure, High Stress and Fatigue Impairing Task Execu-
medical information during doctor’s appointments and to be able tion. Participants experienced challenges executing the necessary
to call 911 in emergency medical situations. steps to complete tasks with precision on their mobile phones
though they understood the necessary steps to complete tasks (i.e.,
4.2 Challenges with Mobile Phone Use the Gulf of Execution [76]). This challenge was primarily described
Although participants described their reliance on their mobile in instances of time pressure or when participants experienced
phones and used their phones for a variety of individually mean- moments of high stress and fatigue.
ingful activities, they also described the challenges they had with Participants described feeling time pressure for tasks, which
navigation, task execution, and relearning changed interfaces after made them more difcult to execute with precision. For example,
updates and upgrades. when inputting calendar events on their mobile phone, participants
described experiencing “the pressure of the person that you’re doing
4.2.1 Dificulty Navigating to Apps and Features. Participants de- business with kind of standing there going ‘Why is this so hard for
scribed experiencing challenges with navigating to apps and fea- you?”’ [Alecia] as well as the “people behind you” in line [Josslyn].
tures on their mobile phone. Thomas described this as “maneuver- In some instances, this time pressure can lead to the event not
ing through the phone” where “as it becomes more useful I think it “get[ting] on the calendar and I’m thinking, I thought I did it. And I
can become more challenging because there’s more stuf squeezed think I’m missing that last click” [Josslyn]. In other instances, this
in there and fnding it all and maneuvering through it all can be a time pressure led participants to input the exact month and time
challenge.” of events incorrectly, therefore scheduling events “in the wrong
Participants often knew exactly what they wanted to fnd on month” and consequently “show[ing] up at wrong appointments
their mobile phone but had difculty “sift[ing] through the vast at the wrong time because I’ve screwed them up” [Alecia].
amount of information. . . zoom[ing] down into it” to get to what Participants also described challenges with executing tasks in
they were looking for [Thomas]. For example, several participants moments of high stress and fatigue. For example, Elenora, who
described their difculty with “navigating the contact lists” in high lives in a large city, described a time when after a long day of
stress environments [Miranda]. Preston described how “A lot of errands she “was very cognitively exhausted and I was waiting for
people with dementia would forget to scroll. They would think I a specifc bus to come home”. This was the only way she knew
only can see what’s on that page. . . Somebody with dementia that how to get home. But, the bus she was waiting for was delayed
gets overwhelmed or fnally realizes they’re lost. All a sudden your for three hours, leaving her sitting at the bus stop into the night.
anxiety is so high. It’s like, ‘Where do I make a phone call? Where’s Although Elenora was able to check on her phone to see the bus
the number?”’ We can understand this example through the lens of continued to be delayed, she “didn’t have the cognitive wherewithal
afordances - an interaction element that tells us what we can do to fgure anything else out,” meaning use her phone to search for
1 SOS other alternative ways to get home. Similarly, after a stressful day
is an internationally recognized signal of distress in radio code used especially
by ships calling for help. of work and navigating transportation in a large, unfamiliar city,
2 911 is a phone number used in North America to contact emergency services. Thomas described how on his phone he was “looking at this map
Mobile Phone Use by People with Mild to Moderate Dementia: Uncovering Challenges and Identifying Opportunities ASSETS ’22, October 23–26, 2022, Athens, Greece

of these train stations and it’s just not making any sense. And all upgrades in relation to changes made to software and/or interfaces
I could think of was the train somehow jumped the tracks into a in order to correct accessibility barriers, as described in the Web
diferent line and they’d taken me to the wrong place.” He explained Content Accessibility Guideline [49]. Instead these barriers were
this as “a situation where our brains aren’t working right,” making more so due to the change itself, rather than the substance of the
it difcult to execute routine tasks, such as navigation, on their change, as well as the fact that these changes are not explicitly
mobile phones. This created high stress situations for participants called out by the system.
who were aware of the increased risk of the harm of being lost for Several participants expressed fear over upgrading their phones
people living with dementia [86, 104]. to a newer version. Tina described how she has “always kind of
In these instances of high stress and fatigue, participants were updated or upgraded as needed. But now with my Alzheimer’s, I
either not willing to ask for help from other people or had dif- think that if I got another phone that had more fancy things that
culty communicating with others. Elenora described how in these might make it more difcult. . . I don’t try to start all over again and
moments “I’m certainly not going to ask a stranger on the street get a whole brand-new phone” because she was “afraid I’m gonna
because when you get in a fog like that, as friendly as I tend to mess up and then it’ll screw up my brain.” Malcolm shared this fear:
be and willing to ask for help, when I’m that deep in the fog, I “it’s a fve-year-old [phone]. I’m afraid to get a new phone because
really enclose in on myself.” Thomas was willing to ask for help, I’m afraid I’m not going to know how to use all of this stuf. So, I
but when he “asked the conductor, he was very gruf, and made me keep the phone. It works fne.”
feel stupid. . . and look[ed] at me like I was crazy” but his directions Participants also described not wanting to switch the operating
just were “not computing to me.” In this instance, Thomas called his system they used because of the difculty they experienced with
husband who drove to pick him up, describing how “people with learning new systems. Josslyn explained that when she retired “I
challenges need the ability to reach out to a human even more than” just kept [previous operating system] because I knew how to use it
other people. Thomas is referring to reaching out to signifcant and I didn’t want to have to relearn everything.” One participant,
others who understand his situation and can provide sympathetic Preston, intentionally got a new phone with a larger screen and
assistance rather than people, such as the conductor, who do not memory “with the hopes I probably never have to get a new phone
understand his situation and therefore cannot provide the necessary again” as he plans for this phone to last him the rest of his phone
assistance. use.
Participants also described high stress interactions with voice- Updated, upgraded or switched operating systems of their mobile
based systems which impaired them from executing tasks on their phones complicated discovery for participants, forcing them to re-
mobile phone. For example, Malcolm describes using his bank’s learn mobile phone skills, which led to frustration and fear for some
interactive voice response system regularly to pay his bills. How- participants.
ever, these systems do not always understand Malcolm where the
system will say “I didn’t hear you.” This could be due to changes in
speech patterns that people with dementia experience [71, 82] such 4.3 Participants’ Ideas to Address Challenges
as slowed speech, a developed stutter, and greater pauses between with Mobile Phone Use
words. Preston posits this may be in part due to participants asking
In this section we describe participants’ ideas to address challenges
“question[s] and say[ing] the wrong word or the wrong adjective
with mobile phone use, including customizable user interfaces,
sometimes by mistake” [Preston]. In these instances, Malcolm then
activity-based customization, proactive technology assistance, and
responds by “yelling and screaming because they don’t hear my
extended use of voice-based interactions. These design opportuni-
voice,” which led to “a very high level of frustration. . . and I end up
ties are based on feedback provided during the speculative design
hanging up the phone because it’s horrible.” This example shows
portion of the interview, where participants frst articulated their
that execution of tasks on mobile phones may be impaired due to
ideal interactions with mobile phones and then saw the technology
voice assistants not understanding their verbal commands.
probe demonstration.
Participants described challenges with executing tasks on their
mobile phones when under time pressure and in moments of high
stress and fatigue, such as after a long day trying to navigate home 4.3.1 Customizable User Interface on Mobile Phones. To address
or when interacting with mobile phone voice-based systems. challenges with navigating their mobile phones, participants
wanted to customize the size of apps and the icons so that they
4.2.3 Dificulty Re-learning Task Flows Afer Updates and Upgrades. could more easily recognize icons. For example, Thomas wanted
Similar to past work [50], participants expressed the challenge it “to sort of mold the device to what you need.” Molding the device
was to re-learn tasks fows after updates and upgrades because “this included making app icons that are “recognizable. . . easy for me
is the new path I have to go through to do the things that I’m used to identify, even when I’m at my worst with my brain fog with
to doing. I’m not to the point where I can’t do it, but it defnitely my dementia” [Miranda]. This also included “hide[ing] all those
can give me pause when a program changes or updates” [Thomas]. other things that we don’t use or don’t want” [Miranda] and plac-
For Tristen, upgrades were difcult because “nobody gives you ing only the most used “app[s] on the home screen” [Thomas].
instructions that these are the changes that have taken place. It Malcom describes how he needs “less options. I don’t need 90% of
just happens on the phone.” He explained that “sometimes I get the things [apps]. . . Because I think that’s where I have problems,
very anxious and frustrated, when changes [updates] take place.” when there’s a lot of options. Limit the options so I’m not searching
Notably, participants did not describe barriers with updates and for things.”
ASSETS ’22, October 23–26, 2022, Athens, Greece Emma Dixon et al.

Relating to the challenges with re-learning task fows after up- action on their behalf without them asking for this assistance. For
dates and upgrades, participants wanted a pre-specifed easy ver- instance, Sylvia described her desire for a single button, that when
sion of their phone rather than having to make user interface cus- pushed, enabled a personal assistant that could help her during
tomizations themselves. Josslyn proposed the idea of being able doctor’s visits by “advocat[ing] for me” by “asking [questions] for
to “choose an easy format” where modifcations were “set for you” me” if she forgets to ask them. Sylvia also wanted the personal as-
making the phone similar to “these phones advertised for seniors sistant to tape her conversations with her doctor, because “if I can
with the simplicity things.” Though when asked if she had consid- record it then I can play it back,” supporting her memory changes.
ered buying and using one of these phones, Josslyn explained how In later interviews, we used Sylvia’s idea to probe other partici-
she would prefer “just a simplifed version of my phone” because pants’ perception of the line between helpful and disconcerting
this is what she is familiar with. with this kind of assistance from mobile phones. Tina described her
perspective: “I don’t think it’s creepy because you’re making the
4.3.2 Activity-based Customization on Mobile Phones. As another
choice to tell [the system] to start flming [recording]. . . It’s not
way to address challenges with navigating their mobile phones,
like she [the system] knows you just got into the doctor’s ofce and
participants described their need to bundle common activities -
now automatically the microphone’s going to show up.” However,
meaning to create specifc categories of activities - and store these
Sabella surfaced a legitimate concern: “the thing about the button
bundles in distinct, easily identifable places. For example, Josslyn
[to activate the personal assistant] is you have to remember to
described how she “had to have it [her calendar] compartmen-
tell the button to do it [provide assistance].” Because memory can
talized” where she “save[s] it [her phone calendar] for doctor’s
be increasingly challenging for people with dementia, Tina later
appointments, or hair appointments or something that I’m leaving
concluded: “I think[s] there’s going to be a time where we really
the house for. Not just like to remember to do the laundry. . . routine
need all those crazy voices [voice assistants] kind of helping us”
things are put on the paper calendar.” Josslyn made this modifca-
without being prompted.
tion to her calendar use after her diagnosis as she described “when
In addition to wanting more proactive assistance in doctor’s
I worked I had all that” on her phone calendar, “but once I had to
visits, participants also described how they would like their phone
simplify my life that’s how I simplifed it - paper calendar, phone
to track their activities on their mobile phone to play a part in
calendar.”
sustaining their relationships. Participants wanted their phone to
This idea of bundling common activities and storing them in
help them respond to others and initiate social interactions with
distinct places was applied to several envisioned futures for mobile
others - tasks that were becoming increasingly difcult. For example,
phone use. First, after seeing the technology probe could make a
Alecia wanted her phone to be “snooping in your text” to then
phone call using a single click button to a pre-specifed individual (a
remind you “‘hey, you haven’t checked in on or checked up on so
common feature on devices geared towards older adults and people
and so since such and such.”’, similar to existing Microsoft Outlook
with dementia [72, 90]), Tina described her desire for a “my tribe”
email reminders [69]. Though Alecia describes how it would be
button to “push the button that would have the name and the phone
important for the system to be “synced to your text” rather than
number of those seven people that are in my tribe so that I don’t
social media in order to only be tracking those more active social
have to look it up” in her contacts. Both iOS [5] and Android smart
connections.
phones [38] provide this service through groups.
One participant, Preston, proposed another instance when track-
In another example of activity-based customization by bundling
ing mobile phone use to provide proactive technology assistance
common activities, one participant, Sylvia, described her desire to
may be useful: in identifying unused but potentially useful features
present voice-assistants as support for specifc aspects of life. For
(e.g., a stylus). For example, Preston proposed the phone could iden-
example, Sylvia wanted her phone to “have fve buttons” each with
tify: “‘Hey, he really hasn’t used that stylus pen. Maybe we should
a diferent “personal assistant. . . one is for doctor’s appointments.
send him another opportunity.”’ and then provide “a tutorial” of
One is to help you with the grocery store. One is to [remind you
how to use that feature. When this idea was presented to partici-
to] take medication.” Practically this would be one voice-assistant
pants in later interviews it was well received as a potential solution
(e.g., Alexa, Google Assistant, Siri) but presented as diferent types
to facilitate learning new task fows after updates or upgrades.
of assistants to help with diferent bundled aspects of their life. We
Participants also wanted their phone to provide more proactive
found this concept compelling, and so proposed it in later interviews
assistance by learning and automating routine uses of their phone,
to get other participants’ feedback. This idea was well received,
which could further support them in executing tasks on their mo-
with several participants proposing diferent personal assistance
bile phones in stressful situations or when they experience fatigue.
(e.g., life coach assistant to support chores [Alecia, Josslyn], and
For example, participants described using specifc intervals of re-
technical support [Kennith]).
minders for every calendar event: “it’s always two days before the
Participants described their desire to have activity-based cus-
event, one day before the event, one hour before the event, and
tomization by bundling common activities and storing these bun-
maybe 15 or 20 minutes before the event, and then a fve-minute
dles in distinct, easily identifable places to address challenges with
reminder” [Preston]. Preston wanted his phone to be “smart enough
navigating their mobile phones.
to realize, I’m doing the same reminders every time. There should
4.3.3 Proactive Technology Assistance on Mobile Phones. To assist be an option where I can just say, ‘use your common reminders’.”
in moments of pressure, stress and fatigue that can hinder task Sabella wanted to completely automate the process of inputting
execution on mobile phones, participants described their desire for calendar events so that “when you call to say ‘Sabella, I’m going
more proactive technology, where their mobile phones take some to set us up a Zoom for so and so’ and it automatically just kind of
Mobile Phone Use by People with Mild to Moderate Dementia: Uncovering Challenges and Identifying Opportunities ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 2: Mockup of adding a calendar event using voice-based interactions with Google Assistant on a mobile phone, including
proactive technology assistance with reminder intervals.

records it in some little strange way and it sets it on the calendar support in moments of high stress or fatigue that impair mobile
right then. . . Not for me to do it [input a calendar invite] but its phone use.
own self.”
Proactive assistance could also take the form of voice-assistants 4.3.4 Extending Voice-based Interactions with Mobile Phones. Par-
providing more structured prompts in a conversational style to help ticipants described their desire to use voice-based interaction to
people execute tasks on their mobile phone. For example, Elenora address challenges with navigating their phone. For instance, Tina
wanted to be able to wanted to just be able to speak to her preferred social media app
to have it search for pictures of her and her daughter on vacation
“tell my phone while I’m out: ‘I’m confused’ or ‘I’m rather than having to scroll through her past posts to fnd the album.
not sure where to go.’ Then it could start asking me Many participants described their desire to “click on the calendar
questions. Not, ‘Do you want to do this?’ But literally [app] and use your voice and say, ‘doctor’s appointment, this date,
ask me questions to help me fgure out what it is I’m this time”’ and have it add an event [Josslyn]. Figure 2 provides an
trying to even do. And, I could just say ‘I’m trying example of adding a calendar event using voice-based interactions
to get home’ and it could say, ‘Well, there’s all kinds with Google Assistant on the mobile phone, which also takes into
of ways. What do you prefer? Want to go home the consideration more proactive technology assistance. Participants
fastest way? The easiest way? The cheapest way?’ also wanted to search for content across apps [Elenora, Josslyn,
And then I can just pick one.” Alecia, Sabella, Kim, Preston] (e.g., saved recipes on diferent so-
This system would allow Elenora to “interact with it in a more cial media apps [Kim], social connections with people [Thomas]).
natural language. . . almost have a conversation with the system.” Preston wanted to “name your phone, which would then turn it on
This example refects the challenges of balancing 100% open-ended like Alexa” when you spoke its name.
conversation with structured prompting (as previous work with Several participants described needing an avatar in combination
older adults has highlighted [17]). In our fndings, participants with a voice-based virtual assistant on their mobile phone to help
describe their need for structured prompting but in a more conver- with attention, which would ultimately help them to execute tasks
sational style and for voice-assistants to understand less structured on their mobile phones. For instance, Malcolm described how “I
responses. don’t like people talking to me over the phone, it’s difcult for me.
Participants described their desire for more proactive technology This [points at the video conference call camera] is easier.” Preston
assistance through unprompted assistance, more in-depth tracking describes this as “a dual connection because you’re connecting with
of their mobile phone use to support relearning task fows after up- your eyes and your ears, whereas if my screen is gone, we’ve only
dates and upgrades, as well as automating routine tasks to provide got one connection and that’s just listening.” Preston believes this
ASSETS ’22, October 23–26, 2022, Athens, Greece Emma Dixon et al.

dual connection would help to “draw you in and you’re going to pay research in the development of mobile apps for people with demen-
more attention than a voice that’s just randomly talking.” Because tia have primarily focused on designing apps to improve memory
of this need for a dual connection, participants described the need [40] or for GPS tracking to assist in navigation [46, 61, 100, 101],
for voice-based interactions on their phone to have an “artifcial we also intentionally highlight two common activities on mobile
person” or “an android person” [Preston]. Sylvia described wanting phones, productivity and health management, as promising areas
to personalize her virtual assistant by having “an animated person” for future development. Additionally, participants described some
where “I can pick his voice. I can pick the culture. Like, it’ll probably ways their use of apps changed as they progressed with dementia.
be a person of color.” She explained this concept by relating to “the For example, they experienced a new reliance on their calendar
Wii game you can describe your person. . . You can pick the hair, apps to dictate their daily activities. They also described having to
you pick the glasses, the voice, the outft. I would like to pick my compartmentalize calendars based on diferent aspects of life, as
virtual assistant” [Sylvia]. well as the need for set intervals of reminders for each calendar
Participants also described clear lines for acceptable use of event. As our fndings report participants’ refections on how their
avatars with virtual assistants for people with dementia, relating mobile phone use has changed, further research is needed to ob-
back to the uncanny valley efect [70]. Preston notes that the system serve the longitudinal use of mobile phone applications as people
needs to make it clear that “it’s not a real person” by having “in the with dementia progress with the condition.
background saying ‘You’re speaking to Robby the Robot’ something Second, we investigated the pain points and barriers to com-
that is displayed. So that they know this person isn’t real because pleting tasks on mobile phones that people with mild to moderate
for some people they could get confused.” One participant, Alecia, dementia experience. We uncovered challenges with: 1) navigating
was open to support from avatars in combination with virtual as- to apps and features; 2) task execution in moments of high stress, fa-
sistants as long as it remained two dimensional where any physical tigue, and time pressure, and 3) re-learning task fows after updates
or three-dimensional robots were considered “scary” and “going and upgrades. These fndings expand on past work on barriers tech-
over the top.” savvy people with mild to moderate dementia have with typing on
Even with the enthusiasm of many participants about extending their mobile phone due to small key size and difculty identifying
voice-based interactions with their mobile phone, several partic- the notifcation ringtones and sounds [31] by providing a more
ipants also noted the importance of providing human assistance thorough examination of the barriers people with mild to moderate
in high stress or time sensitive instances where voice assistants dementia experience with mobile phone use.
were not understanding them, which made executing tasks on their Addressing our third research question concerning opportuni-
mobile phones difcult. Thomas stated: “there are times when we ties for technology to support people with dementia when they
just need, particularly people with cognitive challenges, need the encounter challenges with their mobile phones, we uncover four
ability to reach out and touch another person. Especially if we’re design opportunities based on participants envisioned future inter-
in a situation where our brains aren’t working right.” Similarly, Mi- actions with mobile phones: customizing for accessibility, activity-
randa describes how she “prefers person-centered help” especially based customization, proactive technology assistance, and extended
“when you’re talking to an artifcial intelligence [about] anything modalities for voice-based interactions. These fndings demonstrate
more complex.” Though she elaborates “that doesn’t mean that I the considerable creativity of people with dementia in generating
don’t think things like [voice assistants] aren’t good. But I think ideas for future technologies.
within those [having] the ability to easily access an actual person Many of these ideas incorporate AI and automation to support
is also important” [Miranda]. more accessible interactions with mobile phones for people with
Participants described their desire to extend the capabilities of dementia. This provides a diferent perspective from past work in
voice-based interactions by: facilitating voice-based interactions AI and dementia, which has primarily focused on ways AI could
within and across apps to better support navigation, as well as be used to detect and monitor the progress of dementia [8, 41, 87,
to support task execution by using avatars in combination with 92, 96], in smart home environments to support care partners in
voice assistants to support attention and further personalization; monitoring the activities of individuals with dementia [3, 24, 26, 63,
and providing access to human assistance in instances when voice 80], or to support therapy [13, 20, 95]. Therefore, our work opens
assistants did not understand them. up new opportunities for the design of future AI systems to support
the abilities of people with dementia, as in ability-based design
[105].
In the remainder of the discussion, we describe design oppor-
5 DISCUSSION tunities to support individuals with progressive disabilities and
Through interviews with fourteen people with mild to moderate tensions with automation for people with dementia.
dementia, we address each of our three research questions. First, we
investigated how people with mild to moderate dementia use their
mobile phones, fnding they use a range of apps for: navigation,
5.1 Design Opportunities to Support Individuals
mobile shopping, online banking, games, social media, news, en- with Progressive Disabilities
tertainment, communication, connection, productivity, and health One contribution of this work is taking a frst step towards under-
management. These fndings take a frst step towards flling an standing how to design for access needs for a group of neurodiverse
empirical gap in the literature by detailing some ways people with users that experience progressive changes in ability, going beyond
mild to moderate dementia use their mobile phones. Because past the traditional binary representation of disability. For instance, our
Mobile Phone Use by People with Mild to Moderate Dementia: Uncovering Challenges and Identifying Opportunities ASSETS ’22, October 23–26, 2022, Athens, Greece

fndings demonstrate the importance of adaptive user interfaces to in executing tasks in moments of high stress and fatigue. This as-
minimize navigation of mobile phone use, which can become in- sistance could be provided through context-aware prompting for
creasingly difcult for people with dementia as they progress with tasks, as in [22, 23] which provided prompts for users with cogni-
the condition. Our fndings also demonstrate the importance of tive impairments to complete tasks. Though, as participants in our
providing increasingly proactive technological support for people study mentioned, there may come a time with the progression of
with dementia as they progress with the condition. In the following the condition where prompting will not be enough and the system
section we discuss each of these directions for future design. will need to provide more active assistance. For example, assistance
could be provided automatically when the system recognized the
5.1.1 Adaptive User Interfaces for Progressive Changes in Abil- context the person with dementia was in (e.g., in a doctor’s appoint-
ity. Participants described their desire for adaptive interfaces that ment speaking with their doctor). One participant even wanted
could simplify navigating their phones through customizable home their mobile phones to listen into their phone conversations to
screens, and adjustable app sizes and personalizable icons (aligning automatically add calendar events as they were confrmed during
with guidance from the WCAG Cognitive Accessibility Task Force) phone calls, as in past work [64, 65]. Future systems should consider
[99]. But, they also described challenges with learning new interac- utilizing context-aware computing to provide more active support
tion patterns, which are inevitable with adaptive interfaces. As one to people with dementia.
way to navigate this tension, designers could ensure that users are Another way proactive technological support could be provided
made aware of any changes with updates and upgrades and then is through tracking technology use. For example, several partici-
provide training on new task fows. pants described their desire for their phone to monitor their text and
Another potential way to navigate this tension is to use temporal emails to prompt them to respond to and initiate social interactions
dimensions for adaptive user-interfaces for people with dementia, with others. Participants also wanted their mobile phones to track
building of of previous work on ephemeral adaptations - where what features they used and propose new or previously unused
only the most used menu items are displayed abruptly and then all features that may be helpful to them based on their historic phone
other menu items gradually fade in [34]. In the case of mobile phone usage. Current systems provide nudges and notifcations for new
accessibility for people with dementia, it may be necessary for the features once rather than providing additional reminders as these
less used apps to gradually fade out and no longer be displayed due additional reminders are assumed to be unhelpful and annoying
to changes in visual ability with the progression of the condition, to users [81]. Our fndings suggest that for people with dementia
including selecting an object from a visually busy environment they may need to be reminded about features more than once, due
[11, 91]. To be clear, we are not advocating for systems that strip to the progressive nature of the condition and their changes in
away functions of a device to the bare minimum due to the inability memory overtime. Future systems could integrate more proactive
of people with dementia to understand complex functions of devices technological support (e.g., context-aware proactive smart-speakers
(as in past work [4, 18, 43, 55, 62, 72, 88]). Rather, we are proposing [83, 102, 103]) by highlighting potentially useful features and provid-
systems which display the most used features in a way that is easy ing regular reminders of these features if they begin to go unused.
for people with mild to moderate dementia to navigate to, while Although these design directions may provide necessary sup-
still providing access to less used features if/when they want them. port to prolong mobile phone use for people with dementia, they
Still another potential future direction is the design of adaptable also introduce privacy concerns due to the level of data collection
systems in combination with activity-centric thinking [7], which necessary to provide this support. As past work has outlined [67],
shifts away from traditional application-centric computing and such tracking could be used as a mediator of coercive control and
towards human goal-oriented activities, cutting across systems abuse. To manage this tension between the need for more proactive
boundaries [7]. Participants wanted their user interfaces to bundle technology support and privacy concerns, we urge researchers,
common activities and store these bundles in distinct, easily iden- designers and developers to keep privacy considerations central to
tifable places. For instance, participants bundled their calendars their work.
by type of activity (e.g., in-home activities vs. out of the house
activities). Adapting user interfaces to refect bundles of common
activities may be one way to make systems more accessible to
5.2 Tensions in Designing Technologies for use
people with dementia as they experience progressive changes in by People with Dementia
ability. Participants described extensive future applications for automation,
These are just two examples of potential areas to explore in surfacing tensions with autonomy and who would be in control of
future work on adaptive user interfaces to support people with initiating support, the person or the system. For instance, partici-
dementia. There is room for much further exploration concerning pants in our study described ways that automation could be used
adaptive user-interfaces to support more accessible interactions to improve their task execution on their mobile phones, centering
with technology for people with progressive changes in ability. their own role in completing the task (e.g., having a button where
they could prompt voice assistants). This may be one solution to
5.1.2 Proactive Technological Support. One way to provide more the concern of some people with dementia in past work towards
proactive technological support is through context-aware comput- receiving support from AI to assist with managing their daily life
ing. Participants described the need for their phone to recognize out of concern for the loss of autonomy [32]. However, participants
their location as well as their conversational partner (as in the also described how with the progression of the condition, they will
TalkAbout System for people with Aphasia [51]) to assist them need systems to eventually act autonomously to perform tasks on
ASSETS ’22, October 23–26, 2022, Athens, Greece Emma Dixon et al.

their behalf (e.g., start audio recording in doctor’s visits). Partici- augment and enhance the abilities of people with dementia. With
pants described these progressive changes in abilities due to the the pervasive use of mobile phones in our society, these fndings will
condition necessitates less autonomy with technology to support help researchers and creators of technology design environments
them to continue to be active in everyday activities (e.g., attending that assure societal inclusion [47] for people with mild to moderate
doctor’s visits independently). However, with more proactive tech- dementia.
nological support and less autonomy, this could pose additional
risks for abuse facilitated via the proactive systems (as described in ACKNOWLEDGMENTS
[67]). With evidence of elder abuse through automation in the form Thank you to participants, as well as Shaun Kane, Hernisa Kacorri,
of smart home technologies [15], we join in the call of past work Jonathan Lazar, Gregg Vanderheiden, Matthew Goupell and the
[94] to design automated systems for use by people with dementia anonymous reviewers who provided feedback on versions of this
rather than on people with dementia. paper. This work was supported, in part, by grant 90REGE0008,
Participants also described instances where their reliance on U.S. Admin. for Community Living, NIDILRR, Dept. of Health &
technology negatively afected them. For example, moments of in- Human Services, NSF Grant IIS-2045679, and the National Science
tense brain fog can render technology assistance useless (e.g., when Foundation Graduate Research Fellowship Program under Grant
navigating), requiring them to reach out to another person for sup- No. DGE 1840340. Opinions expressed do not necessarily represent
port. These fndings suggest future systems should be designed to ofcial policy of the Federal government.
keep humans in-the-loop (as proposed in [89]) around users with
dementia to provide support when necessary. Importantly, partic- REFERENCES
ipants needed to be connected to sympathetic human assistance. [1] Alzheimer’s Association. Younger/Early-Onset Alzheimer’s. Alzheimer’s Dis-
Therefore, future systems could sense frustration or negative emo- ease and Dementia. Retrieved March 16, 2020 from https://alz.org/alzheimers-
dementia/what-is-alzheimers/younger-early-onset
tion from voice interactions, and then automatically connect the [2] Alzheimer’s Association. 2021. 2021 Alzheimer’s Disease Facts and Figures.
user to a friend or loved one to provide human assistance. Alzheimers Dement, Chicago, IL.
Still another tension which emerged from our work concerns [3] Mohsen Amiribesheli and Abdelhamid Bouchachia. 2015. Smart Homes De-
sign for People with Dementia. In 2015 International Conference on Intelligent
the use of a visual representation for voice assistance. Although Environments, 156–159. https://doi.org/10.1109/IE.2015.33
some participants noted a visual representation may help with [4] Claire Ancient, Alice Good, Clare Wilson, and Tineke Fitch. 2013. Can Ubiqui-
attention when interacting with voice assistants, they also noted tous Devices Utilising Reminiscence Therapy Be Used to Promote Well-Being
in Dementia Patients? An Exploratory Study. In Universal Access in Human-
how such visuals may be confusing or disturbing to some people Computer Interaction. Applications and Services for Quality of Life (Lecture Notes
with dementia who may not be able to discern that the avatar is in Computer Science), 426–435. https://doi.org/10.1007/978-3-642-39194-1_50
[5] Apple Inc. 2022. Create and manage groups of contacts on iCloud.com. Apple
not a real person. Future work is needed to understand how people Support. Retrieved June 24, 2022 from https://support.apple.com/guide/icloud/
with dementia perceive diferent types of visual representations of create-and-manage-groups-mmfba73c71/icloud
voice assistants (e.g., disembodied agents, artifcial embodied agent, [6] Norm Archer, Karim Keshavjee, Catherine Demers, and Ryan Lee. 2014. Online
self-management interventions for chronically ill patients: Cognitive impair-
and photorealistic embodied agents [10]) and if their perceptions of ment and technology issues. International Journal of Medical Informatics 83, 4:
visual representations change with the progression of the condition. 264–272. https://doi.org/10.1016/j.ijmedinf.2014.01.005
We recognize that this is not a full investigation of all possible [7] Jakob E. Bardram, Steven Jeuris, Paolo Tell, Steven Houben, and Stephen Voida.
2019. Activity-centric computing systems. Communications of the ACM 62, 8:
tensions that may arise with the design of technologies for use 72–81. https://doi.org/10.1145/3325901
by people with dementia (e.g., trust, explainability). Further work [8] Flavio Bertini, Davide Allevi, Gianluca Lutero, Danilo Montesi, and Laura Calzà.
2021. Automatic Speech Classifer for Mild Cognitive Impairment and Early
is needed to investigate these tensions and ways for designers, Dementia. ACM Transactions on Computing for Healthcare 3, 1: 8:1-8:11. https:
developers, and researchers to better navigate these tensions. //doi.org/10.1145/3469089
[9] Inga-Lill Boman, Stefan Lundberg, Sofa Starkhammar, and Louise Nygård. 2014.
Exploring the usability of a videophone mock-up for persons with dementia and
their signifcant others. BMC Geriatrics 14, 1: 49. https://doi.org/10.1186/1471-
6 CONCLUSION 2318-14-49
This work details ways people with mild to moderate dementia use [10] Michael Bonfert, Nima Zargham, Florian Saade, Robert Porzel, and Rainer
Malaka. 2021. An Evaluation of Visual Embodiment for Voice Assistants on
their mobile phones surfaced through an analysis of interviews with Smart Displays. In CUI 2021 - 3rd Conference on Conversational User Interfaces,
fourteen people with mild to moderate dementia. Findings from 1–11. https://doi.org/10.1145/3469595.3469611
[11] François-Xavier Borruat. 2013. Posterior Cortical Atrophy: Review of the Recent
this study showed three major challenges with mobile phone use: Literature. Current Neurology and Neuroscience Reports 13, 12: 406. https://doi.
1) navigating to apps and features; 2) task execution in moments of org/10.1007/s11910-013-0406-8
time pressure, high stress and fatigue, and 3) re-learning task fows [12] Ann L Bossen, Heejung Kim, Kristine N Williams, Andreanna E Steinhof, and
Molly Strieker. 2015. Emerging roles for telemedicine and smart technologies
after updates and upgrades. To address these challenges participants in dementia care. Smart homecare technology and telehealth 3: 49–57. https:
described their ideal interactions with their mobile phones, which //doi.org/10.2147/SHTT.S59500
included customizing for accessibility, activity-based customization, [13] Eleni Boumpa, Ioanna Charalampou, Anargyros Gkogkidis, and Athanasios
Kakarountas. 2017. Home Assistive System for Dementia. In Proceedings of the
proactive technology assistance, and extended modalities for voice- 21st Pan-Hellenic Conference on Informatics (PCI 2017), 1–6. https://doi.org/10.
based interactions. This paper contributes to the literature by 1) 1145/3139367.3139435
[14] Aikaterini Bourazeri and Simone Stumpf. 2018. Co-designing smart home tech-
providing an empirical account of how fourteen people with mild to nology with people with dementia or Parkinson’s disease. In Proceedings of the
moderate dementia use mobile phones and the challenges they face 10th Nordic Conference on Human-Computer Interaction - NordiCHI ’18, 609–621.
with mobile phone use; 2) uncovering design opportunities to help https://doi.org/10.1145/3240167.3240197
[15] Bonnie Brandl, Carmel Bitondo Dyer, Candace J. Heisler, Joanne Marlatt Otto,
achieve more accessible mobile phone use for people with dementia; Lori A. Stiegel, and Randolph W. Thomas. 2006. Elder Abuse Detection and
and 3) providing new directions for the design of future systems to Intervention: A Collaborative Approach. Springer Publishing Company.
Mobile Phone Use by People with Mild to Moderate Dementia: Uncovering Challenges and Identifying Opportunities ASSETS ’22, October 23–26, 2022, Athens, Greece

[16] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychol- early-stage dementia: A preliminary investigation. Aging & Mental Health 9, 5:
ogy. Qualitative Research in Psychology 3, 2: 77–101. https://doi.org/10.1191/ 442–448. https://doi.org/10.1080/13607860500142838
1478088706qp063oa [36] Grant Gibson, Claire Dickinson, Katie Brittain, and Louise Robinson. 2015. The
[17] Robin Brewer, Raymundo Cornejo Garcia, Tedmond Schwaba, Darren Gergle, everyday use of assistive technology by people with dementia and their family
and Anne Marie Piper. 2016. Exploring Traditional Phones as an E-Mail Interface carers: a qualitative study. BMC Geriatrics 15, 1: 1–10. https://doi.org/10.1186/
for Older Adults. ACM Transactions on Accessible Computing 8, 2: 6:1-6:20. https: s12877-015-0091-3
//doi.org/10.1145/2839303 [37] Grant Gibson, Claire Dickinson, Katie Brittain, and Louise Robinson. 2019.
[18] Suzanne Mary Cahill, Emer Begley, Jon Paul Faulkner, and I. Hagen. 2007. “It Personalisation, customisation and bricolage: how people with dementia and
gives me a sense of independence” – Findings from Ireland on the use and their families make assistive technology work for them. Ageing & Society 39, 11:
usefulness of assistive technology for people with dementia. Technology and 2502–2519. https://doi.org/10.1017/S0144686X18000661
Disability 19, 2–3: 133–142. https://doi.org/10.3233/TAD-2007-192-310 [38] Google Inc. 2022. View, group & share contacts - Android - Contacts Help.
[19] Kelly Caine. 2016. Local Standards for Sample Size at CHI. In Proceedings of the Google Support - Contacts Help. Retrieved June 24, 2022 from https://support.
2016 CHI Conference on Human Factors in Computing Systems (CHI ’16), 981–992. google.com/contacts/answer/30970?hl=en&co=GENIE.Platform%3DAndroid
https://doi.org/10.1145/2858036.2858498 [39] Estefanía Guisado-Fernández, Guido Giunti, Laura M. Mackey, Catherine Blake,
[20] Mariona Carós, Maite Garolera, Petia Radeva, and Xavier Giro-i-Nieto. 2020. and Brian Michael Caulfeld. 2019. Factors Infuencing the Adoption of Smart
Automatic Reminiscence Therapy for Dementia. In Proceedings of the 2020 Health Technologies for People With Dementia and Their Informal Caregivers:
International Conference on Multimedia Retrieval, 383–387. Retrieved March 4, Scoping Review and Design Framework. JMIR Aging 2, 1: e12192. https://doi.
2022 from https://doi.org/10.1145/3372278.3391927 org/10.2196/12192
[21] Filippo Cavallo, Michela Aquilano, and Marco Arvati. 2015. An Ambient Assisted [40] Yuqi Guo, Fan Yang, Fei Hu, Wei Li, Nicole Ruggiano, and Hee Yun Lee. 2020. Ex-
Living Approach in Designing Domiciliary Services Combined With Innovative isting Mobile Phone Apps for Self-Care Management of People With Alzheimer
Technologies for Patients With Alzheimer’s Disease: A Case Study. American Disease and Related Dementias: Systematic Analysis. JMIR Aging 3, 1: e15290.
Journal of Alzheimer’s Disease & Other Dementias® 30, 1: 69–77. https://doi.org/ https://doi.org/10.2196/15290
10.1177/1533317514539724 [41] Chathurika Palliya Guruge, Sharon Oviatt, Pari Delir Haghighi, and Eliza-
[22] Yao-Jen Chang, Wan Chih Chang, and Tsen-Yung Wang. 2009. Context-aware beth Pritchard. 2021. Advances in Multimodal Behavioral Analytics for Early
prompting to transition autonomously through vocational tasks for individu- Dementia Diagnosis: A Review. In Proceedings of the 2021 International Con-
als with cognitive impairments. In Proceedings of the 11th international ACM ference on Multimodal Interaction, 328–340. Retrieved March 4, 2022 from
SIGACCESS conference on Computers and accessibility (Assets ’09), 19–26. https: https://doi.org/10.1145/3462244.3479933
//doi.org/10.1145/1639642.1639648 [42] Jose Guzman-Parra, Pilar Barnestein-Fonseca, Gloria Guerrero-Pertiñez, Peter
[23] Yao-Jen Chang, Shih-Kai Tsai, and Tsen-Yung Wang. 2008. A context aware Anderberg, Luis Jimenez-Fernandez, Esperanza Valero-Moreno, Jessica Marian
handheld wayfnding system for individuals with cognitive impairments. In Goodman-Casanova, Antonio Cuesta-Vargas, Maite Garolera, Maria Quintana,
Proceedings of the 10th international ACM SIGACCESS conference on Computers Rebeca I García-Betances, Evi Lemmens, Johan Sanmartin Berglund, and Fermin
and accessibility (Assets ’08), 27–34. https://doi.org/10.1145/1414471.1414479 Mayoral-Cleries. 2020. Attitudes and Use of Information and Communication
[24] Gibson Chimamiwa, Marjan Alirezaie, Hadi Banaee, Uwe Köckemann, and Technologies in Older Adults With Mild Cognitive Impairment or Early Stages
Amy Loutf. 2019. Towards Habit Recognition in Smart Homes for People with of Dementia and Their Caregivers: Cross-Sectional Study. Journal of Medical
Dementia. In Ambient Intelligence (Lecture Notes in Computer Science), 363–369. Internet Research 22, 6: e17253. https://doi.org/10.2196/17253
https://doi.org/10.1007/978-3-030-34255-5_29 [43] Bart Hattink, Rose-Marie Droes, Sietske Sikkes, Ellen Oostra, and Afna W
[25] Yi Chu, Pat Brown, Mark Harniss, Henry Kautz, and Kurt Johnson. 2014. Cog- Lemstra. 2016. Evaluation of the Digital Alzheimer Center: Testing Usability
nitive support technologies for people with TBI: current usage and challenges and Usefulness of an Online Portal for Patients with Dementia and Their Carers.
experienced. Disability and Rehabilitation: Assistive Technology 9, 4: 279–285. JMIR Research Protocols 5, 3. https://doi.org/10.2196/resprot.5040
https://doi.org/10.3109/17483107.2013.823631 [44] Soumya Hegde and Ratnavalli Ellajosyula. 2016. Capacity issues and decision-
[26] Dagoberto Cruz-Sandoval and Jesus Favela. 2016. Human-robot interaction to making in dementia. Annals of Indian Academy of Neurology 19, Suppl 1: S34–S39.
deal with problematic behaviors from people with dementia. In Proceedings of https://doi.org/10.4103/0972-2327.192890
the 10th EAI International Conference on Pervasive Computing Technologies for [45] Niels Hendriks, Liesbeth Huybrechts, Andrea Wilkinson, and Karin Slegers.
Healthcare (PervasiveHealth ’16), 274–275. 2014. Challenges in doing participatory design with people with dementia. In
[27] Richard Davies, Chris Nugent, Mark Donnelly, Marike Hettinga, Franka Meiland, Proceedings of the 13th Participatory Design Conference on Short Papers, Industry
Ferial Moelaert, Maurice Mulvenna, Johan Bengtsson, David Craig, and Rose- Cases, Workshop Descriptions, Doctoral Consortium papers, and Keynote abstracts
Marie Dröes. 2009. A user driven approach to develop a cognitive prosthetic to - PDC ’14 - volume 2, 33–36. https://doi.org/10.1145/2662155.2662196
address the unmet needs of people with mild dementia. Pervasive and Mobile [46] Kristine Holbø, Silje Bøthun, and Yngve Dahl. 2013. Safe walking technology for
Computing 5, 3: 253–267. https://doi.org/10.1016/j.pmcj.2008.07.002 people with dementia: what do they want? In Proceedings of the 15th International
[28] Melissa Dawe. 2006. Desperately seeking simplicity: how young adults with cog- ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’13), 1–8.
nitive disabilities and their families adopt assistive technologies. In Proceedings https://doi.org/10.1145/2513383.2513434
of the SIGCHI Conference on Human Factors in Computing Systems, 1143–1152. [47] Kat Holmes. 2018. Mismatch: How Inclusion Shapes Design. MIT Press, Cambridge,
https://doi.org/10.1145/1124772.1124943 MA, USA.
[29] Melissa Dawe. 2007. Understanding mobile phone requirements for young [48] Hilary Hutchinson, Wendy Mackay, Bo Westerlund, Benjamin B. Bederson,
adults with cognitive disabilities. In Proceedings of the 9th international ACM Allison Druin, Catherine Plaisant, Michel Beaudouin-Lafon, Stéphane Conversy,
SIGACCESS conference on Computers and accessibility (Assets ’07), 179–186. Helen Evans, Heiko Hansen, Nicolas Roussel, and Björn Eiderbäck. 2003. Tech-
https://doi.org/10.1145/1296843.1296874 nology probes: inspiring design for and with families. In Proceedings of the
[30] Emma Dixon, Jesse Anderson, and Amanda Lazar. 2022. Understanding How SIGCHI Conference on Human Factors in Computing Systems (CHI ’03), 17–24.
Sensory Changes Experienced by Individuals with a Range of Age-Related https://doi.org/10.1145/642611.642616
Cognitive Changes can Afect Technology Use. ACM Transactions on Accessible [49] W3C Web Accessibility Initiative (WAI). Web Content Accessibility Guidelines
Computing. https://doi.org/10.1145/3511906 (WCAG) Overview. Web Accessibility Initiative (WAI). Retrieved January 9, 2021
[31] Emma Dixon and Amanda Lazar. 2020. The Role of Sensory Changes in Ev- from https://www.w3.org/WAI/standards-guidelines/wcag/
eryday Technology use by People with Mild to Moderate Dementia. In The [50] Shaun K. Kane, Chandrika Jayant, Jacob O. Wobbrock, and Richard E. Ladner.
22nd International ACM SIGACCESS Conference on Computers and Accessibility 2009. Freedom to roam: a study of mobile device adoption and accessibility for
(ASSETS ’20), 1–12. https://doi.org/10.1145/3373625.3417000 people with visual and motor disabilities. In Proceedings of the 11th international
[32] Emma Dixon, Anne Marie Piper, and Amanda Lazar. 2021. “Taking care of myself ACM SIGACCESS conference on Computers and accessibility (Assets ’09), 115–122.
as long as I can”: How People with Dementia Confgure Self-Management Sys- https://doi.org/10.1145/1639642.1639663
tems. In Proceedings of the 2021 CHI Conference on Human Factors in Computing [51] Shaun K. Kane, Barbara Linam-Church, Kyle Althof, and Denise McCall. 2012.
Systems, 1–14. https://doi.org/10.1145/3411764.3445225 What we talk about: designing a context-aware communication tool for people
[33] Emma Dixon, Ashrith Shetty, Simone Pimento, and Amanda Lazar. 2021. Lessons with aphasia. In Proceedings of the 14th international ACM SIGACCESS conference
Learned from Remote User-Centered Design with People with Dementia. In on Computers and accessibility - ASSETS ’12, 49. https://doi.org/10.1145/2384916.
Proceedings of the 2021 Dementia Lab Conference (Design for Inclusion). 2384926
[34] Leah Findlater, Karyn Mofatt, Joanna McGrenere, and Jessica Dawson. 2009. [52] Yvonne Kerkhof, Ad Bergsma, Maud Graf, and Rose-Marie Dröes. 2017. Se-
Ephemeral adaptation: the use of gradual onset to improve menu selection lecting apps for people with mild dementia: Identifying user requirements for
performance. In Proceedings of the SIGCHI Conference on Human Factors in apps enabling meaningful activities and self-management. Journal of Reha-
Computing Systems, 1655–1664. https://doi.org/10.1145/1518701.1518956 bilitation and Assistive Technologies Engineering 4: 2055668317710593. https:
[35] Ed Freeman, Linda Clare Dr, Nada Savitch, Lindsay Royan, Rachael Litherland, //doi.org/10.1177/2055668317710593
and Margot Lindsay. 2005. Improving website accessibility for people with
ASSETS ’22, October 23–26, 2022, Athens, Greece Emma Dixon et al.

[53] Yvonne Kerkhof, Gianna Kohl, Melanie Veijer, Floriana Mangiaracina, Ad us/ofce/send-an-email-message-with-a-follow-up-reminder-740a3b9e-
Bergsma, Maud Graf, and Rose-Marie Dröes. 2020. Randomized controlled e837-4711-938a-08dd0ea5ac64
feasibility study of FindMyApps: frst evaluation of a tablet-based interven- [70] Masahiro Mori, Karl F. MacDorman, and Norri Kageki. 2012. The Uncanny
tion to promote self-management and meaningful activities in people with Valley [From the Field]. IEEE Robotics Automation Magazine 19, 2: 98–100. https:
mild dementia. Disability and Rehabilitation: Assistive Technology 0, 0: 1–15. //doi.org/10.1109/MRA.2012.2192811
https://doi.org/10.1080/17483107.2020.1765420 [71] Kimberly D. Mueller, Bruce Hermann, Jonilda Mecollari, and Lyn S. Turk-
[54] Yvonne Kerkhof, Myrna Pelgrum-Keurhorst, Floriana Mangiaracina, Ad stra. 2018. Connected speech and language in mild cognitive impairment and
Bergsma, Guus Vrauwdeunt, Maud Graf, and Rose-Marie Dröes. 2019. User- Alzheimer’s disease: A review of picture description tasks. Journal of Clini-
participatory development of FindMyApps; a tool to help people with mild de- cal and Experimental Neuropsychology 40, 9: 917–939. https://doi.org/10.1080/
mentia fnd supportive apps for self-management and meaningful activities. DIG- 13803395.2018.1446513
ITAL HEALTH 5: 205520761882294. https://doi.org/10.1177/2055207618822942 [72] Maurice Mulvenna, Suzanne Martin, Stefan Sävenstedt, Johan Bengtsson, Franka
[55] Helianthe S. M. Kort and Joost van Hoof. 2014. Design of a website for home Meiland, Rose Marie Dröes, Marike Hettinga, Ferial Moelaert, and David Craig.
modifcations for older persons with dementia. Technology and Disability 26, 1: 2010. Designing & evaluating a cognitive prosthetic for people with mild de-
1–10. https://doi.org/10.3233/TAD-140399 mentia. In Proceedings of the 28th Annual European Conference on Cognitive
[56] Clemens Scott Kruse, Joanna Fohn, Gilson Umunnakwe, Krupa Patel, and Ergonomics - ECCE ’10, 11. https://doi.org/10.1145/1962300.1962306
Saloni Patel. 2020. Evaluating the Facilitators, Barriers, and Medical Out- [73] David Nandigam, Judith Symonds, Nicola Kayes, and Kathryn McPherson. 2010.
comes Commensurate with the Use of Assistive Technology to Support Peo- Mobile phone user interface design for patients with traumatic brain injury.
ple with Dementia: A Systematic Review Literature. Healthcare 8, 3. https: In Proceedings of the 11th International Conference of the NZ Chapter of the
//doi.org/10.3390/healthcare8030278 ACM Special Interest Group on Human-Computer Interaction (CHINZ ’10), 69–72.
[57] Haley M LaMonica, Amelia English, Ian B Hickie, Jerome Ip, Catriona Ireland, https://doi.org/10.1145/1832838.1832850
Stacey West, Tim Shaw, Loren Mowszowski, Nick Glozier, Shantel Dufy, Alice A [74] National Institute of Aging. 2021. Alzheimer’s Disease Fact Sheet. National
Gibson, and Sharon L Naismith. 2017. Examining Internet and eHealth Practices Institute on Aging. Retrieved October 11, 2021 from http://www.nia.nih.gov/
and Preferences: Survey Study of Australian Older Adults With Subjective Mem- health/alzheimers-disease-fact-sheet
ory Complaints, Mild Cognitive Impairment, or Dementia. Journal of Medical [75] Donald Norman. 1999. Afordance, conventions, and design. Interactions 6: 38–42.
Internet Research 19, 10: e358. https://doi.org/10.2196/jmir.7981 https://doi.org/10.1145/301153.301168
[58] Jonathan Lazar, Libby Kumin, and Jinjuan Heidi Feng. 2011. Understanding [76] Donald A. Norman and Stephen W. Draper (eds.). 1986. User Centered System
the computer skills of adult expert users with down syndrome: an exploratory Design: New Perspectives on Human-computer Interaction. CRC Press, Hillsdale,
study. In The proceedings of the 13th international ACM SIGACCESS conference NJ.
on Computers and accessibility (ASSETS ’11), 51–58. https://doi.org/10.1145/ [77] Chris D. Nugent, Richard J. Davies, Mark P. Donnelly, Josef Hallberg, Mossaab
2049536.2049548 Hariz, David Craig, Franka Meiland, Ferial Moelaert, Johan E. Bengtsson, Stefan
[59] Raphael J. Leo. 1999. Competency and the Capacity to Make Treatment Deci- Savenstedt, Maurice Mulvenna, and Rose-Marie Droes. 2008. The development of
sions: A Primer for Primary Care Physicians. Primary Care Companion to The personalised cognitive prosthetics. In 2008 30th Annual International Conference
Journal of Clinical Psychiatry 1, 5: 131–141. of the IEEE Engineering in Medicine and Biology Society, 787–790. https://doi.org/
[60] Clayton Lewis, James Sullivan, and Jefery Hoehl. 2009. Mobile Technology 10.1109/IEMBS.2008.4649270
for People with Cognitive Disabilities and Their Caregivers – HCI Issues. In [78] Louise Nygård and S Starkhammar. 2007. The use of everyday technology by
Universal Access in Human-Computer Interaction. Addressing Diversity (Lecture people with dementia living alone: Mapping out the difculties. Aging & mental
Notes in Computer Science), 385–394. https://doi.org/10.1007/978-3-642-02707- health 11: 144–55. https://doi.org/10.1080/13607860600844168
9_44 [79] Siobhan O’Connor, Matt-Mouley Bouamrane, Catherine A. O’Donnell, and
[61] Stephen Lindsay, Katie Brittain, Daniel Jackson, Cassim Ladha, Karim Ladha, and Frances Mair. 2016. Barriers to Co-Designing Mobile Technology with Persons
Patrick Olivier. 2012. Empathy, participatory design and people with dementia. with Dementia and Their Carers. Nursing Informatics. https://doi.org/10.3233/
In Proceedings of the 2012 ACM annual conference on Human Factors in Computing 978-1-61499-658-3-1028
Systems - CHI ’12, 521. https://doi.org/10.1145/2207676.2207749 [80] Roger Orpwood, Chris Gibbs, Timothy Adlam, Ricahrd Faulkner, and D. Meega-
[62] Philippe Lopes, Maribel Pino, Giova Carletti, Sofana Hamidi, Sylvie Legué, hawatte. 2005. The design of smart homes for people with dementia—user-
Helene Kerhervé, Samuel Benveniste, Guillaume Andéol, Pierre Bonsom, S interface aspects. Universal Access in the Information Society 4, 2: 156–164.
Reingewirtz, and Anne-Sophie Rigaud. 2016. Co-Conception Process of an https://doi.org/10.1007/s10209-005-0120-7
Innovative Assistive Device to Track and Find Misplaced Everyday Objects for [81] Martin Pielot, Karen Church, and Rodrigo de Oliveira. 2014. An in-situ study of
Older Adults with Cognitive Impairment: The TROUVE Project. IRBM 37, 2: mobile phone notifcations. In Proceedings of the 16th international conference
52–57. https://doi.org/10.1016/j.irbm.2016.02.004 on Human-computer interaction with mobile devices & services - MobileHCI ’14,
[63] Ahmad Lotf, Caroline Langensiepen, Sawsan M. Mahmoud, and M. J. Akhlagh- 233–242. https://doi.org/10.1145/2628363.2628364
inia. 2012. Smart homes for the elderly dementia suferers: identifcation and [82] Matthew L. Poole, Amy Brodtmann, David Darby, and Vogel. 2017. Motor Speech
prediction of abnormal behaviour. Journal of Ambient Intelligence and Human- Phenotypes of Frontotemporal Dementia, Primary Progressive Aphasia, and Pro-
ized Computing 3, 3: 205–218. https://doi.org/10.1007/s12652-010-0043-x gressive Apraxia of Speech. Journal of Speech, Language, and Hearing Research
[64] Kent Lyons, Christopher Skeels, and Thad Starner. 2005. Providing support for 60, 4: 897–911. https://doi.org/10.1044/2016_JSLHR-S-16-0140
mobile calendaring conversations: a wizard of oz evaluation of dual–purpose [83] Leon Reicherts, Nima Zargham, Michael Bonfert, Yvonne Rogers, and Rainer
speech. In Proceedings of the 7th international conference on Human computer Malaka. 2021. May I Interrupt? Diverging Opinions on Proactive Smart Speakers.
interaction with mobile devices & services (MobileHCI ’05), 243–246. https://doi. In CUI 2021 - 3rd Conference on Conversational User Interfaces (CUI ’21), 1–10.
org/10.1145/1085777.1085821 https://doi.org/10.1145/3469595.3469629
[65] Kent Lyons, Christopher Skeels, Thad Starner, Cornelis M. Snoeck, Benjamin [84] Merja Riikonen, Eija Paavilainen, and Hannu Salo. 2013. Factors supporting the
A. Wong, and Daniel Ashbrook. 2004. Augmenting conversations using dual- use of technology in daily life of home-living people with dementia. Technology
purpose speech. In Proceedings of the 17th annual ACM symposium on User & Disability 25, 4: 233–243. https://doi.org/10.3233/TAD-130393
interface software and technology (UIST ’04), 237–246. https://doi.org/10.1145/ [85] Lena Rosenberg, Anders Kottorp, Bengt Winblad, and Louise Nygård. 2009.
1029632.1029674 Perceived difculty in everyday technology use among older adults with or
[66] Camilla Malinowsky, Ove Almkvist, Anders Kottorp, and Louise Nygård. 2010. without cognitive defcits. Scandinavian Journal of Occupational Therapy 16, 4:
Ability to manage everyday technology: a comparison of persons with dementia 216–226. https://doi.org/10.3109/11038120802684299
or mild cognitive impairment and older adults without cognitive impairment. [86] Meredeth A. Rowe and Vikki Bennett. 2003. A look at deaths occurring in persons
Disability and Rehabilitation: Assistive Technology 5, 6: 462–469. https://doi.org/ with dementia lost in the community. American Journal of Alzheimer’s Disease
10.3109/17483107.2010.496098 & Other Dementias® 18, 6: 343–348. https://doi.org/10.1177/153331750301800612
[67] Dana McKay and Charlynn Miller. 2021. Standing in the Way of Control: A Call [87] Yoichi Sakai, Yuuko Nonaka, Kiyoshi Yasuda, and Yukiko I. Nakano. 2012. Lis-
to Action to Prevent Abuse through Better Design of Smart Technologies. In tener agent for elderly people with dementia. In Proceedings of the seventh
Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, annual ACM/IEEE international conference on Human-Robot Interaction (HRI ’12),
1–14. https://doi.org/10.1145/3411764.3445114 199–200. https://doi.org/10.1145/2157689.2157754
[68] Franka Meiland, Ans Bouman, Stefan Sävenstedt, Sanne Bentvelzen, Richard [88] Vardit Sarne-Fleischmann, Noam Tractinsky, Tzvi Dwolatzky, and Inbal Rief.
Davies, Maurice Mulvenna, Chris Nugent, Ferial Moelaert, Marike Hettinga, 2011. Personalized reminiscence therapy for patients with Alzheimer’s disease
Johan Bengtsson, and Rose-Marie Dröes. 2012. Usability of a new electronic using a computerized system. In Proceedings of the 4th International Conference
assistive device for community-dwelling persons with mild dementia. Aging & on PErvasive Technologies Related to Assistive Environments (PETRA ’11), 1–4.
Mental Health 16, 5: 584–591. https://doi.org/10.1080/13607863.2011.651433 https://doi.org/10.1145/2141622.2141679
[69] Microsoft. 2022. Send an email message with a follow-up reminder. Microsoft [89] Ben Shneiderman. 2022. Human-Centered AI. Oxford University Press.
Support. Retrieved June 24, 2022 from https://support.microsoft.com/en-
Mobile Phone Use by People with Mild to Moderate Dementia: Uncovering Challenges and Identifying Opportunities ASSETS ’22, October 23–26, 2022, Athens, Greece

[90] Andrew Sixsmith, Grant Gibson, Roger Orpwood, and Judith Torrington. 2007. [98] Gregg Vanderheiden, Jutta Treviranus, José Angel Martinez Usero, Evangelos
Developing a technology ‘wish-list’ to enhance the quality of life of people with Bekiaris, Maria Gemou, and Amrish Chourasia. 2012. Auto-Personalization:
dementia. Gerontechnology 6. https://doi.org/10.4017/gt.2007.06.01.002.00 Theory, Practice and Cross-Platform Implementation. Proceedings of the Human
[91] Aida Suárez-González, Susie M. Henley, Jill Walton, and Sebastian J. Crutch. Factors and Ergonomics Society Annual Meeting 56: 926–930. https://doi.org/10.
2015. Posterior Cortical Atrophy: An Atypical Variant of Alzheimer Disease. 1177/1071181312561193
Psychiatric Clinics 38, 2: 211–220. https://doi.org/10.1016/j.psc.2015.01.009 [99] W3C Working Group. 2021. Making Content Usable for People with Cognitive
[92] Hiroki Tanaka, Hiroyoshi Adachi, Norimichi Ukita, Takashi Kudo, and Satoshi and Learning Disabilities. Retrieved March 4, 2022 from https://www.w3.org/
Nakamura. 2016. Automatic detection of very early stage of dementia through TR/coga-usable/#dfn-cognitive-and-learning-disabilities
multimodal interaction with computer avatars. In Proceedings of the 18th ACM [100] Lin Wan, Claudia Müller, Dave Randall, and Volker Wulf. 2016. Design of A
International Conference on Multimodal Interaction (ICMI ’16), 261–265. https: GPS Monitoring System for Dementia Care and its Challenges in Academia-
//doi.org/10.1145/2993148.2993193 Industry Project. ACM Transactions on Computer-Human Interaction 23, 5: 1–36.
[93] The Health Research Authority. 2005. Mental Capacity Act. Retrieved June 29, https://doi.org/10.1145/2963095
2021 from https://www.hra.nhs.uk/planning-and-improving-research/policies- [101] Lin Wan, Claudia Müller, Volker Wulf, and David William Randall. 2014. Address-
standards-legislation/mental-capacity-act/ ing the subtleties in dementia care: pre-study & evaluation of a GPS monitoring
[94] Federico Tiersen, Philippa Batey, Matthew J. C. Harrison, Lenny Naar, Alina- system. In Proceedings of the 32nd annual ACM conference on Human factors in
Irina Serban, Sarah J. C. Daniels, and Rafael A. Calvo. 2021. Smart Home Sensing computing systems - CHI ’14, 3987–3996. https://doi.org/10.1145/2556288.2557307
and Monitoring in Households With Dementia: User-Centered Design Approach. [102] Jing Wei, Tilman Dingler, and Vassilis Kostakos. 2021. Developing the Proactive
JMIR Aging 4, 3: e27047. https://doi.org/10.2196/27047 Speaker Prototype Based on Google Home. In Extended Abstracts of the 2021
[95] Esther Y. C. Tiong, David M. W. Powers, and Anthony J. Maeder. 2018. Dementia CHI Conference on Human Factors in Computing Systems, 1–6. Retrieved March
virtual assistant as trainer and therapist: identifying signifcant memories and 4, 2022 from https://doi.org/10.1145/3411763.3451642
interventions of dementia patients. In Proceedings of the Australasian Computer [103] Jing Wei, Tilman Dingler, and Vassilis Kostakos. 2022. Understanding User
Science Week Multiconference (ACSW ’18), 1–9. https://doi.org/10.1145/3167918. Perceptions of Proactive Smart Speakers. Proceedings of the ACM on Interactive,
3167953 Mobile, Wearable and Ubiquitous Technologies 5, 4: 185:1-185:28. https://doi.org/
[96] Kelvin KF Tsoi, Max WY Lam, Christopher TK Chu, Michael PF Wong, and 10.1145/3494965
Helen ML Meng. 2018. Machine Learning on Drawing Behavior for Dementia [104] Eleanor Bantry White and Paul Montgomery. 2015. Dementia, walking outdoors
Screening. In Proceedings of the 2018 International Conference on Digital Health and getting lost: incidence, risk factors and consequences from dementia-related
(DH ’18), 131–132. https://doi.org/10.1145/3194658.3194659 police missing-person reports. Aging & Mental Health 19, 3: 224–230. https:
[97] Gregg Vanderheiden, Jonathan Lazar, J. Bern Jordan, Yao Ding, and Rachel E. //doi.org/10.1080/13607863.2014.924091
Wood. 2020. Morphic: Auto-Personalization on a Global Scale. In Proceedings of [105] Jacob O. Wobbrock, Shaun K. Kane, Krzysztof Z. Gajos, Susumu Harada, and
the 2020 CHI Conference on Human Factors in Computing Systems, 1–12. Retrieved Jon Froehlich. 2011. Ability-Based Design: Concept, Principles and Examples.
January 27, 2022 from https://doi.org/10.1145/3313831.3376204 ACM Transactions on Accessible Computing 3, 3: 9:1-9:27. https://doi.org/10.1145/
1952383.1952384
ASSETS ’22, October 23–26, 2022, Athens, Greece Emma Dixon et al.

A APPENDICES
Table 2: Table of Apps and Features from Participants Self-reported Mobile Phone Usage

Type of App or Feature App or Feature Participants who Reported Using that App or Feature Number of
Participants
Social Media Facebook Thomas, Elenora, Kim, Tristen, Sylvia, Kennith, Tina, Malcolm, Josslyn, Hall, 11
Alecia
YouTube Sylvia, Preston, Kennith, Hall 4
Instagram Elenora, Tina 2
SnapChat Alecia, Sabella 2
TikTok Sabella 1
LinkedIn Thomas 1
Twitter Thomas 1
Navigation Waze Tristen, Kennith, Malcolm 3
Google Maps Elenora, Hall 2
Apple Maps Thomas 1
Life360 Thomas 1
Uber Elenora 1
Lift Elenora 1
Unspecifed Kim, Miranda 2
Online Shopping Amazon Kim, Josslyn, Alecia 3
Sams App Kim 1
Ebay Alecia 1
Mobile Banking Capital One App Elenora 1
Apple Wallet Thomas 1
Venmo Alecia 1
unspecifed Kim, Tristen, Sylvia, Sabella 4
News BBC Malcolm 1
CNN Malcolm 1
Fox Malcolm 1
NPR Malcolm 1
unspecifed Josslyn, Hall 2
Music Pandora Elenora, Hall 2
Shazam Alecia 1
unspecifed Thomas, Tristen, Tina, Alecia, Sabella 5
Communication Email Thomas, Elenora, Tristen, Sylvia, Miranda, Tina, Malcolm, Josslyn, Sabella 9
phone calls Kim, Tristen, Preston, Kennith, Josslyn, Hall, Sabella 7
Messenger Kim, Kennith, Tina, Malcolm, Hall, Sabella 6
SMS Tristen, Sylvia, Preston, Josslyn, Alecia 5
FaceTime Miranda, Tina, Malcolm, Sabella 4
Zoom Sylvia, Alecia 2
Google Meet Miranda 1
Evite Kennith 1
Blogs Tina 1
WhatsApp Elenora 1
Games Words with Friends Kennith 1
Solitaire Hall 1
memory games Sabella 1
unspecifed Kim, Alecia 2
Entertainment Photos Thomas, Elenora, Sylvia, Preston, Hall, Alecia 6
Audible Kim, Hall, Alecia 3
Bible app Kim, Hall 2
Deer Cast Hall 1
Apple TV Sabella 1
Planter - Garden Planner Kim 1
app
ESPN Hall 1

Frameo Alecia 1
GoFan Hall 1
E-books Thomas 1
Productivity Reminders Thomas, Elenora, Tristen, Sylvia, Miranda, Malcolm, Josslyn, Hall 8
Google Search Preston, Miranda, Kennith, Tina, Malcolm, Josslyn, Sabella 7
Calendar Thomas, Kim, Preston, Kennith, Tina, Hall, Alecia 7
Google Calendar Elenora, Tristen, Sylvia, Miranda, Malcolm, Josslyn 6
Calculator Thomas, Elenora, Kim, Alecia 4
Weather Kim, Tina, Hall, Alecia 4
KeepNote Elenora, Sylvia, Alecia 3
Clock Elenora, Tristen, Alecia 3
Google Workspace Thomas, Sylvia 2
Timer Kim 1
Notes Kim 1
Outlook Alecia 1
Evernote Elenora 1
Flashlight Thomas 1
Health Management Fitbit Alecia 1
pharmacy apps Kim 1
unspecifed
patient portals Sabella 1
glucose monitoring app Elenora 1
fall detection apps Miranda 1
Freedom to Choose: Understanding Input Modality Preferences
of People with Upper-body Motor Impairments for Activities of
Daily Living
Franklin Mingzhe Li Michael Xieyang Liu
Carnegie Mellon University Carnegie Mellon University
Pittsburgh, Pennsylvania, USA Pittsburgh, Pennsylvania, USA
mingzhe2@cs.cmu.edu xieyangl@cs.cmu.edu

Yang Zhang Patrick Carrington


University of California, Los Angeles Carnegie Mellon University
Los Angeles, California, USA Pittsburgh, Pennsylvania, USA
yangzhang@ucla.edu pcarrington@cmu.edu

ABSTRACT In The 24th International ACM SIGACCESS Conference on Computers and


Many people with upper-body motor impairments encounter chal- Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New
York, NY, USA, 16 pages. https://doi.org/10.1145/3517428.3544814
lenges while performing Activities of Daily Living (ADLs) and
Instrumental Activities of Daily Living (IADLs), such as toileting,
1 INTRODUCTION
grooming, and managing fnances, which have impacts on their
Quality of Life (QOL). Although existing assistive technologies en- Activities of Daily Living (ADLs) is a term used to collectively de-
able people with upper-body motor impairments to use diferent scribe fundamental tasks that are required for a person to care
input modalities to interact with computing devices independently for themselves, independently [18]. They are often broken up into
(e.g., using voice to interact with a computer), many people still two categories: 1) basic ADLs, such as eating, bathing, and groom-
require Personal Care Assistants (PCAs) to perform ADLs. Mul- ing, and 2) Instrumental Activities of Daily Living (IADLs), such
timodal input has the potential to enable users to perform ADLs as managing fnances, shopping, house cleaning, and managing
without human assistance. We conducted 12 semi-structured in- communication. Throughout this paper, we will use “ADLs” to re-
terviews with people who have upper-body motor impairments fer to tasks in both of these categories. Collectively, these tasks
to capture their existing practices and challenges of performing form a base from which society understands a person’s essential
ADLs, identify opportunities to expand the input possibilities for independence and begins to evaluate Quality of Life (QOL) [22].
assistive devices, and understand user preferences for multimodal Since many of these tasks involve interacting with objects and one’s
interaction during everyday tasks. Finally, we discuss implications environment using hands and arms, people with upper-body motor
for the design and use of multimodal input solutions to support impairments can experience difculties completing these activities
user independence and collaborative experiences when performing [56]. Prior work has shown that people generally adopt one of
daily living tasks. two solutions to help with ADL tasks: digital assistance through
assistive technologies [24, 74]) or human assistance provided by
CCS CONCEPTS Personal Care Assistants (PCAs) [50]. Although there has been
prior work to develop assistive technologies that can empower peo-
• Human-centered computing → Empirical studies in acces-
ple with disabilities to interact with diferent computing devices
sibility.
by leveraging alternate input modalities, many users still require
human assistance to complete tasks efectively [50].
KEYWORDS
The reliance on human assistance is due, in part, to a lack of
People with upper-body motor impairments, multimodal, assistive knowledge on how to efectively leverage various input modali-
technology, activity of daily living ties, which have become increasingly common among emerging
ACM Reference Format: computing devices. We have not fully explored the potential for
Franklin Mingzhe Li, Michael Xieyang Liu, Yang Zhang, and Patrick Carring- multimodal input to support people with upper-body motor im-
ton. 2022. Freedom to Choose: Understanding Input Modality Preferences of pairments in ADL tasks that do not completely rely on computing
People with Upper-body Motor Impairments for Activities of Daily Living. technology like mobile devices and laptops. Furthermore, we do
not fully understand the collaborative role of technology and PCAs
in completing ADL tasks.
This work is licensed under a Creative Commons Attribution International To begin addressing these gaps, we conducted an interview study
4.0 License.
with 12 people with upper-body motor impairments. We frst aimed
ASSETS ’22, October 23–26, 2022, Athens, Greece to understand the current practices of people with upper-body mo-
© 2022 Copyright held by the owner/author(s). tor impairments while performing ADLs, along with the challenges
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3544814 they encounter. We then explored the potential uses of diferent
ASSETS ’22, October 23–26, 2022, Athens, Greece Li et al.

input modalities during specifc daily living tasks. Finally, we high- limitations of upper-body motor impairments [10, 12, 13, 28]. In or-
lighted potential applications and implications for the design of der to further support the independence of people with upper-body
multimodal assistive technologies to facilitate user input during motor impairments in ADLs and improve the QOL, it is important to
ADL tasks. Our research questions are: frst understand how people with upper-body motor impairments
RQ1: What are the current practices and challenges of people with perform diferent ADLs and associated challenges.
upper-body motor impairments during ADLs?
RQ2: How and why would various input modalities beneft people 2.2 Input Modalities for People with
with upper-body motor impairments during ADLs? Upper-body Motor Impairments
RQ3: How can we leverage multimodal input to support people Prior research also explored various input modalities for people
with upper-body motor impairments during ADLs? with upper-body motor impairments to interact with diferent sys-
In the sections that follow, we frst describe upper-body motor tems to compensate the upper-body motor impairments [28, 36],
impairments and ADLs (Section 2.1). We then summarize related lit- such as controlling mobile devices [15, 19, 41], performing text-
erature on both common and emerging input modalities for people entry tasks [60, 71], interacting with TVs [64], using robotic arms
with upper-body motor impairments (Section 2.2) as well as prior [5], and playing games [21]. These approaches either rely on readily
research examining multimodal interactions for assistive applica- available devices (e.g., [19]) or customized technologies (e.g., [5, 64]).
tions (Section 2.3). Next, we describe our semi-structured interview To interact with these devices, prior research explored various in-
study, the subsequent analysis, and our fndings. Specifcally, we put modalities for people with upper-body motor impairments—
describe practices and challenges of existing approaches toward touch input [11, 60, 65], hand or arm gestures [3, 4, 58], voice input
ADLs (Section 4). We interrogate users’ preferences for applying [6, 24, 57], eye-based input [19, 20, 31, 74], head-movement input
individual inputs and combinations of inputs to support specifc [15, 55], brain-computer input [17, 46], facial or mouth-based ges-
daily living tasks (Section 5). We then describe reasons for indi- tures [23, 45, 67], and biometrics [30, 32, 47]. Table 1 summarizes
vidual input modality and multimodal input alternatives (Section input modalities presented in prior work, from which we took much
6). Finally, we provide design recommendations for future input inspiration in generating our user studies.
solutions to support people with upper-body motor impairments Touch Input: Existing research has explored touch input meth-
during ADLs (Section 7), including creating multimodal designs ods and customization on computing devices to support people who
that consider collaborative experiences in ADLs, diferentiating be- have upper-body motor impairments. For example, Vatavu and Un-
tween interaction with computing devices and systems to support gurean [65] conducted stroke-gesture analyses from a dataset of
traditional ADLs (e.g., toileting), and consideration for actuation 9681 gestures collected from 70 participants with motor impair-
and human-robot interaction during multimodal interactions. ments to outline the research roadmap for accessible gesture input
on touchscreens. Similar explorations on touch input have also
2 BACKGROUND AND RELATED WORK been conducted using various devices, such as trackballs [71, 72],
In this section, we frst provide background information regarding joysticks [60], smartphones [44, 48], tablets [65], smartwatches [41],
upper-body motor impairments and the importance of performing head-mounted displays [42], and customized touchpads [11].
ADLs (Section 2.1). Next, we describe existing literature that ex- Voice-based Input: Voice-based input has been studied to help
plored diferent input modalities to assist people with upper-body people with upper-body motor impairments in controlling diferent
motor impairments (Section 2.2). Finally, we show existing research IoT devices (e.g., [53]), substituting inaccessible input techniques
on multimodal input and how it may beneft ways of interactions (e.g., [6]), or performing specifc tasks (e.g., drawing [24], program-
(Section 2.3). ming [57]). For example, Rosenblatt et al. [57] demonstrated that
using vocal input could help people with upper-body motor impair-
2.1 Upper-body Motor Impairments and ADLs ments navigate and edit code in programming.
The term Upper-body motor impairment usually refers to motor im- Eye-based Gesture: Furthermore, we found that eye-based ges-
pairments that afect the upper extremities, which are often caused tures, such as using eye-gaze fxations [74] and eyelid gestures
by spinal cord injury, cerebral palsy, muscular dystrophy, etc [69]. [19], could support people with upper-body motor impairments
People with upper-body motor impairments may have diferent in interacting with digital interfaces. For example, Zhang et al.
mobility conditions of their upper limbs, such as fne motor and [74] demonstrated decoding eye gestures (e.g., looking up or look-
gross motor impairments. When rehabilitation specialists assess ing down) into commands that can be used to enable people with
the QOL for people with upper-body motor impairments, they usu- upper-body motor impairments to control mobile devices.
ally evaluate the capability of performing basic and instrumental Head-movement Input: Similar to eye-gaze directions, prior
activities of daily living (i.e., ADLs and IADLs) [22]. Basic ADLs in- research also leveraged head movements and orientations to control
volve essential tasks such as grooming, cooking, toileting, dressing, the pointer on a device [15, 55]. For instance, Cicek et al. [15] pro-
and showering, and instrumental ADLs (IADLs) refer to activities posed a calibration-free head-tracking input mechanism for mobile
that require more planning and thinking, such as using a phone devices that allows people with upper-body motor impairments to
and managing fnances [33]. According to existing research, many achieve pixel-level pointing precision on small screens.
people with upper-body motor impairments rely on PCAs for ADLs Face-based or Mouth-based Input: Besides using eye-based
[50] or use inputs to control devices (e.g., voice) to overcome the or head-based input, we found that existing research also explored
face-based or mouth-based gestures as an input modality to help
Understanding Input Modality Preferences of People with Upper-body Motor Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Input modalities for people with upper-body motor impairments.

Category Description Diagnosis of Motor Impairments Reference

Touch Input Touch control through joystick, track- Spinal cord injuries, spina bifda, [11], [41], [42], [44],
ball, smartphone, tablet, smartwatch, orthostatic tremor, cerebral palsy, [48], [60], [65], [71],
or customized touchpads. muscular dystrophy, multiple scle- [72]
rosis, osteogenesis imperfecta, juve-
nile rjeumatoid arthritis, radial nerve
injury, spastic quadriplegia, hemor-
rhagic stroke
Voice-based Input Use voice to control diferent IoT Spinal cord injury, cerebral palsy, [6], [24], [53], [57]
devices, substituting inaccessible in- stroke, spinal dysmorphism
put techniques, or performing specifc
tasks.
Eye-based Input Use eye-gaze position or eyelid ges- Spinal cord injury, amyotrophic lateral [19], [74]
tures to control devices. sclerosis
Head-movement Input Use head movements and orientations Non-specifc upper body [15], [55]
to control a pointer on devices.
Face-based or Mouth-based Use facial expressions or mouth-based Non-specifc upper body [23], [45], [67]
Input input (e.g., teeth tapping, sip-and-puf)
to interact with devices.
Hand or Arm Gesture Use personalized hand or arm gestures Cerebral palsy, hydrocephalus, [3], [4]
to interact with computing systems de- quadriplegia, spastic quadriplegia,
pending on the upper-body mobility. static encephalitis, pseudobulbar
palsy
Brain-Computer Interface Use Electroencephalogram (EEG) or Muscular dystrophy, stroke [16], [17], [46]
Electromyography (EMG) to allow
people with upper-body motor impair-
ments to interact with devices or to
understand their needs.
Biometric Input Use biometric information (e.g., fnger- Cerebral palsy [30], [47]
print, voice) for security and privacy
purposes.
Automatic Recognition or Use automatic recognition to respond Non-specifc upper body [8]
Other Input to the user without explicit input (e.g.,
automatic door opener) or other input
modalities.

people with motor impairments to interact with their devices [23, computing systems based on the abilities of people with upper-body
45, 61, 67]. For example, Wang et al. [67] introduced the use of motor impairments [3, 4].
facial expressions as controls in games, such as Super Mario Bros., Brain-Computer Interface: In addition, existing research also
and Grewal et al. [23] showed the approach of using sip-and-puf explored EEG (Electroencephalography) or EMG (Electromyog-
systems to place commands to a power wheelchair. raphy) approach that allows people with upper-body motor im-
Hand or Arm Gesture: From existing literature, we learned pairments to interact with technologies through brain-computer
that people with upper-body motor impairments might have vari- interfaces (BCI) or to understand their needs [16, 17, 46, 49]. For
ous levels and conditions of controlling their upper extremity to example, Neuper et al. [49] introduced the approach of EEG-based
interact with their devices (e.g., fne motor, gross motor) [48]. Thus, brain-computer interface to help people with severe motor im-
prior research explored various recognition approaches that allow pairments accomplish tasks like selecting fne-grain letters, where
detection of personalized hand or arm gestures for interaction with people need to precisely specify the start and end boundary of the
selection.
ASSETS ’22, October 23–26, 2022, Athens, Greece Li et al.

Biometric Input: As people with upper-body motor impair- 3 METHOD


ments are becoming more conscious of privacy and security through In this work, we conducted semi-structured interviews with people
daily activities [32], researchers further proposed approaches to who have upper-body motor impairments to learn about their ex-
leverage biometric information for people with upper-body motor isting practices and challenges when performing ADLs, and their
impairments [30, 34, 47], such as fngerprints. For instance, Lewis preferences for various input modalities. In addition, we inves-
and Venkatasubramanian [34] mentioned the existing approaches tigated opportunities for multimodal input to help people with
of using fngerprint and face recognition to assist people with upper- upper-body motor impairments in ADLs.
body motor impairments in completing the authentication process
to control their devices. 3.1 Participants
Overall, we introduced existing research that explored various in-
We recruited 12 participants with upper-body motor impairments
put modalities to help people with upper-body motor impairments
to participate in our study (Table 2). Participants were recruited
interact with their computing devices for diferent accomplishments
through online platforms (e.g., Reddit, Twitter, Facebook) and snow-
and under certain environments (e.g., [28]). Existing work mostly
ball sampling. To participate in our study, participants must be 18
focused on input modalities for computing devices (e.g., computers),
years or older, have upper-body motor impairments, have expe-
which does not fully address the problem of high reliance on PCAs
riences with assistive technologies, and be able to communicate
for ADLs (e.g., toileting, dressing). We chose to focus on practices
in English. Among the 12 participants we recruited, three of them
and challenges of leveraging diferent input modalities in various
were female, and nine were male (Table 2). They had an average
ADLs. We are also interested in exploring which input modalities
age of 31.6 (SD = 8.0). Four participants stated that they had spinal
are more preferred from the perspectives of people with upper-
cord injuries, four had cerebral palsy, one had stroke, one had pri-
body motor impairments, such as whether people prefer using
mary lateral sclerosis, one had arthrogryposis multiplex congenita,
voice control or head gestures to tweak the water temperature.
and one had muscular dystrophy. The study took around 75 to
90 minutes per participant. Participants were compensated with a
2.3 Multimodal Input Methods for People with
$20 Amazon gift card. The recruitment and study procedure was
Upper-body Motor Impairments approved by the Institutional Review Board (IRB).
Existing research has studied the overall benefts of multimodal
input in human activities, such as achieving low-false positive 3.2 Study Procedure
rates [63] and the accommodation of tasks and context changes 3.2.1 Demographic Background. In our semi-structured interviews,
[54]. More importantly, Reeves et al. [54] also implied the poten- we frst asked the demographic background of our participants (e.g.,
tial benefts of multimodal input to adapt to individual diferences age, gender, descriptions of upper-body motor impairments).
in mobility conditions (e.g., sensor or motor impairments). Simi-
lar to what Reeves et al. [54] projected, prior research leveraged 3.2.2 Current Practices and Challenges of ADLs. We then asked
multiple input modalities to support people with upper-body mo- our participants about their current practices of performing certain
tor impairments to interact with their computing devices (e.g., ADLs [51] (Figure 1) and associated challenges across diferent
[7, 16, 26, 39, 40, 62, 67, 68]). For example, Wang et al. [67] combined ADLs (e.g., rely on PCAs, additional).
eye input gestures (e.g., eye movements to the left and right, double 3.2.3 Input Modality Preferences. Afterwards, we introduced par-
blink) with facial expressions (e.g., smile, open and close mouth) ticipants with existing input modalities for people with upper-
to enable people with upper-body motor impairments to provide body motor impairments from literature (e.g., head-movement in-
input for VR games. As another example, Dupres et al. [16] com- put [15, 55], eye-based input [19, 74], brain-computer interface
bined hand input with brain-computer interfaces for both better [16, 17, 46, 49]) (Table 1). To ensure participants understood the
control of applications in daily life (e.g., web browser, video game) various input modalities and to reduce bias, we created introduction
and enabling researchers to understand behaviors of people with slides of each input modality by including fgures from existing
upper-body motor impairments. Finally, Tomari et al. [62] proposed research (e.g., [74]) and commercially available products (e.g., [2]).
leveraging multimodal input by combining momentary switch and After introducing each input modality, we confrmed with partici-
head recognition to control the direction and orientation of smart pants to make sure they understood diferent input modalities and
wheelchairs. associated applications. If they still had difculties understanding
Although several works leveraged the benefts of combining mul- the diferent input modalities, we provided video demonstrations to
tiple input modalities to better assist people with upper-body motor participants for better understanding. We then asked participants
impairments [70], there exist gaps between people with upper-body to describe how the diferent input modalities may beneft their ex-
motor impairments in ADLs and how multimodal modalities may periences (Table 1) completing each ADLs (Figure 1), and associated
beneft the overall input experiences. We are interested in compre- reasons.
hensively investigating multimodal input as opposed 12ptto only a
few niche interactions and aim to investigate the applications of 3.2.4 Multimodal Input Preferences. Finally, we asked participants
multimodal input in ADLs. As a result, our research provides guide- for their opinions and preferences on combining diferent input
lines to HCI and Accessibility researchers on designing multimodal modalities for each ADL (Figure 1) and associated reasons.
input for people with upper-body motor impairments to help with
their ADLs.
Understanding Input Modality Preferences of People with Upper-body Motor Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 2: Participants’ demographic information.

Participant Gender Age Upper-body Motor Impairments Details


P1 Female 41 Arthrogryposis multiplex congenita Range of motion and strength of arm and leg are limited. Cannot
lift arm or stretch.
P2 Female 23 Cerebral palsy Gross motor and fne motor difculties.
P3 Male 35 Spinal cord injury (C5) Paralyzed from shoulder down, no fnger fexion.
P4 Female 38 Spinal cord injury (C5) Some wrist function and no hand dexterity.
P5 Male 47 Muscular dystrophy Strength is limited and bicep is extremely weak, cannot lift the
arm without gravity.
P6 Male 27 Spinal cord injury (C4/C5) Wrist extension on one side, no fnger mobility, no tricep control,
no fne motor.
P7 Male 24 Cerebral palsy Paralyzed left arm.
P8 Male 33 Spinal cord injury (C5) Have use of bicep, no triceps, can do to mid way point of the
bicep, no sensation further down. No fne motor function on
either hand.
P9 Male 22 Stroke Right arm cannot go past 45 degrees.
P10 Male 34 Cerebral palsy Difculty moving wrist, hand, and cannot fex arm.

P11 Male 20 Cerebral palsy Floppy limbs due to cerebral palsy. Cannot use the left arm at
all. Bicep and tricep functionality are limited.
P12 Male 35 Primary lateral sclerosis Have difculty holding objects and moving around.

using a Miro board [29] to cluster the codes and identify emergent
themes.

4 FINDINGS: CURRENT PRACTICES AND


CHALLENGES OF ADLS BY PEOPLE WITH
UPPER-BODY MOTOR IMPAIRMENTS
In this section, we frst show existing practices of ADLs by our
participants (Section 4.1). We then present associated challenges in
ADLs from people with upper-body motor impairments (Section
4.2).

4.1 Practice of ADLs by People with


Upper-body Motor Impairments
In terms of current practices of ADLs, we learned that participants
Figure 1: Activities of Daily Living [51]. (Bathing, Dress-
rely heavily on Personal Care Assistant (PCA) or their fam-
ing, Grooming, Oral Care, Toileting, Transferring, Moving
ily members to help with diferent ADLs (e.g., dressing, bathing,
Around, Eating, Shopping, Cooking, Managing Medications,
toileting, driving) (Table 3). Based on the responses to each activity
Using the Phone, Housework, Laundry, Driving, Managing
of daily living and statistical analysis, we found that about 67.2%
Finances, Leisure and Other Activities)
(SD = 29.7%) of participants require PCAs for each ADL on average
(Table 3). P3 commented on this situation:
3.3 Data Analysis “...Many of the activities still require my personal as-
The semi-structured interviews were conducted through Zoom [27] sistant for help...tasks like doing my laundry, toileting,
and all interviews were audio-recorded and transcribed. After the and shopping...I have a very limited range of motion,
interviews, two researchers independently performed open cod- and current technologies are not there yet to support
ing [14] on the transcripts. Then the coders met to discuss their me living independently...”
codes and resolve any conficts (e.g., missing codes, disagreement We also found that there is a high disparity between ADLs
on codes). After the two researchers reached a consensus and con- that involves computing devices (e.g., managing fnances) and
solidated the list of codes, they performed afnity diagramming [25] basic ADLs in the reliance on PCAs (e.g., toileting, dressing).
Only 10% of our participants require other people for assistance
ASSETS ’22, October 23–26, 2022, Athens, Greece Li et al.

Table 3: The number of participants that leveraged technologies or PCAs for each ADL and associated sample tasks.

Activities of Number of Sample Tasks through Personal Assistance Number of Sample Tasks through Technology
Daily Living Participants Participants
Required Use
PCAs Technology
Bathing 10 Transfer between wheelchairs and shower 2 Turn on/of the water, control water tem-
bench, control water temperature, dry after a perature
shower, turn on/of the water
Dressing 11 Help with zipping, help with small buttons, feel 1 Use tablet to select clothes
the clothes to make sure it is fat or smooth
Grooming 8 Comb the hair, trim nails, shaving 6 Customized electric hair removing reduce
the required range of motion, automatic
hand-wave dispenser
Oral Care 9 Turn on/of electric toothbrush, cleaning 11 Electric toothbrush, electric fosser
Toileting 12 Transfer, fushing, cleaning 3 Automatic height adjusting hydraulics,
voice control fushing systems
Transferring 9 Transfer from wheelchair to bed 8 Seat elevation, joystick and remote control
to change into a comfortable position
Moving 3 Open/close doors 9 Diferent input modalities to control the
Around wheelchair moving direction and speed
Eating 11 Choose the food on the plate, lift utensils, cut 3 Omnibot to scoop and feed the person,
the food spoon with accelerometer that could level
itself
Shopping 9 Overhead reach of products, put food in bags, 7 Online shopping through accessible smart
carry food devices
Cooking 11 Slicing, getting hot plates of the stove, everyday 4 Voice control microwave, electric cooker,
cooking automatic peeler and cutter
Managing 8 Pill refll, picking up drugs 6 Pill organizer, reminders
Medication
Use the Phone 1 Plug in the charger 10 Smaller phone for touch with less range
of motion, use voice to reduce the need of
motion
Housework 10 Mobbing the foor, dish washing, vaccuming 7 Robot sweeper, roomba, voice-based light
control, TV control
Laundry 11 Folding clothes, general laundry 6 Customized dials and buttons to control
the washer, voice-assisted camera systems
to place commands and instructions
Driving 9 Change settings that are hard to reach, general 5 Hand control of A/C, touchscreen for radio,
driving
Managing 4 Accessing mailbox, write the information on a 11 Keep records on computers, camera to de-
Finances check, count cash posit checks, online banking
Leisure and 1 Horse riding 8 Play video games, walk dog, control TV
other activities
with using the phone and leisure activities (Table 3). However, “...Traditional daily activities or essential activities are
for activities like toileting, dressing, and cooking, over 90% of our usually restricted based on my motion capability and
participants still require other people for assistance (Table 3). P5 also the inaccessible environment. One example is
explained this: that my washer and dryer are in the basement, which
forced me to use PCAs for assistance...”
Understanding Input Modality Preferences of People with Upper-body Motor Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

Among all technologies that are being used to help people with reduce the number of times that I need personal as-
upper-body motor impairments in ADLs, we found that the majority sistants per month...”
of existing input systems for traditional ADLs require addi- We also learned that our participants have concerns about pri-
tional efort for installation, adjustment, and modifcation vacy while relying on other people for assistance in certain ADLs,
supports for individuals with upper extremity capabilities. such as dressing, toileting, and bathing. P2 further commented:
For example, P12 explained how his multimodal shower control “...Although having a personal assistant is the only
system was installed based on the mobility of his upper extremity: way to help me with some activities like toileting and
“...My cousin helped me to use my voice to control bathing, sometimes I feel embarrassed when they see
the temperature and added a big button on the left me using the bathroom or having a shower. Especially
side about my shower seat height to allow me to use when a new PCA comes...”
my less afected left hand to turn on or of the wa- Moreover, we found that there also exists communication bar-
ter...He had to seal all wires, the button, and the mi- riers between PCAs and people with upper-body motor impair-
crophone...He also needs to program and map my ments, such as language barriers (P2) and difculties in verbally
preferred way of controlling the water...” describing the detailed instructions if the person with upper-body
Furthermore, we realized that some of our participants leverage motor impairments cannot be physically in front of certain appli-
both technology and PCAs to collaboratively accomplish cer- ances (P2, P4). P2 further commented on this:
tain ADLs. For example, P11 explained how he could use the elec- “...If a PCA doesn’t know English well, it’s hard for me
tric toothbrush by himself, but needed someone to put toothpaste to communicate with the PCA regarding my needs,
on the brush and press the on/of button: especially for doing the laundry where my washer
“...I have limited motion control of my arm, which is in the basement that is not accessible...like putting
made me hard to press the on/of button on the elec- items that shrink separately, and putting bras in the
tric toothbrush, and it is a tiresome process to put bra bags...”
toothpaste on the brush as well. But I could use the Besides the concerns and challenges while working with PCAs,
limited motion control to brush my teeth after the fve of our participants also expressed concerns about the inability
power is on. So I usually have my personal assistant to contribute to housework when their family members are
add toothpaste for me and turn on the toothbrush, so busy. P8 commented on his situation:
she can focus on other things while I am brushing my
“...I live with my family. My sister and parents usually
teeth...”
help with various daily activities, such as doing laun-
4.2 Challenges with ADLs by People with dry and bathing. I know they are very busy with other
stuf too, and I really want to help them sometimes.
Upper-body Motor Impairments For example, I would like to cook them some meal
According to what we showed above, we found that there exist before they come back home, but existing technology
unique practices for people with upper-body motor impairments does not support me to do so...”
across diferent ADLs. Thus, we further show challenges of diferent In terms of their current challenges of existing technologies, all of
existing approaches toward ADLs and associated challenges. From the participants expressed limited input options they have obtained
the interview, we learned that participants in our study rely heavily and the willingness to try new input modalities. Specifcally, four
on PCAs or their family members to help with diferent ADLs (e.g., participants mentioned that existing input systems mostly rely on
dressing, bathing, toileting, driving). Despite the usefulness and a fxed single input modality for a particular task. They mentioned
necessity of having PCAs for ADLs, they also mentioned limitations the concerns of convenience, efort, and reachability for a
and concerns of having other people assist with ADLs. For example, fxed input modality (P3, P6, P7, P9). For example, P7 explained
P2 mentioned that always relying on PCA for help with ADLs the limitation of using hand waving to control the A/C of his car
may afect her choice of time to do certain ADLs. P2 further when he does not sit on the drivers’ seat:
elaborated on this:
“...I had my car customized by allowing me to use
“...My PCA helped me with basically everything in my hand-waving gestures to control my A/C while driv-
daily activities. I appreciate everything. However, this ing. However, this only works when I am in the dri-
forced me to only be able to do certain activities while ver’s seat. It becomes unreachable for me while I am
my PCA was around. For example, toileting, bathing, sitting in the passenger’s seat...”
and dressing. It would be impossible if I wanted to
Beyond the fxed input modality, three participants mentioned
go out and visit my friend at a certain time without
concerns about the fxed mapping after installation between
having my PCA put my clothes on...”
certain input and specifc tasks, and it is almost impossible to modify
Furthermore, P6 complained about the fnancial burden of the mapping by themselves. P5 commented on her customized dial
hiring personal assistants: pads to control the washer:
“...I am already low income, and hiring personal as- “...I had a technician install this dial pad to control my
sistants is a huge part of my monthly cost. If I can washer a couple of years ago. At that time, I was able
do some tasks myself with my voice or eye, it would to reach the top level of buttons to control the water
ASSETS ’22, October 23–26, 2022, Athens, Greece Li et al.

level and the heat, but because of the muscle decline, touch interface to turn on/of the water or control the
reaching the top row became hard for me last year, water temperature...”
and I did not know how to reprogram the pads...” Beyond touch or voice input, we learned that hand-only ges-
Compared with having personal assistance for ADLs, the use of ture (e.g., waving hands) is the third preferred input method by
technology could potentially address concerns from participants our participants, which has about three participants prefer using
regarding privacy concerns, time and efort requirements from hand-only gestures as an efcient input method for each ADLs.
PCAs, and maintain self-confdence in social interactions. Overall, Specifcally, we found fve participants preferred hand-only ges-
we showed the current limitations of existing technologies in ADLs tures for cooking, and four participants chose hand-only gestures
and found that all of our participants showed strong needs and for grooming, toileting, and housework (Table 4). P3 commented
preferences for being able to use various input modalities in ADLs on his preferences of waving his hand to fush the toilet or control
towards independence. Therefore, it is important to uncover their cooking appliances:
preferences among diferent input modalities in ADLs and how “...I have spinal cord injuries, and it is hard for me to
these input modalities may help people with upper-body motor push or press buttons with my fngers. I think waving
impairments in accomplishing ADLs by maintaining independence. hands is a very compelling way for me to control or
interact with my devices. Especially for toileting, I
5 FINDINGS: APPLICATIONS OF INDIVIDUAL always have trouble pressing the button to fush the
INPUT MODALITY AND MULTIMODAL toilet after use, it would be better to wave my hand
INPUT IN ADLS after use to fush it...For cooking, I also have trouble
In this section, we frst present preferred applications of individual using the knob to control the time and temperature.
input modality in ADLs for people with upper-body motor impair- Thus, using waving to control a scrolling panel would
ments (Section 5.1). We then show participants’ preferences on be a great option for me...”
applications of diferent combinations of input modalities (Section Eye-based input (e.g., eye gaze, eyelid gestures), head-movement
5.2). input, and brain-computer input all have the same number of
selections (2) by participants who prefer to use these methods for all
5.1 Applications of Individual Input Modality ADLs. By further analyzing the data quantitatively, we found that
In our interview, we asked participants whether they would like to participants highly prefer eye-based input (6) and head-movement
change their existing ways of doing ADLs and what input modali- input (5) for shopping (Table 4). P10 explained her existing way of
ties (Table 1) they would prefer to use for specifc ADLs (Fig. 1). On shopping and why he prefers eye-based input:
average, almost all (11/12) participants mentioned that they would “I do not often go shopping physically, I usually just
like to change their existing method for ADLs (Table 3). Among use online platforms such as Amazon, for most of the
the participants who had already been using technologies in ADLs, things. The main barrier is that I cannot buy things
nearly all of them would like to change the technology involved by at a store by myself. If it is possible, I want to use
having new input modalities. Only one participant did not want to eye-based input to dwell at a product, and a robotic
change the technology involved for toileting, transferring, house- arm can grab that product for me...”
work, or leisure and other activities. For participants who do not Finally, we learned that our participants prefer to use facial ges-
use technologies in certain ADLs, we found that all of them prefer tures, biometric input, and automatic recognition or other
having new input modalities that enable them to accomplish these input only for limited tasks, which ends up being the least selected
ADLs independently or collaboratively, especially for the ADLs that input modalities among all ADLs (1). For instance, we found partic-
they may fully rely on PCA or family members, such as dressing, ipants mostly prefer using biometric input for managing fnances,
bathing, toileting, and eating. because biometric information could be used for authentications.
While we examined each ADL, it became clear that certain input In terms of automatic recognition or other input, seven participants
types are favored over others. Across all tasks, we found that touch prefer it when in motion, P6 commented on the difculty of moving
(e.g., joysticks, touchscreens) and voice inputs are highly desirable physical obstacles (e.g., door):
compared with others (Table 4). On average, seven participants “Opening my door is the hardest thing ever, I am
prefer using touch or voice input as an input modality to substitute always in my wheelchair, and I do not have much
their existing ways of handling ADLs. Specifcally, we found that control of my upper body, which is not sufcient to
our participants mostly prefer using touch input for bathing (10), open my front door, which forced me to ask my family
toileting (8), and cooking (8). Furthermore, our participants would members to open the door for me all the time. This is
also like to use voice input for cooking (10) and driving (10) (Table why I want an automatic door that can open when I
4). For example, P8 mentioned that he would prefer using touch approach...”
input rather than the existing shower knob:
“...I usually have my dad or sister help with the bathing 5.2 Applications of Multiple Modality Inputs
process because I cannot rotate the shower knob due We further identify participants’ preferences for applications of
to the lack of control with my fngers and hands. How- diferent combinations of input modalities. The combination means
ever, I can ‘tap’ with my palm, I can defnitely use the
Understanding Input Modality Preferences of People with Upper-body Motor Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 4: Input modality preferences for specifc ADLs among all 12 participants.

Touch Input Hand-only Voice Input Eye-based Head-movement Brain-computer Facial Gestures Biometrics Others
Gestures Input Input Input
Bathing 10 3 9 2 3 3 5 0 0
Dressing 4 3 8 2 2 0 2 0 1
Grooming 7 4 6 3 3 1 3 1 2
Oral Care 5 2 2 3 3 2 4 2 1
Toileting 8 4 6 0 3 2 3 0 2
Transferring 5 2 5 1 2 1 1 0 1
Moving Around 6 2 9 1 3 2 0 6 7
Eating 7 2 6 4 2 2 0 0 1
Shopping 8 3 7 6 5 3 0 0 0
Cooking 8 5 10 2 1 1 0 0 1
Managing Medications 7 1 7 2 0 3 0 1 2
Uses the Phone 7 3 8 5 3 2 1 4 0
Housework 6 4 9 3 1 2 0 0 0
Laundry 8 2 6 0 0 2 0 1 1
Driving 8 2 10 2 3 1 1 2 0
Managing Finances 6 0 5 2 1 0 0 8 1
Leisure and Other Activities 8 2 6 3 3 4 0 0 0

participants either prefer using several input methods for redun- 6.1 Reasons for Individual Input Modality
dancy purposes or use these input methods to accomplish diferent Alternatives
sections of a specifc task. For example, people may prefer using
In the interview, we asked our participants why they would prefer
both touch input and voice-based input for switching the water
one input modality over the other based on their responses to input
temperature during showering. On the other side, people may have
preferences that we showed in the prior section. We uncovered
eye-based input to target an object and use voice-based input to
fve main factors (i.e., usability, efciency, consequences, personal-
open or close the object.
ization, and context) that may afect how people with upper-body
In our paper, if a certain combination of input modalities got
motor impairments choose a specifc input modality, and we present
selected more than fve times by our participants, we defne it as a
each of them in detail.
popular combination. Based on participants’ responses, we found
Usability: We learned that participants highly value reliability
that there is a majority popular combination (selection > 5 among
and confdence while choosing a specifc input modality (P6, P8,
all participants) between touch and voice input for the majority
P10, P11). For example, P10 mentioned that he prefers touch input
of ADLs, which include bathing, grooming, toileting, eating,
over voice-based input for some ADLs because he knows what
cooking, managing medications, use the phone, housework,
the outcome of the touch interaction will be, and it makes him
laundry, driving, and leisure and other activities (Table 4). We
become more confdent with providing reliable output. P11 further
found moving around has the most number of input modalities
explained the involvement of biometric information in driving to
in popular combinations (selection > 5 among all participants),
ensure reliability and reduce false positives:
which includes touch input, voice-based input, biometric input,
and automatic recognition or other input. From what we uncov- “...I can only use my right side of the body to drive.
ered in the interviewees’ responses, we found moving around This makes driving a hard task for me because there
usually involves more complex environments. Having multiple are so many diferent functions, and I do not have
input modalities would allow people to easier accommodate the control of my left hand or foot, which causes the huge
complexity of such interactions. We also realized people prefer concern of accidentally touching and safety issues.
combining touch input, voice-based input, and eye-based input (se- That is why I prefer to use fngerprint as a verifcation
lection > 5 among all participants) for shopping specifcally (Table approach to make sure I do not accidentally touch
4). Furthermore, we found that managing fnances has a popular somewhere and prevent false activation...”
combination (selection > 5 among all participants) between touch Another important factor for choosing an input modality is in-
input and biometric input. put precision, which indicates how precise the user can place the
command with certain input modalities (P4, P8, P9). For instance,
6 FINDINGS: CHOOSING INPUT P9 mentioned that he prefers using touch input for some activities
ALTERNATIVES FOR PERFORMING ADLS because it can make precise commands. P4 further commented on
why she likes using voice or joystick compared with hand-only
We showed preferred applications of both individual input modal-
gestures for certain ADLs:
ity and multimodal input in the previous section. In this section,
we frst show the reasons for choosing individual input modality “...For tasks that need you to set up a specifc time or
alternatives in ADLs (Section 6.1). We then present the reasons for temperature, using my voice or joystick would be eas-
choosing multimodal input alternatives (Section 6.2). ier. You can simply say, ‘turn the water temperature
ASSETS ’22, October 23–26, 2022, Athens, Greece Li et al.

to 39 degrees,’ but it would be very difcult if you use for me to move around and take turns. My bathroom
hand gestures to indicate such information...” is a good example, that is why I would prefer to use
Furthermore, we found that learnability and intuitiveness eye-based input or brain-computer input as an input
afect how people with upper-body motor impairments choose to compared with touch input for these interactions that
use a certain input modality for ADLs (P4, P9, P10). This means requires me to move around...”
that they prefer the input modality have a lower learning curve Besides requiring physical efort, we learned that our participants
(P10). P9 further compared voice-based input with hand gestures also mentioned the preferences of simpler level of interactions
in new skill training eforts: according to the mobility conditions when performing the input
“...Voice is more straightforward to me. If I need to (P1, P4, P8). P8 commented on this:
place commands for my kitchen appliances, I have “...Compared with diferent hand gestures, I like sim-
to learn diferent hand gestures that map to various ple input methods, just like eye blinking or voice. I
commands. However, using voice would not require might not be able to perform certain gestures, and it
me to spend time to learn and remember certain com- will end up with poor performance...”
mands...” Consequences: Our participants mentioned the concerns of
Similar to learnability, we also uncovered that our participants various consequences after using certain input modalities, which
highly value the importance of adaptability and compatibility, might change their minds about continuing with that specifc input
which means using a single input modality on a specifc system method. First, both P6 and P9 commented on the importance of
should be able to interact with diferent devices for various purposes mess prevention while using certain input modalities (e.g., touch)
(P8, P9, P10, P12). P10 commented on how he likes to use the same in ADLs (e.g., bathing, cooking). P6 explained his concerns about
touchscreen system to both control the kitchen appliances at home using touch interfaces, but rather hand gestures or voice while
and help with navigation in grocery stores: having a shower:
“...I hate switching input devices for diferent purposes “...I like using touch screens because they can provide
due to the difculty of switching the hardware and me with accurate output. However, it can also be prob-
adapting new methods based on mobility conditions. lematic when the touch interfaces already have water
Therefore, I would like to control or place commands on them, or my hand has shampoos. For this situation,
from a single unit with a single method, such as hav- I will just use hand gestures or voice input in a better
ing my touchscreen connect to my instant pot while way without creating a mess in the bathroom...”
cooking and also accessibility maps of the grocery Next, we also learned that safety is an important factor for
store for navigation. This compatibility will reduce people with upper-body motor impairments when choosing input
the efort of changing devices for diferent purposes...” modalities for ADLs (P1, P3). For example, P3 mentioned the im-
Efciency: From the interview, we found that our participants portance of having hands-free interactions because he only has
mentioned the necessity of ensuring efciency for a specifc input control of the arm and no sensations of the hand or fnger. P3 further
modality. First of all, response time is a key factor for certain input commented on this:
modalities to be used for some specifc ADLs (e.g., temperature “I cannot tell if my fnger gets burned or not due to the
change in showering) (P4, P6, P7, P8, P11). P6 further elaborated loss of sensation from spinal cord injury, so I would
on the importance of quickly changing temperature while taking a prefer not using touch input while cooking or boiling
shower for people with spinal cord injury: water.”
“People with spinal cord injuries care about skin tem- Beyond safety concerns, our participants also mentioned the
perature a lot because of the lack of sensation. I prefer importance of confdentiality and security for input modalities
hand gestures over voice while changing the water (P1, P6, P8, P11). P8 mentioned the concerns of confdentiality while
temperature, because using voice can take a long time, checking out as a person with upper-body motor impairments and
including activation, placing specifc commands, and why he prefers biometric input for payment:
waiting for the responses from the system. I might “...I often have a hard time pulling out my credit card
get burned with hot water by that time.” and paying at the cashier. Not just that, due to the
Moreover, some people with upper-body motor impairments difculty of moving my upper body, it can take a long
(e.g., spinal cord injury, cerebral palsy) have a hard time making time to fnish the payment process, and other people
movements both for the upper limbs and also their lower limbs behind me may see my personal belongings in my
as well. Therefore, they mentioned the importance of less phys- wallet, which can be dangerous to me. That is why I
ical efort (P3, P4, P5, P6, P9). P3 explained the necessity of less think biometric input could reduce the consequences
movement required for certain ADLs: of breaching my personal identities...”
“...I have a C5 level of spinal cord injury and have to Personalization: Participants in our study had diferent motor
stay in my wheelchair all the time. Currently, most of impairments afecting their bodies in diferent ways, including their
the devices now require me to physically move close upper and lower limbs. To account for these diferences, personal-
to them and touch them to place commands. This situ- ized input modalities are preferred when selecting a specifc input
ation is frustrated because some spaces are too narrow
Understanding Input Modality Preferences of People with Upper-body Motor Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

method. For example, P7 and P9 mentioned the importance of tak- can be embarrassing and attract unwanted attention. P6 further
ing physical ability and mobility into consideration. P7 further explained this:
explained: “...It is less socially acceptable to use voice input at a
“...The limited mobility of my hand afects my choices store, and I feel it is weird to have someone see me
of input methods, which means I would prefer the yelling at my wheelchair or a specifc product in a
input method that can take my personal situations store. That is why I would prefer to use eye-based or
into consideration. This is the reason why I prefer eye touch interfaces that have gentle reactions from other
movement over touch...” people...”
Furthermore, we learned that familiarity is another factor in
choosing personalized input modalities (P1, P5, P9). We found that 6.2 Reasons for Multimodal Input Alternatives
our participants preferred certain input modalities because they We asked them about their preferences for combining various input
had already used them for other ADLs. P1 further elaborated on modalities together. We illustrate reasons for two main uses of mul-
her existing experiences with the joystick, which made her prefer tiple modality input: multi-modality for input redundancy (Section
touch input: 6.2.1) and multi-modality for input variability (Section 6.2.2).
“...Touch input would be good for me to do other tasks 6.2.1 Multi-modality for Input Redundancy. Based on our partici-
if the technology is available. Because I am already pants’ responses about why they prefer having multimodal input
familiar with diferent ways of using joystick, touch- for redundancy, we uncovered four main factors (i.e., reliability,
screen, and trackball. I have been driving with joystick convenience, customization, appropriateness) that afect how and
for many years...” why our participants choose multiple input modalities for the same
Similar to what we have mentioned previously (Section 4.2), our purpose vs. just having a single input modality.
participants prefer using input modalities that can enable indepen- Reliability: we learned that our participants prefer having re-
dence for specifc ADLs based on individual circumstances (P2, P4). dundant input methods for the same purposes to make sure they
P2 further commented: can accomplish tasks in a reliable way. The frst reason our partici-
“...I prefer input methods that can ensure indepen- pants mentioned is that having a backup input modality could
dence when I am taking a shower, eating, toileting enable people with upper-body motor impairments to accomplish
or dressing. More specifcally, I do not have to wait certain tasks disregarding the physical and mobility limitations (P2,
for my PCA all the time, I can then eat whenever I P4, P5, P8). P5 further commented on this:
want...” “...I have muscular dystrophy, I continuously lose my
Context: We identifed that the context of specifc environments muscles every day. On the one hand, having multiple
also afects how people perceive diferent input modalities. For input methods for the same purpose could allow me
instance, six participants mentioned that environmental infu- to accommodate my muscle decline as time passes.
ences afect how they choose a specifc input method. The major On the other hand, having two or three options could
consideration was using voice as input while the surrounding envi- allow me to fnish the tasks with the most suitable way
ronment is noisy, which may afect the accuracy. P3 commented of input based on my physical situation. For example,
on his preferences during gaming: I could use voice to turn on the water remotely if I
“...My body limits me from using touch input for aim- want to have a bath while I am in my wheelchair, this
ing in playing FPS games...I do not like voice input just gives me more options for specifc tasks...”
when playing games because either the game sound Another reason is that having multiple inputs for redundancy
is noisy and the recognition system cannot detect my could ensure task completion through environmental changes
voice correctly, or I have to use my voice to communi- (P1, P2, P4). We mentioned this in Section 6.1 as the reason why they
cate with my teammates from the game. Therefore, I prefer certain input modalities. P4 further commented on having
would go a lot with eye-based input and simple touch redundant input while taking a shower:
commands as a confrmation for shooting...” “...The beneft of having multiple inputs could beneft
In addition, P6 further mentioned the importance of reachabil- the showering experiences for sure. Because the envi-
ity in a space by using certain input modalities, which includes ronment could be very noisy at that time, if I only used
considerations of both the space and scalability: voice, it might not work nicely. That is why having
“...Some input methods require certain space for in- touch input as an alternative is a great option...”
teractions, such as eye tracking, head tracking, and Beyond always having a backup and ensuring task completion
brain-computer interfaces. I do not have a huge space through environmental changes, we found that our participants
at home. This brings the concerns of whether I should consider redundant inputs as having better accuracy and preci-
use a certain input for daily tasks...” sion for various tasks (P4, P10). P4 further commented on this:
Finally, participants also expressed concerns regarding the social “...Diferent input methods usually have diferent pre-
acceptability of using certain input modalities during interactions cision and accuracy. Take showering as an example, I
with or around other people (P5, P6, P8, P9). Three of them explicitly would like to have both voice control and joystick as
mentioned that using voice input in a store or during shopping the input methods to control the water. However, a
ASSETS ’22, October 23–26, 2022, Athens, Greece Li et al.

joystick is more in control and more precise than just feedback of using multiple input methods to control a robot to help
using voice...” her eat:
Convenience: we found that fve participants claimed that they “...Having multiple input methods allow me to prevent
like redundant input mainly for convenience. First, P4, P6, and P12 awkward times, such as I do not want to talk with my
commented on the importance of maintaining multi-purposing devices while I am having dinner with someone...”
during complex situations, which allows them to choose their pre- In addition to reducing unwanted attention, we also learned
ferred input method freely. For example, P4 mentioned her experi- that having redundant input modalities could potentially increase
ences of having two or more things at a time: self-confdence of people with upper-body motor impairments in
“...I sometimes have multiple things happening at the social interactions. P10 commented on this:
same time, such as I may have to answer the phone “...Having multiple ways of controlling my devices is
while I am driving through touch. Therefore, having really cool! I could show of to my friends not only I
multiple input methods for diferent tasks ofers me can fnish tasks independently, but also I could inter-
great fexibility in daily activities...” act with my devices ‘intelligently’...”
Moreover, we found redundant input ofers better reachability
6.2.2 Multi-modality for Input Variability. During the interview,
for participants with upper-body motor impairments due to mo-
we also asked our participants about the perceived benefts and
bility issues. It allows people to control their devices at a certain
concerns of having multiple inputs through task completion, which
distance with comfort. P1 commented on her preference of combin-
means combining diferent input modalities to accomplish various
ing the joystick with the voice to control the seat elevation while
sub-tasks of a complete task. We uncovered three main factors and
transferring:
the usefulness of having varied inputs throughout task completion
“...I use seat elevation to do some of the transfers, (i.e., usability, efciency, personalization).
such as transferring from my wheelchair to my bed. Usability: using multiple input modalities for sub-tasks enables
After I raise my wheelchair, I need someone else to
participants to have a natural mapping between input modali-
put the wheelchair back down for transferring back
ties and the actual meaning of the interface for complemen-
because I can only control the seat elevation through
tary purposes (P4, P5). P4 explained why she likes to use varied
the joystick on my wheelchair. If I can use voice to
inputs for diferent sub-tasks:
control my wheelchair, I won’t need someone else for
assistance in transferring...” “...Voice control is really bad with placing commands
which require multiple steps. Therefore, all I want to
Customization: we learned that our participants preferred hav-
control my Roomba in the future is just saying ‘start
ing redundant input for customized input opportunities to com-
cleaning Roomba,’ and if I need to select specifc places
promise their own mobility conditions. P6 commented on her
to clean, I would just either use touch or eye-based
situation of only being able to perform a portion of gestures through
input to make a more precise command...”
her hands even though she prefers using her hands for practice and
exercises: Our participants mentioned that they prefer having multiple
input modalities for sub-tasks to prevent false activation and
“...Having more input options allows me to choose
provide a more efective activation approach. P3 further elab-
the way that I feel most comfortable with based on
orated on why she prefers to use hand waving as an activation
my body conditions. For example, although I can only
approach for voice-based systems:
move my elbow to move my hand to tap on touch
screens, which limits the input I can do with my hands, “...If I have a smart bathroom with all the input meth-
I still want to use my hand for practice because my ods you just showed, I would like to use voice because
doctor told me to practice my hands often. That is I can control it remotely. However, I would like to use
why I need more than one input method for tasks...” hand waving or eye-based gestures as an activation
approach, because I found that the current voice in-
Furthermore, we found that our participants also like to use mul-
put accuracy and precision are not sufcient, it might
tiple input methods to cope with inaccessible environments.
falsely activate through conversations...”
This indicates that having more input methods could allow peo-
ple to access technology or interact with their devices in a less Having multiple input modalities to assist people with upper-
accessible environment. P11 further commented: body motor impairments for sub-tasks allows additional input
dimensions for various tasks, because certain input modality does
“...It takes efort to modify the home layout, I some-
not have sufcient input dimensions, or people with upper-body
times stay at my parents’ or my sister’s houses. Some
motor impairments are restricted to using a subset of them (P1, P2,
of their bathrooms or stairs are not that accessible
P3, P7, P12). P8 commented on his limited options in gaming:
to me. A good system should have more than one
input method that allows me to interact with it under “...I spend about ten hours in front of my computer,
complex environments...” gaming becomes an important leisure activity for me.
However, I can only play chess by using the light
Appropriateness: by analyzing participants’ responses, we found
pointer controlled by my head. It would be great if I
that having multiple input modalities reduces unwanted atten- could have more input methods that could support
tion during social interactions (P3, P4). P4 commented on her me to play RPG or FPS games...”
Understanding Input Modality Preferences of People with Upper-body Motor Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

Efciency: besides providing additional input dimensions, we 6.3 Summary of Individual versus Multimodal
learned that using multiple inputs could reduce the efort on each Input Approaches
single input modality, thus reducing the complexity of learning In this section, we presented specifc reasons for choosing individ-
eforts. P8 further commented on this: ual input modality (Section 6.1) and multimodal input alternatives
“...I do not have full control of my fngers, I can def- (Section 6.2) for helping people with upper-body motor impair-
initely use touch to complete tasks, like document ments in ADLs. By comparing both reasons for individual and
editing, but it would be really slow and can easily multimodal input alternatives, we found there exist similarities,
make me feel tired. Especially for complex tasks, that such as design for reliability, reachability, and efciency. The use
is why I prefer combining eye, head, touch, and brain of multimodal input could further extend these individual input
control to form a hybrid control system...” preferences either through input redundancy (e.g., reliability, reach-
By reducing the efort of a single input modality, we further ability) or variability (e.g., efciency, privacy). On the other hand,
learned that it would make people with upper-body motor im- we also found that multimodal input also brings opportunities that
pairments feel more comfortable (P5, P9). P9 elaborated on the complement individual input modality. For instance, multimodal
reason for extra comfort: input could enhance self-confdence in social interactions for input
“...Let me use gaming as an example, I have to hold redundancy and prevent false activation through input variability.
a certain body posture while playing the game just Overall, we present how multimodal inputs provide opportunities
by using the touch input. Having more input vari- through upper-extremity mobility limitations and adapting the com-
eties allow me to relax and change postures during plexity of contexts (Section 6.2). We should also consider existing
gaming...” concerns that might also be applicable to certain multimodal inputs,
Beyond reducing the efort of a single input modality, P10 and such as fnancial burden and additional efort of installation and
P11 also mentioned the benefts of reducing the time cost by mapping (Section 4).
combining multiple input modalities. P10 further explained:
7 DISCUSSION
“...Having multiple input methods on small tasks could
allow me to accomplish the task faster than just using In the previous section, we presented the overall practices and
a single input. For example, I can use head or eye challenges of diferent approaches in ADLs (Section 4), preferred
gestures to control where I want to shoot and then applications of individual input modality and multimodal input (Sec-
use a touch button to shoot. It would be a lot faster tion 5), and associated reasons for choosing certain individual input
than I use a joystick to aim and shoot...” modality and multimodal input alternatives (Section 6). Refecting
on these fndings, we will discuss future research opportunities
Privacy and Security: we mentioned in Section 6.1 that our
and design recommendations for more accessible applications of
participants have privacy and confdentiality concerns about using multimodal input systems.
existing methods in ADLs (e.g., shopping). Our participants men-
tioned the use of additional input for verifcation and security 7.1 Multimodal Input Towards Collaborative
purposes. P1 commented on combining biometric information
with a joystick for security purposes:
Experiences in ADLs
In our fndings, we showed that our participants already adopted
“...My current power wheelchair does not have a good
the experiences of co-accomplishing certain ADLs with their PCAs.
authentication process, which means anyone can eas-
For instance, P11 mentioned that he used the electric toothbrush
ily drive it by using the built-in joystick. That is why
by himself, but needed a PCA to put toothpaste on the brush and
I prefer future power wheelchairs to have biometric
press the on/of button (Section 4.1). Understanding the prefer-
verifcation for ease of use and security...”
ences for multimodal input may reduce the efort and time cost
Finally, we also learned that having multiple input modalities of PCAs for assisting people with upper-body motor impairments
for diferent purposes ensures and extends the independence through ADLs. Furthermore, our fndings of the needs and pref-
in social activities. P10 commented on this: erences of collaboratively accomplishing ADLs may further bring
“...A combination of multiple input methods for dif- more opportunities for enhancing collaborative experiences [9]
ferent small tasks could help me to become more in- through multimodal input systems. For example, fve of our par-
dependent while moving outside. For example, using ticipants expressed the desire to contribute to housework when
a joystick allows me to move my wheelchair, using their family members are busy (Section 4.2), such as preparing meat
biometric information allows me to lock my door, and and vegetables before their family members come back home and
using voice can allow me to answer phone calls while cook (P8). Thus, our paper proposes the following questions for
my hand is occupied by controlling the joysticks. This future research to consider while leveraging multimodal input to
allows me to have a complete trip independently...” support the collaborative experiences in ADLs: 1) how to support
communications and interactions between PCAs and multimodal
input systems? 2) how to set up collaborative tasks based on upper-
extremity capabilities and diferent ADLs? 3) how to enable certain
multimodal input systems to adapt to diferent collaborative tasks
among ADLs with PCAs?
ASSETS ’22, October 23–26, 2022, Athens, Greece Li et al.

7.2 Diferentiate Computing Devices with of specifying robot motions [1]. The support of end-user robot pro-
Sensing Systems for Traditional ADLs gramming for people with upper-body motor impairments would
reduce the efort for users with upper-body motor impairments
From the fndings, we recognized the disparity between technology
and further support the robot for complex ADLs. Third, we showed
adoption and reliance on PCAs across computing-based activities
the existing high reliance on PCAs for ADLs in general. Extending
(e.g., managing fnances) and traditional ADLs (e.g., toileting). Prior
from robot learning through multiple modalities [43], supporting
research uncovered the practices of multimodal input on comput-
active learning of robots on PCAs would strongly reduce the efort
ing devices (e.g., gaming context) [70] and design considerations
from end-users. Finally, we mentioned that some of our participants
of sensing systems [28]. The fndings from our paper uncovered
prefer working collaboratively on ADLs with PCAs or their fam-
the demand for taking consideration of consequences and context
ily members. Thus, robots may become a medium to support the
of use while developing multimodal input systems (e.g., safety, pri-
collaborative experiences between people with upper-body motor
vacy). For example, input systems for computers would require
impairments in social interactions.
more modifcations to be able to adapt to the environment in the
bathroom or for cooking scenarios [37], such as location, water
7.4 Personalize Multimodal Designs Based on
resistant, and surrounding objects.
From our results, we showed that our participants prefer having Socially Categorized ADLs
various input options for diferent ADLs (Section 4.2). This would We learned that people with upper-body motor impairments have
likely lead to increased efort to install sensing units for various diferent preferences for input modalities depending on the social
devices when used in the home context [28]. However, home envi- context in which they will be used (Section 6.1), which correlates to
ronments may have unique layouts, spaces, and requirements for prior research which proposed designing assistive technologies for
installment, which require designers to provide more customized so- social interactions [38, 59]. For example, our participants mentioned
lutions. Furthermore, participants expressed concerns regarding the the necessity of combining biometric input with joystick input for
maintenance and replacement of assistive technologies. Therefore, security purposes in public. Thus, future designers may addition-
to reduce the efort required to exchange or repair devices and sys- ally consider categorizing daily tasks with more social dimensions.
tems, it is important to explore ultra-low power or self-sustaining Based on our fndings, we ofer three examples of sub-categories
solutions (e.g., [66]). that may be considered based on the relative social expectations
in each. One might be group tasks by location and consider which
7.3 Consider Actuation and Human-Robot inputs are more appropriate depending on whether a location is
Interaction (HRI) during Multimodal private (e.g., a bathroom) or public (e.g., a restaurant). A second
Interactions category might be based on the relative expectations of particu-
lar events or activities. For example, bathing is a private activity
We also found that the signifcance of input modality can depend that has relatively low expectations of social interactions, while
on actuation techniques when it comes to physical systems. In the shopping is often a completely public activity with high expecta-
context of ADLs, actuation techniques [75] include a wide spec- tions of social connections (albeit direct or indirect contact). The
trum of mechanisms that change the physical world, ranging from third dimension we ofer is based on the sensitivity or expectation
interacting with the environment, automatic doors, to connected of privacy for an individual in a given scenario. For this example,
appliances and service robots [73]. Working together with input one might consider how a wheelchair user may again feel that the
modalities, these actuation techniques close the loop of interaction expectation of privacy in highly social activities like shopping is
paradigms, examples of which can be found in P12’s case of the conversely low, whereas managing one’s fnances, no matter the
voice-controlled shower system, P10’s robotic arm, P6’s automatic environment or occasion, carries high expectations or desire for
door, P8’s omnibot for eating and P4’s Roomba in our study (Ta- privacy. We ofer these few examples as a starting point for future
ble 3). Based on our fndings on challenges of existing practices researchers and developers to consider when designing multimodal
(Section 4.2), leveraging actuation and robots could further support systems for use in diferent contexts.
independence and reduce concerns of privacy and fnancial eforts.
Especially, having robots could potentially beneft traditional ADLs 8 LIMITATION AND FUTURE WORK
(e.g., toileting, showering, and cooking) and reduce the reliance
In our study, we chose to interview people with upper-body mo-
on upper-extremity mobility. Moreover, existing HRI research also
tor impairments to understand all three research questions (RQ1 -
explored how users could leverage multimodal input to interact
RQ3). Although we are confdent about our current contributions
with robots for complex tasks [1, 43, 52].
to the HCI and Accessibility community, there might be more op-
We propose the following research directions for HRI and sup-
portunities by conducting contextual inquiries to further explore
porting people with upper-body motor impairments in ADLs. First,
the detailed interactions in-depth. Beyond showing participants
existing research in HRI has explored how to support robot learning
with diferent input modalities, future research could leverage tech-
from multiple modalities [43]. Future research could further com-
nology probes to actually deploy in the living environments [35]
bine our fndings of preferred input modalities with robot learning
of people with upper-body motor impairments to uncover more
research to support robot learning from customized multiple modal-
specifc designs of multimodal input systems for ADLs.
ities by people with upper-body motor impairments. Second, end-
user robot programming is also important through the interaction
with robots, which enables end-users to overcome the complexities
Understanding Input Modality Preferences of People with Upper-body Motor Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

9 CONCLUSION and rehabilitation medicine 62, 5 (2019), 379–381.


[17] Alban Duprès, José Rouillard, and François Cabestaing. 2014. Hybrid BCI for
In this paper, we describe the results of an interview study involv- palliation of severe motor disability. In Proceedings of the 26th Conference on
ing 12 people with upper-body motor impairments, which aimed to l’Interaction Homme-Machine. 171–176.
[18] Peter F Edemekong, Deb L Bomgaars, Sukesh Sukumaran, and Shoshana B Levy.
understand their current and potential future use of emerging input 2021. Activities of daily living. In StatPearls [internet]. StatPearls Publishing.
techniques for ADLs. We highlight the signifcance of incorporating [19] Mingming Fan, Zhen Li, and Franklin Mingzhe Li. 2020. Eyelid Gestures on
new input modalities to potentially decrease reliance on PCAs and Mobile Devices for People with Motor Impairments. In The 22nd International
ACM SIGACCESS Conference on Computers and Accessibility. 1–8.
increase opportunities for independence. We assert that by under- [20] Mingming Fan, Zhen Li, and Franklin Mingzhe Li. 2021. Eyelid gestures for
standing these opportunities based on the social- and task-based people with motor impairments. Commun. ACM 65, 1 (2021), 108–115.
preferences of people with upper-body motor impairments, future [21] Sergio García-Vergara, Yu-Ping Chen, and Ayanna M Howard. 2013. Super pop
VR TM: An adaptable virtual reality game for upper-body rehabilitation. In
research and development eforts can better utilize diferent input international conference on virtual, augmented and mixed reality. Springer, 40–49.
modalities in ADLs (Section 6). Overall, we believe our fndings [22] Robbert J Gobbens. 2018. Associations of ADL and IADL disability with physical
and mental dimensions of quality of life in people aged 75 years and older. PeerJ
contribute opportunities to support end-users’ ability to choose 6 (2018), e5425.
how technology can best adapt to their unique preferences, abilities, [23] Harkishan Singh Grewal, Aaron Matthews, Richard Tea, Ved Contractor, and
and goals for independent and collaborative achievements of ADLs. Kiran George. 2018. Sip-and-Puf Autonomous Wheelchair for Individuals with
Severe Disabilities. In 2018 9th IEEE Annual Ubiquitous Computing, Electronics &
Mobile Communication Conference (UEMCON). IEEE, 705–710.
REFERENCES [24] Susumu Harada, Jacob O Wobbrock, and James A Landay. 2007. Voicedraw: a
[1] Gopika Ajaykumar and Chien-Ming Huang. [n.d.]. Multimodal Robot Program- hands-free voice-driven drawing application for people with motor impairments.
ming by Demonstration: A Preliminary Exploration. ([n. d.]). In Proceedings of the 9th international ACM SIGACCESS conference on Computers
[2] Apple. 2021. Use Dictation on your iPhone, iPad, or iPod touch - Apple Support. and accessibility. 27–34.
https://support.apple.com/en-us/HT208343. (Accessed on 03/27/2022). [25] Rex Hartson and Pardha S Pyla. 2012. The UX Book: Process and guidelines for
[3] Rúbia EO Schultz Ascari, Roberto Pereira, and Luciano Silva. 2020. Computer ensuring a quality user experience. Elsevier.
Vision-based Methodology to Improve Interaction for People with Motor and [26] Andreas Holzinger and Alexander K Nischelwitzer. 2006. People with motor
Speech Impairment. ACM Transactions on Accessible Computing (TACCESS) 13, 4 and mobility impairment: Innovative multimodal interfaces to wheelchairs. In
(2020), 1–33. International Conference on Computers for Handicapped Persons. Springer, 989–
[4] Rúbia EO Schultz Ascari, Luciano Silva, and Roberto Pereira. 2020. Personalized 991.
gestural interaction applied in a gesture interactive game-based approach for [27] Zoom Video Communications Inc. 2012. Video Conferencing, Cloud Phone, We-
people with disabilities. In Proceedings of the 25th International Conference on binars, Chat, Virtual Events | Zoom. https://zoom.us/. (Accessed on 07/20/2021).
Intelligent User Interfaces. 100–110. [28] Shaun K Kane, Anhong Guo, and Meredith Ringel Morris. 2020. Sense and accessi-
[5] Tommaso Lisini Baldi, Giovanni Spagnoletti, Mihai Dragusanu, and Domenico bility: Understanding people with physical disabilities’ experiences with sensing
Prattichizzo. 2017. Design of a wearable interface for lightweight robotic arm for systems. In The 22nd International ACM SIGACCESS Conference on Computers
people with mobility impairments. In 2017 International Conference on Rehabili- and Accessibility. 1–14.
tation Robotics (ICORR). IEEE, 1567–1573. [29] Andrey Khusid and Oleg Shardin. 2011. An Online Visual Collaboration Platform
[6] Jef Bilmes, Xiao Li, Jonathan Malkin, Kelley Kilanski, Richard Wright, Katrin for Teamwork | Miro. https://miro.com/. (Accessed on 06/02/2021).
Kirchhof, Amarnag Subramanya, Susumu Harada, James Landay, Patricia Dow- [30] Ki-Hong Kim, Jae-Kwon Yoo, Hong Kee Kim, Wookho Son, and Soo-Young Lee.
den, et al. 2005. The Vocal Joystick: A voice-based human-computer interface for 2006. A practical biosignal-based human interface applicable to the assistive
individuals with motor impairments. In Proceedings of Human Language Tech- systems for people with motor impairment. IEICE transactions on information
nology Conference and Conference on Empirical Methods in Natural Language and systems 89, 10 (2006), 2644–2652.
Processing. 995–1002. [31] Vinay Krishna Sharma, Kamalpreet Saluja, Vimal Mollyn, and Pradipta Biswas.
[7] Pradipta Biswas and Patrick Langdon. 2012. Developing multimodal adapta- 2020. Eye gaze controlled robotic arm for persons with severe speech and motor
tion algorithm for mobility impaired users by evaluating their hand strength. impairment. In ACM Symposium on Eye Tracking Research and Applications. 1–9.
International Journal of Human-Computer Interaction 28, 9 (2012), 576–596. [32] Jonathan Lazar, Brian Wentz, and Marco Winckler. 2017. Information privacy
[8] G Bourhis and P Pino. 1996. Mobile robotics and mobility assistance for people and security as a human right for people with disabilities. In Disability, Human
with motor impairments: Rational justifcation for the VAHM project. IEEE Rights, and Information Technology. University of Pennsylvania Press, 199–211.
transactions on Rehabilitation Engineering 4, 1 (1996), 7–12. [33] Carol Levine, Susan Reinhard, Lynn Friss Feinberg, Steven Albert, and Andrea
[9] Stacy M Branham and Shaun K Kane. 2015. Collaborative accessibility: How Hart. 2003. Family caregivers on the job: Moving beyond ADLs and IADLs.
blind and sighted companions co-create accessible home spaces. In Proceedings Generations 27, 4 (2003), 17–23.
of the 33rd Annual ACM Conference on Human Factors in Computing Systems. [34] Brittany Lewis and Krishna Venkatasubramanian. 2021. “I... Got my Nose-Print.
2373–2382. But it Wasn’t Accurate”: How People with Upper Extremity Impairment Authen-
[10] Patrick Carrington, Jian-Ming Chang, Kevin Chang, Catherine Hornback, Amy ticate on their Personal Computing Devices. (2021).
Hurst, and Shaun K Kane. 2016. The gest-rest family: Exploring input possibilities [35] Franklin Mingzhe Li, Di Laura Chen, Mingming Fan, and Khai N Truong. 2019.
for wheelchair armrests. ACM Transactions on Accessible Computing (TACCESS) FMT: A wearable camera-based object tracking memory aid for older adults. Pro-
8, 3 (2016), 1–24. ceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
[11] Patrick Carrington, Amy Hurst, and Shaun K Kane. 2014. The gest-rest: a pressure- 3, 3 (2019), 1–25.
sensitive chairable input pad for power wheelchair armrests. In Proceedings of [36] Franklin Mingzhe Li, Di Laura Chen, Mingming Fan, and Khai N Truong. 2021.
the 16th international ACM SIGACCESS conference on Computers & accessibility. “I Choose Assistive Devices That Save My Face” A Study on Perceptions of
201–208. Accessibility and Assistive Technology Use Conducted in China. In Proceedings
[12] Patrick Carrington, Amy Hurst, and Shaun K Kane. 2014. Wearables and of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14.
chairables: inclusive design of mobile input and output techniques for power [37] Franklin Mingzhe Li, Jamie Dorst, Peter Cederberg, and Patrick Carrington. 2021.
wheelchair users. In Proceedings of the SIGCHI Conference on human factors in Non-Visual Cooking: Exploring Practices and Challenges of Meal Preparation
computing systems. 3103–3112. by People with Visual Impairments. In The 23rd International ACM SIGACCESS
[13] Patrick Alexander Carrington. 2017. Chairable Computing. University of Mary- Conference on Computers and Accessibility. 1–11.
land, Baltimore County. [38] Franklin Mingzhe Li, Franchesca Spektor, Meng Xia, Mina Huh, Peter Cederberg,
[14] Kathy Charmaz. 2006. Constructing grounded theory: A practical guide through Yuqi Gong, Kristen Shinohara, and Patrick Carrington. 2022. “It Feels Like Taking
qualitative analysis. sage. a Gamble”: Exploring Perceptions, Practices, and Challenges of Using Makeup
[15] Muratcan Cicek, Ankit Dave, Wenxin Feng, Michael Xuelin Huang, Julia Kather- and Cosmetics for People with Visual Impairments. In CHI Conference on Human
ine Haines, and Jefry Nichols. 2020. Designing and Evaluating Head-Based Factors in Computing Systems. 1–15.
Pointing on Smartphones for People with Motor Impairments. In The 22nd Inter- [39] Michael Xieyang Liu, Aniket Kittur, and Brad A. Myers. 2022. Crystalline: Low-
national ACM SIGACCESS Conference on Computers and Accessibility. 1–12. ering the Cost for Developers to Collect and Organize Information for Decision
[16] Alban Duprès, François Cabestaing, José Rouillard, Vincent Tifreau, and Charles Making. In Proceedings of the 2022 CHI Conference on Human Factors in Computing
Pradeau. 2019. Toward a hybrid brain-machine interface for palliating motor Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA.
handicap with Duchenne muscular dystrophy: A case report. Annals of physical https://doi.org/10.1145/3491102.3501968 event-place: New Orleans, LA, USA.
ASSETS ’22, October 23–26, 2022, Athens, Greece Li et al.

[40] Michael Xieyang Liu, Andrew Kuznetsov, Yongsung Kim, Joseph Chee Chang, [61] Wei Sun, Franklin Mingzhe Li, Benjamin Steeper, Songlin Xu, Feng Tian, and
Aniket Kittur, and Brad A. Myers. 2022. Wigglite: Low-cost Information Collection Cheng Zhang. 2021. Teethtap: Recognizing discrete teeth gestures using motion
and Triage. In The 35th Annual ACM Symposium on User Interface Software and and acoustic sensing on an earpiece. In 26th International Conference on Intelligent
Technology (UIST ’22). Association for Computing Machinery, New York, NY, User Interfaces. 161–169.
USA. https://doi.org/10.1145/3526113.3545661 event-place: Bend, OR, USA. [62] Mohd Razali Md Tomari, Yoshinori Kobayashi, and Yoshinori Kuno. 2012. Devel-
[41] Meethu Malu, Pramod Chundury, and Leah Findlater. 2018. Exploring accessible opment of smart wheelchair system for a user with severe motor impairment.
smartwatch interactions for people with upper body motor impairments. In Procedia Engineering 41 (2012), 538–546.
Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. [63] Matthew Turk. 2014. Multimodal interaction: A review. Pattern Recognition
1–12. Letters 36 (2014), 189–195.
[42] Meethu Malu and Leah Findlater. 2015. Personalized, wearable control of a head- [64] Ovidiu-Ciprian Ungurean and Radu-Daniel Vatavu. 2021. Coping, Hacking,
mounted display for users with upper body motor impairments. In Proceedings and DIY: Reframing the Accessibility of Interactions with Television for People
of the 33rd Annual ACM Conference on Human Factors in Computing Systems. with Motor Impairments. In ACM International Conference on Interactive Media
221–230. Experiences. 37–49.
[43] Anahita Mohseni-Kabir, Changshuo Li, Victoria Wu, Daniel Miller, Benjamin [65] Radu-Daniel Vatavu and Ovidiu-Ciprian Ungurean. 2019. Stroke-Gesture Input
Hylak, Sonia Chernova, Dmitry Berenson, Candace Sidner, and Charles Rich. for People with Motor Impairments: Empirical Results &amp; Research Roadmap.
2019. Simultaneous learning of hierarchy and primitives for complex robot tasks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems.
Autonomous Robots 43, 4 (2019), 859–874. 1–14.
[44] Martez E Mott and Jacob O Wobbrock. 2019. Cluster Touch: Improving touch [66] Anandghan Waghmare, Qiuyue Xue, Dingtian Zhang, Yuhui Zhao, Shivan Mittal,
accuracy on smartphones for people with motor and situational impairments. In Nivedita Arora, Ceara Byrne, Thad Starner, and Gregory D Abowd. 2020. Ubiqui-
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Touch: Self sustaining ubiquitous touch interfaces. Proceedings of the ACM on
1–14. Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1 (2020), 1–22.
[45] Imad Mougharbel, Racha El-Hajj, Houda Ghamlouch, and Eric Monacelli. 2013. [67] Ker-Jiun Wang, Quanbo Liu, Yifan Zhao, Caroline Yan Zheng, Soumya Vhasure,
Comparative study on diferent adaptation approaches concerning a sip and puf Quanfeng Liu, Prakash Thakur, Mingui Sun, and Zhi-Hong Mao. 2018. Intelligent
controller for a powered wheelchair. In 2013 Science and Information Conference. wearable virtual reality (VR) gaming controller for people with motor disabilities.
IEEE, 597–603. In 2018 IEEE International Conference on Artifcial Intelligence and Virtual Reality
[46] John E Muñoz, Ricardo Chavarriaga, and David S Lopez. 2014. Application of (AIVR). IEEE, 161–164.
hybrid BCI and exergames for balance rehabilitation after stroke. In Proceedings [68] Ker-Jiun Wang, Caroline Yan Zheng, and Zhi-Hong Mao. 2019. Human-centered,
of the 11th Conference on Advances in Computer Entertainment Technology. 1–4. ergonomic wearable device with computer vision augmented intelligence for
[47] Syifaun Nafsah, Oyas Wahyunggoro, and Lukito Edi Nugroho. 2016. Evaluating VR multimodal human-smart home object interaction. In 2019 14th ACM/IEEE
the usage of short-time energy on voice biometrics system for cerebral palsy. International Conference on Human-Robot Interaction (HRI). IEEE, 767–768.
In 2016 8th International Conference on Information Technology and Electrical [69] WebAIM. 2012. Types of Motor Disabilities. https://webaim.org/articles/. (Ac-
Engineering (ICITEE). IEEE, 1–6. cessed on 07/26/2021).
[48] Maia Naftali and Leah Findlater. 2014. Accessibility in context: understanding [70] Johann Wentzel, Sasa Junuzovic, James Devine, John Porter, and Martez Mott.
the truly mobile experience of smartphone users with motor impairments. In 2022. Understanding How People with Limited Mobility Use Multi-Modal Input.
Proceedings of the 16th international ACM SIGACCESS conference on Computers & In CHI Conference on Human Factors in Computing Systems. 1–17.
accessibility. 209–216. [71] Jacob Wobbrock and Brad Myers. 2006. Trackball text entry for people with
[49] Christa Neuper, Gernot R Müller, Andrea Kübler, Niels Birbaumer, and Gert motor impairments. In Proceedings of the SIGCHI conference on Human Factors in
Pfurtscheller. 2003. Clinical application of an EEG-based brain–computer inter- computing systems. 479–488.
face: a case study in a patient with severe motor impairment. Clinical neurophys- [72] Jacob O Wobbrock and Krzysztof Z Gajos. 2008. Goal crossing with mice and
iology 114, 3 (2003), 399–409. trackballs for people with motor impairments: Performance, submovements, and
[50] U.S. Department of Labor. [n.d.]. Personal Assistance Services | U.S. Depart- design directions. ACM Transactions on Accessible Computing (TACCESS) 1, 1
ment of Labor. https://www.dol.gov/agencies/odep/program-areas/employment- (2008), 1–37.
supports/personal-assistance-services. (Accessed on 08/31/2021). [73] Muna Khalil Yousef. 2001. Assessment of metaphor efcacy in user interfaces
[51] PBS.org and the AARP. 2021. Activities of Daily Living Checklist & Assessments. for the elderly: a tentative model for enhancing accessibility. In Proceedings of
https://www.payingforseniorcare.com/activities-of-daily-living. (Accessed on the 2001 EC/NSF workshop on Universal accessibility of ubiquitous computing:
07/26/2021). providing for the elderly. 120–124.
[52] Dennis Perzanowski, Alan C Schultz, William Adams, Elaine Marsh, and Magda [74] Xiaoyi Zhang, Harish Kulkarni, and Meredith Ringel Morris. 2017. Smartphone-
Bugajska. 2001. Building a multimodal human-robot interface. IEEE intelligent based gaze gesture communication for people with motor disabilities. In Pro-
systems 16, 1 (2001), 16–21. ceedings of the 2017 CHI Conference on Human Factors in Computing Systems.
[53] Alisha Pradhan, Kanika Mehta, and Leah Findlater. 2018. " Accessibility Came 2878–2889.
by Accident" Use of Voice-Controlled Intelligent Personal Assistants by People [75] Michael Zinn, Bernard Roth, Oussama Khatib, and J Kenneth Salisbury. 2004.
with Disabilities. In Proceedings of the 2018 CHI Conference on Human Factors in A new actuation approach for human friendly robot design. The international
Computing Systems. 1–13. journal of robotics research 23, 4-5 (2004), 379–398.
[54] Leah M Reeves, Jennifer Lai, James A Larson, Sharon Oviatt, TS Balaji, Stéphanie
Buisine, Penny Collings, Phil Cohen, Ben Kraal, Jean-Claude Martin, et al. 2004.
Guidelines for multimodal user interface design. Commun. ACM 47, 1 (2004),
57–59.
[55] Syed Asad Rizvi, Ella Tuson, Breanna Desrochers, and John Magee. 2018. Sim-
ulation of Motor Impairment in Head-Controlled Pointer Fitts’ Law Task. In
Proceedings of the 20th International ACM SIGACCESS Conference on Computers
and Accessibility. 376–378.
[56] Young Hak Roh. 2013. Clinical evaluation of upper limb function: Patient’s
impairment, disability and health-related quality of life. Journal of exercise
rehabilitation 9, 4 (2013), 400.
[57] Lucas Rosenblatt, Patrick Carrington, Kotaro Hara, and Jefrey P Bigham. 2018.
Vocal programming for people with upper-body motor impairments. In Proceed-
ings of the Internet of Accessible Things. 1–10.
[58] David M Roy, Marilyn Panayi, Roman Erenshteyn, Richard Foulds, and Robert
Fawcus. 1994. Gestural human-machine interaction for people with severe speech
and motor impairment due to cerebral palsy. In Conference companion on Human
factors in computing systems. 313–314.
[59] Kristen Shinohara and Jacob O Wobbrock. 2011. In the shadow of misperception:
assistive technology use and social interactions. In Proceedings of the SIGCHI
conference on human factors in computing systems. 705–714.
[60] Young Chol Song. 2010. Joystick text entry with word prediction for people with
motor impairments. In Proceedings of the 12th international ACM SIGACCESS
conference on Computers and accessibility. 321–322.
It’s Enactment Time!: High-fidelity Enactment Stage for Accessible Automated
Driving System Technology Research
Aaron Gluck Hannah Solini Julian Brinkley
Clemson University Clemson University Clemson University
Clemson, South Carolina Clemson, South Carolina Clemson, South Carolina
amgluck@g.clemson.edu hsolini@g.clemson.edu jbrinkl@g.clemson.edu

ABSTRACT
Automated driving system (ADS) technology has been incorporated into critical driving
functions (e.g., adaptive cruise control and autonomous braking) for over two
decades. Now companies like GM and Google are developing and testing fully
autonomous vehicles (AVs). However, the current design of AVs is for individuals who
can drive rather than those who cannot, those who would benefit most from these
vehicles. As the technology is still primarily experimental, conducting research with
AVs and older adults or people with disabilities is infeasible. Therefore, to explore
the accessibility of AVs, we conduct user enactment studies, as this method works well
with technologies that participants have little to no experience utilizing. This pictorial
describes the iterative process of developing a high-fidelity enactment environment, the
space and objects with which participants interact. The aim is to encourage the HCI
community to utilize high-fidelity enactment environments for conducting accessible
future technology research.
Author Keywords
User enactment; Automated driving system; High-fidelity enactment stage
CSS Concepts
• Human-centered computing • Human-computer interaction (HCI) • HCI design and
evaluation methods • Usability testing

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from permissions@acm.org.

ASSETS ’22, October 23–26, 2022, Athens, Greece


© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10…$15.00.
DOI: https://doi.org/10.1145/3517428.3551351
INTRODUCTION
Automated driving system (ADS) technology holds the
promise of redefining the way we drive, minimizing
roadway accidents and injuries, and saving lives
[10,17,22]. While important, the potential for mobility
independence that ADSs provide for people who cannot
drive (e.g., older adults and people who are blind) is
arguably more significant [2]. However, current ADS
development trends appear to focus on occupants who can
presently drive conventional motor vehicles [16]. This
lack of accessible development may be due to designers
waiting for ADS technology to reach SAE International’s
Level 5 of driving automation, in which the autonomous
vehicle (AV) can handle all driving situations without
occupant input [18,23]. The problem with this approach Figure 1: Messy teenager’s room stage [20]. Figure 2: Autonomous vehicle stage [21].
is that design patterns for human-machine interfaces
(HMIs) are being developed without standardization and
without implementing accessibility features from the
beginning [3,5]. This problem is exacerbated by the
unavailability of Level 5 AVs to explore accessible HMI
and vehicle design for older adults and people with
disabilities with participant stakeholders. For instance,
Google and GM are designing and testing Level 5 AVs,
but these vehicles are currently only available for
ridesharing and in limited, controlled environments
[11,25]. These vehicles are unavailable to be brought into
the research lab or interacted with by participants outside
specific locations (e.g., San Francisco, California [7,26]
and Phoenix, Arizona [26]).
Due to the inability to conduct research in a real-world
environment, accessible ADS technology research Figure 3: Smart-home stage [6]. Figure 4: Virtual home office stage [15].
studies require participants to imagine interacting with
the AV and its HMI systems. This imaginative method out their decisions [19]. User enactment sessions are User enactment stages have been used to study topics
can be complicated when exploring technologies that built around one or more scenarios, developed in advance including a messy teen’s bedroom (Figure 1) [20],
participants do not have prior experience using [21]. This to explore the study’s research questions and consider the autonomous vehicle design (Figure 2) [21], smart-home
situation is the perfect research environment to conduct amount of participant control and the level of fidelity assistants (Figure 3) [6], and understanding burglary
user enactment studies. User enactment (UE) is a process required to set the stage [20]. The enactment stage (Figure 4) [15]. These enactment stages showcase how
by which individuals can explore emerging and future provides the setting where the scenarios are acted out to UE can provide researchers with a better understanding of
technologies through simulated scenarios [4,19]. The UE give researchers a deeper understanding of the needs, participants’ needs and interactions across many research
process combines brainstorming, expressed by opinions, and limitations participants experience while areas.
participants speaking aloud their thoughts and actions, attempting to physically and mentally complete the tasks
and bodystorming, where participants physically act presented in the scenario [20].
EVOLUTION OF THE ENACTMENT STAGE
The design and development of our enactment stage evolved throughout a series of user
enactment studies conducted by our research lab. After each study, we held internal
debriefing meetings to discuss our enactment stage’s positive and negative aspects based
on participant feedback and our research staff’s observations. We used the results of
these debriefings to iterate on the design of the enactment stage.
First Enactment Study
Our first user enactment study [13,14] created a low fidelity enactment stage. We
outlined an average-sized three-row sport utility vehicle (SUV) using blue painter’s tape
with a length of 190 inches and a width of 75 inches. We completed the stage by
including six chairs representing seating options within the vehicle. Ten participants,
divided into two groups of five, participated in the study. Participants acted and spoke
aloud while interacting with the enactment stage: entering and exiting the vehicle, taking
part in in-vehicle activities (e.g., talking on the phone, listening to music, and sleeping),
and moving and removing chairs from the stage to fit their needs.
Planned enactment stage changes after reviewing the first study:
• Add a method for tracking proposed changes to the enactment stage
• Add front and rear of vehicle sections to delineate the vehicle’s main cabin to
reflect participant preference for a smaller vehicle.
• Add props to represent items that participants may choose to bring along
while traveling.
Second Enactment Study
In preparation for our second enactment study [12,14], we provided a selection of
physical props (e.g., phone, purse, and book) and foam core board pieces to write the
name of items that participants requested but that we did not have (e.g., golf clubs).
Additionally, we used pieces of foam core board to write down changes to the enactment
stage to track these modifications between participant groups. Finally, two tri-fold
display boards were used to represent the front and rear of the ADS vehicle.
We began this study by taping off the same SUV footprint as the first study and placing
six chairs into the enactment stage. Thirty older adult participants (60+) participated in
this study. Participants were divided into seven groups of three to six individuals. The
first group was presented with the final design from the first enactment study and iterated
upon it to develop their design. Each following group began with the previous group’s
ending design and continued the iteration. User enactment for this study focused on three
phases of interacting with the ADS vehicle: the arrival of the vehicle to pick up
passengers and vehicle ingress, riding in the vehicle, and the vehicle’s arrival at the
passengers’ destination and vehicle egress. As in the first study, participants acted out
and spoke aloud about their interactions with the enactment stage.
Planned enactment stage changes after reviewing the second study:
• Rework tracking system to make changes more visible to all participants
• Inclusion of a communication system to allow for more in-depth studies using
the Wizard-of-Oz method
• A better system for delineating the vehicle’s interior spaces
• A way to place a human-machine interface (HMI) for requested features in a
specific location within the enactment stage
HIGH-FIDELITY ENACTMENT STAGE DEVELOPMENT
After discussing participant responses and observations made
during the second study, we began planning the development of
an updated enactment stage. We were inspired to create a high-
fidelity enactment stage by the ridesharing ADS vehicles
currently under development and testing. As these vehicles are
in limited production, there were physical vehicle models on
which to base our design. We compared the vehicle designs of
Google’s Waymo One (Figure 5) [26] and GM’s Cruise series
of AVs (Figure 6 and Figure 7) [7], as each of these companies
was conducting on-road testing of their vehicles at the time of
Figure 5: Waymo One autonomous vehicle [26].
our development. Ultimately, we decided to base our high-
fidelity enactment stage model on the GM Cruise Origin
ridesharing AV [7]. Deciding factors were doors that can support
fully motorized wheelchair ingress and egress, the ability to
easily manipulate the seating floor plan, and more surface space
for exploring multiple HMI placements and interactions.
Development Plan
Development of the high-fidelity enactment stage began with
finding reference images of the GM Cruise Origin [7] to
determine its approximate dimensions. Next, rough blueprints
were developed to determine the final layout and plan materials.

Figure 6: First Generation Cruise AV [7].

Figure 7: The Cruise Origin ridesharing autonomous vehicle [7].


HIGH-FIDELITY ENACTMENT STAGE CONSTRUCTION
The main structure of the enactment stage was built from 2-inch by 4-inch wood boards and ¾” plywood sheets and
covered with a foam core board shell. Screen door material was used to represent portions of the vehicle composed of
tinted glass. Structural elements in screened areas were painted black to show this would be glass in a production
vehicle. The overall dimensions of the high-fidelity enactment stage are 14 feet long, 7 feet high (with an internal height
of 6 feet), and 6 feet wide. It has two sets of sliding doors on either side of the vehicle and a working trunk.
Building for Accessibility
This project was designed to ensure that older adults and people with disabilities could interact with the high-fidelity
enactment stage, which dictated certain aspects of our development. First, the base floor was over-engineered to support
the weight of four individuals in powered wheelchairs. A subfloor structure of 2-inch by 4-inch wood boards was set
every foot, with ¾” plywood laid on top to create the floor, providing over 600 pounds per square foot of support [1].
Second, as the enactment stage floor is 4 ¼” above the floor, a metal ramp can be added to access the interior without
stepping up. The ramp is 5 feet long and 3 feet wide to create the required rise over run and width for access by an
individual in a wheelchair [9].
HIGH-FIDELITY ENACTMENT STAGE FOR RESEARCH INTERNAL PILOT STUDY
The primary use of the high-fidelity enactment stage is to research automated driving We explored the usability and accessibility of the high-fidelity enactment stage while
system technologies for older adults and people with disabilities, so we needed to provide researching accessible autonomous vehicles for people with disabilities as a semi-finalist
the ability for our participants to design and modify the stage. The interior walls of the in the U.S. Department of Transportation’s (DOTs) Inclusive Design Challenge (IDC)
stage are covered with metallic paper [8], and we developed foam core magnet widgets [24]. We made project-specific modifications to the enactment stage to present a usable
in distinctive shapes to represent different interaction devices one might encounter in an touch-screen HMI that can be interacted with directly or through a wearable device, the
ADS (e.g., levers, buttons, keypads, and microphones). Each widget has its name printed glasses worn in the images below. We also installed a video projector to explore using
in both text and Braille. Additionally, we have a supply of blank widgets readily available the ADS vehicle’s floor as an additional monitor displaying the vehicle’s position on a
if a participant requests a previously uncreated widget. The new widget’s name is added map. One member of our research team with low vision that had not previously interacted
via a sticky note. The metallic paper and the magnetic widgets allow participants to place, with the enactment stage was brought in to interact with the enactment stage and our
rearrange, and remove various ADS human-machine interfaces on the stage’s interior final IDC project. We asked a series of qualitative questions about her experience
walls to reflect their needs, preferences, and ideas. interacting with the high-fidelity enactment stage, both with and without the IDC glasses.
She had no problems entering or exiting the enactment stage and reported finding the
In addition to the widgets, our high-fidelity enactment stage has other features that allow environment accessible, easy to interact with, and use. The results of this internal pilot
for design modifications in future research studies. The seating is moveable, allowing demonstrated the usability of the high-fidelity enactment stage and that it could quickly
researchers to create different scenarios and for participants to modify the seating layout and easily be modified to explore different features and assistive technologies for
as desired. There is also sufficient space in the front and rear of the vehicle to install different research studies and conditions. While additional testing is required, the initial
computers or other digital devices to study various systems. Additionally, as the findings show the benefit of using a high-fidelity enactment stage to research accessible
enactment stage is constructed from lumber and foam core board, modifying the internal ADS technology and vehicles.
structures is as simple as removing or adding a few screws. Finally, multiple locations
are available within the enactment stage for placing recording equipment data analysis.
CONCLUSION REFERENCES [6] Yi-Shyuan Chiang, Ruei-Che Chang, Yi-Lin
In this pictorial, we described the preliminary user [1] Appliance Analysists. 2021. How Much Weight Chuang, Shih-Ya Chou, Hao-Ping Lee, I-Ju Lin,
enactment studies that led to the design and development Can Plywood Hold? Free Calculator. Appliance Jian-Hua Jiang Chen, and Yung-Ju Chang. 2020.
of the high-fidelity enactment stage for researching the Analysts. Retrieved June 23, 2022 from Exploring the Design Space of User-System
accessibility of ADSs. The main factor driving the
https://applianceanalysts.com/plywood-weight- Communication for Smart-home Routine
development of this enactment stage is the unavailability
of self-driving ADS technology for research studies. The capacity/ Assistants. In Proceedings of the 2020 CHI
high-fidelity enactment stage provides an environment Conference on Human Factors in Computing
[2] Robin N. Brewer and Vaishnav Kameswaran.
where participants can imagine interacting with the ADS Systems, ACM, Honolulu HI USA, 1–14.
2018. Understanding the Power of Control in
technology and providing their needs and opinions for the DOI:https://doi.org/10.1145/3313831.3376501
Autonomous Vehicles for People with Vision
required features and design to make the system
Impairment. In Proceedings of the 20th [7] Cruise LLC. 2022. We’re Cruise, a self-driving car
accessible for older adults and people with disabilities.
International ACM SIGACCESS Conference on service designed for the cities we love. cruise.
The high-fidelity enactment stage is far from complete, Computers and Accessibility (ASSETS ’18), Retrieved June 21, 2022 from
especially as ADS technology is still new. Instead, it will ACM, New York, NY, USA, 185–197. https://www.getcruise.com/technology
continue to morph and change based on future study DOI:https://doi.org/10.1145/3234695.3236347
requirements, findings from research conducted on the [8] Drytac. 2021. Ferro metal paper. Drytac. Retrieved
enactment stage, and ADS technology advancements. [3] Julian Brinkley, Earl W. Huff, Kwajo Boateng, and June 23, 2022 from
Additionally, we plan to explore new high-fidelity add- Suyash Ahire. 2020. Autonomous Vehicle Design https://www.drytac.com/product/ferro-metal-paper/
ons to the enactment stage (e.g., augmented reality and Anti-Patterns: Making Emerging Transportation
projectors) and human-machine interface systems and [9] EZ-ACCESS. 2022. Ramp Incline Calculator |
Technologies Inaccessible by Design. Proceedings
applications. ADA Ramp Slope. EZ-ACCESS. Retrieved June
of the Human Factors and Ergonomics Society
23, 2022 from https://www.ezaccess.com/?
Finally, this research was best presented as a pictorial due Annual Meeting 64, 1 (December 2020), 1038–
to the highly visual nature of the studies. The pictorial 1042. [10] Daniel J. Fagnant and Kara Kockelman. 2015.
format allows HCI researchers to see the processes that DOI:https://doi.org/10.1177/1071181320641249 Preparing a nation for autonomous vehicles:
led to the high-fidelity enactment stage’s development opportunities, barriers and policy
and how participants can interact with it. [4] Marion Buchenau and Jane Fulton Suri. 2000.
recommendations. Transportation Research Part A:
Experience prototyping. In Proceedings of the 3rd
ACKNOWLEDGMENTS Policy and Practice 77, (July 2015), 167–181.
conference on Designing interactive systems:
We want to thank the many participants who have DOI:https://doi.org/10.1016/j.tra.2015.04.003
processes, practices, methods, and techniques (DIS
participated in the associated studies and the US ’00), Association for Computing Machinery, New [11] General Motors. 2018. 2018 Self-Driving Safety
Department of Transportation, who partially funded the York City, New York, USA, 424–433. Report. Detroit, MI. Retrieved June 22, 2022 from
development of the high-fidelity enactment stage through
DOI:https://doi.org/10.1145/347642.347802 https://www.gm.com/content/dam/company/docs/u
their Inclusive Design Challenge for Autonomous
Vehicle Accessibility. s/en/gmcom/gmsafetyreport.pdf
[5] Oliver Carsten and Marieke H. Martens. 2019.
How can humans understand their automated cars? [12] Aaron Gluck, Kwajo Boateng, Earl W. Huff Jr.,
HMI principles, problems and solutions. Cogn and Julian Brinkley. 2020. Putting Older Adults in
Tech Work 21, 1 (February 2019), 3–20. the Driver Seat: Using User Enactment to Explore
DOI:https://doi.org/10.1007/s10111-018-0484-0 the Design of a Shared Autonomous Vehicle. In
12th International Conference on Automotive User
Interfaces and Interactive Vehicular Applications,
ACM, Virtual Event DC USA, 291–300.
DOI:https://doi.org/10.1145/3409120.3410645
[13] Aaron Gluck, Earl W. Huff, Mengyuan Zhang, and United States Department of Transportation. of the 2017 Conference on Designing Interactive
Julian Brinkley. 2020. Lights, Camera, Autonomy! Retrieved June 18, 2022 from Systems - DIS ’17, ACM Press, Edinburgh,
Exploring the Opinions of Older Adults Regarding https://www.nhtsa.gov/vehicle- United Kingdom, 147–160.
Autonomous Vehicles Through Enactment. manufacturers/automated-driving-systems DOI:https://doi.org/10.1145/3064663.3064666
Proceedings of the Human Factors and Ergonomics
[18] NHTSA. 2022. Automated Vehicles for Safety | [22] Mike Ramsey. 2015. Self-Driving Cars Could Cut
Society Annual Meeting 64, 1 (December 2020),
NHTSA. United States Department of 90% of Accidents - WSJ. The Wall Street Journal.
1971–1975.
Transportation. Retrieved June 23, 2022 from Retrieved June 23, 2022 from
DOI:https://doi.org/10.1177/1071181320641475
https://www.nhtsa.gov/technology- https://www.wsj.com/articles/self-driving-cars-
[14] Earl W. Huff, Mengyuan Zhang, and Julian innovation/automated-vehicles-safety could-cut-down-on-accidents-study-says-
Brinkley. 2020. Enacting into Reality: Using User 1425567905
[19] William Odom, John Zimmerman, Scott Davidoff,
Enactment to Explore the Future of Autonomous
Jodi Forlizzi, Anind K. Dey, and Min Kyung Lee. [23] SAE International. Taxonomy and Definitions for
Vehicle Design. Proceedings of the Human Factors
2012. A fieldwork of the future with user Terms Related to Driving Automation Systems for
and Ergonomics Society Annual Meeting 64, 1
enactments. In Proceedings of the Designing On-Road Motor Vehicles. SAE International.
(December 2020), 1561–1565.
Interactive Systems Conference on - DIS ’12, DOI:https://doi.org/10.4271/J3016_201806
DOI:https://doi.org/10.1177/1071181320641373
ACM Press, Newcastle Upon Tyne, United
[24] U.S. Department of Transportation. 2022. DOT
[15] Amy Meenaghan, Claire Nee, Jean-Louis Van Kingdom, 338.
Inclusive Design Challenge. U.S. Department of
Gelder, Marco Otte, and Zarah Vernham. 2018. DOI:https://doi.org/10.1145/2317956.2318008
Transportation. Retrieved June 23, 2022 from
Getting Closer to the Action: Using the Virtual
[20] William Odom, John Zimmerman, Jodi Forlizzi, https://www.transportation.gov/accessibility/inclusi
Enactment Method to Understand Burglary.
Hajin Choi, Stephanie Meier, and Angela Park. vedesign
Deviant Behavior 39, 4 (April 2018), 437–460.
2012. Investigating the presence, form and
DOI:https://doi.org/10.1080/01639625.2017.14071 [25] Waymo LLC. 2021. Waymo Safety Report.
behavior of virtual possessions in the context of a
04 Retrieved June 22, 2022 from
teen bedroom. In Proceedings of the 2012 ACM
https://storage.googleapis.com/waymo-
[16] National Federation of the Blind. 2016. 2016 annual conference on Human Factors in
uploads/files/documents/safety/2021-12-waymo-
Resolutions | National Federation of the Blind. Computing Systems - CHI ’12, ACM Press,
safety-report.pdf
National Federation of the Blind. Retrieved June Austin, Texas, USA, 327.
23, 2022 from https://nfb.org/resources/speeches- DOI:https://doi.org/10.1145/2207676.2207722 [26] Waymo LLC. 2022. Waymo One. Waymo.
and-reports/resolutions/2016-resolutions Retrieved June 21, 2022 from
[21] Ingrid Pettersson and Wendy Ju. 2017. Design
https://waymo.com/waymo-one/
[17] National Highway Traffic Safety Administration. Techniques for Exploring Automotive Interaction
2020. Automated Driving Systems | NHTSA. in the Drive towards Automation. In Proceedings
Where Are You Taking Me? Reflections from Observing
Ridesharing Use By People with Visual Impairments
Earl W. Huf Jr. Robin N. Brewer Julian Brinkley
School of Information School of Information School of Computing
The University of Texas at Austin University of Michigan Clemson University
Austin, TX, USA Ann Arbor, MI, USA Clemson, SC, USA
ewhuf@utexas.edu rnbrew@umich.edu julianbrinkley@clemson.edu

ABSTRACT especially applies to persons with visual disabilities. Unlike am-


Ridesharing services have become a popular mode of transportation putees or persons with partial paralysis, there are no commercially
that holds signifcant benefts for people with disabilities unable available assistive technologies that enable persons who are blind
to operate conventional motor vehicles. Prior work shows how [14] or those with signifcant low vision to operate conventional
these services enable people with vision impairments to travel in- motor vehicles [7]. Ridesharing is an increasingly popular form
dependently without the use of public transportation or walking. of transportation that has played an important part in reshaping
We conducted a study in which we observed 17 blind or visually public transportation. These real-time services, led by companies
impaired participants using the Uber ridesharing service to explore like Uber and Lyft, are an increasingly utilized on-demand mobility
the social and accessibility dynamics they perceived during their option that has seen increased use in the United States from 15% to
experiences. This paper presents a case study of the process used for 36% over the past four years [10]. While these services generally
our research, refecting on aspects that considerably impacted the add to the growing transportation options for society, they also
study. Key takeaways include study site considerations, recruiting hold signifcant mobility potential for disabled persons unable to
participants, balancing realism with participant safety, and the un- operate conventional motor vehicles.
intended side efects of observations in a ridesharing context. The Ridesharing, therefore, may be a viable option for people with
refection points provided will help readers in considering impor- disabilities to travel independently without reliance upon friends,
tant study aspects when conducting observations with participants family, public transportation, or walking. However, there is a grow-
with disabilities, particularly in a ridesharing setting. ing but arguably insufcient body of research focused on the ac-
cessibility of popular ridesharing services for persons with visual
CCS CONCEPTS disabilities. Existing work has looked at trust-building, driver-rider
collaboration, situational awareness, and the roles of both driver
• Human-centered computing → Accessibility; • Applied com-
and passengers in ridesharing [3, 4, 12]. These studies classifed
puting → Transportation; • Social and professional topics →
the means through which blind or visually impaired (BVI) people
People with disabilities.
establish trust and engage in collaborative navigation with the dri-
ver in situational awareness and location verifcation. While these
KEYWORDS recent studies provide relevant data about past experiences with
ridesharing, observation study, people with visual impairments ridesharing services, they primarily rely on self-report research
ACM Reference Format: techniques. Such methods may incur response and recency bias
Earl W. Huf Jr., Robin N. Brewer, and Julian Brinkley. 2022. Where Are You or lack of details about anyone’s experience. Our paper presents a
Taking Me? Refections from Observing Ridesharing Use By People with case study of how we conducted a series of quasi-naturalistic simu-
Visual Impairments. In The 24th International ACM SIGACCESS Conference on lations of the end-to-end process for using a ridesharing service by
Computers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. people with visual impairments to better understand the real-world
ACM, New York, NY, USA, 6 pages. https://doi.org/10.1145/3517428.3551355 experiences of BVI people in the ridesharing context. The remain-
der of this paper is organized as follows: section 2 provides some
1 INTRODUCTION background on ridesharing and people with disabilities; section
Research has indicated that as many as 20% of people with disabili- 3 describes in detail the study setup and data collection process;
ties in the U.S. face transportation barriers, with 45% of this group section 4 details our observations regarding the study execution,
lacking access to a personal passenger vehicle [16]. This problem participant behavior, and main takeaways from our work; section
5 provides future directions and conclusion.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed 2 BACKGROUND
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than ACM 2.1 People with Disabilities in Ridesharing
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a Prior research on ridesharing services has investigated the expe-
fee. Request permissions from permissions@acm.org. rience from both driver and rider perspectives. From the rider’s
ASSETS ’22, October 23–26, 2022, Athens, Greece perspective, prior work has looked at the benefts of ridesharing,
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 such as providing lower-cost transportation alternatives to taxis,
https://doi.org/10.1145/3517428.3551355 and challenges in using their services [4, 8, 11, 12]. More recently,
ASSETS ’22, October 23–26, 2022, Athens, Greece Earl W. Huf et al.

research has focused on ridesharing services for people with vi- as our preferred data collection method. Using observations, our
sual impairments. The current research on the topic examined how goal was to understand how BVI participants used ridesharing in a
visually impaired persons can beneft from ridesharing services real-world experience.
and what challenges may afect the adoption of their use. Results
revealed that independence was the most signifcant positive of us-
3 STUDY METHOD
ing ridesharing services; the increased mobility without relying on
sighted people to take them to their destinations [4, 12]. The biggest We conducted a quasi-naturalistic study where participants initiated
challenges in using ridesharing services were 1) the inaccessibility and completed a ride to a pre-defned destination using the Uber
of the ridesharing app, 2) locating the right vehicle upon its arrival, ridesharing service, after which they were interviewed about their
3) getting assistance to the destination upon arrival, and 4) drivers’ experience. For the context of this paper, we do not include the
lack of experience working with people with disabilities [4, 12]. fndings of the post-study interviews as we present an account
The studies provide a strong foundation for future discussions on of the process used to conduct this study. The purpose of this
ridesharing experiences for people with vision impairments and case study is to present readers with a narrative of how specifc
serve as the basis for our quasi-naturalistic approach by collect- decisions in our methodology impacted the research execution and
ing observation data as participants use a ridesharing service in a produced certain outcomes and side efects. We believe these details
real-world setting. are essential for researchers to know when considering conducting
observations in ridesharing research with people with disabilities.
2.2 Factors Impacting Adoption of
Transportation 3.1 Recruitment
Prior work on people’s adoption of emerging transportation tech- We partnered with an organization serving persons with visual
nologies reveals the importance of trust and its efect on their impairments in North-Central Florida to recruit participants. In-
willingness to adopt [5, 6, 8, 9, 13, 15].Choi and Ji’s study revealed terested individuals could participate in the study if they fulflled
that trust and perceived usefulness were the most signifcant factors the following criteria: 1) age 18 or above and 2) considered them-
motivating people’s adoption of autonomous vehicles [6]. Further selves to be a visually impaired person based on an accompanying
studies also support the fndings of Choi and Ji by reporting that the defnition describing visual impairment as blindness or limited vi-
reliability of autonomous systems is an additional factor for tech- sion not correctable by conventional correction such as glasses
nology adoption [13, 15]. Trust is also a signifcant factor when it or contact lenses. Additionally, we screened participants before
comes to the adoption of using ridesharing services. Related works the consent and scheduling process by telephone. As part of the
reveal how transparency and safety assurance is vital to fostering screening process, they expressed an absence of physical or cog-
trust between riders and drivers [8, 9]. Dillahunt’s work reports on nitive disabilities that would prevent the use of a mobile device,
the mistrust of Uber due to past negative experiences [8]. Although challenges with ingress or egress from a passenger vehicle, or a
prior work dealt with sighted people, the same concerns and bar- history of motion sickness. Lastly, we required participants to have
riers can afect visually impaired riders. In our study, we wanted used an Android or Apple smartphone within the past six months
to examine the interaction between drivers and visually impaired from the date of the study session. The frst author’s Institutional
riders to understand the social dynamics in a more naturalistic Review Board approved the study, and each participant provided
setting. written informed consent on the day of their study session. We
compensated participants with a $40 prepaid gift card for their
participation.
2.3 People with Disabilities in HCI Research
Within the HCI community, we seek to understand users and their
context to obtain insights into how they use technology. We achieve 3.2 Participant
such goals through user research methods such as surveys and in- We recruited 17 people to participate in the study. In summary,
terviews. However, an area of concern in HCI research is the lack of participants were 20 to 88 years old (mean = 55). Regarding vehicle
inclusion of people with disabilities in such studies. They are often ownership, 70.6% of participants indicated that they own a vehicle.
excluded because, depending on their disability, they would require When asked about modes of transportation, 59% indicated getting
adapting existing methods to accommodate their needs and the “im- a ride from others in diferent forms of passenger vehicles, while
plicit bias against disabled subjects as producers of knowledge”[17]. 41% indicated using public transportation (i.e., bus, taxi, shuttle,
Such exclusion promotes exclusive technology design and further and ridesharing). During the screening and scheduling process,
alienates people with disabilities from participating actively in the we briefy presented participants with functional defnitions of
community. Within the current literature, however, observations blindness and low vision. We asked them to choose the defnition
are a method to include persons with disabilities as producers of that best characterized their degree of vision loss. On the day of
knowledge in a way that researchers can collect in-depth informa- their study session, we again presented them with these defnitions
tion about their personal experiences with technology that cannot and asked them to indicate their medically diagnosed visual acuity
be captured using self-report techniques [1, 2]. The emphasis on in their better-seeing eye with conventional correction if known.
allowing participants to carry out their natural routine in a non- Three participants self-identifed as blind, and 14 self-identifed
laboratory setting while still gaining some understanding of their as low vision. Many of the participants (41%) have been visually
lives using technology was why we decided to use observations impaired for less than ten years, 35.3% have been visually impaired
Where Are You Taking Me? Reflections from Observing Ridesharing Use By People with Visual Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

between 10 to 25 years, and 23.7% have been visually impaired since in a separate vehicle. This measure was to ensure the participant’s
birth. safety and to aid the Uber driver in navigation in the event they
get lost along the route. Upon arrival at the fnal destination, this
3.3 Tools research team member sought written informed consent from the
Uber driver and asked them to complete a photo/video release form.
We purchased two smartphones for the study. Due to technical
We collected informed consent from the driver after the trip because
issues, three phones, two Android and one Apple were ultimately
of concern that informing the driver about the study before the
used for each trial. All phones had the Uber ridesharing application
trip would cause the driver to act in a manner unnatural in their
installed, and each phone had a preconfgured payment option to
regular working routine, potentially taking away from the natural
cover the anticipated cost of the participant’s trip. We asked partic-
interaction between the driver and the participant. The research
ipants to wear a GoPro Hero 7 Black using a harness accessory to
team member walked with the study participant to the staging area,
record the interaction between the participant and the ridesharing
where recording equipment was removed, and the post-scenario
driver for data collection and safety purposes. We used a set of
interview was conducted.
hand-held communication devices for communication between the
research team and the participants. We instructed the participant
that if they felt ill, unsafe, or otherwise desired to stop participating,
4 OBSERVATIONS
they should use the communication device to contact a research We present an account of our observations during the ridesharing
team member to request a ride back to the study staging area. None study. We examine the pros and cons of the diferent aspects of
of the participants fell ill or decided to stop participating in our each phase of the study.
study.
4.1 Participants and Study Site
3.4 Procedure With any accessibility research, recruiting participants with dis-
Each trial lasted approximately an hour and a half, with a procedure abilities, especially focusing on specifc disabilities, is a common
identical to each session. After the consent form and demographic challenge. An even more signifcant challenge is recruiting partici-
questionnaire were completed, we provided participants with the pants with disabilities in one geographic location to conduct a feld
following: (1) an Android or Apple smartphone with accessibility study of ridesharing experiences. One solution to this challenge is
features confgured to their specifcations, depending upon prefer- contacting local advocacy groups or chapters of national organiza-
ence and availability, and (2) a two-way communication device to tions serving people with disabilities who may have a contact list
communicate with the research team during the ridesharing trip, of members from whom they help recruit. In our case, we reached
and (3) a GoPro action camera harnessed to their torso to record out to a school for the blind in North Central Florida that could
their experience during the trip. Before the trip, one research team help us recruit blind or visually impaired participants for our study.
member trained the participant on how to use the two-way com- We selected a school in Florida because a team member had prior
munication device and explained how the GoPro camera worked. experience working and built a good rapport with them. As a result
We provided participants with details on the study scenario and of the arrangement, the team agreed to conduct the study at the
instructed them to begin their tasks. The two-way communica- school since the participants will generally be school attendants.
tion device and the GoPro camera were provided to 1) maintain The general area in which the school resided provided an ideal
communication with the participant, 2) maintain awareness of the site for the study. There were many public landmarks to serve as
participant’s current status in terms of health and location, and 3) the destination for the trips; most roads had a speed limit of no
ensure the participant’s safety during the trip. We informed the more than 45 MPH, and the roads contained multiple lanes, which
participant that a research team member would be following them was ideal for having a team member follow the ridesharing vehi-
in a separate vehicle for their safety. The participant was instructed cle. In addition to the road conditions, the school ofered sufcient
to use the two-way communication device at any time during the space for conducting the pre-ride and post-ride phases of the study.
ride if they wished to stop participating or required medical atten- We used four rooms within the school: one for the pre-ride phase,
tion. Participants were asked to attempt a one-way ride using the another room to prepare the participants with the necessary equip-
Uber mobile application as part of a four-task scenario. Participants ment for their ride, and two rooms to conduct one-on-one post-ride
were asked to: 1) Request a ride to a pre-defned location, 2) Locate interviews. We used two rooms for interviews because of the possi-
the Uber vehicle upon arrival at the point of origin, 3) Enter the bility that one session may not end before the next session begins,
vehicle, 4) Exit the vehicle when they arrive at the destination. and two participants may be interviewed concurrently.
We instructed participants to request the ride by selecting the
“UberX” option within the Uber mobile application to start an in- 4.2 Pre-Ride Setup and Procedure
dividual ride (as opposed to carpool options) and were assisted, as The initial setup for the study involved preparing the tools the
needed, in entering the destination and selecting the UberX vehicle. participants would be using. For our study, we used two smart-
Once a ride was confrmed, participants were led outside to wait phones, one with the Android operating system and the other using
for the Uber driver. Once the Uber vehicle arrived, participants Apple’s iOS. We set up each phone with the necessary assistive
began interacting with the driver and entered the vehicle. During technologies for participants to use them. For Apple, VoiceOver
the trip, a member of the research team, equipped with a paired is their built-in mobile screen reading feature; for Android-based
two-way communication device, followed the ridesharing vehicle devices, TalkBack (at the time of the study) is their screen reading
ASSETS ’22, October 23–26, 2022, Athens, Greece Earl W. Huf et al.

application. We intentionally selected the Uber ridesharing applica- impaired participants. One way we could have alleviated this issue
tion because of the higher availability of drivers in the area than was to run the sessions at diferent times of the day to increase the
Lyft at the time. We confgured accounts with a credit card saved odds of getting new drivers. Similarly, we could have scheduled the
in the Payments section. We also saved the destination location sessions on diferent days rather than consecutive days to avoid
in the app to make it easier for participants to select it during the making a pattern of our ridesharing runs. The challenge lies in the
ride-booking process. We instructed participants to go through the unknown of what and when drivers will be in the vicinity of the
process of ordering a ride from the Uber application using their site.
phone of choice. During this phase, we experienced some technical
issues, such as inconsistencies in the payment being processed for 4.4 Ridesharing Experience
rides. There was a problem with the account we created for the
After being equipped with the tools for the study and booking a ride,
phones and using the payment options to book the rides. The issue
participants made their way to the building entrance to wait for
was less prevalent when using an Android-based smartphone than
their ride to come. Some participants noted that the screen reader
the Apple smartphone (although we believe the issue was not spe-
would speak out the name of the driver but not the make or model
cifc to a mobile O.S.). As a working solution, we decided to use an
of their vehicle nor the license plate number, which was a concern
Android-based smartphone; however, we knew it would be an issue
for them. When the vehicle arrived, a team member approached the
for participants who were accustomed to using Apple smartphones.
driver to inform them that the participant was the intended rider
However, they were able to adjust quickly, with some assistance
but did not tell them it was part of a research study. A team member
from the research team. As participants began using the application,
followed the ridesharing vehicle in a separate car during the ride.
there were difculties navigating the user interface (U.I.). At times,
The participant was aware of this but not the driver until they
they could not move from the map interface to the Uber service
reached their destination. We did this so that if the participant did
selection menu at the bottom of the screen. Other times, they had
not want to participate in the study anymore or experienced motion
challenges navigating from selecting the Uber vehicle to the con-
sickness, they could contact the team member via the two-way radio
frmation of payment menu. These issues were brought up again
to end the study, and the team member would take the participant
in the post-study interviews. Some participants mentioned they
back to their vehicle. Additionally, if the driver got lost getting
would have preferred the ability to use voice to schedule the ride
to the destination entrance, the team member could assist with
rather than interacting with the U.I. Another tool was the GoPro
navigation. There were instances where the driver had difculties
Hero 7 Black camera mounted onto a harness that participants
fnding the entrance to the destination, and the participant would
wore for the trip. The harness wraps around the user’s upper body
contact the driving team member to assist. Once at the destination,
and sits on their chest. The range of adjustment for the harness
both vehicles parked, the team member approached the driver to
was limited, which made it uncomfortable for some participants.
explain the purpose of the study and their involvement and asked
While it captured much of the interaction with the research team
the driver to sign an informed consent to agree to be a part of the
and ridesharing driver, the camera positioning did have a limited
study and a media release form for permission to use the recording
range of video coverage.
from the camera. Additionally, the drivers were compensated with
an appropriate tip for their participation. No driver declined to
4.3 Consent Process for Driver and Passenger participate or refused to have the recording released. Afterward,
the participant was assisted into the team member’s vehicle and
As in general research practice, we obtained informed consent from
driven back to the study site. Upon refection, we believe a possible
the BVI participants before we began each session. However, for the
faw in the study design was the safety measures for the participant,
drivers involved in our study, we decided to collect their informed
namely the camera given to the participant, the two-way radio,
consent after completing the trip with the participant passengers.
and the team member driving alongside the ridesharing vehicle.
We needed their consent in order to use the video from the partici-
Those elements may have given the participant a greater sense of
pant’s camera and also to obtain their acknowledgment that they
safety and security than a typical ride without them and could have
willfully took part in the experience. We intentionally waited until
impacted their natural behavior and interaction with ridesharing
after the trip to maintain the natural interaction between driver
drivers. Another consideration is the time of day when the study
and rider. We felt the drivers would act out of their typical behav-
was conducted. All sessions were conducted in the morning and
ior if they discovered the nature of our research before the ride.
early afternoon when there was less trafc. However, the time of day
However, this proved to be a moot point as a few of the drivers
may have impacted the availability of drivers present for the study,
stayed in the area near the study site to accept subsequent ride
as noted by the a few of the drivers that repeatedly accepted our ride
requests. We believed this to be the result of the incentive ofered
requests by staying within the area once they knew about the study.
to drivers where we provided a $25 tip (this was considered relative
We believe staging the sessions during evening or morning rush
to the suggested tip by the app for the short trip requested) as
hour times may have yielded more drivers (although not guaranteed
compensation. This led some drivers to stay within an area close
diferent drivers every time).
to the study site to increase their chances of being selected for
subsequent requests. Further, because the same drivers would pick
up the participants and know the purpose of the study, the natural 4.5 Takeaways
interaction with the passenger was lost; instead, an incentive-based We provide several takeaways from our experience conducting our
bias may have infuenced the drivers’ behaviors toward the visually study to inform the community who may be interested in carrying
Where Are You Taking Me? Reflections from Observing Ridesharing Use By People with Visual Impairments ASSETS ’22, October 23–26, 2022, Athens, Greece

out a similar study with the visually impaired or other persons with ridesharing vehicle. The camera alone could have made the driver
disabilities. hesitant to engage since they were being recorded. When deciding
on the safety precautions, we prioritized the participants’ safety at
4.5.1 Feasibility of Recruitment. The feasibility of our study rests
the expense of some of the natural progression of a ridesharaing
on the convenience of fnding a location where our population of
simulation. Another concern was approval from our institution’s
interest would be available, in our case, a school for the blind or
IRB. It might not have been possible to achieve approval if we did
visually impaired. When conducting an observation under quasi-
not include the measures we did. When conducting observations of
naturalistic or naturalistic settings with people with disabilities,
people with disabilities, safety should always be the priority. The
being able to stage the study at a location that services or can
challenge becomes ensuring their safety while mitigating the loss of
help recruit the intended participants is one of the most important
their natural interaction in the studied context. We advise readers to
areas to consider. Even though no team member resides in the state
carefully consider all possible side efects of any props, equipment,
of Florida, we agreed that the close ties with the school and the
or process used in the study and how they may intentionally or
increased availability of participants with visual impairments made
unintentionally afect any aspect of the user experience.
the site an ideal area to host our research. We suggest contacting
local agencies serving people with disabilities, advocacy groups, or
local chapters of disability organizations to fnd an ideal location to 5 CONCLUSION
host the research and if they can assist with recruiting participants. We presented a case study of our experience conducting quasi-
naturalistic observations of blind or visually impaired people using
4.5.2 Technology Barriers. A major obstacle in conducting our
the Uber ridesharing service. Refecting on our experience, the
study was the tools we used, specifcally smartphones. The techni-
study ofered a lot of relevant fndings regarding how passengers
cal issues we encountered using the Uber application forced us to
with visual impairments engage in ridesharing trips with their
use one type of smartphone device (Android). While we had multi-
drivers and how ridesharing generally compares with other modes
ple Android smartphones, participants preferred using an Apple
of transportation they use. The process we used was not without
smartphone. As a result, participants had an additional period to
its drawbacks, such as technical issues with the equipment, the
adjust to using the Android device. However, we did discover partic-
unintended side efects of being present during the trip, and the use
ipants’ perceptions of using the Uber application during the setup
of certain safety equipment. The lessons learned from our study will
process. While some participants found the application generally
help us plan a future study involving observations of people with
accessible, others preferred multimodal interaction support as they
disabilities using ridesharing services. We present to readers several
found the U.I. challenging to navigate, even with their assistive
takeaways to consider when conducting their observations within
technologies. We advise readers to 1) thoroughly examine and test
a ridesharing context to avoid some of the challenges encountered
all equipment before the study and 2) research and provide options
within our work.
for assistive technologies when working with participants with
disabilities. We suggest running a series of mock simulations to
ensure the equipment will work as planned. REFERENCES
[1] Khaled Albusays, Stephanie Ludi, and Matt Huenerfauth. 2017. Interviews
4.5.3 Considering Roles of Stakeholders. An implication of our and Observation of Blind Software Developers at Work to Understand Code
research was the role of the stakeholders in our research. While Navigation Challenges. In Proceedings of the 19th International ACM SIGAC-
CESS Conference on Computers and Accessibility (Baltimore, Maryland, USA)
the BVI participants engaged in the entire process for most of the (ASSETS ’17). Association for Computing Machinery, New York, NY, USA, 91–100.
study, we did not fully consider the role of the drivers outside of https://doi.org/10.1145/3132525.3132550
[2] Pranjal Protim Borah and Keyur Sorathia. 2019. Direct Observation of Tactile
driving the participants to the destination. We decided to withhold Geometric Drawing by Visually Impaired and Blind Students. In Proceedings of
information regarding the study and the participant’s role until the 10th Indian Conference on Human-Computer Interaction (Hyderabad, India)
the trip’s conclusion to avoid unnatural interactions between the (IndiaHCI ’19). Association for Computing Machinery, New York, NY, USA, Article
11, 10 pages. https://doi.org/10.1145/3364183.3364185
driver and passenger. However, we did not consider the case of [3] Robin N. Brewer, Amy M. Austin, and Nicole B. Ellison. 2019. Stories from
what may happen after we informed the driver about our study and the Front Seat: Supporting Accessible Transportation in the Sharing Economy.
Proc. ACM Hum.-Comput. Interact. 3, CSCW (Nov. 2019), 95:1–95:17. https:
compensated them. For example, one driver remained in the area //doi.org/10.1145/3359197
closest to the study site to increase their chance of receiving subse- [4] Robin N. Brewer and Vaishnav Kameswaran. 2019. Understanding Trust, Trans-
quent requests and earn additional compensation. While canceling portation, and Accessibility through Ridesharing. In Proceedings of the 2019 CHI
Conference on Human Factors in Computing Systems - CHI ’19. ACM Press, Glas-
their ride was an option, we could not guarantee another available gow, Scotland Uk, 1–11. https://doi.org/10.1145/3290605.3300425
driver to take the request. Hence, we believe the reoccurrence of [5] Julian Brinkley, Brianna Posadas, Julia Woodward, and Juan E. Gilbert. 2017.
the same drivers, infuenced by an incentive-based bias, could have Opinions and Preferences of Blind and Low Vision Consumers Regarding Self-
Driving Vehicles: Results of Focus Group Discussions. In Proceedings of the 19th
negatively impacted the genuine interaction with other passengers. International ACM SIGACCESS Conference on Computers and Accessibility - ASSETS
Considerations should be made regarding running the study at ’17. ACM Press, Baltimore, Maryland, USA, 290–299. https://doi.org/10.1145/
diferent times of the day to help fnd diferent drivers. 3132525.3132532
[6] Jong Kyu Choi and Yong Gu Ji. 2015. Investigating the Importance of Trust on
Adopting an Autonomous Vehicle. International Journal of Human–Computer
4.5.4 Safety vs Realism. An unforeseen circumstance was the vari- Interaction 31, 10 (Oct. 2015), 692–702. https://doi.org/10.1080/10447318.2015.
ous safety measures used to protect the participant. Using a cam- 1070549
era, two-way radio, and a following vehicle from the team would [7] Audrey Demmitt. [n. d.]. The Transportation Problem: Finding Rides
When You Can’t Drive - Visually Impaired: Now What? - Vision-
provide a measure of security for the participants; however, that Aware. https://www.visionaware.org/blog/visually-impaired-now-what/the-
may also have afected their natural behavior and interaction in a transportation-problem-fnding-rides-when-you-can%E2%80%99t-drive/12
ASSETS ’22, October 23–26, 2022, Athens, Greece Earl W. Huf et al.

[8] Tawanna R. Dillahunt, Vaishnav Kameswaran, Linfeng Li, and Tanya Rosenblat. India. Proc. ACM Hum.-Comput. Interact. 2, CSCW (Nov. 2018), 85:1–85:24. https:
2017. Uncovering the Values and Constraints of Real-time Ridesharing for Low- //doi.org/10.1145/3274354
resource Populations. In Proceedings of the 2017 CHI Conference on Human Factors [13] J. Lee, H. Chang, and Y. I. Park. 2018. Infuencing Factors on Social Acceptance
in Computing Systems - CHI ’17. ACM Press, Denver, Colorado, USA, 2757–2769. of Autonomous Vehicles and Policy Implications. In 2018 Portland International
https://doi.org/10.1145/3025453.3025470 Conference on Management of Engineering and Technology (PICMET). 1–6. https:
[9] Tawanna R. Dillahunt and Amelia R. Malone. 2015. The Promise of the Sharing //doi.org/10.23919/PICMET.2018.8481760
Economy Among Disadvantaged Communities. In Proceedings of the 33rd Annual [14] National Federation of the Blind. 2019. Blindness Statistics. https://nfb.org/
ACM Conference on Human Factors in Computing Systems (CHI ’15). ACM, New resources/blindness-statistics
York, NY, USA, 2285–2294. https://doi.org/10.1145/2702123.2702189 event-place: [15] Arulanandam Jude Niranjan and Geert de Haan. 2018. Public Opinion About
Seoul, Republic of Korea. Self-Driving Vehicles in the Netherlands. In Proceedings of the 36th European
[10] Jinging Jiang. 2019. More Americans are using ride-hailing apps. Conference on Cognitive Ergonomics (ECCE’18). ACM, New York, NY, USA, 19:1–
https://www.pewresearch.org/fact-tank/2019/01/04/more-americans-are- 19:4. https://doi.org/10.1145/3232078.3232080
using-ride-hailing-apps/ Library Catalog: www.pewresearch.org. [16] U.S. Department of Transportation. [n. d.]. Transportation Difculties Keep
[11] Vaishnav Kameswaran, Lindsey Cameron, and Tawanna R. Dillahunt. 2018. Over Half a Million Disabled at Home | Bureau of Transportation Statis-
Support for Social and Cultural Capital Development in Real-time Rideshar- tics. https://www.bts.gov/archive/publications/special_reports_and_issue_
ing Services. In Proceedings of the 2018 CHI Conference on Human Factors in briefs/issue_briefs/number_03/entire
Computing Systems (CHI ’18). ACM, New York, NY, USA, 342:1–342:12. https: [17] Anon Ymous, Katta Spiel, Os Keyes, Rua M. Williams, Judith Good, Eva Hornecker,
//doi.org/10.1145/3173574.3173916 event-place: Montreal QC, Canada. and Cynthia L. Bennett. 2020. "I Am Just Terrifed of My Future" — Epistemic
[12] Vaishnav Kameswaran, Jatin Gupta, Joyojeet Pal, Sile O’Modhrain, Tifany C. Violence in Disability Related Technology Research. In Extended Abstracts of
Veinot, Robin Brewer, Aakanksha Parameshwar, Vidhya Y, and Jacki O’Neill. the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI,
2018. ’We Can Go Anywhere’: Understanding Independence Through a Case USA) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA,
Study of Ride-hailing Use by People with Visual Impairments in Metropolitan 1–16. https://doi.org/10.1145/3334480.3381828
A Collaborative Approach to Support Medication Management
in Older Adults with Mild Cognitive Impairment Using
Conversational Assistants (CAs)
Niharika Mathur Kunal Dhodapkar Tamara Zubatiy
nmathur35@gatech.edu kunal.dhodapkar2012@gmail.com tzubatiy3@gatech.edu
School of Interactive Computing School of Interactive Computing School of Interactive Computing
Georgia Institute of Technology Georgia Institute of Technology Georgia Institute of Technology
Atlanta, Georgia, USA Atlanta, Georgia, USA Atlanta, USA

Jiachen Li Brian D.Jones Elizabeth D. Mynatt


jli986@gatech.edu brian.jones@imtc.gatech.edu e.mynatt@northeastern.edu
Georgia Institute of Technology Institute for People and Technology Northeastern University
Atlanta, USA Georgia Institute of Technology Boston, USA
Atlanta, USA
ABSTRACT ACM Reference Format:
Improving medication management for older adults with Mild Cog- Niharika Mathur, Kunal Dhodapkar, Tamara Zubatiy, Jiachen Li, Brian
D.Jones, and Elizabeth D. Mynatt. 2022. A Collaborative Approach to Sup-
nitive Impairment (MCI) requires designing systems that support
port Medication Management in Older Adults with Mild Cognitive Im-
functional independence and provide compensatory strategies as pairment Using Conversational Assistants (CAs). In The 24th International
their abilities change. Traditional medication management inter- ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22),
ventions emphasize forming new habits alongside the traditional October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 14 pages.
path of learning to use new technologies. In this study, we navi- https://doi.org/10.1145/3517428.3544830
gate designing for older adults with gradual cognitive decline by
creating a conversational “check-in” system for routine medication 1 INTRODUCTION
management. We present the design of MATCHA - Medication Ac- In this research, we explore the potential of supporting medica-
tion To Check-In for Health Application, informed by exploratory tion management in older adults with Mild Cognitive Impairment
focus groups and design sessions conducted with older adults with (MCI) through the use of Conversational Assistants (CAs) which
MCI and their caregivers, alongside our evaluation based on a two- include smart speakers and assistants such as the Google Assistant,
phased deployment period of 20 weeks. Our results indicate that Siri, Amazon Alexa or Microsoft Cortana. They provide a natural
a conversational “check-in” medication management assistant in- language based support to enable access to information and ser-
creased system acceptance while also potentially decreasing the vices for everyday interactions [7]. As of 2019, 20% of people over
likelihood of accidental over-medication, a common concern for the age of 60 owned a smart speaker in their homes [18]. While
older adults dealing with MCI. adoption rates by older adults are lower, many older adults have
reported feeling “supported” and “empowered” while interacting
CCS CONCEPTS with any form of smart assistant in their homes [21, 46]. Assistants
equipped with distinctive personality characteristics such as voice
• Human-centered computing → Empirical studies in acces-
modulations, learning behaviors, and advanced natural language
sibility.
processing contribute to the anthropomorphization of these devices
by users, particularly older adults, who tend to draw upon their
KEYWORDS experiences and behaviors cultivated over their lifetime. Our work
mild cognitive impairment, older adults, medication management, is motivated by these observations, seeking to foster a welcome
conversational assistants connection between users and CAs that can positively infuence
healthy behaviors such as efective medication management.
Most medication management interventions support building
habits over time by providing an integrative system for reminders
or alerts [5, 31]. An additional challenge we address in our work
is designing an interactive medication management system for an
aging population diagnosed with gradual cognitive decline. Mild
This work is licensed under a Creative Commons Attribution International
4.0 License. Cognitive Impairment is defned as an intermediate stage of cog-
nition situated between the expected decline due to aging and the
ASSETS ’22, October 23–26, 2022, Athens, Greece more signifcant decline associated with dementia and Alzheimer’s
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. disease [33]. It is estimated that between the years 2012 and 2050,
https://doi.org/10.1145/3517428.3544830 the US population over the age of 60 is expected to double [29],
ASSETS ’22, October 23–26, 2022, Athens, Greece Mathur, et al.

and about 16.6% of the population above 65 will develop MCI [34]. user research sessions with older adults with MCI and their
Symptoms of MCI, which include varying degrees of memory loss, carepartners.
language problems, and loss of attention among many others, have • We articulate how building a medication system which prompts
shown to gradually become worse, and very few patients show a sig- refection from its users and focuses on “checking-in” rather
nifcant cognitive improvement. It is estimated that approximately than traditional reminders or alarms led to increased en-
14.9% of individuals with MCI will progress to dementia in over 2 gagement with the system over time while also discouraging
years and 38% will progress to dementia in 5 years [16]. Therefore, accidental over-medication in older adults with MCI.
while we do want to encourage and support habit-building in older
adults with MCI through our design, we also attempt to under- 2 RELATED WORK
stand an optimum level of persistence and support required. We are In this section, we discuss why medication management in older
aiming to fnd a balance between a system that delivers repeated adults with MCI remains a challenging area of research and the
notifcations (which may be suitable for people with dementia or a various factors that can lead to the inefectiveness of existing inter-
more advanced cognitive decline than MCI) and one that takes into ventions in the context of older adults. We also discuss the additional
account the varying cognitive strengths of people with MCI and and peripheral cognitive challenges that they face, and highlight
remains optimally attentive to sustain healthy medication goals, the need for more empathetically designed systems. We then re-
including preventing accidentally taking the same medication twice view and discuss the adoption of CAs by older adults and identify
(referred as “over-medication” throughout this paper). With our potential areas of opportunities.
work, we aim to ground MCI as a critical period to empower older
adults to develop habits and compensatory strategies that they can 2.1 Medication adherence in older adults: A
rely on if or when their cognitive abilities decline further. persistent challenge
A key foundation for our work are existing studies with older
adults with MCI and their usage of CAs for daily activities, such as Many past studies have sought to reduce barriers to efective medi-
in [46]. CAs have shown to provide useful and usable support for cation management in aging populations. In [25], Martin et al. dis-
older adults with MCI across a variety of functions, from informa- cuss various psychological needs and requirements for medication-
tion searching to calendaring, especially when sufcient training is related artifacts representing information to older adults, such as
provided. For the purpose of our study, we chose to use Google’s text size, language used, and amount of information presented. Ful-
CA, Google Home Hub, a visual assistant with an 8 x 6 inch touch mer et al. note that cognitive power and mental health conditions
screen and a built-in Google Assistant. A deciding factor in choos- such as depression can signifcantly impact medication behaviors
ing this device was the display of voice commands and responses on [13]. Studies report that 25 to 59% older adults above 65 are not
the screen, eliminating the need for our participants to remember able to take medications as prescribed and report higher instances
what they said. In this paper, we frst describe our exploratory user of complications arising as a result of those. [4, 39].
research process, which informed our design of an interactive medi- These challenges are further complicated by neurological issues
cation assistant system built within the Google Home Hub. We then associated with MCI and aging, such as dexterity limitations, vision
discuss the deployment of this system in households that included and memory loss. Some older adults with MCI also struggle with
older adults with MCI, focusing on their usage and engagement reduced awareness of their circadian rhythms, placing additional
patterns over the course of 20 weeks divided between 2 Phases. In stress on their ability to adhere to routines [8, 22]. Low adherence
this two-phase research study, we explore the impact of ‘learning is also attributed to lack of personalized and alternate ways to
from use’ and inform design revisions based on usage and feedback keep track of medications and efcient contextual reminders de-
from Phase 1. Finally, we summarize our fndings from both phases spite feeling a need for it [14]. Medication management forms an
and frame opportunities for future work. integral component of the daily lives of older adults, they have
scheduled medications not only for specifc health issues, but also
for preventive and maintenance use, such as vitamin and dietary
1.1 Contributions supplements. Remembering to take medications multiple times a
day can be challenging for older adults with MCI and also for their
With this work, we aim to ofer the following contributions to the
caregivers, who often have a parallel medication schedule of their
larger research community:
own [2]. Medication management for older adults remains an es-
• We ofer insights into the existing medication management sential area of research, as poor adherence to prescribed medicines
strategies and habits of older adults with MCI and their is considered to be a signifcant health challenge [15] and one of
expectations for an interactive medication management sys- the leading causes for emergency complications arising from issues
tem that integrates with these strategies. We also identify related to heart disease, diabetes, hypertension and mental health
opportunities for this system to address gaps with existing [4, 15].
strategies as well as limitations to traditional alarm and re-
minder based interventions.
2.2 Role of technology in Medication
• We present the design of an interactive system built within management in older adults
a CA for fexible medication management and deploy the A number of research prototypes and commercial products aim to
design in a two-phase research study. Our system design is in- incorporate technology into systems for medication management.
formed by possible usage scenarios drawn from exploratory However, the bulk of studies with older adults and their interactions
Medication Management in Older Adults with MCI Using Conversational Assistants ASSETS ’22, October 23–26, 2022, Athens, Greece

with technology currently focus on the roadblocks stemming from framework highlights the assistant’s need to function dynamically
the technological stereotypes associated with older adults, who are through the evolving stages of any type of cognitive defcit, from
often “defned by their defcits rather than capabilities” [32, 44]. acceptance to establishing routines and fnally sustaining engage-
Currently, most commonly-available tech-based interventions to ment with the system. While most of the existing work focuses
support medication management are aided by the use of smart- on older adults and their interactions with CAs, the challenge of
phones [41], alarms, reminders [23], automated pill dispensers [28], designing for compensatory cognitive support is underexplored,
and digital calendars [3]. Similar to other technologies oriented to with the exception of a few studies that address cognitive behaviors
the consumer market, there are many challenges associated with as a result of MCI [20, 45, 46]. These studies emphasize following a
using these interventions with older adults. Some older adults, par- user-centered approach for personalization given the diversity in
ticularly those with a relatively recent diagnosis of MCI, report the MCI spectrum, accounting for hearing and dexterity limitations
feeling distressed by a lack of control over their ability to main- and to incorporate pleasant interactions that are able to sustain
tain a schedule. This fear of losing control and independence also engagement beyond the study. Based on analyses from studies that
becomes a contributing factor in their hesitation to adopt newer have explored long-term interactions between older adults and CAs
technologies [32]. There is a need to develop systems that provide [10, 17, 46], it is observed that the initial interactions were mainly
a degree of functional independence to individuals with MCI while related to entertainment (music, searching, etc.), cooking related
also supporting them through the journey of cognitive transition reminders, and alarms related to healthcare monitoring, highlight-
stemming from MCI. The other challenges with technological in- ing the potential for CAs to aid the maintenance of routine tasks.
terventions arise primarily out of the friction associated with their An important aspect of older adults and their usage of CAs is their
costs, on-boarding and technical complexity [31]. Patel et al. in [30] tendency to anthropomorphize the device as a companion within
reviewed these challenges in a usability and workload compari- their home environment [6, 35]. They are likely to interact posi-
son of 21 commercially-available electronic medication adherence tively with the system through socially engendered responses as
products with older adults and caregivers such in [9]. They re- a result of the voice-based interaction that can facilitate a longer
ported that the usability of these products is signifcantly lower retention of healthcare related information, as demonstrated by
than the national average usability scale score [30]. An analysis of Azevedo et al. in [1]. Waterschoot et al. [42] highlight the need to
3 of-the-shelf commercially-available automatic pill dispensers - think about multi-stakeholder interactions with the CA such as the
Hero, Pria and Medacube - revealed an average annual cost of more older adults, their caregivers and healthcare professionals with a
than $300 with additional on-boarding charges. The set up and controlled sharing of information. Designing for gradual behavioral
maintenance requirements of pill dispensers are also challenging change as a result of aging through multimodal interactions in the
for older adults [37]. Additionally, most of the existing strategies, form of a conversational coach and a text-based system is explored
consisting of reminders and alarms, introduce the risk of accidental in [12] and the results indicate that a system that provides multiple
over-medication by simply reminding to take medication. When interaction afordances was positively accepted by the participants.
reminded about taking medication, older adults with MCI have a Finally, a critical aspect of designing an interactive system for older
high chance of not being able to recall if they have already taken adults with MCI is to retain a degree of autonomy and functional
the medication and take it again [2]. Reminders and alarms also independence related to personal health routines and decisions
run the risk of becoming too persistent by requiring the user to through customizable interaction options [40].
keep snoozing or asking it to stop [38], leading to a drop in en- We defne our research and design goals to use these insights
gagement levels over time resulting from alarm fatigue [32]. While from existing studies and gaps identifed within them to design a
there is some research about specialized medication management medication management system which reduces the complexity of
systems specifc to older adults [31], very little, if any, research the set up and maintenance process, works with the existing strate-
proposes the integration of such systems within existing practices gies that older adults are used to, takes the cognitive decline and
of older adults with cognitive defcits, while also involving them in the specifc nature of the decline into account while also retaining a
the design process. There is a potential to develop systems which degree of control over their own system, is optimally persistent and
integrate and utilize the connection that older adults form with CAs not too repetitive, and fnally, is designed to reduce the instances
and provide them a more coherent experience [36]. Given this, our of over-medication to the extent possible.
work strongly advocates and aims to represent an assets-based de-
sign approach [19] which efectively integrates within the existing
strategies used by older adults to manage medications. 3 OVERVIEW OF RESEARCH APPROACH
This research study is conducted within the context of a larger
comprehensive cognitive program in a hospital for older adults di-
2.3 Adoption and use of CAs by older adults agnosed with MCI. This program provides lifestyle and therapeutic
In recent years, there has been a substantial amount of research interventions with a focus on exercise, nutrition, functional inde-
that explores the potential of CAs, either embedded within smart- pendence, group therapy and compensatory cognitive strategies
phones or as standalone devices, in helping older adults for health working with older adults with MCI and their caregivers, who are
and well-being, as well as for entertainment purposes given their mostly their spouses or adult children. The program aims to "em-
interactive and multimodal input capability [6, 11, 45, 46]. In [27], power” older adults through an array of restorative activities such
Morrow et al. develop a framework adopting the use of CAs from as yoga, meditation, group sessions, nutrition counseling, smart
pedagogical purposes to promoting self-care for older adults. Their home installations, etc. with the aim of slowing cognitive decline
ASSETS ’22, October 23–26, 2022, Athens, Greece Mathur, et al.

and protecting overall brain health. The program refers to individu- 4.1 Focus Group Sessions
als with MCI as “members”, not patients, and calls their caregivers We conducted 2 focus groups with our participants to understand
as “carepartners” to signify a mutual partnership and an active their existing medication management strategies and to introduce
commitment to each other. Within this context, the program has them to our study goals. The sessions were conducted remotely
ongoing studies about the use of CAs to support members and via Zoom and adhered to COVID-19 research protocols in place at
carepartners in their daily lives. All the participants in our study the time. Verbal consent was obtained from all the participants to
were enrolled in this year-long program and had an existing di- record the session and share the data with the researchers on the
agnosis of MCI. More details about participant characteristics are team as described in the approved Institutional Review Board (IRB)
mentioned in sections 4 and 6. protocol. The participants were all recruited from the cognitive
To gain a better understanding of the day-to-day experience of program introduced in section 3. The 1st session had an attendance
older adults living with MCI, we attended 4 weekly group sessions of 18 dyads, out of which 17 members had their respective spouses
ofered by the program. During these sessions, "dyads" of members as carepartners and 1 member was in a member-daughter dyad
and their carepartners interact with each other and the program structure (36 total participants). The second session was conducted
staf via a Zoom call, keeping each other updated on their progress in a similar format with 12 dyads, with 11 members with their
and occasionally sharing personal life events. During two of these respective spouses as carepartners and 1 member in a member-
sessions in Month 1, we introduced them to our study goals and child structure with 3 adult children (1 male; 2 female), located in
scheduled user research sessions with interested participants. All diferent cities (26 total participants). We divided the 45-minute
the participating dyads were located in the same city, however, sessions into 3 sections, shown to the participants through a shared
the research sessions were conducted remotely via Zoom due to screen presentation. The 1st section had multiple choice questions
COVID-19 research protocols. All members had a diagnosis of MCI that prompted the participants to indicate how often they took
from their neurologists and were actively enrolled in the program medications, where in their home medications were located, and
activities. As active participants, all the dyads in the study had the where the CA was situated in their house. In the 2nd section, we
Google Home Hub installed in their homes for over a month and focused on facilitating a discussion among the participants around
were aware of the keywords required to initiate conversation with their personal habits or tips for remembering medications. This
it. We also worked through a Privacy Impact Assessment to under- section also prompted a dialog among the carepartners and their
stand privacy questions and concerns addressed by MacLeod et al. involvement in the medication management routine of the members.
in [24] and provided a summary of data collection and management In the 3rd section, we informally recruited members to be involved
protocols to each participant. The purpose of these research ses- in subsequent design activities as well as the eventual deployment.
sions was to gain a better understanding of the existing strategies After the sessions, we made an afnity diagram from our notes
for medication management as well as the expectations from CAs and transcripts to extract key insights from the data gathered. This
in the context of medication management. In the next section, we inductive approach helped us to group the learnings from the ses-
describe these research sessions and the insights gathered, which sion into four categories (Expectations, Habits, Concerns, Current
informed the design goals for our system. We then deployed the Techniques), presented in Fig. 2. This analysis then helped us to
system with 7 dyads, in 2 phases for a total period of 20 weeks. We formulate the following design goals from this session:
evaluated usage mid-deployment and iterated on our design be-
tween the two phases. The Phase 1 deployment of our study lasted (1) The traditional use of alarms or reminders that lack context
for 4 weeks. Following initial evaluation of usage and incorporat- or labels to signify the purpose of that alarm or reminder,
ing design revisions based on our fndings, we then deployed the often accompanied with a repetitive sound, is considered to
system for an additional period of 16 weeks (Phase 2). Finally at the be unreliable and inefectual. Participants reported that they
end of Phase 2, we again conducted interaction logs analysis and often had trouble recollecting what an alarm or a reminder
interviews with the participants, which informed our fnal study was originally set for. This insight helped us understand the
takeaways. A visual timeline of the study is shown in Fig. 1. need for specifc and unambiguous messaging associated
with an alarm or reminder.
(2) Physical or digital spreadsheets, pillboxes, and paper cal-
endars were the most common ways of keeping track of
4 EXPLORATORY USER RESEARCH medications. Some members also reported relying on their
To understand existing medication practices and expectations for carepartners to be in control of their medications. While
the system, we conducted two focus group sessions and one scenario- these methods provided a certain level of robustness for the
based design session with groups of older adults with MCI and their dyads, they also expressed interest in exploring how their
carepartners. The insights gathered from these user research ses- CAs can help them streamline this process by reducing the
sions helped us to formulate concrete design goals for our system. In need to remember to check their calendars or sheets every
the following sections, we describe these sessions and our analysis. day, often multiple times in a day, and to also reduce the
The participants for these sessions were all recruited and sampled carepartner burden.
from the cognitive program introduced in section 3. All partici- (3) Participants did not want to completely abandon their ex-
pating dyads had a greater than 2 daily medication frequency and isting medication management methods such as using pill-
used a variety of strategies to manage medications that we were boxes, sticky notes, etc., since they had been a part of their
interested in learning more about. routines for many years now. This suggested that any new
Medication Management in Older Adults with MCI Using Conversational Assistants ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 1: Month-wise research study timeline

digital medication management tools that we design for them towards the involvement of CAs. We present the codes developed
should work with, and not replace, these existing methods. based on the analysis of insights from this session in Table 1 that
later informed our system design. Using the codes in Table 1, we
4.2 Scenario-Based Design Session extracted the following tangible design goals from this session:
To probe deeper into the learnings from the focus groups and to
understand the expectations of the participants for CAs supporting
medication tasks, we conducted a 45-minute design session with
the goal of involving the participants in the design process. The ses- (1) In most scenarios, the participants hinted at their preference
sion was conducted remotely via Zoom due to COVID-19 protocols for a system that checks in with them as opposed to an alarm
and had 18 participants, all of whom also participated in the focus or reminder. In traditional alarms and reminders, the systems
groups previously. The participants for this session were also re- generally do not prompt the members to answer, since they
cruited and sampled from the cognitive program and had a greater are worded as “Time for medicine/take medicine” or similarly.
than 2 daily medication frequency. 8 members attended the session An individual with MCI might not remember that they have
with their carepartners and 2 members attended unaccompanied. taken the medication already, and in the moment, might
While designing the narrative structure of the session, we decided unknowingly take it again. They could also unknowingly
to partially anthropomorphize the CA as a humanoid robot called take the medication for the next day as a result of being told
Rosey, inspired from the popular 60s animated sitcom The Jetsons, by the system to take it. Given that “over-medication” is also
who worked as the Jetson family’s housekeeper. This depiction a challenge for them, our system needs to “check-in” and
helped spur engagement with participants during the session by ask the member if they took the medication or not, and not
grounding the discussions in a playful, familiar metaphor [43]. We “remind”. This check-in introduces the possibility to prompt
divided the session into 3 parts. In the 1st part, we introduced the the member to think about whether they took the medication
session structure and presented example scenarios, such as Rosey or not, go and check their pillbox or calendar if they need
reminding them to call their kids. Building on this example, in the to, and then report back to the system.
second part, we presented the participants with a storyboarded (2) Another crucial expectation from the system is for it to have
scenario of Rosey interacting with them at home. We kept the an understanding of the member’s routines and dialog pref-
conversational aspects of the storyboard interactive using empty erences. If the timing of initiation conficts with an ongoing
speech bubbles and giving the participants the chance to design activity, it should have the functionality to check in again
the conversation between them and Rosey by talking about their later, or allow the member to schedule it for a later time.
dialog preferences and typing them in speech bubbles. This exercise If the assistant receives no response after initiating, there
helped us to understand their expectations for the system. In the 3rd should be multiple channels of notifcations to confrm the
part, we touched upon the degree of involvement that they expect check-in. Suggestions for these channels included phone
from the system by presenting multiple scenarios through story- notifcations, text messages, emails, and calls.
boards in which the robot varies its level of persistence with the (3) When the member has already taken the medication, and
reminders by asking once or twice or continually until an answer. if they choose to notify the system of it, it should generate
We then asked the participants to think aloud and discuss their positive feedback or reinforcement to motivate the user. This
preferences for these interaction levels. This exploratory design positive recognition can reinforce feelings of achievement
session helped us in understanding their attitude and perception of having accomplished the task beforehand.
ASSETS ’22, October 23–26, 2022, Athens, Greece Mathur, et al.

Figure 2: Insights from the focus group sessions categorized into expectations, concerns, habits and current techniques

Table 1: Codes and sample quotes from the scenario-based design session conducted with participants

Codes Sample Quotes From the Scenario-Based Design Session ("She" refers to Rosey)
Positive “Maybe it can play my favorite song after I answer it back..”
feedback “How about a movie star’s voice to applaud me? Maybe James Earl Jones!”
Lacking in “Sometimes when I hear my phone alarm, I take the medicine and then later my wife tells me
current that I’d taken it before already”
systems “I have to remember to check the calendar every day, my wife helps me with that.."
“We set alarms for our medicines but it’s hard to remember what the alarm was set for because
we have many alarms”
“Check-in” “She should ask me whether I have taken the pills because that way I can think about it or ask
and my wife, rather than telling me frmly that I should”
not remind “It should ask me to go check, its hard but it can help me making sure that I’m just not taking the
same pill again”
Dialog “I hope she does not talk like the reminder in my phone, it’s dull..”
expectation “She should be respectful and not nosey like my alarm I keep snoozing”
Multiple “If we’re not at home, can she send a message to our phone?”
channels of “What if I’m outside walking the dog when she wakes up?”
notifcation “Maybe it can send a message to my daughter’s mobile saying that I have taken the medicine for
the morning”
Check again “I expect her to come back inside and check again after we’re done with yoga”
“But I hope she knows when is a good time. Can I tell her when to come back?”

5 DESIGNING A GOOGLE HOME ACTION FOR Console and can be integrated into any device that supports Google
MEDICATION Assistant.
We listed the scenarios and structured the dialog fow based
From the insights generated during the exploratory user research
on the design goals informed by the discussions during the focus
sessions, we designed our Google Action, “Medication Action To
group and exploratory design sessions, with appropriate paths for
Check-In for Health Application” (MATCHA), using Google’s Ac-
each scenario. Table 2 presents possible scenarios and the resulting
tion Console, a web-based tool to manage the development, reg-
MATCHA response, as well as the source of the insights that led
istration, confguration and analysis of Google Actions. A Google
to our designated response. Our system recognizes more than 200
Action is an applet for Google Assistant that provides additional
possible unique responses from the dyads. The assistant’s volume of
and extended functionality. Google Actions are coded in the Action
Medication Management in Older Adults with MCI Using Conversational Assistants ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 2: Possible scenarios, MATCHA response and informing insight

No. Possible Scenario MATCHA Response Corresponding Source of Insight


S1 Taken the medication be- Plays cheering sounds and ap- The need to foster positive engage-
fore MATCHA checks-in plauds them with positive afrma- ment with MATCHA and reinforce-
tion ment of medication management.
S2 Not taken the medication Asks if they would like to repeat The need for the check-in to be per-
before MATCHA checks- the check-in at a later time and sistent and let the member specify a
in guides them to specify a time pe- later time for check-in
riod after which MATCHA should
check-in again
S3&4 Does not remember if they Asks to check and waits for them The expectation to reduce cognitive
have taken the medication to respond again and responds fur- burden on the member to remember
or not before MATCHA ther according to scenario 1 or 2 whether they have taken the medica-
checks-in AND/OR Needs tion. Goal with the pillbox is to catch
to check the pillbox to con- missing medications but also reduce
frm risk of over-medication through spe-
cifc reminders.
S5 Says something Google Lets them know that it has sent The preference to utilize the exist-
does not understand - un- the feedback to the research team ing connection between the members
known response to rectify the error and we will be and the conversational assistant to
in touch soon foster support and feedback
S6 Not around to respond or Sends a notifcation on the mem- The expectation to engage multiple
MATCHA receives no re- ber and the care partner’s phones channels of notifcations to confrm
sponse until it is timed out to notify that the routine has the check-in
started

operation as well as the physical location of the Google Home Hub dyads from the research sessions and were sampled on the basis
was infuenced by house layout and preferences. Some members of their willingness to participate in the study and their level of
keep their medications closer to their Google Home Hub, while oth- engagement with the CA. The dyads primarily belonged to subur-
ers keep the two on diferent levels in the house. Our data reported ban communities, and had a diverse professional background with
that the kitchen and the bathroom were the most common locations 3 members having retired from technical professions, 2 from an
for the Google Home Hub and the medications respectively. These art background, 1 from a clinical support background and 1 from
specifcs of the location factored into the medication assistant’s academia. MATCHA triggered at specifc times each day depending
pitch, volume and time functions for the dyads. The assistant also on the member’s medication schedule. The average frequency of
addressed the member with a greeting and their name with every MATCHA reminders for the 7 dyads was twice a day, with the
check-in. lowest frequency being once in a day and the highest being 5 times
We conducted the deployment in two successive phases. We in a day. The medication frequency is the total number of times per
frst introduced MATCHA to the participating dyads by conducting day that a member takes a set of medicines, for example, if once in
training sessions at their houses and also provided them with a set of the morning and once in the evening, the medication frequency is 2.
printed training materials explaining the purpose and interaction A summary of the demographic data including sex, age, medication
fow of MATCHA, and ways to contact us in case of questions. frequency is provided in Table 3.
We then personalized the medication schedules for each dyad by After Phase 1, we collected interaction log data from the 4 weeks
obtaining medication routines during a pre-deployment interview and manually transcribed them from the “My Activity” toolbar
and incorporated these individual routines into MATCHA through in the Google Home app. We also sent a modifed version of the
the Google Assistant backend on the Google Home app. After Phase System Usability Scale (SUS) to the members and carepartners to
1, we conducted interviews and interaction logs analysis to inform respond to, reworded to better represent our study context, with
design revisions for Phase 2. At the end of Phase 2, we conducted the goal of calibrating our assessment of the interaction log data.
a fnal set of interviews and interaction logs analysis to inform The SUS results gave us a usability score of 84.66 from the mem-
overarching takeaways from the study. bers, and 86.16 from the carepartners. Overall, the log data and SUS
indicated sustained use for most participants but presented poten-
tial areas for improvement and revisions. Based on that input, we
6 LEARNING FROM USE: DEPLOYMENT conducted qualitative interviews with the 7 dyads via Zoom lasting
PHASE 1 for 30-45 minutes, to gain an understanding of their experience
In Phase 1, we deployed MATCHA to the Google Home Hubs of 7 with MATCHA so far and to contextualize the patterns seen in the
dyads for a period of 4 weeks. These 7 dyads were a subset of the interaction logs and SUS.
ASSETS ’22, October 23–26, 2022, Athens, Greece Mathur, et al.

6.1 Phase 1 Analysis 7 LEARNING FROM USE: DEPLOYMENT


The insights from the interviews, interaction logs and the SUS PHASE 2
score after Phase 1 resulted in a set of design revisions that we For Phase 2, the revised version of MATCHA was deployed to the
incorporated before the deployment of Phase 2. Below, we present Google Home Hubs of 5 dyads for a period of 16 weeks, with 2
some interview quotes from Phase 1 analysis that, along with logs dyads wishing to drop out of the study between Phase 1 and 2. One
and SUS analysis, resulted in the design revisions as described in dyad with a medication frequency of 5 times in a day found the
Section 6.2: frequent check-ins tiresome, and the other dyad was unavailable
(1) M6: "To be honest, the loud cheers were a little overwhelming. due to travel through the summer. We further plan to address the
It kept going on and on." inactivity of the 2 dyads that dropped out of the study by exploring
(2) CP1: "Sometimes Google doesn’t understand what I am saying, more about personal preferences and their incorporation into the
she keeps asking me to repeat." system in our future work. A summary of the demographic data,
(3) CP2: "I would like to get a text or email whenever he has taken including if they continued to Phase 2 is provided in Table 3.
a medicine." We analyzed the interaction logs for the 16 weeks of the Phase 2
(4) M5: "There are days when I take the pill after Google tells me, deployment to examine the continuing engagement patterns of the
does she know that?" dyads resulting from the design revisions. Finally, we conducted
interviews with the 5 dyads via Zoom for 45 minutes to understand
A breakdown of scenario-wise interactions, including the "Tak- their experience and contextualize patterns in the interaction logs.
ing Now" scenario and engagement rate for Phase 1 is discussed We used the inductive coding process to develop major themes
further in the Results section. across our interviews and usage data, grounded by our initial ex-
ploratory design sessions and their fndings. For each interview,
6.2 Design revisions after Phase 1 a team of 3 coders, all researchers with a background in Human-
Computer Interaction, clustered the codes together and as patterns
Major design revisions informed by the interaction log analysis,
began to emerge, developed distinctive themes for these clusters. Af-
SUS results and the interviews included tamping down the level of
ter initial coding, a process of inter-rater reliability was conducted
positive feedback from MATCHA - while some members found the
to agree on the fnal themes. At the end of Phase 2, we verifed and
length and nature of the positive reinforcement uplifting, others
refned the existing themes by looking across the interviews from
found it “overwhelming”. To be specifc, the frst positive feedback
both phases and generated concise fndings. We discuss the results
started with a 4-second loud cheering and celebratory sounds, fol-
from the interaction log analysis and the interviews from the two
lowed by a 4-second clapping sound and ended with the assistant
phases in the next section.
verbally appreciating the user saying "Yay! Good Job! I’ll check-in
at the next medication time". The feedback was overwhelming for
some members, particularly for those with multiple medications 8 RESULTS
through the day therefore, in Phase 2, we revised an adequate level
of positive feedback, with a 2-second moderate cheer accompanied 8.1 Interaction Log and Engagement Analysis
by the same verbal praise from the assistant. We also included a During Phase 1, MATCHA generated 476 initiations and we recorded
touch-based response option in which the members can press the a total of 84 interactions (responses) over the course of 4 weeks. The
corresponding button on the touch screen in lieu of answering ver- remaining initiations were left answered. This ratio results in an
bally. We added 4 touch buttons - “Yes I did”, “No I did not”, “I don’t engagement rate of around 18% for Phase 1. These interactions are
remember” and “Taking Now” in Phase 2 based on the insights from mapped on a normalized scale of 30 interactions per week for each
the interviews. This modifcation also provided an alternate way of scenario in Fig. 3. This normalization per week was done to account
providing a response in the case of mufed speech or other speech- for the diference in the length of the phases. During the analysis,
based issues. To distinguish between responses that were recorded we noted that there were some responses in which the members
through touch buttons and ones that were recorded through verbal indicated that they have not yet taken the medication but will take
response in the interaction logs, we added an emoji to every button it now as a result of the MATCHA check-in. This scenario extended
text to recognize this response type in the logs. the existing functionality of MATCHA as a reminder in addition
Additionally, we increased the system time-out period to receive to the original check-in function. This scenario, which we call the
a response, and added an additional ‘Taking now’ scenario in which “Taking Now” scenario, was not incorporated into MATCHA be-
the member informs MATCHA of taking the medication right now fore Phase 1, and it prompted an “I don’t understand what you
as a result of the check-in. We noticed a large number of such just said” response from it. This also accounts for the signifcant
responses and separated those from the unknowns and counted number of unknown responses in Phase 1. We manually counted
them manually through the backend under a new category called these interactions from the logs to defne an additional 7th scenario
"Taking Now". We also revised the settings for phone notifcations for the design revision in Phase 2. To be clear, the “taking now”
to be delivered every time MATCHA is triggered, as opposed to only responses formed a part of the unknown responses by the dyads
when it receives no response. This was informed by the carepartners and as we had access to the backend of the system, we were able to
expressing the need to know every time the member has responded count those separately. The unknown interactions do not include
to the system. We further explain and contextualize these revisions the responses that pertain to the “taking now” scenario and hence
with participant quotes in the results section (Section 8). the separated unknowns are labelled as "other unknowns" in Fig. 3.
Medication Management in Older Adults with MCI Using Conversational Assistants ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: Normalized number of interactions by scenario in Phase 1 for 7 dyads (normalized per week)

Figure 4: Normalized number of interactions by scenario in Phase 2 for 5 dyads (normalized per week)

Figure 5: Weekly Engagement Rate Trend for all Dyads for 20 Weeks
ASSETS ’22, October 23–26, 2022, Athens, Greece Mathur, et al.

Table 3: Summary table of participant demographics.

Daily
Member & Member Member Relation to Carepartner Continued
Medication
Carepartner ID Age Sex member Age to Phase 2?
Frequency
M1, CP1 62.6 Male 3 Spouse 60.9 Yes
M2, CP2 78 Male 2 Spouse 75 Yes
M3, CP3 75.4 Male 5 Spouse 70.6 No
M4, CP4 85.3 Male 2 Spouse 76.3 No
M5, CP5 70.8 Female 2 Daughter 45.7 Yes
M6, CP6 75.4 Male 1 Spouse 76.7 Yes
M7, CP7 74.3 Female 2 Spouse 74.6 Yes

During Phase 2, MATCHA generated 1120 initiations and we 8.2.1 The system induced feelings of “confidence” and support to
recorded a total of 760 interactions (responses) over the course of the dyads while being suficiently persistent in its interactions.
16 weeks. This ratio of interactions to all initiations results in an The dyads reported feeling “confdent” about their medication rou-
engagement rate of around 67% for Phase 2, showing an increase tines as a result of the medication assistant being integrated into
from the engagement rate of 18% from Phase 1. We attribute this their existing Google Home Hub routines. This kind of support
increase in engagement to the design revisions we incorporated to also reduced the need to set up external systems such as automatic
improve the user experience and also an increase in the comfort medication dispensers that are signifcantly more expensive and
level and familiarity with MATCHA forming a part of their daily have complex set-up as highlighted in section 2.2. The integration
routine over time within their CA. If we treat dyads 3 and 4 (from of the medication system within the CA reduced the on-boarding
Phase 1) as outliers that did not adopt the use of MATCHA, we friction experienced by our participants with other technologies.
nevertheless still see an uptick in engagement from Phase 1 to The conversational and unambiguous nature of the check-in as
Phase 2, from 27% (without outliers) to 67%. These interactions are opposed to the non-contextual and repetitive nature of alarms and
mapped on a normalized scale of 30 interactions per week for each reminders led to a relatively faster adoption of MATCHA through
scenario in Fig. 4. the course of the study.
To account for the variability in study phase duration and medica- • M5: “I like that Google kind of feels like a part of the house,
tion frequency for each dyad, we calculated the weekly engagement and not something that I have to keep answering to all the time
rate for the dyads over the course of the study shown in Fig. 5. This like my morning alarm..”
weekly engagement rate was calculated by adding all the times that • M7: “I feel assured that if I were to forget my pill, she [MATCHA]
MATCHA initiated for all dyads in 1 week (using each member’s will ask me and that’s important because I feel less pressure to
medication frequency per day), adding all the interactions counted remember my routine..”
for all dyads in that week from the log data, and then calculating the Some members also recollected instances when MATCHA prompted
engagement rate for every week of the study. It can be seen from them to think about or confrm the medication status before re-
the fgure that the weekly engagement rate increased signifcantly sponding as a result of the check-in functionality. In some cases,
right at the start of Phase 2, was sustained during Phase 2 and then they went and checked their pillboxes before reporting completion
increased towards the end of Phase 2, indicating an overall positive if they did not remember taking the medication. The positive afr-
attitude and a sustained level of comfort with the system over the mation from MATCHA after reporting completion also generated
course of the study. The slight dip in engagement around Week assurance of having a degree of control over their own memory
10 coincides with a holiday weekend (4th of July). Some dyads and subsequently getting rewarded for it.
were away from home and only getting reminders on their phones,
• M5: “When I hear Google praise me, I feel good and like I have
which they may or may not have had the time to respond to. The
accomplished something and that she is happy with me..”
slight uptick of engagement in the fnal weeks is also correlated
with scheduling interviews at the conclusion of the study. All the participating dyads in Phase 2 expressed the desire to con-
tinue using MATCHA in their Google Home Hubs after the com-
8.2 Findings from Interviews pletion of the study. Their use, now at around 55 weeks, continues
through the writing of this paper, which is a signifcant result for
In addition to analyzing data from interaction logs, we also con-
us given the need to sustain long-term engagement of technology
ducted interviews with the members and carepartners at the con-
with older adults, as mentioned in section 2.2 in [38]. We discuss
clusion of the study with the goal to understand their perception
the long-term engagement requirement further in the discussion
towards the system. We asked questions that prompted them to
section.
talk through their experience of using MATCHA and whether they
would like to continue using it. We present the following fndings 8.2.2 The medication assistant provided an alternate way for the
from synthesizing our design insights, observed usage and mid carepartner to monitor member’s medication schedule.
deployment interviews, and integrating our fnal set of participant We noted that most members reported being supported by their
interviews with the 5 dyads at the end of Week 20. carepartner in managing their medication schedules. During the
Medication Management in Older Adults with MCI Using Conversational Assistants ASSETS ’22, October 23–26, 2022, Athens, Greece

interviews, the carepartners reported having MATCHA as a way • CP1: “We did notice that you added the buttons there, they
for them to use its capabilities to support their goals for medication weren’t on there before. I quite like them, specially with the
management of their partner. They also reported instances when smileys. We can now touch the buttons to answer and I do that
they were not around or were unavailable. In such cases, MATCHA a lot because I sit very close to it mostly..”
supported an efective integration with external channels such as • M7: “Sometimes Google does not catch what I’m saying, I don’t
phone notifcations and calls at the medication time by sending know if it’s my fault or hers, but when that happens, I go and
notifcation alerts to the member as well as the carepartner’s phones. touch the button and she can hear me fne again. So I defnitely
While carepartners expressed that they would still like to sometimes like having the buttons..”
physically check-in with the member as they do regularly, they
also followed it up by feeling a sense of peace that if they were to
9 DISCUSSION
someday forget checking-in, there was a system in place to do it
for them. In this section, we refect upon our results and discuss their impor-
tance from the context of the gaps and opportunities identifed at
• CP6: “I’m usually outside when it’s his medication time in the the end of section 2 and from user research.
evening but I get the alert on my phone which is good because
then I can know that he has taken the medications..”
• CP5: “It is great that I get mom’s medication notifcations on
9.1 Personalization Matters
my phone and can keep a check on her. I also double check A key insight from related work and the user research conducted
with her sometimes just in case..” with older adults with MCI is that the lived experience of MCI is
highly diverse. It is shaped and characterized by various factors
8.2.3 Members and carepartners liked that the system was aware of such as age, personality, caregiving network and the existence of a
their preferences and individual medication schedules. supporting environment, both physically and emotionally. MCI is
Based on the diversity in medication schedules and preferences a diferent and a unique experience for everyone and incorporat-
revealed during pre-deployment interviews with the dyads, we ing individual preferences for interaction to the extent possible is
attempted to personalize some parameters of the assistant such as critical for improving user experience and system acceptance. The
the volume, location, personalized greeting for time of the day, etc. result of these individual circumstances is a customized functional-
This personalization was appreciated because it took into account ity incorporated into the assistant that works in accordance with
individual cognitive behaviors to some extent and incorporating individual preferences identifed through following a user-centered
these individual variations into the system helped in making the approach to research. As highlighted in [20], user research should
experience more pleasant for the dyads. For example, while some be conducted in a way that facilitates this level of personalization
dyads preferred a lively and playful interaction, others wanted a in volume, medication routines for diferent types of medications,
more straightforward and simple check-in and the positive feedback. preferred mode of greeting, etc. Before deploying MATCHA, we
In addition, some dyads also expressed their desire to have access to conducted a series of pre-deployment interviews with the dyads
their reported medication data and have it shared with their clinical to understand their medication schedules and behaviors and the
teams at difering intervals. location of their Google Home Hubs and pillboxes within the house.
These schedules revealed the diversity of medication habits based
• M2: “I like that it’s kind of playful and talks to me but also on personal preferences. Integrating these preferences in the system
not too much or too loud that it starts to annoy me..” helped in providing a coherent support to their existing medica-
• M1: “I think it’s helpful because it reminds me every night be- tion habits and strategies eliminating the need to abandon them
fore I go upstairs that I need to get the medicine out. Sometimes entirely. Adequate personalization also helped in navigating the
I keep the medicine out on the counter for the next day too. In initial hesitation to adoption of new technologies by older adults
the morning, I’m usually sitting right next to it in the kitchen [32], and provided a compensatory support system that resulted in
so I can respond directly to it while I’m having cofee.” faster acceptance of it during the course of the study. With respect
• M2: “Our doctor always asks us if we have been taking our to the need for personalization, as the system deployment expands,
medicines regularly and we always feel like we have to say yes we plan on studying the interaction patterns based on frequency
whether we did or not because it’s hard to recall perfectly, so it and also the scenarios of usage based on the type of interactions,
would be useful if Google could tell him that directly..” with the aim of informing frameworks that can be used for groups
of users. This can be a way to ensure practical scalability while also
8.2.4 Adding touch-butons to provide an alternate way of interac- retaining personalization to the extent possible.
tion in addition to voice interaction was appreciated.
After adding touch-buttons on the Google Home Hub screen as
an alternate way to account for Phase 1 feedback, we saw that the 9.2 Sustaining long-term engagement
number of touch-based interactions hovered at a range of 20-35% It is crucial that any technical intervention aiming to provide
of total interactions for each dyad through the study. While some routine-based support during MCI to older adults needs to be ef-
dyads used the buttons more than others, a combination of the two fective in the long-term and also account for the often aggravating
forms of responses was appreciated and used consistently by the decline. In most cases, the primary caregiver supporting their part-
dyads. It also helped to alleviate the requirement to speak clearly ner with MCI is also of an advanced age, and have medical issues
during every interaction. of their own. This can induce feelings of anxiety related to their
ASSETS ’22, October 23–26, 2022, Athens, Greece Mathur, et al.

ability to continue providing support to their partner. As identi- structure of the conversational fow for all participants, while also
fed in [32, 38], sustaining long-term engagement beyond study providing the participants the option to personalize check-in tim-
duration is a critical issue that leads to the failure of most interven- ings, and notifcation preferences for their individual uses. Another
tions deployed for older adults. These failures could be a result of way to facilitate more autonomy in the system is by generating
poor user experience that does not provide error prevention, alarm transparent explanations regarding the check-in purpose by the
fatigue as a result of frequent reminders or inadequate amount assistant and the nature of specifc medication information given to
of training and functional support to the older adults and their it [40]. Additionally, it is also important to provide users the func-
partners while introducing the system. Increased technical com- tionality to record, share and review medication data. While this
plexity and difculty in understanding the instructions also leads to ability is not currently supported by Google’s actions console, there
drop in engagement after a while for most commercial medication are workarounds to incorporate this feature into the medication
management devices [37]. Our study worked towards addressing assistant that our team is currently exploring, discussed more in
these issues by incorporating the medication system within the section 9.5. Recording and sharing of medication routine history
Google Home Hubs, CAs that the participants already felt a degree over a period of time can also be benefcial to the clinical staf to
of comfort towards and also reduced the interaction complexity monitor adherence and efciency of a new medication routine in
by having an interactive voice as the primary afordance. We also order to make informed clinical decisions.
highlight the need to provide comprehensive written and printed
training materials and personal support to older adults at every 9.4 Multiple interactions modalities for
step in the process of deploying such systems, and creating a safe improved compensatory support
environment for them to ask questions and have their concerns
While this forms a part of designing for personalized interactions
addressed. Error prevention was addressed through transmission of
in 9.1, it is important to address this separately as an important
unknown responses to the research team, and a feedback response
construct in providing individualized support to older adults with
from the system requesting the user to repeat their last sentence.
MCI. Multimodal interactions were positively accepted by partici-
Given the high-risk nature of medication mishaps, we also made
pants in [12], as a result, we explored the addition of touch-based
sure that the system had alternate ways to notify to carepartner of
feedback in the system in addition to voice response. The Google
the medication time (phone notifcations) and regularly checked
Home Hub has a screen to provide visual feedback to the user
the backend to make sure the check-ins were generated at the right
and the members have a strong mental model that stems from the
time for each dyad. Finally, the most pivotal aspect of MATCHA that
touchscreen capabilities of their smartphones and tablets that they
contributed towards the long-term engagement was the clear and
extended to the CA. The touch-buttons also provided an alternate
unambiguous check-in functionality as opposed to the traditional
interaction option to the participants in the cases when their speech
reminders and alarms. As discussed in section 4, over-medication is
was not recognized by the assistant or they had issues with verbal
often a result of non-contextual reminders and is a critical concern
communication. However, we highlight the need to make these
for older adults with MCI, and the check-in functionality of the
multiple interaction modalities clear to the users by having them
assistant helped to alleviate this concern by prompting refection at
designed to be adequately visible to avoid multiple responses and
the medication time, notifying the carepartner and incorporating
the resulting confusion.
existing strategies such as prompting them to check their pillbox
as part of the interaction.
9.5 Adaptive functionality and external
integrations
9.3 User autonomy and freedom While MATCHA relied primarily on the check-in functionality,
Given the occurrence of MCI at the juncture of normal aging and there were also instances in Phase 1 when the participants used it
dementia, its diagnosis can be accompanied by feeling a lack of as a reminder to inform the assistant of taking medication as a result
control [26] as a result of altered memory capacity and functional of the check-in, efectively making the check-in work as an efective
abilities while also retaining some cognitive power. As a result of reminder to take medication. Recurring instances of this adaptive
this, the cognitive program as well as the individuals with MCI functionality, although initially prompting an unknown response
and their carepartners feel a signifcant desire to maintain a cer- from the assistant, was then incorporated into Phase 2 as a separate
tain level of autonomy and functional independence, while also interaction stream and the assistant provided positive feedback by
being open to compensatory support. The interventions need to playing cheering sounds leading to increased interaction success
support fexibility of medication routines as a result of changing life between the member and the assistant.
circumstances. Dyads, specifcally the carepartners, expressed the Finally, as a closing contribution of the study and to set up future
willingness to get more training to be able to provide the needs of directions for this work, we recommend the need to incorporate
their partner to the assistant in a way which makes the interaction external integration with the CA to efectively extend its use for
more efective. There are days when they know in advance that health and well-being of older adults with MCI. A proposed external
their schedule is going to be diferent from usual, and having the integration for the purpose of recording medication behavior as well
control to schedule their own check-ins a day before will make as to verify ground truth of medication ingestion are smart pillboxes.
the experience more independent. We demonstrated this to them These pillboxes would integrate with the medication assistant in a
through the Google Home App. Any intervention for medication way that the presence of the pill in the pillbox compartment would
management needs to strike a balance between the identical basic be known to the assistant through sensors. This will also lead to
Medication Management in Older Adults with MCI Using Conversational Assistants ASSETS ’22, October 23–26, 2022, Athens, Greece

lesser medication check-ins throughout the day for individuals and positive engagement. Working from this foundation, partici-
with more than average number of medications in a day. The smart pants saw additional avenues for future work including integration
pillbox and the CA would work together in the same ecosystem with smart devices and augmented pillboxes, and the creation of
to notify the users in the case of a delayed or missed medication medication records to share with clinical care teams.
and report it whenever asked. There are also other ways in which
the pillbox status can be verifed with the medication assistant
such as smart buttons pinned to the pillbox, an integration that REFERENCES
we are hoping to explore in future studies. Added integrations can [1] Renato FL Azevedo, Dan Morrow, James Graumlich, Ann Willemsen-Dunlap,
be with smart home devices, smart phones, tablets, sensors, etc. Mark Hasegawa-Johnson, Thomas S Huang, Kuangxiao Gu, Suma Bhat, Tarek
which provide a more robust functionality to the CA by creating a Sakakini, Victor Sadauskas, et al. 2018. Using conversational agents to explain
medication instructions to older adults. In AMIA annual symposium proceedings,
networked interaction ecosystem. Vol. 2018. American Medical Informatics Association, 185.
[2] Renée L Beard and Tara M Neary. 2013. Making sense of nonsense: experiences
of mild cognitive impairment. Sociology of health & illness 35, 1 (2013), 130–146.
10 LIMITATIONS [3] William S Bond and Daniel A Hussar. 1991. Detection methods and strategies
for improving medication compliance. American journal of hospital pharmacy 48,
While we are encouraged to see our study results, specifcally 9 (1991), 1978–1988.
the sustained engagement of the medication system, we would [4] Marie T Brown and Jennifer K Bussell. 2011. Medication adherence: WHO cares?.
like to address some limitations of our work. The development of In Mayo clinic proceedings, Vol. 86. Elsevier, 304–314.
[5] Noll L Campbell, Malaz A Boustani, Elaine N Skopelja, Sujuan Gao, Fred W
MATCHA was a contributing part of a larger ongoing study that Unverzagt, and Michael D Murray. 2012. Medication adherence in older adults
is aimed at understanding the usage of CAs to support functional with cognitive impairment: a systematic evidence-based review. The American
journal of geriatric pharmacotherapy 10, 3 (2012), 165–177.
independence and provide compensatory support to the individ-
[6] Chen Chen, Janet G Johnson, Kemeberly Charles, Alice Lee, Ella T Lifset, Michael
uals with MCI and their carepartners. As a result of this, most of Hogarth, Alison A Moore, Emilia Farcas, and Nadir Weibel. 2021. Understanding
our participants had been interacting with a CA for over a month Barriers and Design Opportunities to Improve Healthcare and QOL for Older
Adults through Voice Assistants. In The 23rd International ACM SIGACCESS
before our study. It would require more research to analyze the Conference on Computers and Accessibility. 1–16.
engagement and usage when the system is newly introduced to [7] Leigh Clark, Nadia Pantidi, Orla Cooney, Philip Doyle, Diego Garaialde, Justin
the participants. Additionally, we would also like to point out that Edwards, Brendan Spillane, Emer Gilmartin, Christine Murad, Cosmin Munteanu,
et al. 2019. What makes a good conversation? Challenges in designing truly
recruitment for the participants for the purpose of this study was conversational agents. In Proceedings of the 2019 CHI Conference on Human Factors
limited to the participating dyads in the cognitive program who in Computing Systems. 1–12.
are able to aford long-term therapeutic healthcare and primarily [8] Andy Cochrane, Ian H Robertson, and Andrew N Coogan. 2012. Association
between circadian rhythms, sleep and cognitive impairment in healthy older
belong to upper-middle and high-income households. While we are adults: an actigraphic study. Journal of neural transmission 119, 10 (2012), 1233–
currently in the process of working towards extending the study 1239.
[9] David M Cutler and Wendy Everett. 2010. Thinking outside the pill-
with CAs to other healthcare institutions serving a more broader box—medication adherence as a priority for health care reform. New England
population, the study results at the time of this paper are only Journal of Medicine (2010).
representative of the dyads within this cognitive program. In the [10] Melisa Duque, Sarah Pink, Yolande Strengers, Rex Martin, and Larissa Nicholls.
2021. Automation, Wellbeing and Digital Voice Assistants: Older People and
future, we also plan on reaching out to assisted living facilities and Google devices. Convergence 27, 5 (2021), 1189–1206.
establishing connections with the community partners who work [11] Mira El Kamali, Leonardo Angelini, Maurizio Caon, Giuseppe Andreoni,
Omar Abou Khaled, and Elena Mugellini. 2018. Towards the NESTORE e-Coach:
in those facilities. The commercial nature of the system, i.e., being a tangible and embodied conversational agent for older adults. In Proceedings of
a Google product, ensures that the technological infrastructure is the 2018 ACM International Joint Conference and 2018 International Symposium
well supported and can be accessed through a central Google email. on Pervasive and Ubiquitous Computing and Wearable Computers. 1656–1663.
[12] Mira El Kamali, Leonardo Angelini, Denis Lalanne, Omar Abou Khaled, and
Finally, we would also like to point out that all of our participating Elena Mugellini. 2020. Multimodal conversational agent for older adults’ behav-
dyads included a 2-person team - the member and the carepartner ioral change. In Companion Publication of the 2020 International Conference on
(spouse or adult child). Our goal for future research is to understand Multimodal Interaction. 270–274.
[13] Terry T Fulmer, Penny Hollander Feldman, Tae Sook Kim, Barbara Carty, Mark
multi-carepartner teams consisting of more than one adult child or Beers, Maria Molina, and Margaret Putnam. 1999. An intervention study to
extended family members in diferent locations. enhance medication compliance in community-dwelling elderly individuals.
[14] Lynn Gitlow. 2014. Technology use by older adults and barriers to using technol-
ogy. Physical & Occupational Therapy in Geriatrics 32, 3 (2014), 271–280.
11 CONCLUSION [15] Bradi B Granger and Hayden Bosworth. 2011. Medication adherence: emerging
use of technology. Current opinion in cardiology 26, 4 (2011), 279.
As older adults with MCI continue to age and advance through [16] D Heister, James B Brewer, Sebastian Magda, Kaj Blennow, Linda K McEvoy,
varying levels of cognitive progression, and with increasing num- Alzheimer’s Disease Neuroimaging Initiative, et al. 2011. Predicting MCI outcome
with clinically available MRI and CSF biomarkers. Neurology 77, 17 (2011), 1619–
bers of new diagnoses of MCI every year, it is crucial to design 1628.
new technologies and adapt existing ones to support aging adults [17] Sunyoung Kim et al. 2021. Exploring How Older Adults Use a Smart Speaker–
in important tasks such as medication management. In this study, Based Voice Assistant in Their First Interactions: Qualitative Study. JMIR mHealth
and uHealth 9, 1 (2021), e20427.
we explored the idea of supporting these tasks using CAs and de- [18] Bret Kinsella. 2019. Voice Assistant Demographic Data-Young Consumers More
signed a medication management system while keeping the needs Likely to Own Smart Speakers While Over 60 Bias Toward Alexa and Siri.
[19] Deborah Klee, Marc Mordey, Steve Phuare, and Cormac Russell. 2014. Asset
and cognitive behaviors of our participant population central to based community development–enriching the lives of older citizens. Working
our design. We observed that a design centered on specifc medi- with Older People (2014).
cation management scenarios, personalized to individual routines, [20] Kathrin Koebel, Martin Lacayo, Madhumitha Murali, Ioannis Tarnanas, and Arzu
Çöltekin. 2021. Expert Insights for Designing Conversational User Interfaces as
and broadly focused on "checking-in" regarding medication actions Virtual Assistants and Companions for Older Adults with Cognitive Impairments.
in contrast to narrow alerts and reminders, generated sustained In International Workshop on Chatbot Research and Design. Springer, 23–38.
ASSETS ’22, October 23–26, 2022, Athens, Greece Mathur, et al.

[21] Jarosław Kowalski, Anna Jaskulska, Kinga Skorupska, Katarzyna Abramczuk, [44] John Vines, Gary Pritchard, Peter Wright, Patrick Olivier, and Katie Brittain. 2015.
Cezary Biele, Wiesław Kopeć, and Krzysztof Marasek. 2019. Older adults and An age-old problem: Examining the discourses of ageing in HCI and strategies
voice interaction: A pilot study with google home. In Extended Abstracts of the for future research. ACM Transactions on Computer-Human Interaction (TOCHI)
2019 CHI Conference on human factors in computing systems. 1–6. 22, 1 (2015), 1–27.
[22] Glenn J Landry and Teresa Liu-Ambrose. 2014. Buying time: a rationale for exam- [45] Pierre Wargnier, Giovanni Carletti, Yann Laurent-Corniquet, Samuel Benveniste,
ining the use of circadian rhythm and sleep interventions to delay progression of Pierre Jouvelot, and Anne-Sophie Rigaud. 2016. Field evaluation with cognitively-
mild cognitive impairment to Alzheimer’s disease. Frontiers in aging neuroscience impaired older adults of attention management in the embodied conversational
6 (2014), 325. agent louise. In 2016 IEEE International Conference on Serious Games and Applica-
[23] Matthew L Lee and Anind K Dey. 2014. Real-time feedback for improving tions for Health (SeGAH). IEEE, 1–8.
medication taking. In Proceedings of the SIGCHI Conference on Human Factors in [46] Tamara Zubatiy, Kayci L Vickers, Niharika Mathur, and Elizabeth D Mynatt.
Computing Systems. 2259–2268. 2021. Empowering Dyads of Older Adults With Mild Cognitive Impairment And
[24] Haley MacLeod, Maia Jacobs, Katie Siek, Kay Connelly, and Elizabeth D Mynatt. Their Care Partners Using Conversational Agents. In Proceedings of the 2021 CHI
2016. Ethical considerations in pervasive health research. In Proceedings of Conference on Human Factors in Computing Systems. 1–15.
the 10th EAI International Conference on Pervasive Computing Technologies for
Healthcare. 326–329.
[25] Aqueasha M Martin, Jessica N Jones, and Juan E Gilbert. 2013. A spoonful
of sugar: understanding the over-the-counter medication needs and practices
of older adults. In 2013 7th International Conference on Pervasive Computing
Technologies for Healthcare and Workshops. IEEE, 93–96.
[26] Kerry Mills and Jennifer Brush. 2014. Coaching care partners who support loved
ones with MCI. Perspectives on Gerontology 19, 2 (2014), 50–56.
[27] Daniel G Morrow, H Chad Lane, and Wendy A Rogers. 2021. A framework
for design of conversational agents to support health self-care for older adults.
Human Factors 63, 3 (2021), 369–378.
[28] Gift Arnold Mugisha, Faith-Michael Uzoka, and Chinyere Nwafor-Okoli. 2017. A
framework for low cost automatic pill dispensing unit for medication manage-
ment. In 2017 IST-Africa Week Conference (IST-Africa). IEEE, 1–10.
[29] Jennifer M Ortman, Victoria A Velkof, Howard Hogan, et al. 2014. An aging
nation: the older population in the United States. (2014).
[30] Tejal Patel, Jessica Ivo, Sadaf Faisal, Aidan McDougall, Jillian Carducci, Sarah
Pritchard, Feng Chang, et al. 2020. A prospective study of usability and workload
of electronic medication adherence products by older adults, caregivers, and
health care providers. Journal of medical Internet research 22, 6 (2020), e18073.
[31] Jessica Pater, Shane Owens, Sarah Farmer, Elizabeth Mynatt, and Brad Fain.
2017. Addressing medication adherence technology needs in an aging population.
In Proceedings of the 11th EAI International Conference on Pervasive Computing
Technologies for Healthcare. 58–67.
[32] Belén Barros Pena, Rachel Elizabeth Clarke, Lars Erik Holmquist, and John Vines.
2021. Circumspect Users: Older Adults as Critical Adopters and Resistors of
Technology.. In CHI. 84–1.
[33] Ronald C Petersen. 2004. Mild cognitive impairment as a diagnostic entity. Journal
of internal medicine 256, 3 (2004), 183–194.
[34] Ronald C Petersen, Oscar Lopez, Melissa J Armstrong, Thomas SD Getchius,
Mary Ganguli, David Gloss, Gary S Gronseth, Daniel Marson, Tamara Pring-
sheim, Gregory S Day, et al. 2018. Practice guideline update summary: Mild
cognitive impairment: Report of the Guideline Development, Dissemination,
and Implementation Subcommittee of the American Academy of Neurology.
Neurology 90, 3 (2018), 126–135.
[35] Alisha Pradhan, Leah Findlater, and Amanda Lazar. 2019. " Phantom Friend" or"
Just a Box with Information" Personifcation and Ontological Categorization of
Smart Speaker-based Voice Assistants by Older Adults. Proceedings of the ACM
on Human-Computer Interaction 3, CSCW (2019), 1–21.
[36] Alisha Pradhan, Amanda Lazar, and Leah Findlater. 2020. Use of intelligent
voice assistants by older adults with low technology use. ACM Transactions on
Computer-Human Interaction (TOCHI) 27, 4 (2020), 1–27.
[37] Blaine Reeder, George Demiris, and Karen D Marek. 2013. Older adults’ satisfac-
tion with a medication dispensing device in home care. Informatics for Health
and Social Care 38, 3 (2013), 211–222.
[38] Yara Rizk, Vatche Isahagian, Merve Unuvar, and Yasaman Khazaeni. 2020. A
snooze-less user-aware notifcation system for proactive conversational agents.
arXiv preprint arXiv:2003.02097 (2020).
[39] Assumpta Ann Ryan. 1999. Medication compliance and older people: a review of
the literature. International journal of nursing studies 36, 2 (1999), 153–162.
[40] Jamie Sanders and Aqueasha Martin-Hammond. 2019. Exploring autonomy in
the design of an intelligent health assistant for older adults. In Proceedings of the
24th International Conference on Intelligent User Interfaces: Companion. 95–96.
[41] Takuo Suzuki and Yasushi Nakauchi. 2014. A smartphone mediated portable
intelligent medicine case for medication management support. In 2014 36th
Annual International Conference of the IEEE Engineering in Medicine and Biology
Society. IEEE, 3642–3645.
[42] Jelte Barachia van Waterschoot, IHE Hendrickx, Arif Khan, Catia Cucchiarini,
Helmer Strik, LFM ten Bosch, and Rob Tieben. 2021. Spoken Conversational
Agents for Older Adults. Who Are the Stakeholders and What Do They Expect?
(2021).
[43] Bert Vandenberghe and Karin Slegers. 2016. Anthropomorphism as a strategy
to engage End-Users in health data ideation. In Proceedings of the 9th Nordic
Conference on Human-Computer Interaction. 1–4.
Designing Post-Trauma Self-Regulation Apps for People with
Intellectual and Developmental Disabilities
Krishna Venkatasubramanian∗ Tina-Marie Ranalli∗
The University of Rhode Island Unafliated
Kingston, RI, USA Providence, RI, USA
krish@uri.edu ranalli@gmx.edu

ABSTRACT people with intellectual and developmental disabilities (I/DD)1 are


In the US people with intellectual and developmental disabilities suspected to have experienced trauma in their lifetime [51]. The
(I/DD) comprise one of the most likely groups to experience trau- negative efects of trauma (e.g., anxiety, depression, complicated
matic life events. These experiences often produce negative efects grief) experienced by people with I/DD are often misdiagnosed be-
(e.g., stress, anxiety, grief, numbing, etc.) that need to be managed. cause of what is known as diagnostic overshadowing, where health
Methods such as emotional self-regulation are often used to help practitioners assume that the negative efects of trauma are merely
people cope when these efects present themselves post-trauma. an aspect of the person’s I/DD and/or other disability [50]. There-
In recent years mobile-computing-devices-based apps have been fore, the majority of people with I/DD who experience trauma may
increasingly used to help the general population with autonomous not get appropriate help to cope with the negative efects of the
self-regulation. However, none of these is designed for people with trauma and to improve their quality of life [86]. Moreover this
I/DD or is cognizant of the trauma they experience in their lives. disregard for, and negation of, the post-trauma reality of people
We interviewed eight (8) practitioners at a trauma services organi- with I/DD may also constitute another traumatic event, thereby re-
zation that, among other things, helps people with I/DD learn and traumatizing them [12]. Untreated trauma can result in severe and
practice post-trauma self-regulation. The goal of the interviews is to long-term impairments that negatively afect one’s general well-
understand what it would take to build post-trauma self-regulation being and emotional, social, academic, and physical development
apps for people with I/DD. Based on the interview responses we [51]. Therapeutic treatment often involves processing the traumatic
argue for a set of guidelines, based on the social work practice of experience. The methods to do so include, cognitive behavior ther-
trauma-informed care, to design post-trauma self-regulation apps apy (CBT) and eye movement desensitization and reprocessing
for people with I/DD. (EMDR), both of which have been shown to be efective for people
with I/DD [50], especially EMDR [51]. However, there are often
CCS CONCEPTS substantive systemic and personal barriers that prevent or impede
people with I/DD from accessing the relevant mental health ser-
• Human-centered computing → Accessibility technologies.
vices [84]. Engaging in self-regulation - coping with the negative
efects of trauma (more on this in Section 1.1) - outside of therapy
KEYWORDS
has been shown to play a major role in facilitating the process of
trauma-informed care, intellectual disability, developmental disabil- recovering from trauma in people with I/DD [51]. It is, thus, im-
ity, trauma, design portant to empower people with I/DD by giving them the means to
ACM Reference Format: autonomously cope with the negative efects of trauma in their daily
Krishna Venkatasubramanian and Tina-Marie Ranalli. 2022. Designing Post- lives, that is, self-regulate, whether or not they also pursue therapeutic
Trauma Self-Regulation Apps for People with Intellectual and Developmen- treatment.
tal Disabilities. In The 24th International ACM SIGACCESS Conference on The last two decades have seen the near pervasive availability of
Computers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. mobile computing devices (e.g., smartphones, tablets, and wearables).
ACM, New York, NY, USA, 14 pages. https://doi.org/10.1145/3517428.3544798
Such mobile computing devices are increasingly being used to help
people who have experienced trauma autonomously cope with the
1 INTRODUCTION day-to-day negative efects of trauma. In the past decade, the US
Trauma is a sudden, potentially deadly experience that often leaves federal government has developed several trauma-focused apps that
lasting, troubling memories [25]. In the US, a signifcant number of help military veterans and US-Department-of-Defense-afliated
∗ Both
military and civilian personnel seek help, understand, and manage
authors contributed equally to this paper.
the negative trauma efects they experience [58, 61, 68, 81]. How-
Permission to make digital or hard copies of all or part of this work for personal or
ever, to the best of our knowledge, no post-trauma self-regulation
classroom use is granted without fee provided that copies are not made or distributed (PTSR) app exists that has been specifcally designed for the needs
for proft or commercial advantage and that copies bear this notice and the full citation of people with I/DD. Recent years have seen the regular use of mo-
on the frst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or bile computing devices (like smartphones, tablets, etc.) by people
republish, to post on servers or to redistribute to lists, requires prior specifc permission
and/or a fee. Request permissions from permissions@acm.org.
ASSETS ’22, October 23–26, 2022, Athens, Greece 1 Based on the defnition from the American Association of Intellectual and Develop-
© 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM. mental Disabilities, I/DD can be thought of as a set of disabilities that negatively afect
ACM ISBN 978-1-4503-9258-7/22/10. . . $15.00 the trajectory of an individual’s intellectual, emotional, and/or physical development.
https://doi.org/10.1145/3517428.3544798 I/DD appear in childhood and are likely to be present throughout life [1].
ASSETS ’22, October 23–26, 2022, Athens, Greece Venkatasubramanian and Ranalli

with I/DD in the US [56, 80]. Therefore a mobile-computing-device- The negative efects of trauma on an individual with I/DD need to
based app would provide an excellent mechanism for empowering be managed on an on-going basis. In this work, we do not look at the
people with I/DD to self-regulate post-trauma. In this collaborative negative efects of trauma from a medical, diagnostic standpoint
paper, which was written by an HCI and a humanities scholar, we that seeks to determine the presence/absence of post-traumatic
aim to explore the design of post-trauma self-regulation apps for stress disorder (PTSD). Rather we approach the post-trauma state
people with I/DD that can complement therapeutic interventions. from the standpoint of what some humanities scholars term dis-
Consequently, in this paper we frst defne the notion of self- ease [14, 72, 77]. This state of dis-ease, or a lack of ease, can present
regulation in the context of its use by people with I/DD. We then itself in many ways, including: a general sense of grief, depression,
present an interview study where we interviewed eight (8) practi- anxiety, sleep disturbance, exaggerated tendency to startle, self-
tioners at an organization in the United States (US) that provides harming behavior, and being stressed by triggers [51]. We do not
trauma services specifcally for people with I/DD. As part of their view the state of dis-ease as something caused by triggers alone,
services, among other things, they teach people with I/DD how to though triggers can provoke or exacerbate it.
use self-regulation to cope with trauma. Such eforts are intended Note that we do not claim that self-regulation should be the only
to be complementary to any therapeutic interventions that the sur- source of dealing with post-trauma dis-ease for people with I/DD.
vivor may pursue. Our aim with the user study is to understand Therapeutic approaches, such as cognitive behavior therapy (CBT)
the following: (1) What does post-trauma self-regulation for people and eye-movement desensitization and reprocessing (EMDR), are
with I/DD entail? (2) Do people with I/DD use apps/technologies powerful, have been shown to work for people with I/DD [51],
for post-trauma self-regulation and if so, how? and (3) What should and are indeed useful. That being said, when it comes to the I/DD
designers keep in mind when designing post-trauma self-regulation population, there is also a need to think more holistically beyond
apps for someone with I/DD? therapeutic approaches alone. This is because of: (1) a lack of thera-
We then present a new framework for designing for post-trauma pists who will work with people with I/DD; (2) frequent diagnostic
self-regulation that leverages the notion of trauma-informed care overshadowing by therapists and providers; (3) a lack of knowledge
[46]. Trauma-informed care was originally developed in the feld of about mental health issues and the stigmas associated with people
social work for support organizations that help trauma survivors. with I/DD; and (4) logistical problems related to the lifestyles of
The practice of trauma-informed care avoids pathologizing trauma people with I/DD (in the US), such as a lack of suitable transporta-
and focuses on validating the post-trauma experiences of the in- tion and a lack of privacy in group living situations, which often
dividual. This prevents retraumatization of the individual while preclude them from pursuing therapy [83].
empowering them to develop a sense of safety and reducing any Therefore, we defne post-trauma self-regulation (PTSR) as
shame or guilt surrounding their past trauma [39]. In this paper, engaging in one or more activities that regulate one’s emotions to
we refect on how the notion of trauma-informed care can be applied help cope with the state of dis-ease in the moment, outside of (but
to the design of PTSR apps for people with I/DD and propose several complementary to) any long-term therapy [55]. Broadly speaking,
design guidelines to this end. self-regulation usually requires performing activities that focus
This paper makes the following contributions: demonstrating one’s attention to temper the negative emotions and feelings in a
the lack of existing self-regulation solutions that meet the PSTR given situation [71]. This paper does not concern itself with the
needs of people with I/DD (Section 2); an interview study with design of specifc PTSR activities within an app environment to be
stakeholders to understand what to keep in mind when designing used by people with I/DD. Rather, this paper is about the larger
PSTR apps (Section 3); and proposed guidelines for designing PTSR design considerations of an app that provides PTSR for people with
apps for individuals with I/DD (Sections 5). I/DD. The decision of whether or not to engage in any particular
type of PTSR activity is up to the individual with I/DD. Given that
1.1 Trauma and post-trauma self-regulation people with I/DD are avid users of mobile computing devices [80],
for people with I/DD apps are one of the easiest ways of providing people with I/DD the
ability to perform PTSR.
Before we delve into the details of designing post-trauma self-
regulation apps, we present a short overview of how trauma afects
people with I/DD. Trauma afects people with I/DD in profound 2 RELATED WORK
ways. After trauma, people with I/DD can experience: (1) impaired
In this section, we provide an overview of the work done in re-
working memory for tasks at hand; (2) a variety of negative ef-
search and commercial spaces around mobile-computing-based
fects from the trauma (e.g., depression, anxiety, etc.) that vary from
self-regulation apps. None of the prior work has focused on the use of
individual to individual; (3) excessive distraction, even more so
technology for PTSR for people with I/DD. We categorize the extant
than non-I/DD individuals experience; (4) limited ability to avoid
work on self-regulation into four broad groups, which we describe
triggers2 because of limited control over their environment than
below.
those without I/DD (e.g., a group living situation); and (5) limited
ability to get appropriate help to manage the negative efects of
their trauma [86]. 2.1 Mobile technologies for mental health
2 Triggers are things like sights, sounds, smells, situations, or depictions that are similar
Recent years have seen an increasing number of technologically-
to and in some way evoke a person’s traumatic experience and which cause the person oriented mental health solutions that have become available. These
to uncontrollably recall and relive the trauma [42]. include: connecting people with help online [16, 45], conversational
Designing Post-Trauma Self-Regulation Apps for People with I/DD ASSETS ’22, October 23–26, 2022, Athens, Greece

agents [26, 57, 73], chatbots [30, 62, 78], and virtual reality (VR) sys- include aspects that trigger trauma survivors. For instance, just the
tems [27]. Often these health technologies focus on helping people act of asking a trauma survivor to close their eyes for meditation
with issues, such as depression and anxiety, by providing treatment can be triggering and make them feel unsafe and potentially retrau-
and therapies (such as cognitive behavioral therapy (CBT) or expo- matized [66]. These apps impose a very high cognitive load on the
sure therapy) [88]. Such technological approaches to therapy have user and require signifcant working memory, as they often contain
become necessary due to a lack of easy access to mental health an enormous number of self-regulation exercises from which to
services, stigma, and low mental health literacy in the general popu- choose. For instance, the app Insight Timer boasts of having tens of
lation [53] (much like the I/DD community). Consequently, consid- thousands of videos alone, not to mention all of its live content [36].
erable design work is now underway to systematically understand Moreover, these apps also provide limited instructions and often
the users of these technologies and build better and more usable poor recommendation support. Many of these apps have substantial
mental health technologies [8–10, 22, 41, 45, 76, 88]. These technolo- subscription fees, which can be difcult for many to aford (e.g.,
gies, though important, are diferent from our work in important Calm [15] and Headspace [32]) [40]. As mentioned above, most of
ways. (1) The focus of these technologies is often on therapeu- these apps are heavily reliant on some form of meditation practice
tic approaches, which are diferent from, though complementary involving listening to someone guide the activity through spoken
to self-regulation. (2) These technologies have not been designed cues, which are usually vague and tailored to users who already
for people with I/DD whose lived experiences, motivations, and have experience with meditation and do not require explicit instruc-
struggles can be quite diferent from non-I/DD individuals. tions. Not surprisingly, these meditations have been found to be
difcult to use for many people who fnd it hard to keep up with the
2.2 Research apps for non-trauma pace of the instructions [69]. On the whole, these apps constitute
self-regulation part of what is referred to as “self-tracking culture” in [47], which is
only fully available to those with sufcient socioeconomic privilege.
Mobile-computing-device-based apps for self-regulation have been
Thus, those who lack adequate privilege (e.g., older adults, those
investigated in the academic research space as well. These include
with less education or lower incomes, people with disabilities and
using a device to record one’s emotions, location-tagging them,
chronic health problems, people living in rural and remote areas,
and sharing them (anonymously, if needed) with others [35]. Other
etc.) cannot fully beneft from these technologies and are left be-
approaches include using slow, repetitive actions to maintain the
hind. In short, whether consciously or not, these apps are designed
user’s attention [4], providing gentle feedback via soft fascinations3
for highly motivated, non-traumatized, privileged people who have
[2], and promoting an efortless refection on any loss of attention
the means to pay for them; they are not intersectionally inclusive.
[69]. These ideas of repetitive actions and soft fascinations have
been extended from monitoring one’s thoughts to performing mind-
ful physical activities (e.g., yoga) as well [60]. These apps, not having
been designed for someone with I/DD, provide a limited repertoire
of often abstract self-regulation activities. Someone with limited
working memory (as many people with I/DD have post-trauma) 2.4 Non-commercial trauma-focused
would likely fnd them difcult to use. Further, if any of the activ- self-regulation apps
ities/actions/sounds in these apps constitutes a trigger for users
In recent years, an entire class of free apps has been developed
who have experienced trauma, the apps do not ofer any alternative
by the US federal government primarily for veterans, members
activities or any form of recourse to deal with the ramifcations.
of the military, civilian personnel, and their families to help them
2.3 Commercial apps for non-trauma cope with the negative efects of trauma. These include apps that:
teach mindfulness and meditation practices and routines over time
self-regulation [49], provide a variety of ways to deal with the negative efects
Numerous mobile-computing-device-based apps are now available of trauma [61, 81], help healthcare providers deal with secondary
that provide a means of self-regulation, primarily through guided traumatic stress [67], help families of trauma survivors [24], and
meditation practices (e.g., Calm and HeadSpace). We looked at 28 help survivors of sexual assault [58, 68]. These apps have many
popular self-regulation apps listed on websites like Healthline [33], useful features that could be leveraged for people with I/DD (as
Positive Psychology [64] and Women’s Health Weekly [85]. We we shall see from time-to-time in this paper). However, as none
provide the list of the 28 apps along with the URL where we found of these apps has been designed for people with I/DD, they have
them in the appendix. Usually designed to manage stress, these several drawbacks in terms of being accessible for our target popu-
apps require their users to observe their current state of attention lation. Most of these apps are quite reading-heavy in nature (often
and adjust it, often based on spoken instructions. Most of these apps necessarily so, as they want their audience to understand trauma,
appear not to be cognizant of the lives of people with I/DD or those take it seriously, and get help), which can be difcult for people
who have experienced trauma. These apps tend to blithely ofer with I/DD, many of whom have limited literacy [34]. Moreover,
activities unproblematically, unaware that certain characteristics these apps typically lack efective recommendation services for
of the activities and/or the way in which they are presented can activities to do, given the often large set of available activities. We
3 Soft fascinations are feedback, such as nature sounds, which are meant to be efortless
believe that these aspects, among others, make these apps not only
to observe. These are often used in Attention Restoration Theory as a way to recover difcult for people with I/DD to navigate but also difcult for them
from mental fatigue [38]. to operate, especially when used autonomously.
ASSETS ’22, October 23–26, 2022, Athens, Greece Venkatasubramanian and Ranalli

ID Freq. of interaction w/ people with I/DD Duration working w/ people with I/DD
P1 2 days/week 4 years
P2 Daily 25 years
P3 Daily 3 years
P4 Weekly 4 years
P5 Weekly 3 months
P6 Daily 5 years
P7 3 days/week 10 years
P8 Daily 17 years

Table 1: Demographic information for the practitioners surveyed in our study. All participants identifed as female.
2.5 Trauma-informed computing 3.1 Study methods
In a very recent paper [17], the idea of trauma-informed comput- We interviewed practitioners who are afliated with an adult pro-
ing was introduced. The authors, coming from the perspective of tective services agency in the US and who help people with I/DD
intimate partner violence (IPV) and the role of technology is per- with trauma services. Among the services they provide is helping
petuating it, describe the need for trauma-informed computing via people with I/DD learn how to engage in PTSR. These practitioners
three fctional scenarios they constructed, which refect their expe- have a global perspective about coping with trauma specifcally
rience working with IPV survivors. The authors then turn to the for the I/DD community. They can therefore provide us with the
notion of trauma-informed care, which we explore in more detail larger context regarding self-regulation practices within the com-
below. In basic terms, trauma-informed care asks organizations pro- munity. All interviews were conducted over Zoom because of the
viding services to trauma survivors to treat these survivors in a way COVID-19 pandemic. The interview protocol was approved by the
that conveys respect and compassion [21, 31, 46]. This recent paper institutional review board (IRB), the ethics board, at the University
adapts the notion of trauma-informed care into a broad framework of Rhode Island.
they refer to as trauma-informed computing. The scope of the paper
is broad and discusses applying their framework to topics, such as:
3.1.1 Study design. We conducted semi-structured interviews with
UX research and design, security and privacy, AI and machine learn-
participants recruited for our study. We had a script with open-
ing, and corporate culture. Therefore the guidelines provided in
ended questions and interviewees were allowed to wander in
the paper are designed to be broadly applicable rather than directly
their responses. We opened with a brief introduction of our aims
relate to the needs of a specifc class of technology or to address the
followed by questions in four categories: the participant’s back-
needs of a specifc community (such as the I/DD community). We
ground, the services their organization provides related to enabling
similarly apply the notion of trauma-informed care to the develop-
self-regulation for individuals with I/DD, their understanding of
ment of technology (in the present paper): specifcally PTSR apps
the types of PSTR practiced by individuals with I/DD, and their
for people with I/DD. We consider our contemporaneous work as
thoughts on ways to more efectively facilitate PSTR among individ-
complementary to the one proposed in [17]. We adapt the idea of
uals with I/DD. A total of eight (8) practitioners participated in the
trauma-informed care specifcally to meet the targeted needs of
interviews. All eight interact with people with I/DD on a regular
designing PTSR apps for people with I/DD, based on interviews
basis and have experience helping them learn and practice self-
with stakeholders. Consequently, our view of how trauma-informed
regulation. The individuals with I/DD with whom our practitioners
care maps to technology design difers somewhat from the view
work have mild to moderate I/DD. Table 1 shows the demographic
presented in [17].
information for the practitioners.

3.1.2 Study analysis. After the user study, the collected Zoom
3 INTERVIEW STUDY recordings were transcribed. We applied Braun and Clark’s 6-step
recursive approach to thematic analysis, as described in [13]. The
Our aim in this paper is to explore the design of post-trauma self-
coding and analysis were completed in a collaborative manner
regulation (PTSR) apps for people with I/DD that complements
between the two authors, aiming to achieve a richer interpretation
therapeutic interventions. In this regard, we wanted to develop
of meaning than attempting to achieve consensus would produce.
a broad understanding of PTSR within the I/DD community. To
The coding and theme development were done inductively and
wit, we interviewed practitioners at an organization in the US
evolved throughout the analytic process. Table 2 lists the codes that
that provides trauma services to people with I/DD. As part of their
were generated during the thematic analysis. The results of our
services they help and teach people with I/DD to use self-regulation
analysis are summarized in the fndings below.
to cope with their past trauma. Our aim with the interviews was to
answer three core research questions. (RQ1) What does post-trauma
self-regulation for people with I/DD entail? (RQ2) Do people with 3.1.3 Limitations. The methodology of our study had three main
I/DD use apps/technologies for post-trauma self-regulation and if limitations that we briefy discuss. First, all practitioners we in-
so, how? and (RQ3) What should designers be mindful of when they terviewed identify as female, though we did not set out to solicit
design post-trauma self-regulation apps for someone with I/DD? opinions exclusively from female practitioners. We do not believe
Below we describe the study methods of our interview. that this afected the observations in the paper; however, a more
Designing Post-Trauma Self-Regulation Apps for People with I/DD ASSETS ’22, October 23–26, 2022, Athens, Greece

Code Name Defnition


Current self-regulation activities Self-regulation activities in which individuals with I/DD engage
Potential self-regulation activities Categories of self-regulation activities that should work for individuals with I/DD
Ersatz apps for self-regulation Apps that individuals with I/DD turn to for self-regulation (since there is no existing PTSR app for them to use)
Suitability of ersatz apps Appropriateness of the ersatz self-regulation apps for PTSR for individuals with I/DD
Designing for PTSR What the PTSR apps for individuals with I/DD should consider in their design

Table 2: The codes used in our thematic analysis of the interviews.


diverse practitioner population could have provided additional per- moment is often healing in itself: “[people with I/DD] enjoy
spectives that we might have missed here. Second, all of our partici- learning [about and] identifying what their feelings are or
pants were from the United States and, therefore, their perspectives maybe what they’re going through. Even the difcult feelings
and experience may difer from those of practitioners from other but they’ll come to terms with [them]... in my understanding
regions. Third, we did not talk to individuals with I/DD about how and observation, it’s very healing and it’s very empowering.”
they self regulate on a day-to-day basis. This is because, at this (P8).
point in our work, the aim was to get a community-level picture of • Enabling people with I/DD to make decisions about their
what self-regulation for people with I/DD entails. This is something lives, including the type of self-regulation activities that
that we plan to work on in the immediate future. work for them: “So it’s important to make sure that they’re
taking charge in... any of the decisions being made... allowing
3.2 Study fndings 1: PTSR for people with people to pick their own [self-regulation activities].” (P6).
I/DD should promote empowerment and • Providing a means of supporting them as they make these
include diverse activities. decisions: “... decrease any barriers or any complexity to them
getting whatever support that they might need.” (P8). The idea
In our interviews, we began by trying to understand what PTSR is to enable people with I/DD to lead a healthy and positive
for people with I/DD entails. We found two main themes in this life by taking in account things like “what are their goals,
regard, which we describe next. We provide verbatim quotations, their dreams, their interests, their preferences, and how can
which are edited for brevity and clarity using ellipses and brackets, they get those things.” (P2).
respectively.
3.2.1 PTSR should enable people with I/DD to gain control over their
lives as a way to mitigate some of the efects of their trauma. People
with I/DD often lead, as P2 puts it, a highly “managed life.” For
instance, they live in group homes where they share personal space 3.2.2 Activities used for PTSR should be diverse to give them more
and resources with others: “I also think, acknowledging the disability choices. Every person has diferent needs when it comes to self
component... that’s something that’s unique to [people with I/DD]. Like regulating to deal with the negative efects of trauma: “I think it’s
a [person with I/DD] will talk about, in one way or another, how... they good to have choices, like a variation of choices because that’s really a
might not be able to access certain things because of their disability, big thing too, is that one coping skill might not work for [one] person
like privacy [or lack there of, in their daily lives].” (P3). Furthermore, [while for another person] it might be the best coping skill ever.” (P7).
the opinions of people with I/DD are often discounted by others: “... Consequently, when it comes to PTSR, a variety of activities should
people don’t believe people with [I/DD] are reliable narrators of their be ofered. Our participants listed a large variety of options that
experiences.” (P3). they have seen work for people with I/DD to cope with their trauma.
Moreover, if people with I/DD are in distress, they are often These can be broadly categorized into 6 categories (next to each
blamed for displaying such behavior: “if someone [with I/DD] is categorical entry, we list the number of participants who suggested
distressed, it’s like they’re doing something wrong or they’re having a that category): audio-visual activities (e.g., music, video games, and
behavior that’s wrong.” (P2). These experiences often mean that they video clips) (N = 4), outdoor activities (e.g., gardening/yard work,
lack power over their words, body, and surroundings, which con- walks, and traveling) (N = 4); establishing social connections (N =
tributes to their feeling traumatized on a day-to-day basis, which 3); identifying one’s emotions/feelings (N = 3); breathing, yoga, and
is line with extant literature [52, 86]. Given the pervasive presence mindfulness practices (N = 2); viewing positive afrmations and
of traumatizing events in the lives of people with I/DD the partici- messages (N = 2); and activities that engage a person’s creativity
pants felt that one of the main tenets of PTSR should be to allow (e.g., coloring books and crafts) (N = 1). These categories are, of
people with I/DD to regain control over their lives. They brought course, not comprehensive but they do give a sense of how diverse
up three broad ways of empowering individuals with I/DD in this PTSR activities can be. Interestingly one participant stated that, for
regard: some people with I/DD, activities involving abstract imagery or
• Helping people with I/DD to observe and determine their visualizations can be distressing instead of calming: “[People with
current feelings/emotions in the moment: “being able to em- I/DD] feel like [if] it’s visually too imaginary [it’s] distressing rather
power them and teach them the skills that they need to be than calming. In that same vein, it makes me think that something
able to [check in and see how they are feeling] in a practical that can be useful is being a bit more concrete... being less abstract and
way in their life would be a really amazing goal.” (P7). This is less imaginary or visual. Like the idea of visualizations is probably
essential because knowing one’s feelings or emotions in the not the best route, usually.” (P2).
ASSETS ’22, October 23–26, 2022, Athens, Greece Venkatasubramanian and Ranalli

3.3 Study fndings 2: People with I/DD use • Non-commercial trauma-focused apps: The practitioners felt
existing apps for PTSR but these are not that these apps were not designed to be accessible for people
well-suited to their needs. with I/DD, despite being designed for helping with trauma:
“I remember the military app like defning PTSD. That’s like a
Next, we wanted to understand what apps or technologies people clinical defnition of PTSD, right? I don’t need that... it’s also
with I/DD use for PTSR purposes. Further, we also wanted to under- there’s not a lot of personalization, which I think is like the
stand how well-suited these solutions were for our population. We core part of the work we do... I think you just can’t personalize
describe our fndings below. Again, we provide verbatim quotations, any of those apps the way that they need to be personalized
which are edited for brevity and clarity using ellipses and brackets, for our clients.” (P3).
respectively. • Social media apps: These apps were viewed as being too risky
3.3.1 People with I/DD use a variety of ersatz apps for PTSR. Ac- for being used for PTSR purposes: “God no! Social media isn’t
cording to our participants, people with I/DD use a variety of apps designed for anybody dealing with trauma. No one.” (P4).
for PTSR purposes. These can be grouped into four broad categories
(as before, we list the number of participants who suggested each 3.4 Study fndings 3: The design of PTSR apps
category). They include the use of: audio-visual apps (e.g., YouTube, for people with I/DD needs to promote
Netfix, Spotify, video games) (N = 5); social media apps (e.g., In-
autonomy, be accessible, and enable social
stagram, TikTok, Facebook, and Pinterest) (N = 3); creativity apps
(e.g., coloring books, puzzles, creating music playlists) (N = 2); a connections
digital companion (e.g., Care.Coach) (N = 1); viewing pictures of We next asked our participants what designers of PTSR apps should
cute animals (N = 1). be mindful of when designing for someone with I/DD. The partic-
These data show that the apps used by people with I/DD, accord- ipants’ responses can be categorized into three broad categories,
ing to our participants, roughly match the diversity of the PTSR which we describe below. Once again, we provide verbatim quo-
activities that they recommended as potentially useful. The only tations, which are edited for brevity and clarity using ellipses and
categories not found here, as compared to the section above, in- brackets, respectively.
clude outdoor activities and observing one’s feelings/emotions. It
is unsurprising that the former is not included here, since outdoor 3.4.1 Support the autonomy of people with I/DD in design to make
activities are not suited to implementation on an app modality. It is decisions about their self-regulation. As mentioned before, one of
also unsurprising that the latter is not present here, since all of the the most signifcant sources of trauma in the lives of people with
apps in this section are essentially ersatz PTSR apps. This means I/DD was not having control over their body, words, and lives in
that were not explicitly designed for self-regulation, which often general. Therefore, one of the main design elements that PTSR apps
involves refecting on how one is feeling in the moment. Going should espouse is the provision of control, that is, helping people
forward we refer to the app categories listed in this section as er- with I/DD control the type of self-regulation that they want to do:
satz apps. Interestingly, none of our participants mentioned people “I think people want to have the option of what type of self care that
with I/DD using of any of the trauma-focused self-regulation apps they engage in and what it’s relating to... I’ve noticed having topics
developed for conditions like PTSD (as mentioned in Section 2). people like... food, TV, exercise, and/or trauma and coping.” (P1). In
this regard, the app should not try to tell people with I/DD what to
3.3.2 None of the ersatz apps is well-suited for people with I/DD. do. Rather it should guide them to identify what they need: “I guess
Although people with I/DD were using a variety of apps for PTSR listing or being able to identify what it is their needs are, for one, and
purposes, they were using apps that were neither explicitly designed that’s going to vary from individual to individual. So a section... for
for this population nor were they using apps that are designed them to be able to pinpoint or identify, ‘okay, I need this’ or... ’these
for coping with trauma. When it came to these ersatz apps, the are my basic needs’....” (P8).
participants felt that the apps were not well suited for PTSR for Furthermore, supporting autonomy also includes understanding
people with I/DD. The practitioners explicitly mentioned three that some (perhaps most) people will not want to talk or engage with
categories of ersatz apps when discussing their unsuitability for the trauma when doing PTSR: “... framing the app in a way where
PTSR for people with I/DD: the individual, if they want to use and talk about their trauma, they
• Commercial non-trauma self-regulation apps: The practition- can but it’s not a requirement to use the app... sometimes somebody
ers felt that commercial apps were not designed to be usable might, say, want a self-care activity, might want to do something but
for people with I/DD because of the cognitive load they doesn’t want to specifcally think about their sexual assault, doesn’t
impose: “I would not recommend any of those [commercial want to specifcally think about why they need that self care, just
self-regulation] apps to any of my clients. I would never. The they need a self care activity.” (P6). Moreover, there is a common
meditations are weird... You need to know how to do too much misconception that PTSR requires engaging with the trauma: “but
independent navigation to use any of those apps. It’s too much... we talk about skills... you don’t have to talk about trauma to respond
There were no pictures on anything... everything should be writ- to it or heal from it. I think that’s a big misconception that people
ten at most a ffth-grade reading level. If my client, who’s never have.” (P3). This notion that PTSR need not involve engaging with
meditated before, is going to use a guided meditation, like, ’No!’ the trauma at all builds on our earlier idea from this section where
... I need what’s an intro level. I need like a video or something. the person with I/DD ultimately gets to decide what works for them
My clients, they don’t use this stuf on a regular basis.” (P3). when it comes to PTSR.
Designing Post-Trauma Self-Regulation Apps for People with I/DD ASSETS ’22, October 23–26, 2022, Athens, Greece

Finally, supporting the autonomy of a person with I/DD also for our target community. When we use the term intersectional
means being aware that what triggers someone in a given moment approach to design, we mean design that is aware of the multitude
is often unpredictable to designers beforehand: “I think you never of factors that afect the lived experience of a person’s life: “People
know what [someone’s trigger] is... because I have a client who was [with I/DD] haven’t just experienced ableism but have also experienced
triggered by the word ‘guys,’ right?... and saying the word ‘guys’ and sexism, racism, and homophobia. I think it’s important to [consider
being perceived as not feminine was a big trigger for that client and I this intersectionality] for how we [think about] helping somebody.”
would mess up. I would have to write myself a note to... remember not (P2). Therefore an intersectionally inclusive design of PTSR apps
to say the word but that’s just a small example of what it can look like.” should include content that: accommodates diferent literacy levels
(P3). Since triggers cannot be predicted the design should include (to promote I/DD accessibility); employs gender-neutral language
appropriate provisions to mitigate the efects of such triggers by: and avoids both gender binarism and gender stereotypes (to avoid
(1) making sure that users do not feel blamed for their triggers: “You biases like transphobia, homophobia, and sexism); uses images of
don’t want something that would inadvertently make a person feel people with diverse skin tones and body types (to avoid typical
blamed or shamed.” (P2), and (2) allowing the person using the app content whitewashing and fxations on certain notions of beauty),
to individualize (customize) the app to their personal situation and etc.
needs.
3.4.3 Include some form of social connection in the design. An in-
3.4.2 Be accessible and intersectionally inclusive in design. In terms teresting pattern that we observed in the use of the ersatz apps
of the eventual design of the app, given that it is ultimately being was that people with I/DD were using them to try to engage in
designed for someone with I/DD, it must be accessible to them. social connection in some form or another. This is particularly true
Our participants specifed several key properties for accessibility. for social media apps: “I know a lot of [people with I/DD]... [use]
These included: the use of simple and easy-to-understand language social media for self [regulation]. A lot of them will do that... for
that is concrete rather than abstract. The idea is to tend toward social connection, talking to people, feeling like they’re connected,
over-explaining things within the app: “I think [the activities] would they’re not isolated.” (P6). Additionally, the participants also men-
have to be really broken down. So I know sometimes if I will go on tioned social connection in the context of using music for PTSR by
like meditation apps, if I listen to people talk, they sometimes will sharing playlists with others: “... the person I was talking about who
use very large words, talk about like the space, the heavens. But for has a communication device... his device connects to YouTube and he
some people, that’s really not accessible and so just something that’s shares music videos all the time.” (P2). In the same vein, a recurring
very simple, very clear. If we’re breathing in: ‘breathe in for three theme in the design of PTSR apps was the need to promote social
seconds. I’m going to count with you: one, two, three,’ things like that. connection for people with I/DD: “I just envision an app more of like
Over-explaining to where you think it’s not necessary but for someone a dialog between people with I/DD as opposed to some of [the] apps
with a disability, that could be really helpful because again there’s a that you would see now.... [For PTSR, people with I/DD]... really like...
diference between telling somebody ‘breathe in’ and ‘we’re going to the dialog and the socialization, especially during a time like this. So
breathe in for three seconds through our nose and we’re going to try if they have an app that... really flls the void of the socialization piece
and feel it in our chest or in our stomach.”’ (P6). that none of [the clients] have right now because they’re so isolated.
An interesting point on this topic raised by one of the participants They want to feel like they’re talking to somebody and if they feel
was that the app should be accessible to people with I/DD without like they’re talking to their peer, that’s, to me, even better.” (P1). The
calling obvious attention to that fact. In other words, people with participants suggested several alternatives for establishing social
I/DD want to be treated like everyone else, as a person frst instead connection for people with I/DD as part of PTSR.
of being approached in terms of their disability: “an app that’s
accessible without screaming, ‘I’m accessible,’... So not necessarily of • Sending and receiving positive afrmations among people
the bat, you sign on to the app and it’s like, do you need large font?... with I/DD as a way to support one another: “People [with
I just think that puts folks with disabilities of; they’ve told me it has. I/DD] really like words of wisdom and hearing things from
So I just think it needs to look like an app that someone without a their peers... I’m wondering if they could have like a daily
disability could use but creatively having [the team] who’s creating it positive quote from someone who’s been through something
know that it does need to be accessible and these are some particular that they have, that they just check it out because that kind of
things for folks with intellectual and developmental disabilities that gets addicting, right? Like I’m waking up this morning, I want
you need to know....” (P1). to hear what one of my peers has to say about this morning
The participants also suggested that the design of the material and it’s like a positive quote to get their day going. I think
used in the app should be more inclusive, in terms of the types things like that would be cool just coming from their peer or if
of people shown, than apps have traditionally tended to be. This you wake up having a crappy day, ’here’s what I’ve done. If
means refecting and respecting diferences in lived experience in you click on this, you can do it for yourself too.”’ (P1).
terms of things like language, ethnicity, culture, gender expression, • Utilizing voice-based interactive systems that help the person
sexuality, religion, and more of the target I/DD community: “I think feel connected (even if that’s not with another person but
inclusion in making sure the app represents folks on a lot of diferent rather with a bot): “If there’s voice activation, you don’t want
identities and skin colors. I think also [that] the disability work can something that’s bossing you around... You want something
be really whitewashed.” (P3). In other words, participants suggested that makes you feel positive about yourself and also helps you
that we take an intersectional approach to the problem of designing feel like you’re in control.” (P2).
ASSETS ’22, October 23–26, 2022, Athens, Greece Venkatasubramanian and Ranalli

• The words and lived experience of people with I/DD should on an assumption that the goal is normality - that is, the elimination
be conveyed as part of the interaction: “If [people with I/DD’s] of disability. Conceptualizing disability using the medical model can
words and experiences are somehow captured in the app, I just have drastic negative consequences on the autonomy and decision-
think that’s a catch... I think the folks are more inclined to making power of people with disabilities [48]. This was followed in
tune in when it’s... coming from somebody that’s lived the the 1970s by the social model, which shifts the narrative from a need
experience and has a suggestion that worked for them.” (P1). for cure to a need for care where patients, not clinicians, become
the leaders in managing their conditions. The social model and the
4 TRAUMA-INFORMED CARE (TIC) associated independent living movement promote self-advocacy
Based on the fndings of our interview study, we posit that one way and peer support as the frst steps toward full participation in so-
of approaching the design of PTSR for people with I/DD is based on ciety, citizenship, and leadership development. The social model,
the notion of trauma-informed care (TIC). TIC is an idea obtained however, also has some limitations. For example, if disability is only
from social work practice that is used by organizations to support defned at the level of society, then the individual’s experience is
traumatized individuals. Before we delve into how this can be done to some extent invalidated and the question of accommodations at
(in the next section), in this section we provide a brief overview of the level of the individual sidelined [48]. Among the models that
TIC, its suitability for helping trauma survivors with I/DD, and its have been proposed to move beyond the medical and social models,
fve main criteria. the postmodern model is, to us, most apt. Based on the concept
of postmodernity from the humanities, this model incorporates
some aspects of the medical and social models that can still be
4.1 Overview of trauma-informed care
considered relevant. Rather than simply claiming that these former
The main idea behind TIC is that organizations providing trauma approaches no longer have anything to ofer: it includes a focus
services should consider the roles of trauma and its lingering efects on the individual’s experience of disability (similar to the medical
on the lives of individuals who have experienced trauma. Hence, model, though signifcantly expanded) as well as a consideration of
the idea is for the entire culture of an organization that deals with how society constructs disability (from the social model). The post-
trauma survivors to be aware of the trauma their clients have ex- modern model privileges each individual’s unique lived experience,
perienced [21, 46]. Organizations providing TIC services rely on complete with the complexity and nuance of everyday life [48].
their staf’s knowledge about trauma in responding to clients in
ways that convey respect and compassion, foster the autonomy
4.2.2 Trauma-informed care and the postmodern model of disability.
of the survivor, and aid in building coping strategies [21, 46]. TIC Trauma-informed care embodies the postmodern model of disabil-
is diferent from trauma-focused therapy (e.g., exposure therapy),
ity for many reasons. First, TIC does not view trauma purely as
as its primary goal is not to directly address or process the past a medical issue to be cured. It achieves this by viewing negative
trauma (i.e., the root of the problem per se) but to provide a process efects of trauma as coping mechanisms for the trauma survivor as
to deal with what the client presents as a result of their traumatic opposed to seeing them as some form of pathology, thus highlight-
experience(s) [46]. ing the survivor’s resilience. This reframing of negative trauma
In recent years, there has been an increased understanding of efects as adaptations ultimately allows support organizations to
the need for TIC for people with I/DD. Given that people with I/DD help the survivor develop healthier coping mechanisms without
often need organizational support to manage their daily lives and stigma or guilt [39]. Second, TIC considers traumatic experiences
require such support throughout their lives, TIC has been seen as a (e.g., a feeling of loss of bodily autonomy) not as past events but as
natural way of helping people with I/DD manage their situation defning experiences that shape the core of the survivor’s current
[39]. It has been recognized that, through TIC practice, people with
identity. Therefore, TIC fosters the survivor’s autonomy, control,
I/DD can: develop a sense of safety; feel reduced shame and guilt;
and say over their own well-being [39]. Third, TIC is careful not to
and be empowered and engaged to minimize the impact of the ascribe the cause of negative trauma efects to existing disabilities.
traumatic experience [21]. Consequently, organizations implementing TIC take extreme care
not to use language, behaviors, scenarios, or practices that can po-
4.2 Trauma-informed care and the postmodern tentially retraumatize a trauma survivor [39]. Overall, TIC, being
model of disability cognizant of the negative efects of trauma on the lives of people, is
Even though TIC did not necessarily develop in the context of by its very nature an ideal framework to meet the needs of people
helping people with disabilities, it aligns with the postmodern model with I/DD afected by trauma.
of disability as described in [48].

4.2.1 Models of disability. Mankof et. al provided a beautiful 4.3 The fve criteria of trauma-informed care
overview models of disability in this seminal paper [48]. In it they The Substance Abuse and Mental Health Services Administration
described three models of disability, which we quickly summarize (SAMHSA) qualifes that any organization seeking to provide TIC
here for context. Traditionally, disability has been conceptualized services must satisfy fve core criteria: safety, trust, choice, col-
frst as something to cure, as part of what is known as the medical laboration, and empowerment [31]. These criteria were originally
model of disabilities. This model conceives of people with disabil- defned for the general population of trauma survivors. However, in
ities as lacking and in need of being “fxed" to bring them to some recent years these criteria have been seen as applicable to the I/DD
notion of “normal". However the medical model rests uncomfortably community as well [39, 86]. Based on the work by John Keesler
Designing Post-Trauma Self-Regulation Apps for People with I/DD ASSETS ’22, October 23–26, 2022, Athens, Greece

[39], in this section we defne the fve criteria of TIC in terms of of experience designing technologies for people with I/DD. Next
people with I/DD: we describe each of these design guidelines, which are summarize
Safety: The organization should recognize that people with I/DD in Table 3. In this section, we use the term user(s) to specifcally
are likely to have a history that includes trauma. Consequently, it mean individual(s) with I/DD.
should explicitly provide an environment where the person with
I/DD feels both physically and emotionally safe. Another way of
putting it is that the environment should be such that it does not
retraumatize the person with I/DD.
Trust: Trust is essential to the care process to ensure that the 5.1 Designing for safety
person with I/DD is able to develop the autonomy, initiative, and The frst criterion we consider is that of safety, where the idea
competence required to empower themselves to manage with the is to provide users with a sense of physical and emotional safety.
efects of their trauma. From an organizational standpoint, this Likewise we defne safety within PTSR apps as designing with the
criterion relates to a clarity of expectations for the person with intent to minimize any retraumatization of users and to provide a
I/DD in terms of receiving consistent service delivery across the means for them to cope with negative trauma efects. We suggest
organization. three guidelines for ensuring the safety criterion for PTSR apps.
Choice: Oftentimes trauma causes a person with I/DD to lose Mitigate known negative trauma efects. The design of PTSR apps
their sense of control over their lives and their bodies. The concept should be cognizant of the dis-ease that its users may be experienc-
of choice within TIC is an attempt to enable the person with I/DD ing and ensure that it does not exacerbate it. Consequently, when
to make their own decisions and gain a sense of control over their designing PTSR apps, similar eforts must be made to help users
recovery and life. to manage known negative trauma efects. For instance, the app
Collaboration: When a person with I/DD receives services for should be cognizant of the fact that, for some people with I/DD, self-
their trauma from a support organization, a power diferential regulation activities that are abstract and involve the imagination,
can exist in that relationship. If not carefully equalized, such a like visualization activities, can be distressing rather than calming.
power imbalance (often subtle and insidious) can increase feelings Similarly, closing one’s eyes and going right into a meditation with
of vulnerability in the trauma survivor. An essential feature of TIC no preparatory safety measures is not ideal for those who have
is equalizing the potential power diferential between the survivor experienced trauma (including people with I/DD). Thus, the app
and any support organization (and their staf) in decision-making should give users options of various safety measures they can take
around care. For instance, people with I/DD are often prone to before and/or during meditation (like keeping their eyes open or
the seek acceptance of others and are particularly vulnerable to frst check that the doors are locked) to make sure they feel safe
instinctive compliance to authority. Therefore, the collaboration and in control in their surroundings.
criterion requires that we remind trauma survivors with I/DD that Be actively supportive of the lived experience of people with I/DD.
they have the right to ask questions, decline services, and make The I/DD community is diverse. Thus the design of PTSR apps
requests. should be considerate of the diferences in lived experience in terms
Empowerment: This criterion essentially does two things. First, it of things like language, ethnicity, culture, gender expression, sex-
recognizes the behavior of a trauma survivor with I/DD as a legiti- uality, religion, and more of the target I/DD community. An app
mate way for them to deal with past trauma. It therefore validates that does not actively represent a worldview that is supportive of
their current experience, thus empowering trauma survivors to the diversity of people with I/DD essentially rejects non-normative
take their well-being in their own hands. Second, it recognizes the body and mental states, which is oppressive and traumatizing, thus
individual strengths, skills, and abilities of the person with I/DD negatively impacting user safety [59]. For instance, any voice com-
and aids them every step of the way. This focus on the survivor’s munication in PTSR apps designed to be inclusive of trans and
existing strengths and skills is seen as an integral way to help them nonbinary people with I/DD should include more than just stereo-
realize that they may already possess the resources necessary to typically cis male and female genders in its vocal elements.
develop a solution for any obstacles they face in dealing with the Assume that “potentially anything can trigger”: For someone who
efects of the trauma. has experienced trauma, anything that reminds them of their trau-
matic experience can trigger negative efects. Therefore, designers
5 INCORPORATING TRAUMA-INFORMED need to be aware of the fact that even self-regulation activities can
potentially trigger users. Triggers are inherently personal and are
CARE INTO THE DESIGN OF PTSR APPS not the same for any two people [42]. It is thus often very difcult
FOR PEOPLE WITH I/DD for designers to know a priori what can trigger a user when using
In this section, we explore the design space for PTSR apps for PTSR apps. We thus believe that PTSR apps should be designed
people with I/DD using the notion of trauma-informed care (TIC) with the “potentially anything can trigger” mindset, which means
discussed in Section 4. To the best of our knowledge no prior work that it should: (1) use non-judgmental language and features in the
has attempted to look at applying TIC to app design. In the rest of app and (2) include a variety of self-regulation activities from which
this section we discuss how each of the TIC criteria can be adapted to choose, including ofine activities that people with I/DD like
to the design of PTSR apps by exploring important design guidelines (some examples that participants in our study mentioned include:
related to each criterion. We derived these design guidelines from sit with a pet, gardening, coloring, taking a walk, mowing the lawn,
the responses from the interviews we conducted as well as our years being outdoors, yard work, and crafting).
ASSETS ’22, October 23–26, 2022, Athens, Greece Venkatasubramanian and Ranalli

TIC criterion Guidelines


1. Mitigate known negative trauma efects
Safety 2. Be actively supportive of the lived experience of people with I/DD
3. Assume that potentially anything can trigger
4. Ensure authentication to protect access to user-generated personal data within the app
Trust
5. Never take away recognition for prior successes
6. Provide a diversity of choices by designing for a variety of abilities
Choice 7. Guide users through the choice-making process
8. Give users an easy-to-understand way to express their current feelings/emotions
9. Provide the ability to interact with others
Collaboration
10. Provide tools that converse directly with users
11. Provide accessible information on trauma
Empowerment
12. Provide customized, pithy messages of support

Table 3: A summary of guidelines obtained for incorporating trauma-informed care into the design of PTSR apps
5.2 Designing for trust I/DD should never penalize and should only add to prior successes
The second criterion is to develop trust with people with I/DD. [34]. Consequently, any incentive elements in PTSR apps should
When it comes to PTSR apps, we defne trust as an app’s ability therefore accumulate in-app successes. We believe that this would
to foster confdence in users, such that they keep using the app to allow users to trust the motives of PTSR apps (e.g., that the app is
the fullest extent. We suggest two guidelines for ensuring the trust not negatively judging the users’ performance) and therefore use it
criterion for PTSR apps. more fully.
Ensure authentication to protect access to user-generated personal
data within the app. PTSR apps provides services and features that
are intensely personal and should not be available for anyone else
to see without explicit permission from the user. For instance, one
5.3 Designing for choice
could imagine an app that allows a user to journal or check in about The third criterion is that of choice, where people with I/DD are
their emotions and feelings at diferent points in time. For a user to given the chance to make choices and gain control over their re-
be able to avail themselves of the benefts of journaling/check-ins, covery and life. PTSR apps that incorporates the choice criterion
they have to feel safe in doing so honestly. Providing a way to should provide their users with a variety of ways to achieve their
secure this within an app from any unauthorized access is a power- self-regulation goals. We suggest three guidelines for ensuring the
ful way to incentivize users to engage with such features. In this choice criterion for PTSR apps.
regard, one of the main features an app should have is authentica- Provide a diversity of choices by designing for a variety of abilities.
tion, especially for any personal data. Authentication would ensure One way to increase the choices for users of PTSR apps is to ensure
that only the user can access their personal information. Indeed, that a given self-regulation activity is available at diferent levels of
password-based authentication is used by the Safe Helpline app difculty to accommodate the diferent cognitive abilities of people
[68] to control access to any user-generated information. However, with I/DD. For instance, a jigsaw puzzle activity designed for PTSR
it frequently requires one’s password to be retyped (approximately could provide several diferent versions of the same puzzle with
once every fve minutes), which can make using the app tedious and, difering degrees of difculty. For instance, in addition to allowing
for people with I/DD, it can make using the app difcult. Designing users to use hints or not, this activity could also ofer things like
an authentication solution that works for people with I/DD in the diferent numbers of pieces or diferent degrees of intricacy in the
context of protecting the content of PTSR apps is a challenging image. Ofering several difculty levels for an activity would allow
problem, given that people in this population often need to share users to select a level not only based on their general abilities but
their passwords with close family members [80]. also according to what they feel capable of doing on a given day or
Never take away recognition for prior successes. Having incentives in a given situation.
within the app is a powerful way to encourage users to use PTSR Guide users through the choice-making process. So far we have
apps over time. The idea of giving people incentives for using argued for the need for a diversity of choices for self-regulation ac-
the app came up during the interviews: “So maybe having some tivities. However, as the number of self-regulation activities within
sort of incentive or like... “Oh, you completed this series of coping an app increases, it is not enough for an app simply to provide a
skills, you get a badge”... because I also have a background in applied list of all of the options. PTSR apps should include a recommender
behavior analysis, so I think creating incentive to show that they system to ofer users self-regulation activity suggestions in order
accomplished something, it would be helpful for them to wanting to to help facilitate the task of selecting an activity and thus reduce
be engaged in it.” (P7). Often incentives within apps (in general) the cognitive load imposed on users. However, given the negative
are designed to be negative in nature, where prior rewards are efects of trauma on individuals with I/DD (as described in Section
rescinded if certain goals are not met. This can be clearly seen in 1.1), such as impaired working memory and a tendency to be easily
apps where an anthropomorphic virtual being becomes happier and distracted, the recommender system may have to considerably limit
sadder with app use and disuse, respectively [18, 29, 63]. However, the number of options shown at a time. That being said, any PTSR
in our prior work we found that incentives designed for people with app’s recommender system should be careful just to suggest options
and not dictate what the user should do.
Designing Post-Trauma Self-Regulation Apps for People with I/DD ASSETS ’22, October 23–26, 2022, Athens, Greece

Give users an easy-to-understand way to express their current with others. Giving this community unmediated access to socializ-
feelings/emotions. How to design a recommender system in PTSR ing would reduce some of the very sources of repeated trauma that
apps for people with I/DD is an open question. One crucial input for they endure.
the recommender system is the current emotional state of the user, Provide tools that converse directly with users. PTSR apps need
which is something that extant trauma-focused self-regulation apps not only promote social collaboration involving other individuals.
often ofer. For instance, PTSD Coach [66] provides a barometer One can also imagine PTSR apps interacting directly with users to
from 1 to 10 that can be used to indicate one’s level of distress4 . This simulate social interactions, similar to AI-assisted chatbots or digital
is a good approach for populations without I/DD but it involves an voice assistants. Such interactive tools can help people with I/DD
abstract way of expressing how one is feeling in the moment, which feel less isolated in the absence of others with whom to interact.
can be difcult for people with I/DD to determine about themselves However, for interactive tools to be truly benefcial to individuals
[74]. Consequently, an app designed for people with I/DD needs to: with I/DD, they should react and talk like a peer (i.e., another person
(1) explicitly guide the user through the process of determining how with I/DD). At a minimum, existing chatbots and voice assistants
they are feeling in that moment, perhaps through an illustrated, have to become more inclusive and be trained on data emanating
guided emotional check-in process [37] and (2) make the manner of from people with I/DD.
expressing the user’s level of distress less abstract, such as through
the use of a scale with faces to represent the diferent emotions, as 5.5 Designing for empowerment
in [65]. In an organizational context the empowerment criterion is about
validating the individual’s trauma experience and focusing on their
5.4 Designing for collaboration abilities5 . In terms of PTSR app design, we leverage the notion of
Traditionally, the collaboration criterion is about equalizing the knowledge is power and defne empowerment as accessibly pro-
power diferential between the individual with I/DD and the staf viding the user with information about trauma, its manifestations
at a support organization. When it comes to PTSR apps, it is clear and efects, and diferent ways to cope and heal. The availability
that collaboration needs to be defned diferently. Given the nature of information surrounding trauma is a powerful way to validate
of mobile computing devices, in that they are typically devices users’ experiences, remove any stigma surrounding trauma [28],
for individual use, working with mobile apps is usually a solitary and better foster their ability to engage in self-regulation. We sug-
activity. Furthermore, self-regulation, which is the focus of this gest two guidelines for ensuring the empowerment criterion for
paper, can increase the user’s sense of isolation even more. As P6 PTSR apps.
pointed out (in Section 3.4.3), one of the biggest sources of difculty Provide accessible information on trauma. Apps focused on trauma
comes from being isolated from others. Therefore, PTSR apps, as for people without I/DD often provide a lot of information (in the
part of the collaboration criterion, should actively allow the user to form of reading material) about trauma and why it is important to
communicate with others in a meaningful fashion and to engage in address it [61, 68]. Many non-trauma commercial self-regulation
collaborative activities and community formation [6]. We suggest apps similarly provide a lot of information about the importance and
two guidelines as part of the collaboration criterion for PTSR apps. benefts of meditation practices [75, 82]. However, such a reading-
Provide the ability to interact with others. We believe that PTSR intensive approach is not appropriate for an I/DD population. Prior
apps for our population should provide a means for users to interact work in HCI/accessibility has identifed several approaches for
with others for social support. This can be friends and family, allies, teaching, informing, and skill-building for the I/DD community.
a designated trusted person, therapy or support groups, and other Some of the major fndings include recommendations to: (1) use im-
people with I/DD. This support can take many forms, including ages [11, 54], interactive content, and videos [5, 23]; (2) use concrete,
informational support, emotional support, personal network sup- large, diferentiated icons and symbols [7]; (3) make the content
port, self-esteem support, and even tangible in-person support [19]. accessible by supporting audio descriptions of images [3, 79] and
For instance, several trauma-focused self-regulation apps, such as accommodating diferent levels of literacy [20]; (4) reduce infor-
those designed by for US Department of Defense personnel [61, 68], mation overload [20]; (5) use positive reinforcement for successful
provide their users with a means to call someone to talk. However, engagement with the content [11, 43]; and (6) provide self-paced
these apps primarily focus on phone-call-based communication. learning capabilities [43, 44, 54, 70]. These approaches should be
One way of promoting the ability to communicate with others for adapted to convey information about trauma and self-regulation to
people with I/DD is to leverage the paradigm of social VR. Social people with I/DD within PTSR apps.
VR would allow individuals with I/DD, who otherwise live highly Provide customized, pithy messages of support. Another way to
managed lives, to move around and communicate with others in a empower users via PTSR apps is to provide short afrmations that
simulated, immersive space in imaginative ways. VR would thus validate their current experience and reduce any stigma surround-
allow them not only to perform PTSR but also to interact with oth- ing trauma. Once again these messages have to be designed with
ers on their own terms: the virtual environment would allow them the capabilities of users in mind. For instance, in our prior work
to socialize without having to wait for care assistants to arrange [34], we developed an abuse-recognition tool for people with I/DD
transportation, etc., as they often do when physically socializing
5 The empowerment criterion in this context lends itself naturally to the notion of
ability-based design in HCI [87]. We believe that an entire app, in order to be accessible
4 PTSD Coach uses the catchall term distress to signify one or more of a host of to people with I/DD, has to be designed with empowerment in mind. Therefore, we do
psychological and physiological states (“anger, sadness, fear, pain, stress, worry, or not list ability-based design as a specifc guideline in this section, as we view it as a
anything negative they are feeling”). meta-guideline that holds for any app designed for people with I/DD.
ASSETS ’22, October 23–26, 2022, Athens, Greece Venkatasubramanian and Ranalli

that frequently reminds users that “abuse is not their fault.” It is Association for Computing Machinery, New York, NY, USA, Article 27, 10 pages.
understood that people with I/DD often need repetition to fully https://doi.org/10.1145/2899475.2899481
[8] Eleanor R. Burgess. 2019. Collaborative Self-Management of Depression. In
appreciate a concept [34]. Consequently, such empowering, easy- Conference Companion Publication of the 2019 on Computer Supported Cooperative
to-understand messages should appear passim throughout an app Work and Social Computing (Austin, TX, USA) (CSCW ’19). Association for Com-
puting Machinery, New York, NY, USA, 38–42. https://doi.org/10.1145/3311957.
and should be repeated often. Moreover, for such messages to be 3361851
efective for a variety of people with I/DD, the messages should be [9] Eleanor R. Burgess, Kathryn E. Ringland, Jennifer Nicholas, Ashley A. Knapp,
presented via multiple modalities (such as text, pictures/symbols, Jordan Eschler, David C. Mohr, and Madhu C. Reddy. 2019. "I Think People
Are Powerful": The Sociality of Individuals Managing Depression. Proc. ACM
audio, and video) and should be in the voices of people with I/DD. Hum.-Comput. Interact. 3, CSCW, Article 41 (nov 2019), 29 pages. https://doi.org/
10.1145/3359143
6 CONCLUSION [10] Eleanor R. Burgess, Alice Renwen Zhang, Jessica L. Feuston, Madhu C. Reddy,
Sindhu Kiranmai Ernala, Munmun De Choudhury, Stephen Schueller, Adrian
People with I/DD are some of the most traumatized individuals Aguilera, and Mary Czerwinski. 2020. Technology Ecosystems: Rethinking
in the US. We posit that mobile-computing-based apps have the Resources for Mental Health. In Extended Abstracts of the 2020 CHI Confer-
ence on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA
potential to help people with I/DD deal with the negative efects of ’20). Association for Computing Machinery, New York, NY, USA, 1–8. https:
trauma. In recent years a plethora of research, commercial, and non- //doi.org/10.1145/3334480.3375166
[11] Luke Buschmann, Lourdes Morales, and Sri Kurniawan. 2014. Online Learning
commercial self-regulation apps have become available but they System for Teaching Basic Skills to People with Developmental Disabilities. In
have not been designed for people with I/DD. Thus this community Proceedings of the 16th International ACM SIGACCESS Conference on Computers &
has turned to cobbling together aspects of ersatz apps for self- Accessibility (Rochester, New York, USA) (ASSETS ’14). Association for Comput-
ing Machinery, New York, NY, USA, 271–272. https://doi.org/10.1145/2661334.
regulation. However, the apps this population has been using were 2661391
never designed for them, nor are they appropriate for coping with [12] Lisa D Butler, Filomena M Critelli, and Elaine S Rinfrette. 2011. Trauma-informed
trauma. Consequently, we interviewed eight (8) practitioners at a care and mental health. Directions in Psychiatry 31, 3 (2011), 197–212.
[13] David Byrne. 2022. A worked example of Braun and Clarke’s approach to refexive
trauma services organization in the US to better understand what it thematic analysis. Quality & quantity 56, 3 (2022), 1391–1412.
would take to develop post-trauma self-regulation apps for people [14] Lucille Cairns. 2015. Bodily Dis-ease in Contemporary French Women’s Writing:
Two Case Studies. French Studies 69, 4 (2015), 494–508.
with I/DD. Based on the interview responses, we then developed [15] Calm 2022. Calm. Calm. Retrieved June 20, 2021 from https://www.calm.com/
a set of guidelines, based on the social work practice of trauma- [16] Prateek Chanda, Amogh Wagh, Jemimah A. Johnson, Swaraj Renghe, Vageesh
informed care, to design post-trauma self-regulation apps for people Chandramouli, George Mathews, Sapna Behar, Poornima Bhola, Girish Rao,
Paulomi Sudhir, T. K. Srikanth, Amit Sharma, and Seema Mehrotra. 2021. MIND-
with I/DD. In the future, we plan to instantiate a prototype of a NOTES: A Mobile Platform to Enable Users to Break Stigma around Mental
PTSR app based on our design guidelines and then assess the app Health and Connect with Therapists. In Companion Publication of the 2021 Con-
through user studies involving members of the I/DD community. ference on Computer Supported Cooperative Work and Social Computing (Virtual
Event, USA) (CSCW ’21). Association for Computing Machinery, New York, NY,
USA, 213–217. https://doi.org/10.1145/3462204.3482895
ACKNOWLEDGMENTS [17] Janet X. Chen, Allison McDonald, Yixin Zou, Emily Tseng, Kevin A Roundy, Acar
Tamersoy, Florian Schaub, Thomas Ristenpart, and Nicola Dell. 2022. Trauma-
We would like to thank our reviewers and participants for their Informed Computing: Towards Safer Technology Experiences for All. In Proceed-
invaluable help with this paper. We would also like to thank Nancy ings of the 2022 CHI Conference on Human Factors in Computing Systems (New
Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York,
Alterio, Mariah Freark, Priyankan Kirupaharan, Andrew Laraw NY, USA, Article 544, 20 pages. https://doi.org/10.1145/3491102.3517475
Lama, and Brittany Lewis for their help with this work. This work [18] Simone Colombo, Franca Garzotto, Mirko Gelsomini, Mattia Melli, and Francesco
was made possible by generous funding from the Massachusetts Clasadonte. 2016. Dolphin Sam: A Smart Pet for Children with Intellectual
Disability. In Proceedings of the International Working Conference on Advanced
Disabled Persons Protection Commission. Visual Interfaces (Bari, Italy) (AVI ’16). Association for Computing Machinery,
New York, NY, USA, 352–353. https://doi.org/10.1145/2909132.2926090
REFERENCES [19] Constantinos K Coursaris and Ming Liu. 2009. An analysis of social support
exchanges in online HIV/AIDS self-help groups. Computers in Human Behavior
[1] AAIDD 2020. American Association of Intellectual and Developmental Disabili- 25, 4 (2009), 911–918.
ties. AAIDD. Retrieved Feb 27, 2020 from https://www.aaidd.org/intellectual- [20] Vagner Figueredo de Santana, Rodrigo Laiola Guimarães, and Andrea Britto
disability/defnition Mattos. 2016. Identifying Challenges and Opportunities in Computer-Based
[2] Jesper J Alvarsson, Stefan Wiens, and Mats E Nilsson. 2010. Stress recovery Vocational Training for Low-Income Communities of People with Intellectual
during exposure to nature sound and environmental noise. International journal Disabilities. In Proceedings of the 13th Web for All Conference (Montreal, Canada)
of environmental research and public health 7, 3 (2010), 1036–1046. (W4A ’16). Association for Computing Machinery, New York, NY, USA, Article 2,
[3] Andrew A. Bayor. 2019. HowToApp: Supporting Life Skills Development of Young 8 pages. https://doi.org/10.1145/2899475.2899480
Adults with Intellectual Disability. In The 21st International ACM SIGACCESS [21] Denise E Elliott, Paula Bjelajac, Roger D Fallot, Laurie S Markof, and Beth Glover
Conference on Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19). Reed. 2005. Trauma-informed or trauma-denied: Principles and implementation
Association for Computing Machinery, New York, NY, USA, 697–699. https: of trauma-informed services for women. Journal of community psychology 33, 4
//doi.org/10.1145/3308561.3356107 (2005), 461–477.
[4] Herbert Benson and Miriam Z Klipper. 1975. The relaxation response. Morrow, [22] Hayley Evans, Udaya Lakshmi, Hue Watson, Azra Ismail, Andrew M. Sherrill,
New York, NY. Neha Kumar, and Rosa I. Arriaga. 2020. Understanding the Care Ecologies of
[5] Petra Boström and Eva Eriksson. 2015. Design for self-reporting psychological Veterans with PTSD. Association for Computing Machinery, New York, NY, USA,
health in children with intellectual disabilities. In Proceedings of the 14th Interna- 1–15. https://doi-org.uri.idm.oclc.org/10.1145/3313831.3376170
tional Conference on Interaction Design and Children. Association for Computing [23] Anya S Evmenova and Michael M Behrmann. 2014. Enabling access and enhanc-
Machinery, New York, NY, USA, 279–282. ing comprehension of video content for postsecondary students with intellectual
[6] Alice V. Brown and Jaz Hee-jeong Choi. 2017. Towards Care-Based Design: disability. Education and Training in Autism and Developmental Disabilities 49, 1
Trusted Others in Nurturing Posttraumatic Growth Outside of Therapy. In Pro- (2014), 45–59.
ceedings of the 8th International Conference on Communities and Technologies [24] Family 2021. PTSD Family Coach. Family. Retrieved August 20, 2021 from
(Troyes, France) (C&T ’17). Association for Computing Machinery, New York, https://www.ptsd.va.gov/appvid/mobile/familycoach_app.asp
NY, USA, 56–63. https://doi.org/10.1145/3083671.3083703 [25] Charles R Figley. 2012. Encyclopedia of trauma: An interdisciplinary guide. Sage
[7] Erin Buehler, William Easley, Amy Poole, and Amy Hurst. 2016. Accessibility Publications, Los Angeles, CA.
Barriers to Online Education for Young Adults with Intellectual Disabilities. In [26] Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering
Proceedings of the 13th Web for All Conference (Montreal, Canada) (W4A ’16). cognitive behavior therapy to young adults with symptoms of depression and
Designing Post-Trauma Self-Regulation Apps for People with I/DD ASSETS ’22, October 23–26, 2022, Athens, Greece

anxiety using a fully automated conversational agent (Woebot): a randomized 5 (feb 2014), 26 pages. https://doi.org/10.1145/2513179
controlled trial. JMIR mental health 4, 2 (2017), e7785. [46] Jill Levenson. 2017. Trauma-informed social work practice. Social Work 62, 2
[27] Eivind Flobak, Jo D. Wake, Joakim Vindenes, Smiti Kahlon, Tine Nordgreen, and (2017), 105–113.
Frode Guribye. 2019. Participatory Design of VR Scenarios for Exposure Therapy. [47] Deborah Lupton. 2014. Self-Tracking Cultures: Towards a Sociology of Personal
In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems Informatics. In Proceedings of the 26th Australian Computer-Human Interaction
(Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New Conference on Designing Futures: The Future of Design (Sydney, New South Wales,
York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300799 Australia) (OzCHI ’14). Association for Computing Machinery, New York, NY,
[28] Edna B Foa. 1997. Psychological processes related to recovery from a trauma and USA, 77–86. https://doi.org/10.1145/2686612.2686623
an efective treatment for PTSD. Annals of the New York Academy of Sciences 821 [48] Jennifer Mankof, Gillian R. Hayes, and Devva Kasnitz. 2010. Disability Studies as
(1997), 410–424. a Source of Critical Inquiry for the Field of Assistive Technology. In Proceedings of
[29] Stuart Iain Gray, Tom Metcalfe, Kirsten Cater, Peter Bennett, and Chris Bevan. the 12th International ACM SIGACCESS Conference on Computers and Accessibility
2020. The Sugargotchi: An Embodied Digital Pet to Raise Children’s Awareness (Orlando, Florida, USA) (ASSETS ’10). Association for Computing Machinery,
of Their Dental Health and Free Sugar Consumption. In Extended Abstracts of the New York, NY, USA, 3–10. https://doi.org/10.1145/1878803.1878807
2020 Annual Symposium on Computer-Human Interaction in Play (Virtual Event, [49] MC 2021. Mindfulness Coach. MC. Retrieved August 20, 2021 from https:
Canada) (CHI PLAY ’20). Association for Computing Machinery, New York, NY, //www.ptsd.va.gov/appvid/mobile/mindfulcoach_app.asp
USA, 242–247. https://doi.org/10.1145/3383668.3419874 [50] L Mevissen and A De Jongh. 2010. PTSD and its treatment in people with
[30] Hee Jeong Han, Sanjana Mendu, Beth K Jaworski, Jason E Owen, and Saeed intellectual disabilities: A review of the literature. Clinical psychology review 30,
Abdullah. 2021. PTSDialogue: Designing a Conversational Agent to Support Individ- 3 (2010), 308–316.
uals with Post-Traumatic Stress Disorder. Association for Computing Machinery, [51] Liesbeth Mevissen, Robert Didden, and Ad de Jongh. 2016. Assessment and
New York, NY, USA, 198–203. https://doi-org.uri.idm.oclc.org/10.1145/3460418. Treatment of PTSD in People with Intellectual Disabilities. Springer International
3479332 Publishing, Cham. 1–15 pages. https://doi.org/10.1007/978-3-319-08613-2_95-2
[31] M Harris and R Fallot. 2001. Creating cultures of trauma-informed care (CCTIC): [52] A Mitchell and Jennifer Clegg. 2005. Is post-traumatic stress disorder a helpful
A self-assessment and planning protocol. Jossey-Bass, New York, NY. concept for adults with intellectual disability? Journal of Intellectual Disability
[32] HeadSpace 2022. HeadSpace. HeadSpace. Retrieved June 20, 2021 from https: Research 49, 7 (2005), 552–559.
//www.headspace.com/ [53] David C Mohr, Joyce Ho, Jenna Dufecy, Kelly G Baron, Kenneth A Lehman, Ling
[33] Healthline 2021. 20 Best Meditation Apps of 2021. Healthline. Retrieved June Jin, and Douglas Reifer. 2010. Perceived barriers to psychological treatments
2, 2021 from https://www.healthline.com/health/mental-health/top-meditation- and their relationship to depression. Journal of clinical psychology 66, 4 (2010),
iphone-android-apps 394–409.
[34] Thomas Howard III, Krishna Venkatasubramanian, Jeanine L. M. Skorinko, [54] Lourdes M. Morales-Villaverde, Karina Caro, Taylor Gotfrid, and Sri Kurniawan.
Pauline Bosma, John Mullaly, Brian Kelly, Deborah Lloyd, Mary Wishart, Emi- 2016. Online Learning System to Help People with Developmental Disabilities
ton Alves, Nicole Jutras, Mariah Freark, and Nancy A. Alterio. 2021. De- Reinforce Basic Skills. In Proceedings of the 18th International ACM SIGACCESS
signing an App to Help Individuals with Intellectual and Developmental Dis- Conference on Computers and Accessibility (Reno, Nevada, USA) (ASSETS ’16).
abilities to Recognize Abuse. In Proceedings of the 23rd International ACM Association for Computing Machinery, New York, NY, USA, 43–51. https://doi.
SIGACCESS Conference on Computers and Accessibility (ASSETS’21). Associa- org/10.1145/2982142.2982174
tion for Computing Machinery, New York, NY, USA, Article 373, 13 pages. [55] Neema Moraveji, Athman Adiseshan, and Takehiro Hagiwara. 2012. Breath-
https://doi.org/10.1145/3441852.3471217 Tray: Augmenting Respiration Self-Regulation without Cognitive Defcit. In CHI
[35] Yun Huang, Ying Tang, and Yang Wang. 2015. Emotion Map: A Location- ’12 Extended Abstracts on Human Factors in Computing Systems (Austin, Texas,
Based Mobile Social System for Improving Emotion Awareness and Regula- USA) (CHI EA ’12). Association for Computing Machinery, New York, NY, USA,
tion. In Proceedings of the 18th ACM Conference on Computer Supported Co- 2405–2410. https://doi.org/10.1145/2212776.2223810
operative Work & Social Computing (Vancouver, BC, Canada) (CSCW ’15). As- [56] John T Morris, Michael L Jones, and W Mark Sweatman. 2016. Wireless Technology
sociation for Computing Machinery, New York, NY, USA, 130–142. https: Use by People with Disabilities: A National Survey. Technical Report. California
//doi.org/10.1145/2675133.2675173 State University, Northridge.
[36] Insight Timer 2021. Insight Timer. Insight Timer. Retrieved September 5, 2021 [57] Nasim Motalebi and Saeed Abdullah. 2018. Conversational Agents to Provide
from https://insighttimer.com/ Couple Therapy for Patients with PTSD. In Proceedings of the 12th EAI Interna-
[37] Jen Ryan and Eve Ashwood and Shannan Puckeridge. 2010. The Development of tional Conference on Pervasive Computing Technologies for Healthcare (New York,
Anger Management Skills in Adults with Moderate Intellectual Disability. ASID. NY, USA) (PervasiveHealth ’18). Association for Computing Machinery, New York,
Retrieved August 20, 2021 from https://www.asid.asn.au/Portals/0/Conferences/ NY, USA, 347–351. https://doi.org/10.1145/3240925.3240933
45thBrisbane/Conference%20Papers/Ryan_et_al_THU_1555_Behaviour77.pdf [58] MST 2021. Beyond MST. MST. Retrieved August 20, 2021 from https://www.ptsd.
[38] Stephen Kaplan. 1995. The restorative benefts of nature: Toward an integrative va.gov/appvid/mobile/beyondMST.asp
framework. Journal of environmental psychology 15, 3 (1995), 169–182. [59] Andrea Nicki. 2001. The abused mind: Feminist theory, psychiatric disability,
[39] John M Keesler. 2014. A call for the integration of trauma-informed care among and trauma. Hypatia 16, 4 (2001), 80–104.
intellectual and developmental disability organizations. Journal of Policy and [60] Kavous Salehzadeh Niksirat, Chaklam Silpasuwanchai, Peng Cheng, and Xiangshi
Practice in Intellectual Disabilities 11, 1 (2014), 34–42. Ren. 2019. Attention Regulation Framework: Designing Self-Regulated Mind-
[40] M Kellen and Deepak Saxena. 2020. Calm my headspace: Motivations and barriers fulness Technologies. ACM Trans. Comput.-Hum. Interact. 26, 6, Article 39 (Nov.
for adoption and usage of meditation apps during times of crisis. In Proceedings 2019), 44 pages. https://doi.org/10.1145/3359593
of The 20th International Conference on Electronic Business. Elsevier, Amsterdam, [61] Jason E Owen, Beth K Jaworski, Eric Kuhn, Kerry N Makin-Byrd, Kelly M Ramsey,
Netherlands, 5–8. and Julia E Hofman. 2015. mHealth in the wild: using novel data to examine the
[41] Rachel Kornfeld, Renwen Zhang, Jennifer Nicholas, Stephen M. Schueller, Scott A. reach, use, and impact of PTSD coach. JMIR mental health 2, 1 (2015), e3935.
Cambo, David C. Mohr, and Madhu Reddy. 2020. "Energy is a Finite Resource": [62] SoHyun Park, Anja Thieme, Jeongyun Han, Sungwoo Lee, Wonjong Rhee, and
Designing Technology to Support Individuals across Fluctuating Symptoms of De- Bongwon Suh. 2021. “I Wrote as If I Were Telling a Story to Someone I Knew.”:
pression. Association for Computing Machinery, New York, NY, USA, 1–17. Designing Chatbot Interactions for Expressive Writing in Mental Health. Association
https://doi-org.uri.idm.oclc.org/10.1145/3313831.3376309 for Computing Machinery, New York, NY, USA, 926–941. https://doi-org.uri.
[42] Francesca Laguardia, Venezia Michalsen, and Holly Rider-Milkovich. 2017. Trig- idm.oclc.org/10.1145/3461778.3462143
ger warnings. Journal of Legal Education 66, 4 (2017), 882–903. [63] Nintendo 2020. The Pocket Pikachu. Nintendo. Retrieved March 18, 2021 from
[43] Rodrigo Laiola Guimarães and Andrea Britto Mattos. 2015. Exploring the Use of https://www.nintendo.co.jp/n09/pokepika/t
Massive Open Online Courses for Teaching Students with Intellectual Disability. [64] Positive Psychology 2021. Top 14 Apps For Meditation and Mindfulness (+ Reviews).
In Proceedings of the 17th International ACM SIGACCESS Conference on Computers Positive Psychology. Retrieved June 2, 2021 from https://positivepsychology.
and Accessibility (Lisbon, Portugal) (ASSETS ’15). Association for Computing Ma- com/mindfulness-apps/
chinery, New York, NY, USA, 343–344. https://doi.org/10.1145/2700648.2811370 [65] Therapists Aid 2021. Printable Emotion Faces. Therapists Aid. Retrieved August 20,
[44] Rodrigo Laiola Guimarães, Andrea Britto Mattos, and Carlos Henrique Cardonha. 2021 from https://www.therapistaid.com/therapy-worksheet/printable-emotion-
2016. Investigating Instructional Pacing Supports for Teaching Students with faces
Intellectual Disability. In Proceedings of the 2016 CHI Conference Extended Abstracts [66] ptsdcoach 2021. ptsdcoach. ptsdcoach. Retrieved Sept 4, 2021 from https:
on Human Factors in Computing Systems (San Jose, California, USA) (CHI EA //mobile.va.gov/app/ptsd-coach
’16). Association for Computing Machinery, New York, NY, USA, 2171–2177. [67] Resilience 2021. Provider Resilience. Resilience. Retrieved August 20, 2021
https://doi.org/10.1145/2851581.2892342 from https://www.health.mil/Military-Health-Topics/MHS-Toolkits/Provider-
[45] Reeva Lederman, Greg Wadley, John Gleeson, Sarah Bendall, and Mario Álvarez Resilience-Toolkit
Jiménez. 2014. Moderated Online Social Therapy: Designing and Evaluating [68] safehelpline 2021. safehelp. safehelpline. Retrieved August 20, 2021 from
Technology for Mental Health. ACM Trans. Comput.-Hum. Interact. 21, 1, Article https://www.safehelpline.org/app
ASSETS ’22, October 23–26, 2022, Athens, Greece Venkatasubramanian and Ranalli

[69] Kavous Salehzadeh Niksirat, Chaklam Silpasuwanchai, Mahmoud Mohamed [83] Erin Louise Whittle, Karen R Fisher, Simone Reppermund, Rhoshel Lenroot, and
Hussien Ahmed, Peng Cheng, and Xiangshi Ren. 2017. A Framework for Interac- Julian Trollor. 2018. Barriers and enablers to accessing mental health services for
tive Mindfulness Meditation Using Attention-Regulation Process. In Proceedings people with intellectual disability: a scoping review. Journal of Mental Health
of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Research in Intellectual Disabilities 11, 1 (2018), 69–102.
Colorado, USA) (CHI ’17). Association for Computing Machinery, New York, NY, [84] Erin Louise Whittle, Karen Raewyn Fisher, Simone Reppermund, and Julian
USA, 2672–2684. https://doi.org/10.1145/3025453.3025914 Trollor. 2019. Access to mental health services: The experiences of people with
[70] Maria Saridaki and Michalis Meimaris. 2018. Digital Storytelling for the Em- intellectual disabilities. Journal of Applied Research in Intellectual Disabilities 32,
powerment of People with Intellectual Disabilities. In Proceedings of the 8th 2 (2019), 368–379.
International Conference on Software Development and Technologies for Enhanc- [85] Women’s Health Weekly 2021. The 12 Best Meditation Apps For 2020, According
ing Accessibility and Fighting Info-Exclusion (Thessaloniki, Greece) (DSAI 2018). To Experts. Women’s Health Weekly. Retrieved June 2, 2021 from https://www.
Association for Computing Machinery, New York, NY, USA, 161–164. https: womenshealthmag.com/health/g25178771/best-meditation-apps/
//doi.org/10.1145/3218585.3218664 [86] Sarah Wigham and Eric Emerson. 2015. Trauma and life events in adults with
[71] Corina Sas and Rohit Chopra. 2015. MeditAid: A Wearable Adaptive intellectual disability. Current Developmental Disorders Reports 2, 2 (2015), 93–99.
Neurofeedback-Based System for Training Mindfulness State. Personal Ubiquitous [87] Jacob O Wobbrock, Shaun K Kane, Krzysztof Z Gajos, Susumu Harada, and Jon
Comput. 19, 7 (Oct. 2015), 1169–1182. https://doi.org/10.1007/s00779-015-0870-z Froehlich. 2011. Ability-based design: Concept, principles and examples. ACM
[72] Susan Scheckel. 2017. Home-Sickness, Nostalgia, and Therapeutic Narrative in Transactions on Accessible Computing (TACCESS) 3, 3 (2011), 1–27.
Poe’s" The Fall of the House of Usher". Poe Studies 50, 1 (2017), 12–25. [88] Renwen Zhang, Kathryn E. Ringland, Melina Paan, David C. Mohr, and Madhu
[73] Jessica Schroeder, Chelsey Wilkes, Kael Rowan, Arturo Toledo, Ann Paradiso, Reddy. 2021. Designing for Emotional Well-Being: Integrating Persuasion and
Mary Czerwinski, Gloria Mark, and Marsha M. Linehan. 2018. Pocket Skills: Customization into Mental Health Technologies. In Proceedings of the 2021 CHI
A Conversational Mobile Web App To Support Dialectical Behavioral Therapy. Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21).
Association for Computing Machinery, New York, NY, USA, 1–15. https://doi. Association for Computing Machinery, New York, NY, USA, Article 542, 13 pages.
org/10.1145/3173574.3173972 https://doi.org/10.1145/3411764.3445771
[74] Nokuthula Shabalala and Aimee Jasson. 2011. PTSD symptoms in intellectually
disabled victims of sexual assault. South African Journal of Psychology 41, 4 (2011),
424–436. APPENDIX
[75] SimplyBeing 2021. The SimpleBeing App. SimplyBeing. Retrieved August 20, The following are the 28 commercial apps that we looked
2021 from https://www.meditationoasis.com/simply-being-app
[76] Katarzyna Stawarz, Chris Preist, Deborah Tallon, Laura Thomas, Katrina Turner, at in 2021 as part of this work. Most of these apps are
Nicola Wiles, David Kessler, Roz Shafran, and David Coyle. 2020. Integrating paid (21) or freemium (5). The names of the apps and the
the Digital and the Traditional to Deliver Therapy for Depression: Lessons from a
Pragmatic Study. Association for Computing Machinery, New York, NY, USA,
URLs where we obtained them are as follows: Calm (https:
1–14. https://doi-org.uri.idm.oclc.org/10.1145/3313831.3376510 //www.calm.com), Breethe (https://breethe.com), Headspace
[77] Erin Sullivan. 2016. Beyond melancholy: Sadness and selfhood in Renaissance (https://www.headspace.com), My Life (https://my.life), Bud-
England. Oxford University Press, Oxford, UK.
[78] Colm Sweeney, Courtney Potts, Edel Ennis, Raymond Bond, Maurice D. Mulvenna, dhify (https://buddhify.com), Inspace (https://inscape.life),
Siobhan O’neill, Martin Malcolm, Lauri Kuosmanen, Catrine Kostenius, Alex Breathe+ (https://dynamicappdesign.com/#Breathe), Oak (https:
Vakaloudis, Gavin Mcconvey, Robin Turkington, David Hanna, Heidi Nieminen, //www.oakmeditation.com), Whil (https://www.whil.com), Sim-
Anna-Kaisa Vartiainen, Alison Robertson, and Michael F. Mctear. 2021. Can
Chatbots Help Support a Person’s Mental Health? Perceptions and Views from ple Habit (https://www.simplehabit.com), Petit Bambou (https:
Mental Healthcare Professionals and Experts. ACM Trans. Comput. Healthcare 2, //www.petitbambou.com/en/), Waking up (https://wakingup.com),
3, Article 25 (jul 2021), 15 pages. https://doi.org/10.1145/3453175
[79] Juan C. Torrado, Ida Wold, Letizia Jaccheri, Susanna Pelagatti, Stefano Chessa,
Prana Breath (https://pranabreath.info), The Mindfulness App
Javier Gomez, Gunnar Hartvigsen, and Henriette Michalsen. 2020. Developing (https://themindfulnessapp.com), Sattva (https://www.sattva.life),
Software for Motivating Individuals with Intellectual Disabilities to Do Outdoor Insight Timer (https://insighttimer.com), Meditation Studio
Physical Activity. In Proceedings of the ACM/IEEE 42nd International Conference
on Software Engineering: Software Engineering in Society (Seoul, South Korea) (https://meditationstudioapp.com), Let’s Meditate (https://play.
(ICSE-SEIS ’20). Association for Computing Machinery, New York, NY, USA, 81–84. google.com/store/apps/details?id=com.meditation.elevenminute),
https://doi.org/10.1145/3377815.3381376 Happy Not Perfect (https://happynotperfect.com/), Omvana
[80] Krishna Venkatasubramanian, Jeanine L. M. Skorinko, Mariam Kobeissi, Brittany
Lewis, Nicole Jutras, Pauline Bosma, John Mullaly, Brian Kelly, Deborah Lloyd, (https://www.omvana.com/), Welzen (https://welzen.app/), Relax-
Mariah Freark, and Nancy A. Alterio. 2021. Exploring A Reporting Tool to ing Melodies (https://www.relaxmelodies.com/), 10% Happier
Empower Individuals with Intellectual and Developmental Disabilities to Self-
Report Abuse. In Proceedings of the 2021 CHI Conference on Human Factors in
(https://www.tenpercent.com/), Simply Being (https://apps.apple.
Computing Systems. Association for Computing Machinery, New York, NY, USA, com/us/app/simply-being-guided-meditation/id347418999), Aura
Article 373, 13 pages. https://doi.org/10.1145/3411764.3445150 (https://www.aurahealth.io), Unplug (https://www.unplug.com),
[81] Vet 2021. Vet Change. Vet. Retrieved August 20, 2021 from https://www.ptsd.va.
gov/appvid/mobile/VetChange_app.asp Enso (https://www.ensomeditationtimer.com), and Meditation
[82] Wakingup 2021. The WakingUp App. Wakingup. Retrieved August 20, 2021 from Nest (https://appadvice.com/app/meditation-nest/1460053458).
https://wakingup.com/
"I Should Feel Like I’m In Control": Understanding Expectations,
Concerns, and Motivations for the Use of Autonomous
Navigation on Wheelchairs
JiWoong Jang Yunzhi Li Patrick Carrington
jiwoongj@cs.cmu.edu yunzhil@cs.cmu.edu pcarrington@cmu.edu
Human-Computer Interaction Human-Computer Interaction Human-Computer Interaction
Institute, Carnegie Mellon University Institute, Carnegie Mellon University Institute, Carnegie Mellon University
USA USA USA

Figure 1: The relationship between user-centric Factors (and their respective qualities), Attitudes, and Intent to Use Au-
tonomous Navigation on wheelchairs.
ABSTRACT intent. Finally, we highlight three critical areas of focus to high-
Autonomous navigation on wheelchairs promises to be a signifcant light opportunities and challenges for developers of a user-centered
frontier in the evolution of the power wheelchair as an assistive en- autonomous navigation experience for wheelchairs.
abling device, and is increasingly explored among researchers for its
potential to unlock more accessible navigation for wheelchair users. CCS CONCEPTS
While developments on path-planning methods for wheelchairs • Human-centered computing → Empirical studies in accessibil-
is ongoing, there is a relative paucity of research on autonomous ity.
wheelchair navigation experiences which accommodate potential
users’ needs. In this work, we present preliminary design consider- KEYWORDS
ations for the user experience for autonomous wheelchair naviga- wheelchairs, assistive technology, social navigation, interaction
tion derived from a semi-structured interview with ten (10) current design
wheelchair users about their willingness to use and applicability of
ACM Reference Format:
an autonomous navigation function. From this, nine (9) expressed
JiWoong Jang, Yunzhi Li, and Patrick Carrington. 2022. "I Should Feel Like
a willingness to use autonomous navigation in the near future in
I’m In Control": Understanding Expectations, Concerns, and Motivations for
a range of contexts, while expressing attitudes like expectations, the Use of Autonomous Navigation on Wheelchairs. In ASSETS ’22: The 24th
concerns, and motivations for intent to use. To better understand International ACM SIGACCESS Conference on Computers and Accessibility,
the impetus for such attitudes, we conducted thematic analysis October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 5 pages.
to reveal three high-order factors and associated qualities which https://doi.org/10.1145/3517428.3550380
together serve as a framework to help understand participants’
1 INTRODUCTION AND RELATED WORK
Permission to make digital or hard copies of part or all of this work for personal or Autonomous navigation on wheelchairs arguably furthers the promise
classroom use is granted without fee provided that copies are not made or distributed of powered wheelchairs. Powered wheelchairs have enabled in-
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. creased independence from both the necessity of a companion and
For all other uses, contact the owner/author(s). from the physical toll of manual movement. Meanwhile, power
ASSETS ’22, October 23–26, 2022, Athens, Greece wheelchairs generate their own challenges due to larger wheelchair
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. sizes and inaccessibility concerns, and a dependence on battery
https://doi.org/10.1145/3517428.3550380 power which has led to concerns of "range anxiety" from users [16].
ASSETS ’22, October 23–26, 2022, Athens, Greece Jang, et al.

Autonomous wheelchairs have the potential of addressing chal- encouraged participants to communicate and envision any expec-
lenges arising from power wheelchair use by unburdening users tations, concerns, or potential uses about a future autonomous
from managing the minutiae of navigation, and allowing the user navigation function in a think-aloud session. To ground the dis-
to automate difcult maneuvering, such as turning tight corners cussion, the interviewer devised a scenario where the participant
[12]. was to consider whether or not to purchase a wheelchair in the
Exploration of autonomous navigation function in wheelchairs near future with autonomous navigation built in. Finally, the in-
is timely due to advancements in LIDAR-based technology[7] and terviewer asked for participants’ opinions about issues regarding
reinforcement-learning based navigational techniques in contexts social navigation, shared/cooperative control, and expectation for
like household robots [9][10]. It is accompanied by signifcant de- the autonomous system to exhibit identical behavior to participants’
velopments in autonomous vehicle navigation, which are poten- current driving habits.
tially transferable to the wheelchair navigation context [7] [15].
Self-driving wheelchairs are already in development by several 2.2 Interview Analysis
commercial wheelchair makers and technology companies, includ- Two researchers conducted thematic analysis [6] over the text
ing WHILL [8], Hitachi [11], and Panasonic [3], as well as by re- transcript generated from participant interviews. Each researcher
search eforts at Massachusetts Institute of Technology [14] and independently generated a codebook from three (3) transcripts,
the United Kingdom’s National Institute of Health [1]. With con- whereafter the codebooks were merged after agreement was found
tinued development and proliferation of "smart" features in today’s between both researchers. The remaining transcripts were then
power wheelchairs [2] [16] [17], autonomous navigation serves as evenly divided amongst the researchers to independently code.
a convergence point for both trends. Each code and its associated part of the interview transcript were
Prior research sought to address technical challenges of devel- then afnity diagrammed and organized until larger patterns and
oping autonomous function for wheelchairs [1][7][15], and some relationships became apparent.
dealt with human-robotics interaction-related problems of social
navigation [9] and shared control [4][5]. Despite analogous explo- 3 FINDINGS AND EMERGENT FACTORS
rations in the smart wheelchair context [13][17], there has yet to be
Overall, we found that 9 out of 10 participants expressed some in-
an exploration into an autonomous wheelchair experience which
tention to use a wheelchair with autonomous navigation function -
caters to users’ preferences and expectations. In this paper, we
8 (P1-P4, P6, P7, P9, P10) reported willingness to use in an everyday
motivate such eforts by demonstrating what contexts and factors
fashion, while P8 expressed conditional willingness to use the func-
motivate adoption of autonomous wheelchair navigation, as well
tion in limited circumstances where involuntary hand tremors from
as provide guidance for future work for developing user-centered
her cerebral palsy would otherwise inhibit her ability to operate her
autonomous wheelchair navigation experiences.
wheelchair. Despite almost all of our participants communicating a
willingness to use, each expressed difering attitudes like expecta-
2 STUDY METHODOLOGY FOR tions, concerns, and motivations, about the overall user experience
UNDERSTANDING ATTITUDES TOWARD of an autonomous navigation function. This motivated the thematic
analysis, in order to better understand the basis of such attitudes
AUTONOMOUS NAVIGATION
and to generalize our participants’ thoughts.
To better understand motivating use cases, and attitudes like con- As part of analysis, 46 codes were used to label the text tran-
cerns and expectations for autonomous navigation functionality for scripts. Overall, the codes were best organized by grouping into
wheelchairs, we conducted a semi-structured interview study with three diferent themes: intrinsic, extrinsic, and chair-centered
ten (n = 10) current wheelchair users over a period of three months. factors.
Participants were recruited from public messages on Reddit, par- In general, each high-level factor is associated with respective
ticularly in specifc subreddits (e.g. r/wheelchairs, r/cerebralpalsy, qualities. We further observed that these qualities infuence at-
and r/spinalcordinjuries). We required that participants be over 18 titudes like expectation, concern, and motivation to use, where
years of age, and to be able to demonstrate prior experience with attitudes collectively shaped the overall willingness of a user to
wheelchairs of any kind for a period of at least 3 months to guaran- adopt autonomous navigation functionality. Here, we defne the
tee some minimum familiarity with operating a wheelchair. The three primary factors and provide examples of associated quali-
study procedure was done in agreement with an Institutional Re- ties to illustrate their connection to attitudes and ultimately to
view Board (IRB). Of our participants, 3 were manual wheelchair participants’ intent to use.
user, 4 had not had prior exposure to autonomous navigation in any
context, and overall had an average of 9.4 years of experience with 3.1 Intrinsic Factors
signifcant variance (σ = 10.02). We present specifc demographics
Intrinsic factors refected innate, personal characteristics of poten-
of our participants in Table 1 of the Appendix.
tial users of the autonomous wheelchair. From our analysis from
our interviews, we highlight particpants’ prior awareness of
2.1 Interview Procedure automation, extent of disability, and risk appetite as notable
Each participant engaged in an hour-long, recorded, semi-structured intrinsic qualities.
interview in video-conferencing conditions, in which they were • Prior awareness of automation, opinions of AI-based fea-
compensated at the conclusion of the study period. The interviewer tures, and openness to novel technology writ large had direct
"I Should Feel Like I’m In Control":
Understanding Expectations, Concerns, and Motivations for the Use of Autonomous Navigation on Wheelchairs ASSETS ’22, October 23–26, 2022, Athens, Greece

links to attitudes and intent to use. For example, P2 brought sense of control, and system transparency needs. This directly
up that he was inspired by Tesla’s autonomous vehicle navi- led to how participants perceived they could beneft from develop-
gation and expressed that he expected the riding experience ments in autonomous navigation functionality in wheelchairs.
of the autonomous wheelchair to be as "perfect" as Tesla’s • Participants’ difculties with their current wheelchairs,
Autopilot equipped cars. Additionally, he expressed hesita- such as P9’s difculty with maneuvers requiring fne-motor
tion to adopt if the wheelchair could not provide a similar control, or physical efort needed to move for manual wheelchair
quality of experience. users (P1, P3, P5), directly led to motivation to potentially
• Disability provided a unique point of view which afected resolve these challenges with autonomous navigation, and
attitudes towards autonomous navigation. For example, P3 then to willingness to use such a function.
relayed that due to his difculties with spatial memory and • Participants’ desire for sense a sense of control, and the
cognition, this provided him motivation for adopting au- implementation of the control scheme were a source of con-
tonomous navigation in contexts where he would need to cern for many participants, with many (8) stating that they
return to a previously visited location. would strongly prefer to have traditional joystick controls
• Risk appetite also appeared to afect attitudes, with the on autonomous chairs. This concern also manifested as a
propensity to engage with the maximum speed allowed by desire for the ability to immediately override the planned
the wheelchair, as one example. While many participants path or next action of the wheelchair. Eight (8) out of ten
reported adjusting their risk tolerance depending on extrin- participants showed strong reservations against shared or
sic factors, several (P4, P7, P9) reported engaging with the cooperative control schemes in which they did not have over-
maximum speed setting frequently while several others (P1, riding capability, with P4 describing a lack of such a feature
P2, P3, P8) rarely needed to do so, showcasing a range of "a huge problem", and P7 calling it a "show stopper."
diferent innate preferences to actions which could carry risk. • System transparency-related qualities dealt with partic-
This was associated with varying levels of concern at the pos- ipants’ expectations to view the system’s planned path, as
sibility that the chair would go faster than what participants well as the system’s understanding of the world, such as with
would attempt in their current wheelchairs. a planner displaying the intended path and perceived obsta-
cles overlayed on a map. Participants relayed that knowing
3.2 Extrinsic Factors both elements were crucial in scenarios with many elements
Extrinsic factors relates to elements and contexts in the environ- of concern as to determine whether to course-correct. Vi-
ment around the wheelchair. We found physical location and sual/touch, and sound/voice-based interfaces were cited most
social interaction with other people to be important qualities often (6 and 3 participants, respectively), including both in
throughout interviews. conjunction (2), as the preferred method to give input and
• Physical location encapsulates the characteristics of the receive communication to and from the system. Several par-
built environment which the wheelchair is occupying - this ticipants (P2, P3, P6, P7), however, admitted that they would
includes potential obstacles which impact path-planning likely not consult the system often, as they would rather pay
and the presence of safe wheelchair-accessible routes. For attention to their smartphones or surroundings. In this case,
instance, at least three participants (P1, P2, P10) remarked participants expected a reliable alert system which would
that the surface on which a wheelchair drives on has a large provide a haptic or sonic feedback as to scenarios where
efect on the riding experience, and that circumstances where the wheelchair user would need to intervene. P2 even went
road conditions are hazardous, such as with large potholes, so far as to suggest that the reliability of the alert system,
can provide severe navigational and safety challenges. These and that it consistently provided sufcient time for response,
participants expressed a mix of concern and expectation mattered more than the reliability of the planner itself.
that an autonomous system would be capable of successfully
recognizing and adjust to such scenarios. 4 DESIGN IMPLICATIONS AND FUTURE
• Social interaction represents the need to account for those WORK
around the wheelchair who may exhibit a range of diferent We believe fndings from the previous section’s thematic analysis
navigational behavior, including a personal care assistant highlight three immediate areas of focus to be explored in future
(PCA) following closely behind or besides the wheelchair, research, which we explain briefy.
or unrelated passerby or groups going in a diferent way (1) Create Adaptable Intercommunication Between the Sys-
from the wheelchair user. Depending on the social context, tem and User
participants expressed their expectation that an autonomous The chair-centered quality of desire for a sense of control
navigation function would be capable of distinguishing these places emphasis on the intercommunicative abilities of the
roles, and navigating environments like a shopping mall with user interface and input implementation. The extent partici-
a lot of people. pants felt they needed to know about the system state varied
on intrinsic qualities like disability, as well as situational
3.3 Chair-Centered Factors extrinsic qualities like the complexity of the environment.
Chair-centered factors include participants’ difculties stemming These implicate research questions designing system UI and
from operating their current wheelchairs, desire for sense a input modalities which are adaptable to these variables.
ASSETS ’22, October 23–26, 2022, Athens, Greece Jang, et al.

(2) Incorporate Diferences in Ability, Environment, and 04304-1


Social Preferences in Navigation [2] Patrick Carrington, Amy Hurst, and Shaun K. Kane. 2014. Wearables and
Chairables: Inclusive Design of Mobile Input and Output Techniques for Power
Likewise, variations arising from intrinsic, chair-centered, Wheelchair Users. In Proceedings of the SIGCHI Conference on Human Factors in
and extrinsic factors give rise to a need for developing naviga- Computing Systems (Toronto, Ontario, Canada) (CHI ’14). Association for Com-
puting Machinery, New York, NY, USA, 3103–3112. https://doi.org/10.1145/
tion systems which are responsive to these potential changes. 2556288.2557237
For example, participants expressed expectation that nav- [3] Panasonic Holdings Corporation. 2020. Panasonic begins trial of Mobility Service
igation in or around moving groups of people be handled using autonomous tracking robots at Takanawa Gateway station. Panasonic
Newsroom Global (2020). https://news.panasonic.com/global/topics/2020/79848.
smoothly and in accordance with their current driving be- html
havior. [4] Samuel Echefu. 2016. Towards Natural Human Control and Navigation of Au-
(3) Scafold and Ease In to Build Trust tonomous Wheelchairs.
[5] Chinemelu Ezeh, Pete Trautman, Catherine Holloway, and Tom Carlson. 2017.
Given the variation of the awareness of autonomous naviga- Comparing Shared Control Approaches for Alternative Interfaces: A Wheelchair
tion, we encourage inquiry into opportunities for potential Simulator Experiment. In 2017 IEEE International Conference on Systems, Man,
and Cybernetics (SMC) (Banf, AB, Canada). IEEE Press, 93–98. https://doi.org/
users to learn about the capabilities and limitations of au- 10.1109/SMC.2017.8122584
tonomous system prior and after deployment to engender [6] Jennifer Fereday and Eimear Muir-Cochrane. 2006. Demonstrating rigor using
thematic analysis: A hybrid approach of inductive and deductive coding and
grounded trust. To help users form mental models for use theme development. International journal of qualitative methods 5, 1 (2006),
of autonomous navigation, we suggest exploring discretiza- 80–92.
tion of functions, such as a separate mode for autonomously [7] Harkishan Grewal, Aaron Matthews, Richard Tea, and Kiran George. 2017. LIDAR-
Based Autonomous Wheelchair. In 2017 IEEE Sensors Applications Symposium
aligning into a tight hallway. Finally, given that driving speed (SAS) (Glassboro, NJ, USA). IEEE Press, 1–6. https://doi.org/10.1109/SAS.2017.
was expressed by participants as an important intermedi- 7894082
ating variable to build trust, we advocate for exploring the [8] WHILL Inc. 2019. WHILL Expands Airport Trials of Self-Driving
Personal Mobility Devices to North America. WHILL News (2019).
possibility of acclimating users by navigating with relatively https://whill.inc/us/whill-expands-airport-trials-of-self-driving-personal-
slower speeds until the user had developed confdence in the mobility-devices-to-north-america/
[9] Beomjoon Kim and Joelle Pineau. 2016. Socially adaptive path planning in
system’s ability. human environments using inverse reinforcement learning. International Journal
of Social Robotics 8, 1 (2016), 51–66.
5 CONCLUSION [10] Lucia Liu, Daniel Dugas, Gianluca Cesari, Roland Siegwart, and Renaud Dubé.
2020. Robot navigation in crowded environments using deep reinforcement
This work aims to provide preliminary understanding of wheelchair learning. In 2020 IEEE/RSJ International Conference on Intelligent Robots and
users’ motivations and potential uses for autonomous navigation Systems (IROS). IEEE, 5671–5677.
[11] Hitachi Ltd. 2017. Autonomous mobility: Focus on the problems of
functions. It delineates intrinsic, extrinsic, and chair-centered fac- people that can be solved by autonomous mobility. Hitachi Research
tors as a guiding framework to understand the source of observed (2017). https://www.hitachi.com/rd/research/design/vision_design/future/
attitudes. It highlights notable qualities for each factor, and connects autonomous_mobility/chair/index.html
[12] Corey Montella, Timothy Perkins, John R. Spletzer, and Michael Sands. 2012. To
them to attitudes toward autonomous navigation like expectation, the Bookstore! Autonomous Wheelchair Navigation in an Urban Environment.
concern, and motivation. By outlining areas of focus, it seeks to In FSR.
[13] Yoichi Morales, Atsushi Watanabe, Florent Ferreri, Jani Even, Tetsushi Ikeda,
guide future eforts of designers, engineers, and researchers on Kazuhiro Shinozawa, Takahiro Miyashita, and Norihiro Hagita. 2015. Including
developing nuances in autonomous navigation on wheelchairs. Fi- human factors for planning comfortable paths. In 2015 IEEE International Confer-
nally, as preliminary work, we acknowledge limitations to the study ence on Robotics and Automation (ICRA). IEEE. https://doi.org/10.1109/icra.2015.
7140063
methodology, including the limited sample size interviewed, which [14] Massachusetts Institute of Technology News. 2017. Featured video: A self-driving
preclude any claims of statistical signifcance. wheelchair. MIT News (2017). https://www.csail.mit.edu/node/5962
[15] Hye-Yeon Ryu, Je-Seong Kwon, Jeong-Hak Lim, A-Hyeon Kim, Su-Jin Baek,
and Jong-Wook Kim. 2022. Development of an Autonomous Driving Smart
ACKNOWLEDGMENTS Wheelchair for the Physically Weak. Applied Sciences 12, 1 (2022). https://doi.
We gratefully acknowledge Franklin Mingzhe Li’s feedback while org/10.3390/app12010377
[16] Richard C Simpson. 2005. Smart wheelchairs: A literature review. Journal of
preparing this work. rehabilitation research and development 42, 4 (2005), 423.
[17] Bingqing Zhang, Giulia Barbareschi, Roxana Ramirez Herrera, Tom Carlson, and
REFERENCES Catherine Holloway. 2022. Understanding Interactions for Smart Wheelchair
Navigation in Crowds. In Proceedings of the 2022 CHI Conference on Human
[1] André R. Baltazar, Marcelo R. Petry, Manuel F. Silva, and António Paulo Moreira. Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association
2021. Autonomous wheelchair for patient’s transportation on healthcare institu- for Computing Machinery, New York, NY, USA, Article 194, 16 pages. https:
tions. SN Applied Sciences 3, 3 (Feb. 2021). https://doi.org/10.1007/s42452-021- //doi.org/10.1145/3491102.3502085
"I Should Feel Like I’m In Control":
Understanding Expectations, Concerns, and Motivations for the Use of Autonomous Navigation on Wheelchairs ASSETS ’22, October 23–26, 2022, Athens, Greece

A APPENDIX: PARTICIPANT
DEMOGRAPHICS TABLE
ID Years of Experience Type of Wheelchair Prior Exposure to Autonomous Navigation
P1 1 manual Yes
P2 0.5 power Yes
P3 2.5 manual No
P4 18 power Yes
P5 8 manual Yes
P6 1 power No
P7 1 power No
P8 29 power Yes
P9 18 power No
P10 15 power Yes
Table 1: Self-reported Participant Demographics. Prior expo-
sure relates to participants’ awareness of autonomous navi-
gation in any context prior to the interview.
“What Makes Sonification User-Friendly?” Exploring Usability
and User-Friendliness of Sonified Responses
Ather Sharif∗ Olivia H. Wang∗ Alida T. Muongchan
asharif@cs.washington.edu wang4@cs.washington.edu alidatm@uw.edu
Paul G. Allen School of Computer Paul G. Allen School of Computer Human Centered Design and
Science & Engineering | DUB Group, Science & Engineering, Engineering,
University of Washington University of Washington University of Washington
Seattle, Washington, USA Seattle, Washington, USA Seattle, Washington, USA

ABSTRACT 1 INTRODUCTION
Sonifcation is a commonly used technique to make online data With the increasing data representation through visualizations
visualizations accessible to screen-reader users through auditory comes the critical need to make these visualizations accessible to
means. While current sonifcation solutions provide plausible utility people who may not be able to extract information using visual
(usefulness) to screen-reader users in exploring data visualizations, means (e.g., screen-reader users) [6, 16, 18, 21]. According to recent
they are limited in exploring the quality (usability) of the soni- fndings from Sharif et al., due to the inaccessibility of visualiza-
fed responses. In this preliminary exploration, we investigated the tions, screen-reader users extract information 62% less accurately
usability and user-friendliness of data visualization sonifcation and spend 211% more interaction time with online data visualiza-
for screen-reader users. Specifcally, we evaluated the Pleasantness, tions compared to non-screen-reader users [21]. Researchers and
Clarity, Confdence, and Overall Score of discrete and continuous developers have implemented several approaches and strategies
sonifed responses generated using various oscillator waveforms to make online data visualizations accessible [8, 10, 15, 22, 24, 25].
and synthesizers through user studies with 10 screen-reader users. Among these techniques is sonifcation, often referred to as “au-
Additionally, we examined these factors using both simple and com- dio graphs,” a widely-used approach in conveying data through
plex trends. Our results show that screen-reader users preferred auditory channels to screen-reader users.
distinct non-continuous responses generated using oscillators with Several prior works have utilized sonifcation to improve the ac-
square waveforms. We utilized our fndings to extend the function- cessibility of online data visualizations [2, 3, 7, 13, 14, 19, 24, 26, 29].
ality of Sonifier—an open-source JavaScript library that enables However, current solutions are focused on the utility (usefulness)
developers to sonify online data visualizations. Our follow-up inter- of sonifcation to screen-reader users and provide limited insights
views with screen-reader users identifed the need to personalize into the quality (usability and user-friendliness) of the sonifed re-
the sonifed responses per their individualized preferences. sponses. Therefore, in this work, we sought to examine and improve
the usability and user-friendliness of sonifed responses generated
CCS CONCEPTS from online data visualizations created using JavaScript libraries
• Human-centered computing → Empirical studies in visu- (e.g., D3 [4]). Prior research has explored the “pleasantness” of soni-
alization; Accessibility systems and tools; Empirical studies fed responses [1, 9, 20]. However, the most relevant research to
in accessibility. our work is the recent exploration by Wang et al. [27], in which
they examined the impact of various auditory channels (e.g., pitch,
KEYWORDS volume) on users’ perception of data and visualization. We build on
their work by investigating the efects of diferent oscillator wave-
sonifcation, audio graphs, waveforms, visualizations, screen-reader
forms and synthesizers on the pleasantness and users’ confdence
users
in interpreting simple and complex sonifed responses.
ACM Reference Format: To perform our investigation, we developed several sonifcation
Ather Sharif, Olivia H. Wang, and Alida T. Muongchan. 2022. “What Makes prototypes using the Tone.js library [17], incorporating diferent
Sonifcation User-Friendly?” Exploring Usability and User-Friendliness of confgurations for Sound Types (oscillator waveforms and synthe-
Sonifed Responses. In The 24th International ACM SIGACCESS Conference on sizers), Continuity Levels (interval for sounds between data points),
Computers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece.
and Trend Types (simple and complex). We fnalized six prototypes
ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3517428.3550360
through Wizard-of-Oz [5, 12] and pilot studies with users, employ-
∗ These authors contributed equally to this work.
ing a user-centered iterative design process. We evaluated these
prototypes by collecting subjective ratings for Pleasantness, Clarity,
Permission to make digital or hard copies of part or all of this work for personal or Confdence, and Overall Score through user studies with 10 screen-
classroom use is granted without fee provided that copies are not made or distributed reader users. We utilized our fndings to extend the functionalities
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored.
of Sonifier—a recently-introduced open-source JavaScript library
For all other uses, contact the owner/author(s). that generates sonifed responses for two-dimensional single-series
ASSETS ’22, October 23–26, 2022, Athens, Greece data [23, 24]. Additionally, we conducted follow-up interviews with
© 2022 Copyright held by the owner/author(s). our participants to solicit feedback on improving their experiences
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3550360 with sonifed responses.
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Olivia H. Wang, and Alida T. Muongchan

Figure 1: Average scores per Continuity Level and Trend Type for each Sound Type for (a) Pleasantness, (b) Clarity, (c) Conf-
dence, and (d) Overall Scores. Baseline was a sawtooth waveform OmniOscillator, Prototype 1 was a square waveform OmniOscil-
lator, and Prototype 2 was a square waveform MonoSynthesizer. Subjective ratings were collected using a Likert scale ranging
from 1 (worst) to 7 (best).

In this work, we contribute the empirical fndings from our count of 8. We created a continuous (minimal time interval be-
preliminary explorations. Specifcally, we provide the quantitative tween sounds for each data point) and discrete (clear time interval
and qualitative results from our user studies with 10 screen-reader between sounds for each data point) prototype for each sound type.
users. Additionally, we provide enhancements to the open-source We used the default settings from the Sonifier library as our base-
Sonifer library and identify avenues for future work. line measure. Since the baseline only supports discrete responses,
we developed our ffth prototype that generated a continuous re-
2 USER STUDY sponse for the baseline to account for balanced conditions. Our
To assess the usability and user-friendliness of sonifed responses, fnal set contained 12 prototypes (including the baseline), six for
we conducted a user study with 10 screen-reader users, subse- each Trend Type (simple and complex).
quently interviewing them to gather insights on further improving
the sonifed responses. 2.2 Participants & Procedure
Our participants were 10 screen-reader users (Appendix A, Table
2.1 Prototypes 1). Four identifed as women, fve as men, and one as non-binary.
We created several sonifcation prototypes employing diferent Their average age was 48.4 (SD=14.4) years. We compensated our
combinations for oscillator waveforms, synthesizers, and partial participants with a $10 Amazon gift card for 30 minutes of their
counts (number of harmonics to generate the waveform). We de- time. We supervised our user studies online using Zoom.
veloped these prototypes using the Tone.js JavaScript library For each participant, we played six sonifed responses (fve pro-
[17]—a widely-used framework to generate sounds in the browser. totypes + baseline) for simple trends, randomizing the order to
Then, we eliminated the prototypes that users found undesirable account for learning efect. At the end of each sonifcation, we
through Wizard-of-Oz [5, 12] and pilot studies. After elimination, collected their subjective ratings for Pleasantness (timbre), Clarity
our sound types comprised an oscillator (termed “OmniOscilla- (assessment of the sound to identify the trend clearly), Confdence
tor” by Tone.js) and a synthesizer (termed “MonoSynthesizer” (user’s confdence in understanding the overall trend), and Overall
by Tone.js), with both using the square waveform and a partial Score (subjective assessment of the sound overall, including any
“What Makes Sonification User-Friendly?” ASSETS ’22, October 23–26, 2022, Athens, Greece

factors not mentioned above). We used a Likert scale for subjec- and complex trends. OmniOscillator with square waveform and
tive ratings, ranging from 1 (worst) to 7 (best). Then, we asked discrete continuity level performed the best compared to the other
follow-up questions from our users to gather insights on the areas prototypes (M=4.65) on average, similar to PL and CF . We only
of improvement. We followed the same steps for complex trends. found a marginal efect of S (p ≈.090) and C (p ≈.098) on OS. We
Our study sessions, on average, took approximately 30 minutes display the average scores for OS in Figure 1 and Table 2 for each
from start to fnish. independent variable.

2.3 Design & Analysis 2.5 Areas of Improvement


The experiment was a 3×2×2 within-subjects design with the fol- Our participants recognized three areas of improvement for sonifca-
lowing factors and levels: tion: (1) Personalization (customizing the auditory output for factors
• Sound Type (S): {Baseline, OmniOscillator, MonoSynthesizer} including speed and frequency); (2) Identifcation of extrema/outliers
• Continuity Level (C): {discrete, continuous} (identifying the maximum and minimum data points); and (3) Multi-
• Trend Type (T): {simple, complex} modality (using a combination of diferent instrument sounds to-
We used Pleasantness (PL), Clarity (CL), Confdence (CF), and gether with diferent frequencies to amplify the distinctions be-
Overall Score (OS) as our dependent variables. In our analysis, all tween data points). Future work can incorporate our fndings to
our dependent variables were ordinal (on a scale of 1-7). Therefore, improve the usability and user-friendliness of sonifed responses
we conducted our analysis using an ordinal logistic regression for screen-reader users.
[11, 28]. We also included Subjectr as a random factor to account
for repeated measures. We tested our participants over 3×2×2=12 3 SONIFIER LIBRARY ENHANCEMENTS
conditions, resulting in 12×10=120 total trials. Sonifier is an open-source JavaScript library that generates a
sonifed response from two-dimensional single-series data, recently
2.4 Results developed by Sharif et al. [23, 24]. Utilizing the fndings from our
We present the results of our user studies assessing Pleasantness user studies with screen-reader users, we enhanced the Sonifier
(PL), Clarity (CL), Confdence (CF), and Overall Score (OS) of soni- library by (1) improving its scalability to support several sound
fed responses. Additionally, we present our fndings on areas of types; and (2) modifying its default settings to those from our best-
improvement for sonifed responses from our follow-up interviews. performing prototype.
We refactored and modularized the code to enable developers to
2.4.1 Pleasantness (PL). Our results show a signifcant main ef- create sonifed responses using additional sound types, including
fect of S (χ 2 (2, N =10)=12.15, p<.05, Cramer’s V =.42) on PL overall, MonoSynthesizers and Envelopes. (Currently, the Sonifier library
indicating that PL was signifcantly diferent between the three S only supports the creation of sonifcation using an OmniOscillator.)
groups. Overall, the sonifed responses from OmniOscillator with We further improve the customization for OmniOscillator by adding
square waveform and discrete continuity level outperformed the more confguration options, including “sourceType” (source of the
other prototypes (M=4.65). C (p ≈.844) and T (p ≈.122) did not have oscillator; e.g., am), “baseType” (waveform of the oscillator; e.g.,
a signifcant main efect on PL. Figure 1 and Table 2 show average sawtooth), and “partialCount” (number of harmonics to generate
PL scores for each independent variable. the waveform, ranging from 1-32).
2.4.2 Clarity (CL). We found a signifcant main efect of C (χ 2 (1, Additionally, we modifed the default settings of the library based
N =10)=11.03, p<.001, Cramer’s V =.40) and T (χ 2 (1, N =10)=22.83, on our fndings. Specifcally, we changed “sourceType” to am, “base-
p<.001, Cramer’s V =.57) on CL overall. This result indicates that Type” to square, and “partialCount” to 8. We made our code publicly
CL was signifcantly diferent between discrete and continuous available at the Sonifier library’s open-source repository [23].
sounds and also between simple and complex trends. On average,
the Baseline had the best scores (M=5.15). S (p ≈.476) did not have 4 DISCUSSION & CONCLUSION
a signifcant main efect on CL. We show the average scores for CL In this preliminary exploration, we investigated the usability and
in Figure 1 and Table 2 for each independent variable. user-friendliness of sonifed responses for screen-reader users via
user studies. Our results show that screen-reader users preferred
2.4.3 Confidence (CF). C (χ 2 (1, N =10)=6.36, p<.05, Cramer’s V =.30)
distinct non-continuous sonifed responses generated using oscilla-
and T (χ 2 (1, N =10)=42.34, p<.001, Cramer’s V =.78) had a signif-
tors with square waveforms. Additionally, our follow-up interviews
cant main efect on CF , indicating that CF signifcantly difered be-
showed that our participants identifed the need to personalize the
tween simple and complex trends as well as discrete and continuous
sonifed responses. We utilized our fndings to enhance the capa-
sounds. Similar to PL, the sonifed responses from OmniOscillator
bilities of Sonifier [23]—an open-source library that generates
with square waveform and discrete continuity level had the best
sonifed responses from two-dimensional data.
overall average scores (M=5.20). We did not fnd a signifcant efect
While the prototype generated using an oscillator with square
of S on CF (p ≈.726). The average scores for CF are shown in Figure
waveform overall outperformed the rest of the prototypes, inter-
1 and Table 2 for each independent variable.
estingly, the Baseline (using a sawtooth waveform) had the best
2.4.4 Overall Score (OS). Our results show a signifcant main efect average scores for Clarity. (However, Clarity did not have a sta-
of T on OS overall (χ 2 (1, N =10)=7.16, p<.05, Cramer’s V =.32). This tistically signifcant on Sound Type.) Our results identify avenues
fnding indicates that OS signifcantly varied between the simple for future work to investigate the efects of diferent waveforms
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Olivia H. Wang, and Alida T. Muongchan

on diferent aspects of user experiences. Additionally, our fndings Engineering 28, 1 (2015), 127–146.
revealed that our participants preferred to personalize the sonifed [12] Melita Hajdinjak and France Mihelic. 2004. Conducting the Wizard-of-Oz Exper-
iment. Informatica (Slovenia) 28, 4 (2004), 425–429.
responses, depending on factors such as trend type and data cardi- [13] Highcharts. n.d.. Sonifcation | Highcharts. https://www.highcharts.com/docs/
nality. Future work can utilize our fndings to generate a central accessibility/sonifcation. (Accessed on 08/01/2021).
[14] Leona M Holloway, Cagatay Goncu, Alon Ilsar, Matthew Butler, and Kim Marriott.
confguration system, enabling screen-reader users to customize 2022. Infosonics: Accessible Infographics for People who are Blind using Soni-
the responses per their needs. fcation and Voice. In CHI Conference on Human Factors in Computing Systems.
Our work identifes avenues for future research to improve the 1–13.
[15] Edward Kim and Kathleen F McCoy. 2018. Multimodal deep learning using
quality of sonifed responses. Specifcally, our work highlights that images and text for information graphic classifcation. In Proceedings of the 20th
utilizing the knowledge from domains outside of Human-Computer International ACM SIGACCESS Conference on Computers and Accessibility. 143–
Interaction, such as Music Theory, can improve the user experiences 148.
[16] Bongshin Lee, Arjun Srinivasan, Petra Isenberg, John Stasko, et al. 2021. Post-
for people who rely on auditory channels to extract information WIMP Interaction for Information Visualization. Foundations and Trends® in
from online data visualizations in a personalized manner. We hope Human-Computer Interaction 14, 1 (2021), 1–95.
[17] Yotam Mann. n.d.. Tone.js. https://tonejs.github.io/. (Accessed on 08/02/2021).
our work will inspire researchers to conduct an in-depth investiga- [18] Kim Marriott, Bongshin Lee, Matthew Butler, Ed Cutrell, Kirsten Ellis, Cagatay
tion of the usability and user-friendliness of sonifed responses and Goncu, Marti Hearst, Kathleen McCoy, and Danielle Albers Szafr. 2021. Inclusive
improve the experiences of screen-reader users with the sonifca- data visualization for people with disabilities: a call to action. Interactions 28, 3
(2021), 47–51.
tion of online data visualizations. [19] David K. McGookin and Stephen A. Brewster. 2006. SoundBar: Exploiting Multiple
Views in Multimodal Graph Browsing. In Proceedings of the 4th Nordic Conference
ACKNOWLEDGMENTS on Human-Computer Interaction: Changing Roles (Oslo, Norway) (NordiCHI ’06).
Association for Computing Machinery, New York, NY, USA, 145–154. https:
This work was supported by the University of Washington Center //doi.org/10.1145/1182475.1182491
[20] Giorgio Presti, Dragan Ahmetovic, Mattia Ducci, Cristian Bernareggi, Luca A.
for Research and Education on Accessible Technology and Expe- Ludovico, Adriano Baratè, Federico Avanzini, and Sergio Mascetti. 2021. Iterative
riences (CREATE). We thank Jacob O. Wobbrock and Katharina Design of Sonifcation Techniques to Support People with Visual Impairments
Reinecke for their helpful comments and suggestions. Finally, we in Obstacle Avoidance. ACM Trans. Access. Comput. 14, 4, Article 19 (oct 2021),
27 pages. https://doi.org/10.1145/3470649
thank and remember our recently-departed team member Zoey for [21] Ather Sharif, Sanjana Shivani Chintalapati, Jacob O Wobbrock, and Katharina
her feline support, without which the purrusal of this work would Reinecke. 2021. Understanding Screen-Reader Users’ Experiences with Online
not have been as efective. May she cross the rainbow bridge in Data Visualizations. In The 23rd International ACM SIGACCESS Conference on
Computers and Accessibility. 1–16.
peace and fnd her way to cat heaven. [22] Ather Sharif and Babak Forouraghi. 2018. evoGraphs — A jQuery plugin to create
web accessible graphs. In 2018 15th IEEE Annual Consumer Communications
Networking Conference (CCNC). IEEE, Las Vegas, NV, USA, 1–4. https://doi.org/
REFERENCES 10.1109/CCNC.2018.8319239
[1] Dragan Ahmetovic, Federico Avanzini, Adriano Baratè, Cristian Bernareggi, [23] Ather Sharif, Olivia H. Wang, Alida T. Muongchan, Katharina Reinecke, and
Gabriele Galimberti, Luca A Ludovico, Sergio Mascetti, and Giorgio Presti. 2019. Jacob O. Wobbrock. 2022. Sonifer: JavaScript library that converts a two-
Sonifcation of rotation instructions to support navigation of people with visual dimensional data into a sonifed response. https://github.com/athersharif/sonifer.
impairment. In 2019 IEEE International Conference on Pervasive Computing and (Accessed on 06/12/2022).
Communications (PerCom. IEEE, 1–10. [24] Ather Sharif, Olivia H. Wang, Alida T. Muongchan, Katharina Reinecke, and
[2] Dragan Ahmetovic, Cristian Bernareggi, João Guerreiro, Sergio Mascetti, and Jacob O. Wobbrock. 2022. VoxLens: Making Online Data Visualizations Accessible
Anna Capietto. 2019. Audiofunctions. web: Multimodal exploration of math- with an Interactive JavaScript Plug-In. In CHI Conference on Human Factors in
ematical function graphs. In Proceedings of the 16th International Web for All Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing
Conference. 1–10. Machinery, New York, NY, USA, Article 478, 19 pages. https://doi.org/10.1145/
[3] Apple. n.d.. Audio Graphs | Apple Developer Documentation. https://developer. 3491102.3517431
apple.com/documentation/accessibility/audio_graphs. (Accessed on 08/01/2021). [25] Lei Shi, Idan Zelzer, Catherine Feng, and Shiri Azenkot. 2016. Tickers and Talker:
[4] Michael Bostock, Vadim Ogievetsky, and Jefrey Heer. 2011. D3 data-driven An Accessible Labeling Toolkit for 3D Printed Models. In Proceedings of the
documents. IEEE transactions on visualization and computer graphics 17, 12 (2011), 2016 CHI Conference on Human Factors in Computing Systems. Association for
2301–2309. Computing Machinery, New York, NY, USA, 4896–4907. https://doi.org/10.1145/
[5] Nils Dahlbäck, Arne Jönsson, and Lars Ahrenberg. 1993. Wizard of Oz stud- 2858036.2858507
ies—why and how. Knowledge-based systems 6, 4 (1993), 258–266. [26] Alexa Siu, Gene SH Kim, Sile O’Modhrain, and Sean Follmer. 2022. Supporting
[6] Joel J Davis. 2002. Disenfranchising the Disabled: The Inaccessibility of Internet- Accessible Data Visualization Through Audio Data Narratives. In CHI Conference
Based Health Information. Journal of Health Communication 7, 4 (2002), 355–367. on Human Factors in Computing Systems. 1–19.
https://doi.org/10.1080/10810730290001701 [27] Ruobin Wang, Crescentia Jung, and Yea-Seul Kim. 2022. Seeing Through Sounds:
[7] Danyang Fan, Alexa Fay Siu, Wing-Sum Adrienne Law, Raymond Ruihong Zhen, Mapping Auditory Dimensions to Data and Charts for People with Visual Impair-
Sile O’Modhrain, and Sean Follmer. 2022. Slide-Tone and Tilt-Tone: 1-DOF Haptic ments. In The 24th EG/VGTC Conference on Visualization (EuroVis’ 22), Rome, Italy,
Techniques for Conveying Shape Characteristics of Graphs to Blind Users. In 13-17 June, 2022. Eurographics-European Association for Computer Graphics.
CHI Conference on Human Factors in Computing Systems. 1–19. [28] Christopher Winship and Robert D Mare. 1984. Regression models with ordinal
[8] Connor Geddes, David R Flatla, Garreth W Tigwell, and Roshan L Peiris. 2022. variables. American sociological review (1984), 512–525.
Improving Colour Patterns to Assist People with Colour Vision Defciency. In [29] Haixia Zhao, Catherine Plaisant, Ben Shneiderman, and Jonathan Lazar. 2008.
CHI Conference on Human Factors in Computing Systems. 1–17. Data sonifcation for users with visual impairment: a case study with georefer-
[9] Andrea Gerino, Lorenzo Picinali, Cristian Bernareggi, Nicolò Alabastro, and enced data. ACM Transactions on Computer-Human Interaction (TOCHI) 15, 1
Sergio Mascetti. 2015. Towards large scale evaluation of novel sonifcation tech- (2008), 1–28.
niques for non visual shape exploration. In Proceedings of the 17th International
ACM SIGACCESS Conference on Computers & Accessibility. 13–21.
[10] Nicholas A. Giudice, Hari Prasath Palani, Eric Brenner, and Kevin M. Kramer. 2012.
Learning Non-Visual Graphical Information Using a Touch-Based Vibro-Audio
Interface. In Proceedings of the 14th International ACM SIGACCESS Conference on
Computers and Accessibility (Boulder, Colorado, USA) (ASSETS ’12). Association
for Computing Machinery, New York, NY, USA, 103–110. https://doi.org/10.
1145/2384916.2384935
[11] Pedro Antonio Gutiérrez, Maria Perez-Ortiz, Javier Sanchez-Monedero, Francisco
Fernandez-Navarro, and Cesar Hervas-Martinez. 2015. Ordinal regression meth-
ods: survey and experimental study. IEEE Transactions on Knowledge and Data
“What Makes Sonification User-Friendly?” ASSETS ’22, October 23–26, 2022, Athens, Greece

A PARTICIPANT DEMOGRAPHICS

Gender Age Screen Vision-Loss Level Diagnosis


Reader
P1 M 57 JAWS Lost vision gradually Retinitis Pigmentosa
P2 F 38 JAWS Blind since birth Leber Congenital Amaurosis
P3 F 65 JAWS Lost vision gradually Retinitis Pigmentosa
P4 F 69 Fusion Lost vision gradually, Partial vision Juvenile Macular Degeneration
P5 M 33 NVDA Blind since birth Peters Anomaly
P6 M 37 JAWS Blind since birth Leber Congenital Amaurosis
P7 F 52 JAWS Blind since birth Retinopathy
P8 M 58 JAWS Lost vision gradually Cataracts and Glaucoma
P9 M 49 JAWS Lost vision gradually Leber Congenital Amaurosis
P10 NB 26 VoiceOver Partial vision Corneal damage
Table 1: Screen-reader participants, their gender identifcation, age, screen reader, vision-loss level, and diagnosis. Under the
Gender column, M = Male, F = Female, and N B = Non-binary.

B SUBJECTIVE RATINGS

Average Scores
Sound Type (S) Continuous PL CL CF OS
No 3.35 5.15 5.10 3.90
OmniOscillator with sawtooth waveform (Baseline)
Yes 3.95 4.55 4.70 3.75
No 4.65 5.05 5.20 4.65
OmniOscillator with square waveform
Yes 4.30 4.50 4.85 4.05
No 3.75 4.95 5.15 4.20
MonoSynthesizer with square waveform
Yes 3.20 4.15 4.58 3.80
Table 2: Overall average scores for each sound type per continuity level. PL represents Pleasantness, CL represents Clarity, CF
represents Confdence, and OS represents Overall Score. Highest average scores for PL, CL, CF , and OS are shown in bold.
“What’s going on in Accessibility Research?” Frequencies and
Trends of Disability Categories and Research Domains in
Publications at ASSETS
Ather Sharif Ploypilin Pruekcharoen∗ Thrisha Ramesh∗
asharif@cs.washington.edu ploypp@uw.edu thrisha@cs.washington.edu
Paul G. Allen School of Computer Human Centered Design and Paul G. Allen School of Computer
Science & Engineering | DUB Group, Engineering | DUB Group, Science & Engineering,
University of Washington University of Washington University of Washington
Seattle, Washington, USA Seattle, Washington, USA Seattle, Washington, USA

Ruoxi Shang Spencer Williams Gary Hsieh


rxshang@uw.edu sw1918@uw.edu garyhs@uw.edu
Human Centered Design and Human Centered Design and Human Centered Design and
Engineering | DUB Group, Engineering | DUB Group, Engineering | DUB Group,
University of Washington University of Washington University of Washington
Seattle, Washington, USA Seattle, Washington, USA Seattle, Washington, USA

ABSTRACT KEYWORDS
ACM SIGACCESS Conference on Computers and Accessibility (AS- accessibility, disability categories, research domains, frequency,
SETS) is considered one of the premium forums for research on trend, assets
accessibility. Recently, Mack et al. shed light on the demograph-
ACM Reference Format:
ics, goals, research methodologies, and evolution of accessibility Ather Sharif, Ploypilin Pruekcharoen, Thrisha Ramesh, Ruoxi Shang, Spencer
research over time. We extend their work by exploring the fre- Williams, and Gary Hsieh. 2022. “What’s going on in Accessibility Research?”
quencies and trends of disability categories and computer science Frequencies and Trends of Disability Categories and Research Domains in
research domains in publications at ASSETS (N =1,678). Our re- Publications at ASSETS. In The 24th International ACM SIGACCESS Con-
sults show that disability categories and research domains varied ference on Computers and Accessibility (ASSETS ’22), October 23–26, 2022,
signifcantly across the publication years. We found that in the Athens, Greece. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/
past 10 years, publications targeting Mental-Health-Related disabili- 3517428.3550359
ties and the research domain of AR/VR show an increasing trend.
In opposition, Gaming, Input Methods/Interaction Techniques, and 1 INTRODUCTION
User Interfaces domains portray a decreasing trend. Additionally, With the increase in disability awareness, research focusing on ac-
our results show that the majority of the publications utilize the cessible and assistive technology for disabled people has drastically
AI/ML/CV/NLP domain (19%) and focus on people with visual dis- increased over the past few years [3, 11]. While several venues
abilities (42%). We share our preliminary exploration results and exist for accessibility-related publications, the ACM SIGACCESS
identify avenues for future work. Conference of Computing and Accessibility (ASSETS) is amongst
the premium forums for publications focusing on the design, evalu-
CCS CONCEPTS ation, use, and education related to computing for disabled people
• Human-centered computing → Accessibility theory, con- [1]. Given its reputation for publishing top-tier work, ASSETS is
cepts and paradigms; Accessibility design and evaluation meth- the go-to conference for both seasoned and new researchers.
ods; • Social and professional topics → People with disabili- Understanding the evolution of accessibility research over time at
ties. ASSETS can reveal crucial information, including norms, gaps, and
adoptions of technical and societal concepts in academia [6, 8, 11].
Such information is benefcial for researchers to refect on the
∗ These authors contributed equally to this work. growth of accessibility as a research feld and can guide them in
identifying avenues for future work. Prior work has conducted
literature surveys for sub-demographics within the disability de-
Permission to make digital or hard copies of part or all of this work for personal or mographic [4, 5, 10, 11, 13, 16–18]. Most relevant to our work is
classroom use is granted without fee provided that copies are not made or distributed the recent exploration by Mack et al. [11] that seeks to understand
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. the demographics, goals, research methodologies, and evolution
For all other uses, contact the owner/author(s). of accessibility research over time. We build on their work by (1)
ASSETS ’22, October 23–26, 2022, Athens, Greece extending their identifed communities of focus to incorporate more
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. nuanced disability categories; (2) investigating the utilization of
https://doi.org/10.1145/3517428.3550359 computer science research domains over time; and (3) conducting
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Ploypilin Pruekcharoen, Thrisha Ramesh, Ruoxi Shang, Spencer Williams, and Gary Hsieh

Table 1: Overview of keyword counts and percentages, and trend analyses for each disability category and research domain
in publications at ASSETS overall (2000-2021) and in recent years (2012-2021). N is the total keyword count and % is the per-
centage compared to the total keyword count for all the categories. τ is the measure of the ordinal association between cat-
egories/domains and their ratios (+ve τ values mean increasing trend and -ve τ values mean decreasing trend). p shows the
statistical signifcance of the trend (α=.05).

Overall (2000-2021) Past 10 Years (2012-2021)


N % τ p N % τ p
Disability Category
Auditory 256 17% .57 <.001 183 18% .32 .243
Chronic Illness 11 1% -.07 .740 6 1% -.08 .838
Cognitive 188 12% .03 .871 116 11% -.07 .858
Learning 89 6% .39 .019 67 7% -.09 .788
Mental-Health-Related 13 1% .60 <.001 12 1% .60 .026
Mobility 175 11% .08 .650 116 11% -.11 .721
Neurological 55 4% .01 .998 37 4% -.32 .243
Older Adults 98 6% -.08 .626 58 6% .09 .788
Visual 643 42% .15 .381 419 41% .20 .474
Research Domain
3-D Representation 71 3% .54 <.001 62 4% -.07 .858
AI/ML/CV/NLP 498 19% -.32 .051 280 18% .33 .211
AR/VR 75 3% .32 .055 61 4% .54 .039
Educational/Methodological/Theoretical 401 16% .22 .194 256 16% .47 .074
Gaming 103 4% .23 .163 75 5% -.51 .049
Hardware Tools 281 11% -.11 .516 163 10% -.07 .858
Input Methods/Interaction Techniques 252 10% -.39 .019 139 9% -.56 .032
Media/Graphics/Visualizations 283 11% .28 .098 180 11% .24 .371
Security/Privacy 44 2% .31 .067 35 2% .11 .721
Software Tools 299 12% -.10 .559 180 11% -.09 .788
User Interfaces 159 6% -.55 <.001 68 4% -.51 .049
Wearables 89 3% .55 <.001 78 5% .45 .088

Figure 1: Trends for each disability category in ASSETS publications overall (2000-2021).

empirical statistical analyses to explore the frequencies and trends trends for each category and domain over time. We further ana-
of each disability category and research domain. lyzed our data by fltering the publications to only include those
To shed more light on the frequencies and trends in accessibil- published recently (past 10 years; 2012-2021). Overall, we extracted
ity research, frst, we scraped the keywords from ASSETS papers 3,234 keywords from 1,678 papers.
published since 2000, including poster and demonstration papers. We found that disability categories and research domains vary
Then, we manually categorized the keywords into nine disability signifcantly across publication years. Visually disabled people were
categories and 12 research domains to analyze the frequencies and the largest targeted audience (42% overall and 41% in the past 10
years), and publications on Mental-Health-Related disabilities show
“What’s going on in Accessibility Research?” ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 2: Trends for each research domain in ASSETS publications overall (2000-2021).

an increasing trend overall and in recent publications. Additionally, work. We eliminated broad (e.g., “accessibility”) and ambiguous
the majority of the publications employed the research domain of (e.g., “Germany”) keywords from our data set. After categorization,
AI/ML/CV/NLP (19% overall and 18% in the past 10 years), with we calculated ratios for each category and domain per publication
publications on AR/VR showing an increasing trend in recent years. year. (Ratio was the frequency of the total papers containing a
In contrast, Gaming, Input Methods/Interaction Techniques, and User given category/domain’s keywords divided by the total number of
Interfaces domains showed a decreasing trend in recent years. publications with at least one author keyword per year.) Our fnal
We contribute the empirical fndings from our preliminary ex- set comprised a total of 1,885 keywords.
plorations. Specifcally, we provide the results from our statistical At least three researchers participated in the keyword catego-
analyses of frequencies and trends of (1) disabilities categories and rization process to account for accurate categorization. We resolved
(2) research domains over the past (1) 20 years (2000-2021; ASSETS any disagreements through mutual discussions.
skipped publications in 2001 and 2003) and (2) 10 years (2012-2021).
Additionally, we identify avenues for future work. 2.2 Analysis & Results
Our goal was to examine the frequencies and trends of disability
2 FREQUENCIES AND TRENDS IN categories and research domains in overall (2000-2021) and recent
PUBLICATIONS AT ASSETS (2012-2021) publications at ASSETS. Our preliminary analysis using
We studied the frequencies and trends of disability categories and Anderson-Darling [2] tests of normality showed that the ratios were
computer science research domains by extracting 3,234 keywords conditionally non-normal. Therefore, we used a generalized linear
from 1,678 publications at ASSETS. We present our methodology, model [7, 14] with Gamma distribution and log link function to
analysis, and quantitative results. investigate frequencies, as our data was positive and right-skewed.
Category (C) and Domain (D) were the independent variables for
2.1 Data Collection & Procedure analyzing disability categories and research domains, respectively,
whereas Ratio (R) was the dependent variable. Additionally, we
We queried the ACM Digital Library to collect the author keywords used Mann-Kendall [9, 12] test to evaluate temporal trends for each
of all the papers published at ASSETS (N =1,678) since its inception category and their respective ratios across the publication years.
in 1994, similar to prior work [10, 11]. However, unlike these works,
our data set included the poster, panel, and demonstration papers. 2.2.1 Disability Categories. Our results show a signifcant main
Due to missing author keywords in publications from years before efect of Category (C) on R (χ 2 (1, N =180)=363.83, p<.001, Cramer’s
2000 (only 59% had author keywords), we reduced our data set to V =.50), with 42% of the publications on the disability category Vi-
publications from 2000 onward (90% had author keywords). After sual. Filtering the publication years to include only the past 10 years
reduction, our data set included 3,234 keywords from 1,603 papers. (2012-2021) yields similar results, showing a signifcant main efect
We provide our collected keywords with their overall counts in the of C on R (χ 2 (1, N =90)=253.32, p<.001, Cramer’s V =.59) and 41% fo-
supplementary materials. cusing on people with Visual disabilities. These results indicate that
First, we defned the factor levels for disability categories and the disability categories vary signifcantly in ASSETS publications.
research domains. We selected the disability categories from prior Figure 3 (Appendix A) and Table 1 show the percentages.
work [15]. For the computer science research domains, we fnal- Our trend analysis for each disability category identifes an in-
ized the set through discussions with well-published researchers creasing trend for disability categories Auditory (τ =.57, p<.001),
at ASSETS. Then, we manually categorized each keyword into ap- Learning (τ =.39, p<.05), and Mental-Health-Related (τ =.60, p<.001)
propriate disability categories and research domains (a keyword overall. However, in recent publications, only Mental-Health-Related
could belong to multiple categories and research domains). For category shows an increasing trend (τ =.60, p<.05). Figure 1 shows
research domains, categorization qualifcations included applied the trends and Table 1 displays the statistical results.
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Ploypilin Pruekcharoen, Thrisha Ramesh, Ruoxi Shang, Spencer Williams, and Gary Hsieh

2.2.2 Research Domains. Domain (D) also had a signifcant main and extend our work to analyze publications from other venues.
efect on R for both overall (χ 2 (1, N =240)=334.89, p<.001, Cramer’s We hope our work will inspire researchers to explore frequencies
V =.36) and in recent publications (χ 2 (1, N =120)=207.43, p<.001, and trends in accessibility research, providing further insights into
Cramer’s V =.44). Similar to C, these results indicate that the re- the growth of this feld.
search domain employed in publications at ASSETS varied signif-
cantly. The majority of the publications employed the domain of ACKNOWLEDGMENTS
AI/ML/CV/NLP overall (19%) and in recent years (18%). Figure 4 We thank the anonymous reviewers for their helpful comments and
(Appendix A) and Table 1 show the percentages per domain. suggestions. Additionally, we thank and remember our recently-
The results from our trend analysis for each research domain departed team member Zoey for her feline support, without which
reveal that 3-D Representation (τ =.54, p<.001) and Wearables (τ =.55, the purrusal of this work would not have been as efective. May she
p<.001) had an increasing trend overall, whereas only AR/VR had cross the rainbow bridge in peace and fnd her way to cat heaven.
an increasing trend in recent publications. In contrast, Gaming
(τ =-.51, p<.05) had a decreasing trend in recent publications. Input REFERENCES
Methods/Interaction Techniques (τ =-.39, p<.05) and User Interfaces [1] ACM. n.d.. ASSETS Conference - Home. https://dl.acm.org/conference/assets.
(τ =-.55, p<.001) had a decreasing trend both overall and in recent (Accessed on 06/08/2022).
[2] Theodore W. Anderson and Donald A. Darling. 1954. A test of goodness of ft.
publications. We show the trends in Figure 2 and statistical results Journal of the American statistical association 49, 268 (1954), 765–769.
in Table 1 for each research domain. [3] GVU Center at Georgia Tech. n.d.. Workbook: Weaving CHI - Top Keyword Top-
ics. https://public.tableau.com/views/WeavingCHI\protect\discretionary{\char\
hyphenchar\font}{}{}TopKeywordTopics/TopKeywordTopics. (Accessed on
06/08/2022).
3 DISCUSSION & CONCLUSION [4] Alexy Bhowmick and Shyamanta M Hazarika. 2017. An insight into assistive
technology for the visually impaired and blind people: state-of-the-art and future
In this preliminary exploration, we examined the frequencies and trends. Journal on Multimodal User Interfaces 11, 2 (2017), 149–172.
trends of disability categories and computer science research do- [5] Emeline Brulé, Brianna J Tomlinson, Oussama Metatla, Christophe Joufrais,
mains in overall and recent publications at ASSETS, extending the and Marcos Serrano. 2020. Review of Quantitative Empirical Evaluations of
Technology for People with Visual Impairments. In Proceedings of the 2020 CHI
fndings from Mack et al. [11]. To achieve this goal, we extracted Conference on Human Factors in Computing Systems. 1–14.
3,234 keywords from 1,678 papers, including the poster and demon- [6] Faye Ginsburg and Rayna Rapp. 2013. Disability worlds. Annual Review of
stration papers, and manually categorized them into nine disability Anthropology 42, 1 (2013), 53–68.
[7] John M Grego. 1993. Generalized linear models and process variation. Journal of
categories and 12 research domains. Our results show that the focus Quality Technology 25, 4 (1993), 288–295.
on disability categories and research domains signifcantly varies [8] Alan M Jette, Marilyn J Field, et al. 2007. The future of disability in America.
(2007).
across publication years. Additionally, we conducted trend analyses [9] Maurice George Kendall. 1948. Rank correlation methods. (1948).
to identify the trends for each category and domain overall and in [10] Lior Levy, Qisheng Li, Ather Sharif, and Katharina Reinecke. 202. Respectful
recent publications at ASSETS. Language as Perceived by People with Disabilities. In The 23rd International
ACM SIGACCESS Conference on Computers and Accessibility (Virtual Event, USA)
Similar to the fndings by Mack et al. [11], our results showed (ASSETS ’21). Association for Computing Machinery, New York, NY, USA, Article
that publications focusing on visually disabled people are dispropor- 83, 4 pages. https://doi.org/10.1145/3441852.3476534
tionately higher than those focused on other disability categories. [11] Kelly Mack, Emma McDonnell, Dhruv Jain, Lucy Lu Wang, Jon E. Froehlich,
and Leah Findlater. 2021. What Do We Mean by “Accessibility Research”? A
Interestingly, our analyses did not identify a statistical increase or Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019.
decrease in the trend for publications focused on this demographic, In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
(Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York,
likely indicating consistency in the higher focus. On the other hand, NY, USA, Article 371, 18 pages. https://doi.org/10.1145/3411764.3445412
the Mental-Health-Related category shows an increasing trend in [12] Henry B Mann. 1945. Nonparametric tests against trend. Econometrica: Journal
both overall and recent publications, possibly attributing to the of the econometric society (1945), 245–259.
[13] Aboubakar Mountapmbeme, Obianuju Okafor, and Stephanie Ludi. 2022. Address-
recently increasing awareness of mental-health-related matters. ing Accessibility Barriers in Programming for People with Visual Impairments:
Future work can explore the correlation between public aware- A Literature Review. ACM Transactions on Accessible Computing (TACCESS) 15, 1
ness of specifc disability categories and their respective focus in (2022), 1–26.
[14] John Ashworth Nelder and Robert WM Wedderburn. 1972. Generalized linear
publications at ASSETS to investigate knowledge difusion. models. Journal of the Royal Statistical Society: Series A (General) 135, 3 (1972),
Similarly, our analyses revealed an increasing trend of AR/VR 370–384.
[15] Ather Sharif, McCall Aedan L., and Bolante Kianna R. 2022. Should I Say “Disabled
in publications at ASSETS over the recent years, likely attributing People” or “People with Disabilities”? Language Preferences of Disabled People
to its growing technological focus within and outside academia. Between Identity- and Person-First Language. In The 23rd International ACM
Surprisingly, the adoption of 3-D Representation in publications SIGACCESS Conference on Computers and Accessibility (Athens, Greece) (ASSETS
’22). Association for Computing Machinery, New York, NY, USA, To Appear.
shows an overall increasing trend but not for the recent publications. [16] Katta Spiel, Christopher Frauenberger, Os Keyes, and Geraldine Fitzpatrick. 2019.
Our work provides avenues for researchers to explore and gather Agency of autistic children in technology research—A critical literature review.
further insights into technology difusion in accessibility research. ACM Transactions on Computer-Human Interaction (TOCHI) 26, 6 (2019), 1–40.
[17] METTE Warburg. 2001. Visual impairment in adult people with intellectual
Since our work is a preliminary exploration, we only performed disability: literature review. Journal of intellectual disability research 45, 5 (2001),
categorization using the author-identifed keywords. Additionally, 424–438.
[18] Rua M Williams and Juan E Gilbert. 2020. Perseverations of the academy: A
while ASSETS is considered the top forum for accessibility research, survey of wearable technologies applied to autism intervention. International
other venues for accessibility-related research, including the ACM Journal of Human-Computer Studies 143 (2020), 102485.
Conference on Human Factors in Computing Systems (CHI), Web
for All Conference (W4A), and other academic journals are also
prevalent. Therefore, future work can employ our methodology
“What’s going on in Accessibility Research?” ASSETS ’22, October 23–26, 2022, Athens, Greece

A PUBLICATIONS PER DISABILITY CATEGORIES AND RESEARCH DOMAIN PER YEAR

Figure 3: Percentage of ASSETS publications per year for each disability category.

Figure 4: Percentage of ASSETS publications per year for each research domain.
A Participatory Design Approach to Explore Design Directions for
Enhancing Videoconferencing
Experience for Non-signing Deaf and Hard of Hearing Users
Yeon Soo Kim Sunok Lee Sangsu Lee
Industrial Design, KAIST Industrial Design, KAIST Industrial Design, KAIST
Daejeon, Repuplic of Korea Daejeon, Repuplic of Korea Daejeon, Repuplic of Korea
ykim18@kaist.ac.kr sunoklee@kaist.ac.kr sangsu.lee@kaist.ac.kr

ABSTRACT work focuses on the Deaf and hard of hearing (DHH) people’s use
The breakout of the COVID-19 pandemic shifted people’s daily ac- of the videoconferencing tools in a work environment to address
tivities from in-person to video-mediated ones. Many people with these restraints.
hearing loss encounter cognitive overload due to inefective visu- There are ongoing eforts to make videoconferencing inclusive
als of the videoconferencing interface and therefore fnd meeting of the DHH community, and they have learned to adapt to it in a
contents difcult to comprehend. This research incorporates a par- few ways, like reading captions, interpreting lip movements, or hav-
ticipatory design methodology to investigate the Deaf and Hard of ing sign-language interpreters [7, 24]. However, these approaches
Hearing (DHH) users’ tacit needs. DHH users demonstrated ways do not resolve the fundamental restraints. Advancement of auto-
of mitigating their hardships in the workshop, such as emphasiz- captioning technology[16] attempt to assist the DHH users in some
ing the visual hierarchy or assigning visual cues to fxed positions. ways but are often incomplete, incorrect, or delayed. Lip-readers
These fndings are used in developing design directions for creating fnd the digital screen size and quality not adequate for lip-reading
a more inclusive online environment. [15]. DHH users are frequently left with fractional information in-
adequate for a complete assistance, and their eforts to gather these
CCS CONCEPTS fragmental data for comprehension leave them with an excessive
cognitive load [18]. These technical problems are not always solved
• Human-centered computing → Graphical user interfaces.
by having signers available. Merely providing signers in an online
KEYWORDS meeting is counterproductive to making a videoconferencing en-
vironment accessible because only 1% of DHH people know sign
Hard of hearing, Accessibility, Videoconferencing language, and even those who know sign language may prefer to
ACM Reference Format: understand hearing people without a signer [11].
Yeon Soo Kim, Sunok Lee, and Sangsu Lee. 2022. A Participatory Design This research seeks to address an area that has not been discussed:
Approach to Explore Design Directions for Enhancing Videoconferencing DHH people’s videoconferencing occasions without sign language.
Experience for Non-signing Deaf and Hard of Hearing Users. In The 24th
We aim to create an inclusive environment for the DHH community
International ACM SIGACCESS Conference on Computers and Accessibility
with varying preferences of communication strategies. Our research
(ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA,
4 pages. https://doi.org/10.1145/3517428.3550375 goals are: 1) Understand DHH user’s pain points and needs for
communicating with coworkers when sign language is not the
1 INTRODUCTION main communication method, and 2) Explore design directions to
enhance these user’s video-mediated working experiences
The breakout of the COVID-19 pandemic pressured communities
For our study, we applied a participatory design method [3] with
and businesses around the globe to physically isolate and familiarize
a close interview to present qualitative results showing pain points
themselves with the virtual world, enabling various professional
and the possible design directions to enhance videoconferencing
work to be done without having to relocate. The current work-
experiences for DHH users in non-signing situations.
from-home nature of jobs seems to ofer a potential solution for the
work limitations in the DHH community, such as the barriers in
daily commute, which can lead to job termination[4, 12]. However,
the sound-dependent system of current videoconferencing services 2 RELATED WORKS
is not properly designed for people with hearing loss and does not There are prior researches conducted pursuing accessible com-
support an inclusive nor compelling work environment the DHH munication services for DHH people. There are lists of common
workers need, leaving most DHH workers in distress [1, 13]. This difculties in DHH people’s videoconferencing experiences and
guidelines for mitigating those, such as providing live caption and
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed transcripts[10] or providing visual and haptic feedback [19]. While
for proft or commercial advantage and that copies bear this notice and the full citation these studies provide important fndings for the DHH community
on the frst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
and suggest some foundational guidelines in making videoconfer-
ASSETS ’22, October 23–26, 2022, Athens, Greece encing accessible, these works consider sign language a primary
© 2022 Copyright held by the owner/author(s). communication method. On the other hand, our work aims to pin-
ACM ISBN 978-1-4503-9258-7/22/10. point DHH people’s experiences in a non-signing context. Another
https://doi.org/10.1145/3517428.3550375
research integrates ASR (automated speech recognition system) to
ASSETS ’22, October 23–26, 2022, Athens, Greece Yeon Soo Kim, Sunok Lee, and Sangsu Lee

Table 1: List of Participants

Hearing Loss Level Communication Method Communication Method


Participant Gender (dB HL) Afliation (In-person) (Online)
P1 Female Moderately severe (56-70) Student Lip-reading ASL, Lip-reading, Reading captions
Partial hearing, Lip-reading,
P2 Male Moderately severe (56-70) Store manager Partial hearing, Lip-reading Reading captions
Lip-reading, Reading facial Partial hearing, Lip-reading,
P3 Other Moderately severe (56-70) Music producer expressions and body language Reading captions, Typing
P4 Male Severe (56-70) Engineer ASL, Lip-reading, Writing ASL, Lip-reading, Reading captions

aid conversation between Deaf and hearing pairs [20], primarily fo- takes place on Zoom[26], the most commonly used and preferred
cusing on text-based conversation, which is one of many necessary videoconferencing tool for all recruited participants. The users par-
communication strategies for DHH users. This study intents to look ticipated in the remote user study in their usual settings, which also
at multiple conversational techniques combined as a whole (e.g. lip- acted to provoke their past memories of video-meeting experiences
reading, reading facial expressions and body language, using text, for the user study.
etc.), for many DHH people use combination of communication
method for fuller comprehension.There were also approaches using 3.1 Participants
live captions for DHH people’s comprehension in real-life situations We recruited four participants with moderately severe to severe
by using digital aid [25] or using smart glasses [17]. These works hearing loss who were frequently involved (more than three times
illustrate valuable data on DHH people’s cognitive interaction with a week) in a videoconference. The participants were often involved
live captions; however, they are confned to real-life situations and in work related video meetings without using sign language (refer
possibly not be the case in virtual settings. to Table 1 for detailed information on recruited participants).
Moreover, current studies of DHH users’ videoconferencing ex-
perience largely depend on verbal interviews, and an approach to 3.2 Procedure
discovering their tacit needs is uncommon. It is difcult for the
For efective discussion, we frst show a short video of a real cor-
researchers to relate to the experiences of users with impairments
porate Zoom meeting with six workers to provoke users’ past ex-
because their experiences widely vary [21] and the experiences
periences. We then proceed with our interview session and ask
relating to their disabilities may be hard to articulate [5]. Under-
the users questions regarding the general use of online meetings,
standing end-user experiences are crucial to align with the actual
their change in communication techniques between in-person and
needs of the user [9] when the issues are driven by their limitations
online conversation, and their approaches to understanding others
[8]. Although some studies incorporate co-design workshops with
better online. Our questions are solely used to guide the discussion,
DHH users to do so[17, 19], common ability bias during co-design
and participants are encouraged to express anything they like.
workshops can bias end-users’ real feedback and alter their au-
For participatory design workshops, users design iterations based
thentic insights [2]. Therefore, conducting a participatory design
on their personal experiences to create the desired interpretation of
activity with DHH users is central when we lack knowledge about
Zoom. The participants are provided Zoom components on Google
their interaction with digital technologies [14].
Slides for modifcation (Fig. 1a). The components include a GIF
image of the speaker, GIFs of fve non-speakers, text for captions,
3 METHOD and a shared screen. The meeting scene used for the workshop com-
The user study is divided into two consecutive sessions. First, inter- prises six total members; according to statistics, most work-related
view session explores the real-life difculties users face, discovers meetings include 4-6 people[23]. Possible modifcations of these
the essential areas for improvement, and triggers users’ memories components are resizing, relocating, cropping, changing bright-
from past videoconferencing experiences. Participants are asked ness or contrast, or recoloring them (Fig. 1b, 1c). There are three
to point out the aspects of current video-mediated services that tasks for designing: 1) to optimize understanding with captions,
need improvement in their perspectives. Second, a participatory 2) to optimize understanding with lip-reading, and 3) to optimize
design workshop is conducted over Google Slides [22] to have users understanding with additional shared material.
directly involved in rearranging the main UI components of Zoom.
In this session, the users discover their ways of optimizing their 4 FINDINGS AND DISCUSSION
comprehension in an online environment and seek solutions to All data were collected and analyzed by coding and afnity tech-
make it accessible for themselves. We anticipate these design de- nique to extrude current problems and the corresponding solutions
cisions in the workshop to uncover users’ tacit needs that may be found from the users. The users’ approaches for mitigating barriers
difcult to verbalize. and optimizing their comprehension were discovered by looking at
The study is conducted one-on-one in a private setting for each their design decisions: the components they decided to change and
participant to enhance freer discussion on disability. The user study how they changed them. We grouped the users’ solutions and our
A Participatory Design Approach to Explore Design Directions for Enhancing Videoconferencing
Experience for Non-signing Deaf and Hard of Hearing Users ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 1: a) Google slides set-up for the workshop include GIF of the speaker (pink dashed outline), GIFs of other meeting
members (blue fll) and subtitles (yellow solid outline). Participants are given common Zoom interface image as their workplace
(green dotted outline), b) Tutorial slide for creating and decorating text and inserting shapes, and c) Tutorial slide for adjusting
the GIF.

Figure 2: a) P2’s placement of the presenter (pink dashed outline) and the live captions (yellow solid outline), b) P4’s placement
of the presenter and live captions with a shared screen (blue fll), and c) P2’s design of zoomed in screen next to the speaker
(green dotted outline).

design directions under the corresponding problems: 1) Limitations overall ambiance of the meeting (P1). In the case of meetings with
of lip-reading leaves users highly dependent on captions, and 2) an additional screen being shared, the shared screen and the speaker
Unclear identifcation of speaker makes users neglected from the takes up the majority of the screen (Fig. 2b), and "those who don’t
meeting. talk can be eliminated on occasion to reduce visual distraction" (P4).
The amount of captions is reduced in this case because having more
text causes confusion when another material is being shared (P2).
4.1 Limitations of Lip-reading Leaves Users Our users’ comprehension method seemed far more complicated
Highly Dependent on Captions than those of signers; we observed non-signing DHH individuals
The diference in lip-reading experience between online and in- to switch communication method within the meeting according
person communication was considered the biggest challenge of to meeting context, speech-rate, or their momentary capability of
video-mediated communication. For the face-to-face conversations, taking in cognitive load.
all participants preferred to lip-read, and "there is no need to use text Design Directions: An accessible design should not confne the
to communicate in-person"(P2). On the other hand, in a videocon- user to a certain type of communication method and ofer all pos-
ference, the captions are a necessity (P2, P3), forcefully shifting the sible ones. Additional screens of speaker’s face or lips should be
users’ attention from looking at the speaker to focusing on the cap- available to support a clearer view, and the visual information must
tions, confning the DHH users to a text-oriented communication be prioritized accordingly. The size of the videoconferencing screen
method. However, these participants emphasized the importance is often limited, and securing the right amount of space is vital when
of keeping eye contact, looking at the speaker’s gestures, facial designing for DHH users. Videoconferencing interface should ofer
expressions, and precise lip movements to follow the conversation diferent communication modes: for lip-reading, for reading body
as a whole, and many did not prefer to rely on captions as an only language or facial expressions, for just caption reading, etc. This fea-
communicative method. ture is especially vital for non-signers because signing DHH people
Users’ Approach: Emphasized Visual Hierarchy for Diverse Com- have tendency to focus on the signer and the speaker throughout
munication Techniques. We observed the users setting visual hier- the whole meeting, whereas non-signers need more freedom in
archy and emphasizing information based on its signifcance. For choosing the right source of information to meet their dynamic
example, the speaker’s screen is usually considered the most im- needs.
portant and is enlarged to be the largest component on the screen
(Fig. 2a). However, in lip-reading scenarios, the zoomed-in portion
of the speaker becomes the largest (Fig. 2c). The captions come
4.2 Unclear Identifcation of Speaker Make
next in the hierarchy, occupying ample space and located next to Users Neglected From the Meeting
the speaker. Non-speaking participants are placed smaller than the Video meeting participants often come across occasions when mul-
speaker but are big enough for users to see their facial expressions tiple people are speaking at the time or when the speaker change
(Fig. 2) because users consider them important for suggesting the is abrupt. It is difcult for our DHH users to see other participants
ASSETS ’22, October 23–26, 2022, Athens, Greece Yeon Soo Kim, Sunok Lee, and Sangsu Lee

when their eyes are focused on the current speaker. The inability of [5] David Braddock, Mary C Rizzolo, Micah Thompson, and Rodney Bell. 2004.
speaker identifcation is problematic because being unsure of the Emerging technologies and cognitive disability. Journal of Special Education
Technology 19, 4 (2004), 49–56.
current speaker leads to difculty fnding the right time to speak [6] Matthew W Dye, Peter C Hauser, and Daphne Bavelier. 2008. Visual attention
up and, most of the time, they decide not to speak at all(P3, P4). in deaf children and adults. Deaf cognition: Foundations and outcomes (2008),
250–263.
The participants claimed the assistance from the host as unhelpful [7] Richard S Hallam and Roslyn Corney. 2014. Conversation tactics in persons with
because "it’s very hard for the presenter to realize, understand, or get normal hearing and hearing-impairment. International journal of audiology 53, 3
to know how many people ... have been left out."(P4) in an online (2014), 174–181.
[8] Shawn Lawton Henry, C Law, and K Barnicle. 2001. Adapting the design process
setting. to address more customers in more situations. In UPA (Usability Professionals’
Users’ Approach: Dedicated Locations for Each Information for Association) 2001 Conference.
Instant Identifcation. Our DHH users naturally found two solutions [9] Ilpo Koskinen, Katja Battarbee, and Tuuli Mattelmeaki. 2003. Empathic design.
IT press.
to mitigate their cognitive load and keep up with meeting contents: [10] Raja S. Kushalnagar and Christian Vogler. 2020. Teleconference Accessibility and
1) having essential components closer together and 2) consistently Guidelines for Deaf and Hard of Hearing Users. In The 22nd International ACM
SIGACCESS Conference on Computers and Accessibility (Virtual Event, Greece)
placing the essential components. Our participants assigned the (ASSETS ’20). Association for Computing Machinery, New York, NY, USA, Article
crucial components close to each other and in a place where they 9, 6 pages. https://doi.org/10.1145/3373625.3417299
could easily focus. Moreover, all participants asked to keep speaker [11] Susan Lacke. [n.d.]. Do all deaf people use sign language? https://www.
accessibility.com/blog/do-all-deaf-people-use-sign-language
and captions and other components to be "always fxed in their [12] Petrus Ng and Angela Tsun. 1999. Work Experience of People Who are Deaf or
position" (P1, P2, P3, P4) with captions always close to the speaker Hard of Hearing in Hong. 32, 3 (1999), 35–49.
screen (P1, P2, P3, P4) (Fig 2a). Although the participants have [13] Kiri O’Brien. 2020. How coronavirus is making life harder for deaf work-
ers. https://www.drutherssearch.com/how-coronavirus-is-making-life-harder-
diferent preferences of visual arrangement (e.g. left-bottom for for-deaf-workers/
P2, top-center for P3), they expressed the common need to have a [14] Sushil K Oswal. 2014. Participatory Design : Barriers and Possibilities. (2014),
14–19.
specifed location for these elements, preventing them from chasing [15] John Oswald. 2020. I’m deaf, and this is what happens when I get on a
around the speaker like they currently have to. zoom call. https://www.fastcompany.com/90565930/im-deaf-and-this-is-what-
Design Directions: Decreasing split visual attention [6] to reduce happens-when-i-get-on-a-zoom-call
[16] Otter.ai. 2021. Otter.ai. https://Otter.ai.
cognitive overload is commonly discussed; however, we recom- [17] Yi-Hao Peng, Ming-Wei Hsi, Paul Taele, Ting-Yu Lin, Po-En Lai, Leon Hsu,
mend a novel implication, which is to provide a designated location Tzu-chuan Chen, Te-Yen Wu, Yu-An Chen, Hsien-Hui Tang, and Mike Y. Chen.
for informative elements. The current "speaker mode" in Zoom at- 2018. SpeechBubbles: Enhancing Captioning Experiences for Deaf and Hard-
of-Hearing People in Group Conversations. In Proceedings of the 2018 CHI
tempts to fx the speaker in one place. However, this feature makes Conference on Human Factors in Computing Systems (Montreal QC, Canada)
other non-speakers’ screens very small and thus considered un- (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–10.
https://doi.org/10.1145/3173574.3173867
suitable; it does not consider that these users often want to clearly [18] Filipa M Rodrigues, Ana Maria Abreu, Ingela Holmström, and Ana Mineiro. 2022.
see other participants during the meeting. Also, speaker screen E-learning is a burden for the deaf and hard of hearing. Scientifc Reports 12, 1
is not the only important element during videoconferences, and (2022), 9346. https://doi.org/10.1038/s41598-022-13542-1
[19] Jazz Rui Xia Ang, Ping Liu, Emma McDonnell, and Sarah Coppola. 2022. “In This
all elements must be carefully and coherently placed. By doing so, Online Environment, We’re Limited”: Exploring Inclusive Video Conferencing
users can efortlessly access visual information all in one picture Design for Signers. In Proceedings of the 2022 CHI Conference on Human Factors in
and not lose optic focus by having to hunt for wanted elements on Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing
Machinery, New York, NY, USA, Article 609, 16 pages. https://doi.org/10.1145/
a given screen. 3491102.3517488
[20] Matthew Seita, Sooyeon Lee, Sarah Andrew, Kristen Shinohara, and Matt Huen-
erfauth. 2022. Remotely Co-Designing Features for Communication Applications
5 CONCLUSION Using Automatic Captioning with Deaf and Hearing Pairs. In Proceedings of the
Through our user study, we aimed to understand the videoconfer- 2022 CHI Conference on Human Factors in Computing Systems (New Orleans,
LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA,
encing experiences from DHH people’s perspectives and pinpointed Article 460, 13 pages. https://doi.org/10.1145/3491102.3501843
critical insights that need to be considered to resolve the current [21] Karin Slegers, Pieter Duysburgh, Helma van Rijn, and Niels Hendriks. 2012. Par-
issues. The study tackles an unexplored perspective of not sign ticipatory Design for Users with Impairments Afecting Cognitive Functions and
Communication Skills. In Proceedings of the 12th Participatory Design Conference:
language oriented online communication of the DHH community. Exploratory Papers, Workshop Descriptions, Industry Cases - Volume 2 (Roskilde,
We propose how their techniques for better comprehension can be Denmark) (PDC ’12). Association for Computing Machinery, New York, NY, USA,
141–142. https://doi.org/10.1145/2348144.2348190
utilized to provide a design direction that needs to be considered [22] Google Slides. 2021. Google Slides. https://www.google.com/slides/.
to make videoconferencing interface more accessible. [23] Louis Turmel. 2020. Meeting statistics - stats on costs & time spent in meetings.
https://bettermeetings.expert/meeting-statistics/
[24] Nancy Tye-Murray, Suzanne C Purdy, and George G Woodworth. 1992. Reported
REFERENCES use of communication strategies by SHHH members: Client, talker, and situational
[1] 2020. Deaf at work during the pandemic: Making gains with fexibility, optimism, variables. Journal of Speech, Language, and Hearing Research 35, 3 (1992), 708–717.
and resilience. https://www.nationaldeafcenter.org/news/deaf-work-during- [25] Elissa Weeden and Sharon Mason. 2020. An Initial Survey of Deaf and Hard-of-
pandemic-making-gains-fexibility-optimism-and-resilience Hearing Student Use of a Composite Screen Solution Utilizing Web Conferencing
[2] Ashley Ashley. 2022. The diference between co-design and participatory Software. In Proceedings of the 21st Annual Conference on Information Technol-
design. https://uxdesign.cc/diference-between-co-design-participatory-design- ogy Education (Virtual Event, USA) (SIGITE ’20). Association for Computing
df4376666816#:~:text=Participation%20allows%20for%20more%20input,the% Machinery, New York, NY, USA, 46–49. https://doi.org/10.1145/3368308.3415459
20project%20goals%20and%20outcomes. [26] Zoom. 2021. Video conferencing. http://zoom.us/.
[3] Liam J Bannon and Pelle Ehn. 2012. Design: design matters in Participatory
Design. In Routledge international handbook of participatory design. Routledge,
57–83.
[4] Gary R Bettger and Timothy J Pearson. 1989. Accommodating deaf and hard-of-
hearing persons on public transportation systems in Massachusetts. Transporta-
tion research record 1209 (1989), 16–18. http://dx.doi.org/
Co-designing a Bespoken Wearable Display for People with
Dissociative Identity Disorder
Patricia Piedade∗ Nikoletta Matsur∗ Catarina Rodrigues∗
ITI/LARSyS, Instituto Superior ITI/LARSyS, Instituto Superior ITI/LARSyS, Instituto Superior
Técnico, Universidade de Lisboa Técnico, Universidade de Lisboa Técnico, Universidade de Lisboa
Portugal Portugal Lisbon, Portugal
patricia.piedade@tecnico.ulisboa.pt nikoletta.matsur@tecnico.ulisboa.pt catarina.rebelo.rodrigues@tecnico.ulisboa.pt

Francisco Cecilio∗ Afonso Marques∗ Rings of Saturn


ITI/LARSyS, Instituto Superior ITI/LARSyS, Instituto Superior Lisbon, Portugal
Técnico, Universidade de Lisboa Técnico, Universidade de Lisboa
Portugal Lisbon, Portugal
franciscocecilio@tecnico.ulisboa.pt afonso.m.marques@tecnico.ulisboa.pt

Isabel Neto∗ Hugo Nicolau


INESC-ID, ITI/LARSyS, Instituto ITI/LARSyS, Instituto Superior
Superior Técnico, Universidade de Técnico, Universidade de Lisboa
Lisboa Lisbon, Portugal
Lisbon, Portugal hugo.nicolau@tecnico.ulisboa.pt
isabel.neto@tecnico.ulisboa.pt

Figure 1: Stages of the co-design process of creating a bespoken technology with a person living with DID. Left: unfnished
storyboard that the co-designer had to fll-in (last three slots). Middle: paper prototype used during the experience prototyping
activity and MoSCoW matrix resulting from the card sorting activity. Right: fnal concept, a pendant and necklace that displays
a representation of the fronting personality and pronouns.
ABSTRACT display (WhoDID) to facilitate in-person social interactions. The
Dissociative Identity Disorder (DID) is characterized by the pres- prototype aims to be used as a necklace and enable the user to
ence of at least two distinct identities in the same individual. This make their fronting personality visible to others. Thus, facilitating
paper describes a co-design process with a person living with DID. social encounters or sudden changes of identity. We refect on the
We frst aimed to uncover the main challenges experienced by the design features of WhoDID in the broader context of supporting
co-designer as well as design opportunities for novel technolo- people with DID. Moreover, we provide insights on co-designing
gies. We then engaged in a prototyping stage to design a wearable with someone with multiple (sometimes conficting) personalities
regarding requirement elicitation, decision-making, prototyping,
∗ Authors contributed equally to this research. and ethics. To our knowledge, we report the frst design process
Permission to make digital or hard copies of part or all of this work for personal or with a DID user within the ASSETS and CHI communities. We aim
classroom use is granted without fee provided that copies are not made or distributed to encourage other assistive technology researchers to design with
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored.
DID users.
For all other uses, contact the owner/author(s).
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Copyright held by the owner/author(s). CCS CONCEPTS
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3550369 • Human-centered computing → Accessibility.
ASSETS ’22, October 23–26, 2022, Athens, Greece Piedade, et al.

KEYWORDS in a relaxed environment to create a safe space for learning and


Dissociative Identity Disorder, Co-design, Wearable, Social Encoun- getting to know our co-designer. For added comfort, only two team
ters members were present: an interviewer and a notetaker.
We inquired about demographics, DID, and technology. We
ACM Reference Format: started by explaining the project’s goal and then used directed sto-
Patricia Piedade, Nikoletta Matsur, Catarina Rodrigues, Francisco Cecilio,
rytelling [3] to harness detailed descriptions of our co-designer’s
Afonso Marques, Rings of Saturn, Isabel Neto, and Hugo Nicolau. 2022.
Co-designing a Bespoken Wearable Display for People with Dissociative
experiences. We then asked follow-up open-ended questions.
Identity Disorder. In The 24th International ACM SIGACCESS Conference on This session enlightened us on many of the challenges faced by
Computers and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. our co-designer. Social encounters were one of the main emergent
ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3517428.3550369 discussion topics. The co-designer expressed awkwardness and
embarrassment in communicating which alter is fronting (i.e., in
1 INTRODUCTION charge of the body). Moreover, having people misidentify alters was
a source of frustration and misunderstandings. We thereby focused
Dissociative Identity Disorder (DID), formerly known as multiple
on facilitating in-person communication through externalizing who
personality disorder, is a clinical condition involving a “disruption of
in the system is fronting. Our co-designer showed great interest in
identity characterized by two or more distinct personality states"[1].
using a wearable device to indicate the fronting alter.
It consists of a sudden and temporary discontinuity in the sense of
agency and self [1]. DID is a complex and controversial diagnosis
[2, 6], often associated with childhood trauma and abuse [4], that
allows the host, who may be unable to deal with overwhelming
stress, to pass control of their body (i.e., switch) to an alternative 2.2 Exploring the Solution Space
identity, the alter. The term ‘alter’ is often used to describe other
Having a clear problem statement, we conducted a co-design work-
identities within the body. Alters can difer “in afect, behavior,
shop to craft the details of the wearable. We documented this session
cognition, consciousness, memory, perception, and sensory-motor
through video recordings, photos and notetaking.
functioning"[1].
We crafted two unfnished storyboards representing challeng-
People living with DID experience daily challenges such as
ing in-person communication scenarios referenced by our co-designer
stigma, fear, and maintaining relationships [9]. Although previ-
(Figure 1). The frst storyboard illustrated the co-designer hanging-
ous work has proposed adopting a lens from disability studies and
out with a friend and being disrupted by a switch. The second
incorporating the expertise of assistive technology researchers to
storyboard portrayed meeting a friend on the street when the host
support people with psychosocial disabilities [7], to the best of
was not fronting. We discussed how we might facilitate these situa-
our knowledge, there is no previous research exploring the role of
tions using technology and asked the co-designer to fnish flling
technology in supporting people living with DID.
both storyboards accordingly.
In this paper, we describe the process of designing a bespoke
We also conducted a card-sorting activity using the MoSCoW
technology to enhance social interactions by engaging in a co-
prioritization [8] technique to understand the co-designer’s views
creation process with a person living with DID. We contribute with
on collectively brainstormed features catering to their needs, deter-
1) a report on the design process, which ofers preliminary practical
mining which should be included (or excluded) in the prototype.
and ethical considerations to design with people living with DID;
The cards represented the features and the co-designer categorized
and 2) the design of a wearable display (WhoDID) to be used by
them into one of the four MoSCoW priority groups: Must have,
the sixth author of this paper.
Should have, Could have, and Will not have. The resulting MoSCoW
matrix informed the prototyping of a wearable necklace display
2 THE DESIGN OF WHODID
and its companion mobile app - WhoDID.
This section describes two 2-hour in-person co-design sessions with To simulate the use of WhoDID, we crafted a low-fdelity pro-
a person living with DID. We took inspiration from the Double totype of our wearable using cardboard and post-its (Figure 1).
Diamond design process by frst aiming to understand and defne We recruited a recent friend of the host, unacquainted with the
the challenge and then develop and test diferent solutions. The whole system, to conduct an experience prototyping activity [5].
co-designer is a fve-personality DID system (i.e., group of host We simulated a social interaction between the co-designer and
and alters). The host, Onyx, is 20-year-old and has been diagnosed their friend, during which a switch occurred. When the co-designer
with DID for a year. They share their body with Ana, a 20-year- touched the "screen", a Wizard would switch the post-it to display
old protector (alter who handles stressful situations), Oliver, a 17- the now fronting alter. During this session, an actual switch oc-
year-old caretaker (alter responsible for assuring the physical and curred. This helped solidify the robustness of the WhoDID concept,
mental needs of the body), Thalia, a 26-year-old caretaker, and as we avoided mentioning the switch, yet the friend was aware
Charlotte, a 7-year-old alter. This system lives in relative harmony, with whom they were interacting.
communicating and making decisions collectively. Finally, to complement the wearable, we prototyped a high-
fdelity companion app using Figma, which enabled the set up
2.1 Exploring the Problem Space of alters’ profles and control of the appearance of the wearable
We began our co-design process by investigating the daily struggles display. We demoed the app and discussed its features with the
of living with DID. We conducted a semi-structured interview co-designer.
Co-designing a Bespoken Wearable Display for People with Dissociative Identity Disorder ASSETS ’22, October 23–26, 2022, Athens, Greece

3 FINDINGS AND DISCUSSION must account for time (possibly even over several days) and allow
In this section, we highlight the main fndings regarding the design for “internal" discussions.
of WhoDID and lessons learned about designing with someone Consider Diversity Throughout the design process, our co-
living with DID. designer stressed the diversity of DID systems, from the number
of alters to the ease of communication among alters. Though we
3.1 WhoDID Design focused on a solution tailored to our co-designer, the fnal design
is highly customizable in terms of the number of alters, displays.
Supporting In-person Social Interactions. One of the co-designer’s For instance, from our co-design process until the time of writing
main challenges was their difculty conveying the fronting al- this paper, a new alter appeared within the co-designer system.
ter after a switch without awkwardness and unpleasant feelings. When discussing initial research with our co-designer, even in
While conveying a switch through digital technologies is relatively simple aspects such as DID-specifc terminology, we found that
straightforward using existing features of communication plat- our co-designer had their unique perspective and preferences. The
forms (e.g., social media profles, PluralKit Discord bot), there is not diversity in the experiences of people with DID and in the informa-
a similar tool for in-person communication. Thus, the co-designer tion relating to this diagnosis calls for added double-checking of all
explicitly focused on this challenge and fnding new ways of en- researched information with the specifc co-designer.
hancing in-person communication by easing the externalization of Ponder Ethics Triggering switches are one of the main ethi-
the alter in charge. cal concerns when designing with individuals with DID. Switches
Designing a Wearable Display. We explored several alterna- produce both physical and mental changes, possibly even pain.
tives of wearable form factors (e.g., bracelet, watch, rings) for dis- Thus, there are critical implications for designers trained in partic-
playing the fronting alter. The co-designer strongly preferred a ipatory practices of including stakeholders in the design process.
pendant in a necklace due to its fexibility to be highly visible or Participatory activities that require decision-making, discussion,
hidden at will. During the experience prototyping activity, we ob- and consensus (e.g., prioritizing features) can be highly problematic
served that it was natural for the co-designer to perform swipe as they can trigger several switches. Moreover, testing prototypes
gestures on the display due to its familiarity and robustness to with the various alters may require a longer time frame and the use
accidental taps. of asynchronous/remote methods to leverage naturally occurring
Privacy Concerns. The main safety concern emerging during switches rather than trigger them.
the design process was related to using biometric sensing to detect
a switch automatically. While this feature could reduce the inter-
action with the device and seemly transition between alters, the 4 CONCLUSION AND FUTURE WORK
co-designer reached a consensus that each alter should have the We have presented the co-design of a bespoken wearable display
freedom to choose when to identify themselves. This identifcation to support in-person social interactions for people with DID. We
process should not attract attention from onlookers; fashing lights highlight fndings related to the design of WhoDID and the design
or sound cues should not be used. process itself. Results show that to co-design with people with
Companion App Features. Being able to customize the visual DID, it is not feasible to make decisions based on opinions from one
representation of each alter is a key feature. WhoDID supports the alter; we should aim for consensus of the diferent alters and double-
personalization of background color and image. Interestingly, the check whether all voices are heard. We also refect on the ethical
co-designer preferred to display the alter’s pronouns over their concerns of this activity as switches impact the individual comfort
names as a balance to provide enough information for a respectful and well-being. Future work will develop a functional prototype
social interaction without disrupting the norms of meeting someone and evaluate its efectiveness in supporting our co-designer’s needs
for the frst time (i.e., asking for their name). A central concern in the wild.
was the possibility of a given alter making destructive changes to
another alter’s profle (e.g., deleting it). As a risk mitigation measure, ACKNOWLEDGMENTS
WhoDID supports an edit window of 15 days, which allows the
This work was supported through FCT’s projects UIDB/50009/2020
user to undo any action. Although this feature does not prevent
and SFRH/BD/06452/2021.
destructive behaviors, it is a coping mechanism. We believe these
challenges are fertile ground for usable privacy/security research.
Lastly, to tackle the awkwardness when a friend of the system REFERENCES
frst encounters an unknown alter by chance, WhoDID features [1] American Psychiatric Association. 2013. Diagnostic and statistical manual of mental
disorders: DSM-5. Vol. 5. American psychiatric association Washington, DC.
proximity notifcations, allowing the system to choose from trusted [2] David Blihar, Elliott Delgado, Marina Buryak, Michael Gonzalez, and Randall
individuals who should be informed of who is fronting when facing Waechter. 2020. A systematic review of the neuroanatomy of dissociative identity
a possible chance encounter. disorder. European Journal of Trauma & Dissociation 4 (9 2020), 100148. Issue 3.
https://doi.org/10.1016/j.ejtd.2020.100148
[3] Shelley Evenson. 2006. Directed storytelling: Interpreting experience for design.
3.2 Lessons Learned from the Design Process Design Studies: Theory and research in graphic design (2006), 231–240.
[4] Eva Irle, Claudia Lange, Godehard Weniger, and Ulrich Sachsse. 2007. Size abnor-
Wait for Consensus As the system contains multiple alters within malities of the superior parietal cortices are related to dissociation in borderline
the body, our co-designer expressed that it is vital to allow them personality disorder. Psychiatry Research: Neuroimaging 156 (11 2007), 139–149.
Issue 2. https://doi.org/10.1016/j.pscychresns.2007.01.007
enough time to reach a consensus when decision-making, as dif- [5] Ken Keane and Valentina Nisi. 2014. Experience Prototyping. , 224-237 pages.
ferent alters may have diferent opinions. Thus, design processes https://doi.org/10.4018/978-1-4666-4623-0.ch011
ASSETS ’22, October 23–26, 2022, Athens, Greece Piedade, et al.

[6] M. M. McAllister. 2000. Dissociative identity disorder: a literature review. Journal 10.1145/3308561.3353785
of Psychiatric and Mental Health Nursing 7 (2 2000), 25–33. Issue 1. https://doi. [8] Kelly Waters. 2009. Prioritization using moscow. Agile Planning 12 (2009), 31.
org/10.1046/j.1365-2850.2000.00259.x [9] Melissa Zeligman, Jennifer H. Greene, Gulnora Hundley, Joseph M. Graham, Sarah
[7] Kathryn E Ringland, Jennifer Nicholas, Rachel Kornfeld, Emily G Lattie, David C Spann, Erin Bickley, and Zachary Bloom. 2017. Lived Experiences of Men With
Mohr, and Madhu Reddy. 2019. Understanding Mental Ill-Health as Psychosocial Dissociative Identity Disorder. Adultspan Journal 16 (10 2017), 65–79. Issue 2.
Disability: Implications for Assistive Technology. The 21st International ACM https://doi.org/10.1002/adsp.12036
SIGACCESS Conference on Computers and Accessibility, 156–170. https://doi.org/
Co-designing the automation of Theatre Touch Tours
Alexandra Tzanidou Sami Abosaleh Vasilis Vlachokyriakos
a.tzanidou2@newcastle.ac.uk a.h.s.abosaleh2@newcastle.ac.uk Open Lab, Newcastle University
Open Lab, Newcastle University Open Lab, Newcastle University Newcastle upon Tyne, UK
Newcastle upon Tyne, UK Newcastle upon Tyne, UK Vasilis.Vlachokyriakos1@newcastle.ac.uk

Figure 1: Co-design WS - Phase 1: Creating Tangible Objects for the performance


ABSTRACT KEYWORDS
This paper presents the fve-step co-design process of a prototype Co-design, Inclusion, Accessibility, Social Model of Disability, Touch
system for the automation of Touch Tours (TTs) service in theatre Tours
performances. ACM Reference Format:
By grounding our research on the Social Model of Disability, we Alexandra Tzanidou, Sami Abosaleh, and Vasilis Vlachokyriakos. 2022. Co-
motivate our initial design by examining the outputs of our col- designing the automation of Theatre Touch Tours. In The 24th International
laboration with the frst inclusive professional theatre ensemble in ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22),
Greece. We present our refections on the discourse created during October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 5 pages.
the co-design process, and we contribute some initial fndings of https://doi.org/10.1145/3517428.3550370
the process.
1 INTRODUCTION
CCS CONCEPTS Individuals who are experiencing disabilities have not enjoyed equal
• Human-centered computing → Accessibility technologies; access to traditional theatre. Modern theatre has challenged this,
• General and reference → Design; • Applied computing → and is trying to accommodate access with equity to the theatre
Performing arts; experience. As a part of an ongoing research project that examines
inclusion and accessibility in theatre based on the social model of
Permission to make digital or hard copies of part or all of this work for personal or disability [17], this paper focuses, specifcally, on the service of
classroom use is granted without fee provided that copies are not made or distributed Touch/Sensory Tours.
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. Touch/Sensory Tours (TTs) services are ofered by theatre venues
For all other uses, contact the owner/author(s). that ofer the service of audio description during the performances
ASSETS ’22, October 23–26, 2022, Athens, Greece [16], to create stimuli that substitute for the visual ones for the
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. blind or visually impaired (B/VI) members of the audience, through
https://doi.org/10.1145/3517428.3550370 the sense of touch [19]. Although the value of the TTs is recognised
ASSETS ’22, October 23–26, 2022, Athens, Greece Tzanidou, Abosaleh, Vlachokyriakos.

[19, 20],the availability of the service is not common according to lack of visibility of such work in Greece, where this research has
our research, mostly because of organisational and timing reasons taken place. While the co-design process is still ongoing, this paper
that we will briefy mention. presents the initial stages of a co-design process with a community
of experts in inclusive theatre.
2 RELATED WORK
This work approaches accessible design based on the social model 3.1 Ethnographic Research
of disability (SMoD) as described by Oliver [17].Oliver’s model The idea to automate TTs services was one of the outcomes of the
states that ‘impairment’ is dysfunction, while ‘disability’ is the lim- 18-month ethnographic research engagement of the frst author
itation created by the dysfunction because society fails to provide with the THEAMA ensemble. The scope of those engagements was
the infrastructure needed to include people with dysfunctions. Ac- focused on the observation and the understanding of the inclusive
cording to this distinction, the problem is created by society, which methodologies and patterns that the group is following, going a
emphasises the weaknesses of people with dysfunctions [16, 17]. In step outside of the traditional HCI comfort zone.
this work, inclusion is defned based on the disabled rights move-
ment’s slogan ‘Nothing about us, without us’, which requests the 3.2 Focus Group (FG) - Understanding
inclusion of disabled people in all the steps of the decision-making co-designers needs
process [10]. This slogan, in our case, is interpreted as the inclusion
Our initial step was to share our fndings and ideas about the TTs’
of disabled people in all the steps of the design process.
automation with the director of the group, who saw our idea as an
Inclusive theatre is defned as the form of artistic expression that
opportunity to enhance the accessibility services already provided
includes artists of all colours, genders, races, religions, nationalities
by THEAMA ensemble. Thus, we initiated this process by discussing
and ages who are experiencing disability or not, while having as its
the topic with the members of THEAMA. They shared with us their
main ideological approach the equal participation and integration of
thoughts, perspectives and ideas about introducing such a service
all people without exclusions in art and consequently in society [6,
in Greece. The most important factors of this focus group were
21]. The inclusive theatre approach ofers inclusion in participation,
that 1) all the participants (fve professional actors – referred to as
and furthermore, aims to enhance the accessibility ofered on and of
FGP1–5) knew and had already experienced theatre TTs at least
the stage [2]. Theatre accessibility includes all the factors connected
in one performance as spectators; and 2) they were not informed
to physical and content accessibility [3].
about our idea of incorporating the TTs’ service in their work,
which ofered us the opportunity to collect their very frst reactions
2.1 Disability - The Greek Context
and thoughts.
Accessibility is a painful issue in Greece, due to societal barriers.
The lack of funding and relevant legislation prevent inclusion with
equity and independent living of people experiencing disability
[5, 14, 18]. The Hellenic Greek parliament voted for the adoption
of the United Nations Convention on disabled rights in September
2017. This vote took place just a few months after the ‘able-bodied’
requirement for access to tertiary-level drama education by the
Greek Ministry of Culture was removed in February 2017 [1].

2.2 Our partners THEAMA & ISON


To explore our research questions, based on desk research for in-
clusive theatre groups supporting the social model of disability in
Greece, we chose to work with the THEAMA ensemble, founded Figure 2: Pre-workshop activity package
in 2010 by Vasilis Oikonomou 1 . THEAMA’s target was to fll the
void that existed in the sector of professional training and the
career-oriented approach for people experiencing disability who 3.3 Pre-workshop Activity (PWA)
are actively involved in the performing arts. ISON 2 is the inclusive
For the PWA, we prepared ten paper bags with materials connected
education workshop founded by THEAMA and the inclusive dance
to the senses and experiences discussed by the participants of the
group EXIS.
FG. Those materials allowed the participants to create artefacts for
the workshop. The bags were distributed to actors of THEAMA
3 CO-DESIGN and students of ISON who had an interest in the project. A fyer
In this project, we employ a constructivist approach towards the co- explaining the scope and the description of the activity, along with
designing of systems to support inclusion in theatre. The existence our contact information, was included. On the outer side of the
and use of TTs services in theatre were unknown to the researchers bag, we placed a sticker with a QR code, which contained a link
at the beginning of this project (other than using touch to enhance to an online version of the fyer, to be leveraged by screen reader
accessibility in museums or art installations [4, 11, 13]), due to the users. The phrase ‘Scan here’ was written around the QR code in
1 https://theamatheater.gr/en/intro/ English, Greek, English Braille and Greek Braille, creating in this
2 https://ison.com.gr/ way a ‘Braille frame’ around the QR code.
Co-designing the automation of Theatre Touch Tours ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: The frst prototype developed for TTs automation.

3.4 TTs Research Workshop - Research


through making together
The activity was aimed at the people who received the PWA bag,
hoping to gather the perspectives of the co-designers. This was a
double-scoped workshop, aiming to collect data regarding the im-
portant elements of the TTs from the point of view of the artists and
the diferent ways of communicating them, but also to collectively
observe the benefts we could get out of the procedure and the expe-
rience [7]. In the workshop, nine people participated (referred to as
WP1–9); two of them were B/VI. We frst held a discussion on our
previous work and fndings on TTs and the participants presented
and communicated their creations. Afterwards, we refected on the
previous parts of the workshops and the way that the sense of touch
enriched our experience; while in the fnal part, the participants
were expected to experience and refect on the prototype.

3.5 Prototype System


The prototype system was developed based on data gathered dur-
ing the FG, along with prior related work [11]. Thus, the system
combined diferent elements to provide a diversity of stimuli to
the participants. The prototype system consisted of a 3D printed
bust of Einstein, a barber’s mannequin head, a scarf, two candles Figure 4: Performing and experiencing the prototype during
of diferent fragrances, and a 3D diorama of the hypothetical stage the fnal part of the workshop
with small-scale 3D printed objects in it. All the aforementioned
elements were connected to a Raspberry Pi with the help of con-
ductive ink. Each of the elements was connected to a specifc piece
of text that could be read through the printed text, through a smart- 4.1 Reasons for Incorporating TTs and
phone using the QR code or through a Bluetooth speaker when the Prerequisites
corresponding element was touched. Our initial aim was to understand if the automation of such a ser-
vice, usually conducted by human professionals, would make sense.
During the FG, [P1] and [P3], who have also directorial duties
4 ANALYSIS within the team, mentioned instances of deaf, hard of hearing and
From the engagements mentioned above, we collected qualitative B/VI people contacting them in the past, and asking them for short
data in the form of audio recordings from interviews and workshops descriptions of the characters or other materials that could help
that we later transcribed, in combination with ethnographic notes. them have a better theatre experience. We also learned about the
To analyse such a corpus of data, we used Braun and Clarke’s importance of bringing the spectator into a specifc experience, and
thematic analysis[8]; we initially coded the data descriptively (and the connection that touch and the incorporation of other senses
inductively) [9] and we later created categories of codes and formed have with experience creation. As [P2] expressed, “If the actor wears
initial themes refectively, in respect to the social model of disability a winter coat and the spectator touches a piece of the coat, this au-
(and thus deductively). tomatically means [for them] winter, cold ...”. All the participants
ASSETS ’22, October 23–26, 2022, Athens, Greece Tzanidou, Abosaleh, Vlachokyriakos.

shared the same thoughts on the introduction of TTs, and the signif- conventional power relations between academia and community
icant added value that TTs would bring to the accessibility services (or ‘research participants’), by giving control to the community to
currently ofered. However, three of them were particularly worried tell us what they need to know or make. In that sense, the knowl-
about letting other people touch them, their clothes or their faces edge we collected from ethnographic and participatory work with
for various reasons, including the fear of COVID [P4], or the fact inclusion experts (the community) will be used to start conversa-
that some of the actors might feel upset due to such close contact tions with HCI and design experts about what can be made, what
with strangers [P2, P5]. Finally, all the actors agreed that the last the limits of current technology are and how we can creatively con-
hour before the show is important for them to calm down and fgure it. Then, these conversations and design ideas will be brought
concentrate, while it is also the time in which the TTs usually take back to the community to adjust and confgure. As such, we, as the
place. researchers involved in both community and academia, have the
role of mediating and facilitating inclusive design across expertise
4.2 False Understandings – No need for (inclusion and technology design) and space (remote collaboration
accuracy across the UK and Greece) [10].
Throughout the workshop, the two groups chose to name them-
selves based on performances of their preference, Troades and ACKNOWLEDGMENTS
Macbeth. Team Troades chose to create objects that the actors held This research was funded by the EPSRC CDT in Digital Civics
while performing, and parts of their costumes, and provide seman- (EP/L016176/1). The authors thank our collaborators THEAMA and
tic meanings for their creations. Team Macbeth created miniature ISON for the time and expertise they devoted to this research study.
dolls that they held while they were performing together. When
asked about their choices and the procedure they followed, the REFERENCES
actors told us that, in art, fdelity or accuracy are not important in [1] 2019. https://www.disabilityartsinternational.org/resources/greece-country-
profle/
the creation of objects. Thus, we conclude that, in theatre, the right [2] Talleri Adkins Mcrae and Mickey Rowe. 2019. The Future of Theatre is Accessible.
approach is the utilisation of the accessibility services experts for https://howlround.com/series/future-theatre-accessible
the transfer of semiotics and meanings, and triggering the imag- [3] Austin Allison. 2018. Accessibility in the Theatre. Honors Thesis. Ouachita Baptist
University.
ination rather than the image. As [WP1] notes: [we should] “use [4] Vassilios S. Argyropoulos and Charikleia Kanari. 2015. Re-imagining the mu-
stimuli to let imagination free, the description shouldn’t reveal the seum through “touch”: Refections of individuals with visual disability on
plot, we watch the same play, we don’t see the same thing, the details their experience of museum-visiting in Greece. Alter 9, 2 (2015), 130–143.
https://doi.org/10.1016/j.alter.2014.12.005
frame the imagination.” [5] Stathis Balias and Pandelis Kiprianos. 2014. Disability in Greece: Social Perception
and Educational Policies. The Review of Disability Studies: An International Journal
1 (2014).
4.3 Experience Enhancement, Activism [6] Stephanie Barton Farcas. 2017. Disability and Theatre (1 ed.). Routledge.
Based on the notion of creating stimuli rather than providing details, [7] Cynthia L. Bennett, Burren Peil, and Daniela K. Rosner. 2019. Biographical
Prototypes: Reimagining Recognition and Disability in Design. In Proceedings
participants noticed that our representations were based on realistic of the 2019 on Designing Interactive Systems Conference (San Diego, CA, USA)
details and expressed that they might not choose this approach, (DIS ’19). Association for Computing Machinery, New York, NY, USA, 35–47.
now that they knew that they could prepare objects by themselves https://doi.org/10.1145/3322276.3322376
[8] Virginia Braun and Victoria Clarke. 2006. Using thematic
while rehearsing. As [WP3] said, a service like this would beneft analysis in psychology. Qualitative Research in Psychology 3,
all the people involved in a performance as “Some people have this 2 (2006), 77–101. https://doi.org/10.1191/1478088706qp063oa
arXiv:https://www.tandfonline.com/doi/pdf/10.1191/1478088706qp063oa
special ability to connect tangible objects with their own personal [9] Virginia Braun and Victoria Clarke. 2021. Can I use TA? Should I use
experiences and create new meanings out of them”. TA? Should I not use TA? Comparing refexive thematic analysis and
While experiencing the prototypes, there was a lot of discourse other pattern-based qualitative analytic approaches. Counselling and Psy-
chotherapy Research 21, 1 (2021), 37–47. https://doi.org/10.1002/capr.12360
around them and their connection with the activist approach of the arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/capr.12360
group. As they explained, a big priority for them is to make a state- [10] James I. Charlton. 1998. Nothing About Us Without Us: Disability Oppression
ment by incorporating this accessibility service. Their approach to and Empowerment. University of California Press. https://doi.org/doi:10.1525/
9780520925441
using accessibility services is twofold: for practical reasons (theatre [11] Han-Xing Chen and Wen Huei Chou. 2020. Exploratory Design Research
access with equity), as well as to bring visibility to an issue, as for the Blind and Visually Impaired Visitor in Exhibitions. The Design Jour-
nal 23, 3 (2020), 395–417. https://doi.org/10.1080/14606925.2020.1744257
“Accessibility is not a thing of our everyday life, thus activism is a arXiv:https://doi.org/10.1080/14606925.2020.1744257
need right now” [WP1]. [12] Andrew Garbett, Rob Comber, Edward Jenkins, and Patrick Olivier. 2016. App
Movement: A Platform for Community Commissioning of Mobile Applications.
Association for Computing Machinery, New York, NY, USA, 26–37. https://doi.
5 FUTURE WORK org/10.1145/2858036.2858094
In the next steps of this research, we will focus on the understanding [13] Gunnar Jansson, Massimo Bergamasco, and Antonio Frisoli. 2003. A new option
for the visually impaired to experience 3D art at museums: manual exploration
of stimuli creation through technology, respecting the restrictions of virtual copies. Visual Impairment Research 5, 1 (2003), 1–12. https://doi.org/
that can be created by the visual culture [11]. We plan to do so 10.1076/vimr.5.1.1.15973 arXiv:https://doi.org/10.1076/vimr.5.1.1.15973
[14] Lefkothea Kartasidou, Ioanna Dimitriadou, Elisavet Pavlidou, and Panagiotis
by following a reverse commissioning process (examples of such Varsamis. 2013. Independent living and interpersonal relations of individuals with
a process is also talked about in [12, 15]). By reverse commission- intellectual disability: The perspective of support staf in Greece. International
ing we refer to the community (in this case the inclusive theatre Journal of Learner Diversity and Identities 19, 1 (2013), 59–73.
[15] Kellie Morrissey, Andrew Garbett, Peter Wright, Patrick Olivier, Edward Ian Jenk-
group) being put in the position of ‘commissioning’ research and ins, and Katie Brittain. 2017. Care and Connect: Exploring Dementia-Friendliness
design/development work from the university. This reverses the Through an Online Community Commissioning Platform. In Proceedings of the
Co-designing the automation of Theatre Touch Tours ASSETS ’22, October 23–26, 2022, Athens, Greece

2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, [19] J.P. Udo and D.I. Fels. 2010. Enhancing the entertainment experience
USA) (CHI ’17). Association for Computing Machinery, New York, NY, USA, of blind and low-vision theatregoers through touch tours. Disability &
2163–2174. https://doi.org/10.1145/3025453.3025732 Society 25, 2 (2010), 231–240. https://doi.org/10.1080/09687590903537497
[16] Mala D Naraine, Margot R Whitfeld, and Deborah I Fels. 2018. Who’s devising arXiv:https://doi.org/10.1080/09687590903537497
your theatre experience? A director’s approach to inclusive theatre for blind and [20] Margot Whitfeld and Deborah I. Fels. 2013. Inclusive Design, Audio
low vision theatregoers. Visual Communication 17, 1 (2018), 113–133. https://doi. Description and Diversity of Theatre Experiences. The Design Journal
org/10.1177/1470357217727678 arXiv:https://doi.org/10.1177/1470357217727678 16, 2 (2013), 219–238. https://doi.org/10.2752/175630613X13584367984983
[17] Michael Oliver. 1990. Politics of disablement. Macmillan International Higher arXiv:https://doi.org/10.2752/175630613X13584367984983
Education. [21] Roger Wooster. 2009. Creative inclusion in community theatre: a journey with
[18] Maria Poli. 2020. Greece, Tourism and Disability. In Strategic Innovative Mar- Odyssey Theatre. Research in Drama Education: The Journal of Applied Theatre
keting and Tourism, Androniki Kavoura, Efstathios Kefallonitis, and Prokopios and Performance 14, 1 (2009), 79–90. https://doi.org/10.1080/13569780802655814
Theodoridis (Eds.). Springer International Publishing, Cham, 667–675. arXiv:https://doi.org/10.1080/13569780802655814
Creating Personas for Signing User Populations: An
Ability-Based Approach to User Modelling in HCI
Amelie Nolte Karolin Lueneburg
amelie.nolte93@gmail.com Ergosign GmbH
nolte@imis.uni-luebeck.de Berlin, Germany
Ergosign GmbH Edinburgh College of Art
Hamburg, Germany The University of Edinburgh
Institute for Multimedia and Interactive Systems Edinburgh , United Kingdom
University of Luebeck
Luebeck, Germany

Dieter Wallach Nicole Jochems


dieter.wallach@ergosign.de jochems@imis.uni-luebeck.de
Ergosign GmbH Institute for Multimedia and Interactive Systems
Saarbruecken, Germany University of Luebeck
Luebeck, Germany

ABSTRACT 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 6 pages. https:
Personas allow designers to build empathy towards users, avoiding //doi.org/10.1145/3517428.3550364
reliance on stereotypes or assumptions. When it comes to creat-
ing personas for people with disabilities, a common procedure is 1 INTRODUCTION
to integrate impairments as dedicated characteristics. However, Personas as a method within HCI-design [7, 8] ofer a way of ad-
when considering Deaf and Hard of Hearing (DHH) individuals dressing and refecting upon user needs, as the method aims to
as target group, we argue that such an approach is incompatible "create empathic descriptions of the user" [39, p. 5]. As such, per-
with the latter’s own ability-based identifcation concepts. Thus, sonas are intended to prevent designers from having to rely on their
we propose a modifcation of personas for DHH users by defn- own assumptions or stereotypes and, instead, see things from the
ing them in terms of ability ranges and driving or keeping factors user’s perspective [17, 39]. A common procedure for applying this
within specifc contexts. Starting with assumption personas based approach to people with disabilities is to integrate medical impair-
on previous research, we conducted focus groups with DHH in- ments as a dedicated characteristic. However, considering Deaf and
dividuals from which we derived the data and perspectives that Hard of Hearing (DHH) sign language users, research shows that
formed our modifcation. With this approach, we want to promote they do not understand themselves to be impaired or disabled, but
an adequate representation of DHH users among designers while instead view their Deafness as cultural identifcation [14, 18, 46].
also demonstrating an application of Ability-Based Design to HCI Therefore, in order to achieve adequate representations of this tar-
methods. get group, personas should refect their ability-based perspective—a
requirement, as we argue, that cannot be met by integrating im-
CCS CONCEPTS pairments but demands a shift in the creation of personas towards
• Human-centered computing → User models. a more explicit and dynamic modelling of abilities, contexts, and
infuential factors. Thus, this paper proposes a modifcation of the
KEYWORDS persona method by defning users in terms of ability ranges and
Personas, Ability-Based Design, Ability-based user modelling, Deaf, driving or keeping factors within specifc contexts. The result is
Hard of Hearing, Sign language users a persona design for DHH sign language users that pursues an
ACM Reference Format: ability-based perspective, as put forth by Wobbrock et al. [49, 50].
Amelie Nolte, Karolin Lueneburg, Dieter Wallach, and Nicole Jochems. Although initially defned for the context of travelling, we believe
2022. Creating Personas for Signing User Populations: An Ability-Based that our approach can be applied and adapted to other contexts or
Approach to User Modelling in HCI. In The 24th International ACM SIGAC- target groups as needed. The contribution of this paper lies, there-
CESS Conference on Computers and Accessibility (ASSETS ’22), October fore, not only in a novel perspective on the design of personas for
Permission to make digital or hard copies of part or all of this work for personal or
DHH users but in the creation of ability-based personas altogether.
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation 2 RELATED WORK
on the frst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s). Personas are proposed by previous research as one solution to
ASSETS ’22, October 23–26, 2022, Athens, Greece address the needs and requirements of either diverse disabilities
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. groups [16, 19–21, 27, 37, 44, 47] or specifc groups, e.g., elderly
https://doi.org/10.1145/3517428.3550364 users [5, 31, 41], dementia patients [24], or DHH individuals [22].
ASSETS ’22, October 23–26, 2022, Athens, Greece Amelie Nolte, Karolin Lueneburg, Dieter Wallach, and Nicole Jochems

A common approach among these studies in regard to creating insights gained throughout the focus group sessions and created
personas for specifc, accessibility-concerned target users is to in- an ability-based persona for DHH sign language users in travelling
tegrate medical impairments into the otherwise unchanged per- contexts as depicted in Figure 1 (see Appendix A). This persona con-
sona structure, i.e., by modelling aspects such as degree of vision sists of three components which are integrated into one coherent
loss, diagnosed illnesses or hearing status. These aspects and, thus, template3 :
the resulting persona designs are captured in static, determined (1) Persona type: The persona is characterised through a de-
descriptions—a procedure that is criticised by Edwards and col- scriptive label, an image, and a quote of something the persona
leagues [12, 13] as being insufciently representative for any type would say about themselves. We deliberately discarded the idea of
of user group. specifying demographic aspects (e.g., name, age, profession, marital
The relevance of this criticism in regard to DHH individuals is status) to minimise the risk of stereotypical assumptions connected
highlighted by a variety of studies [1, 2, 10, 23, 29, 30, 34, 35, 45]. with specifc aspects. In addition, we chose a sketch as an image,
These address cultural and ethical aspects of working with or design- seeing that sketches have proven to reduce less stereotypes than
ing for sign language users, emphasising repeatedly the dilemma photographs [42]. In this way we formed the Enthusiastic Traveller
within DHH communities of being treated as disabled persons, as persona type for whom travelling is a substantial and important
but not identifying as such, i.e., receiving rigid, infexible attribu- part of life and who is concerned mostly with being independent
tions that do not correspond to their own ability-based, diverse and unrestricted in their travelling activities. We chose this type
self-concept. Recent approaches apprehend such gaps between based on our focus groups, which indicated a high passion for
designers and users by demanding further consideration of the con- travelling throughout all participants.
cepts of empathy [3] and identity [33] within personas, but fail to (2) Environmental drivers and keepers: The middle part of
provide solutions that model the complexity of ability-based diver- the persona contains information on relevant contexts or environ-
sity among sign language users. Microsoft [36] has also addressed ments, each complemented with factors (external or internal) which
the demand for diversity in personas by developing the Persona either motivate (i.e., drivers) or block (i.e., keepers) the persona.
Spectrum method which ofers a ways of overcoming designer’s When considering technology design for travel information and
ability bias by considering situational, temporary, and permanent services, both, the travel context as well as social environments
impairments. However, it associates the absence of certain abilities should be analysed in order to understand where a design can ofer
as a defcit, i.e., as something caused by or linked to the individual support or compensation. For the Enthusiastic Traveller, we identi-
instead of its surrounding environments. Therefore, our approach fed independence and adequate technology design as drivers when
seeks to challenge this perspective on personas by building on the travelling, with lack of information and untrained staf as respec-
Persona Spectrum as well as on the discussed fndings from pre- tive keepers. Within social environments the persona is motivated
vious research on DHH identifcation concepts and diversity and through personal relations in- and outside of DHH communities,
applying an ABD perspective [49, 50] to the creation of personas while being frustrated by hearing people’s lack of consideration,
for DHH sign language users. their focus on hearing loss or the own fear of appearing helpless
infront of them.
3 OUR APPROACH (3) Ability ranges: The third component focuses on contextu-
To establish a frst understanding of the target group within the ally relevant abilities of the users and how these are afected by
team of our research project AVASAG 1 , we designed a common external factors. For this, we defned per ability a set of enabling
assumption persona [43] based on the experiences and perspec- and disabling factors as well as a range indicator, positioning the
tives ofered by the Deaf sign language experts within the team individual ability spectra of the persona. Relevant abilities of the
while also considering the results of previous studies on the eval- Enthusiastic Traveller thus contain 1) Sign language abilities, which
uation of sign language avatars with DHH users [11, 25, 26, 28]. can be emphasised by skilled communication partners and good
This assumption persona contained a list of characteristics that lighting conditions, while being inhibited by being restricted on
were grouped into tasks, goals, communication preferences, impair- fngerspelling or one-handed signing; 2) Spoken language abilities,
ments, pain points and social needs. When comparing our modeled enabled by lip-reading, but disabled through a general dismissal of
assumptions with the statements and discussions we witnessed in using spoken languages instead of sign language; 3) Written lan-
the sign language accessible focus groups we conducted with DHH guage abilities which are enhanced by short and well-structured
individuals (n=10)2 , we realised that integrating the category of texts, but are reduced through unknown words or the own fear of
impairment descriptions did not comply with the target group’s making mistakes; and 4) Technology profciency, enhanced through
self-concept, seeing that it attributed them with a "disabled" label well-structured content or video calls as an available channel of
which they are determined to avoid and reject [29, 30, 34]. Thus, we communication, but weakened by visually cluttered interfaces.
modifed our assumption persona according to the perspectives and Depending on the defned persona type, the types of factors
and their infuence on the individual ability ranges can vary. This
1 The research project AVASAG [4, 40, 48] focuses on the development of an algorithm
that automatically translates German text into German Sign Language with the trans-
lation being carried out by an avatar and the initial project scope focused on the topic 3 We chose to deviate from the classical narrative format [7, 39] and instead inserted
of travel information and services. bullet points containing either full or partial sentences in order to focus on the essentials
2 We conducted three focus group sessions, each with three to four native signers in while keeping the overall artifact as clear as possible—an approach that has also been
order to identify experienced barriers when travelling as a Deaf person and also when suggested by Constantine [6] who criticised the narrative of personas for their risk of
using diferent technologies. A more detailed documentation of our conducted focus potentially distracting details, confusing the reader on "what matters and what does
groups and their results can be found in [48]. not" [6, p. 505].
Creating Personas for Signing User Populations ASSETS ’22, October 23–26, 2022, Athens, Greece

parallelism of allowing individual diferences between persona yet be discussed with a larger sum of target group members. A
types, but also within one single persona while at the same time respective study should inspect both, the general characterisation
ensuring an ability-based perspective is what we argue as the main scheme (e.g., discarding medical aspects in terms of the hearing sta-
contribution of our modifcation. tus completely to focus, instead, on skills) and the selected abilities,
enabling factors, and disabling ones. Researchers examining the
4 DISCUSSION & FUTURE WORK latter (i.e., the selection of abilities) could, in addition, analyse the
Persona design for DHH sign language users poses critical chal- question of the appropriate amount, degree, and types of abilities
lenges to HCI designers in terms of capturing DHH perspectives within such a user representation. An adequate evaluation format
and needs in a way that corresponds to the latter’s own ability- addressing such issues could include either a comparative study,
based self-concepts. Previous studies [13] demand that designers as suggested above, evaluating the novel concept in contrast to
must fnd ways to create personas that are not static or fnite, but existing approaches. Alternatively, only the novel approach could
instead address the complex and diverse nature of users. We ar- be presented to participants, requiring feedback and opinions on its
gue that our approach of an ability-based persona for DHH sign core design and components. Results of both approaches, if indicat-
language users takes a frst step into this direction by recognising ing a preference for the novel design, could help the approach gain
dynamics in the user’s abilities, understanding the latter to be dy- momentum and receive support from within DHH communities
namic areas infuenced by various factors in the physical, social, and, thereby, potentially also decrease the mistrust that DHH users
and technological environment. Such an approach, besides comply- often show towards hearing designers or researchers [1, 34].
ing to the user’s own ability-based identifcation, allows designers
to recognise opportunity areas for their solutions by integrating Impacts on stereotyping. In addition, seeing that the avoidance
and making use of enabling factors while avoiding disabling factors. of false or negative stereotyping is another aim of the approach
If one wants to e.g., integrate communication mechanisms into a brought forth in this paper, future work should explore if and in
solution for the Enthusiastic Traveller, they can apprehend from the what ways this is actually achieved. Studies could apply a research
persona design which communication abilities are pronounced and design similar to [42] or [32], who evaluated stereotypical percep-
how diferent abilities can be enhanced. tion of personas through the Stereotype Content Model (SMC)—a
Even though we have drafted this persona for DHH sign language framework based on research on social perception [9, 15] that di-
users, we consider it possible to transfer and apply the template to vides stereotyping into two dimensions: competence and warmth.
other target groups as well, pursuing, thereby, the goals set out by Regarding the evaluation of our persona design, researchers could
the Ability-Based Design approach [49, 50]. This approach argues present participants with a specifc persona designed in both ways
that if a shift in the designer’s attention can be achieved and abili- (i.e., the novel design as proposed in this paper and existing designs
ties are considered in the early conceptual phases of development as defned by [7] or [38, 39]), in order to measure and compare
processes, resulting solutions should be more accommodating to stereotypical perception of both designs.
such abilities of users. However, despite being cited as infuential
on various projects, the concept still lacks practical methods and Practical applicability. Finally, to evaluate the actual feasibility
characterisations of how to implement its required focus, i.e., how of the approach, researchers should explore if and how designers
to perceive and model users’ abilities within an articulated design are able to apply the proposed persona design within projects.
process. Our modifed persona approach addresses this method- Qualitative measurements could be used to compare the usage of
ological gap by providing an example of how the general notion of this novel persona design to traditional approaches [7, 38, 39]. Such
ABD can be applied to existing HCI design methods. an evaluation format should aim to investigate the ways in which
Although the persona itself was designed by two hearing project both approaches either support or inhibit the design work, i.e., allow
members, a tight integration of feedback from members of DHH designers to build empathy towards users and identify adequate,
communities into the design workfows was constantly ensured optimised design solutions for their target group. For this, a variety
(i.e., through the conduction of focus groups as well as regular of testing scenarios should be considered so as to not only highlight
feedback sessions with DHH project team members). However, a advantages or disadvantages of the novel approach but also identify
thorough evaluation of our fnalised approach has yet to be carried whether specifc types or scopes of projects are better suited with
out. Seeing that such a summative evaluation could address diferent one or the other approach.
hypotheses, we argue that future research should explicitly address
the following issues:
Methodological perspectives. In addition to these evaluative chal-
Adequate representation format. Considering accurate represen- lenges, future research should also examine the idea of applying
tations of DHH sign language users as the major aim of this paper, the novel approach to other target groups and contexts in order to
researchers should examine the degree to which a broad spectrum understand how much of its potential is motivated by the unique
of target group members feels adequately represented by the pro- nature of DHH community members and to what degree this poten-
posed design format, especially in regard to the idea of not mod- tial can be exploited for other user groups as well. In this context, it
eling disabilities at all. Although our approach has been approved would also be helpful to discuss if or to which degree the approach
by Deaf project members, potential drawbacks such as the risk presented here can actually be defned as a (modifed) persona or
of diminishing lived experiences and knowledge, or oppressing whether it should be labeled as a diferent method to better justify
someone’s identifcation with and advocacy for a disability need or explain the deviations it takes from common persona models.
ASSETS ’22, October 23–26, 2022, Athens, Greece Amelie Nolte, Karolin Lueneburg, Dieter Wallach, and Nicole Jochems

5 CONCLUSION München Landsberg Frechen Hamburg.


[9] Amy J. C. Cuddy, Susan T. Fiske, Virginia S. Y. Kwan, Peter Glick, Stéphanie
In this paper, we have presented a novel approach to designing Demoulin, Jacques-Philippe Leyens, Michael Harris Bond, Jean-Claude Croizet,
ability-based personas for DHH sign language users. Having iden- Naomi Ellemers, Ed Sleebos, Tin Tin Htun, Hyun-Jeong Kim, Greg Maio, Judi
Perry, Kristina Petkova, Valery Todorov, Rosa Rodríguez-Bailón, Elena Morales,
tifed the necessity of coming up with modelling solutions that Miguel Moya, Marisol Palacios, Vanessa Smith, Rolando Perez, Jorge Vala,
capture the ability-based identifcation concepts of DHH users as and Rene Ziegler. 2009. Stereotype content model across cultures: Towards
well as their diversity, we propose a modifcation of common per- universal similarities and some diferences. British Journal of Social Psychol-
ogy 48, 1 (2009), 1–33. https://doi.org/10.1348/014466608X314935 _eprint:
sona templates that includes abilities instead of disabilities and https://onlinelibrary.wiley.com/doi/pdf/10.1348/014466608X314935.
external factors infuencing the persona. The template recognises [10] Tatiany X. de Godoi, Deógenes P. da Silva Junior, and Natasha M. Costa Valentim.
2020. A Case Study About Usability, User Experience and Accessibility Prob-
ability as a positive attribute and dynamic area that is infuenced lems of Deaf Users with Assistive Technologies. In Universal Access in Human-
by a variety of factors from physical, social, and technological en- Computer Interaction. Applications and Practice, Margherita Antona and Con-
vironments. Although defned for DHH sign language users in the stantine Stephanidis (Eds.), Vol. 12189. Springer International Publishing, Cham,
73–91. https://doi.org/10.1007/978-3-030-49108-6_6 Series Title: Lecture Notes
context of travelling, we believe that our approach can be applied in Computer Science.
and adapted to other contexts or target groups as needed. Thus, [11] Sarah Ebling and John Glauert. 2016. Building a Swiss German Sign Language
the contribution of this paper lies not only in a novel perspective avatar with JASigning and evaluating it among the Deaf community. Universal
Access in the Information Society 15, 4 (Nov. 2016), 577–587. https://doi.org/10.
on the design of personas for DHH users but on the integration of 1007/s10209-015-0408-1
Ability-Based Design into common HCI-methods altogether. [12] Emory Edwards, Kyle Lewis Polster, Isabel Tuason, Emily Blank, Michael Gilbert,
and Stacy Branham. 2021. "That’s in the eye of the beholder": Layers of Inter-
pretation in Image Descriptions for Fictional Representations of People with
ACKNOWLEDGMENTS Disabilities. In The 23rd International ACM SIGACCESS Conference on Comput-
ers and Accessibility. ACM, Virtual Event USA, 1–14. https://doi.org/10.1145/
The work was conducted as part of the research project AVASAG 3441852.3471222
and as such, funded by the German Federal Ministry of Education [13] Emory James Edwards, Cella Monet Sum, and Stacy M. Branham. 2020. Three
and Research. In addition, the work was in part supported by the Tensions Between Personas and Complex Disability Identities. In Extended Ab-
stracts of the 2020 CHI Conference on Human Factors in Computing Systems. ACM,
Edinburgh College of Art, University of Edinburgh. The authors Honolulu HI USA, 1–9. https://doi.org/10.1145/3334480.3382931
wish to thank all projects, partners, and people who were involved [14] Johannes Fellinger, Daniel Holzinger, Rudolf Schoberberger, and Gerhard Lenz.
in and supported this work. Any opinions, fndings, conclusions or 2005. Psychosoziale Merkmale bei Gehoerlosen: Daten aus einer Spezialambulanz
fuer Gehoerlose. Der Nervenarzt 76, 1 (2005), 43–51.
recommendations expressed in this work are those of the authors [15] Susan T. Fiske, Amy J. C. Cuddy, and Peter Glick. 2007. Universal dimensions of
and do not necessarily refect those of any supporter. social cognition: warmth and competence. Trends in Cognitive Sciences 11, 2 (Feb.
2007), 77–83. https://doi.org/10.1016/j.tics.2006.11.005
[16] Kristin Skeide Fuglerud, Trenton Schulz, Astri Letnes Janson, and Anne Moen.
REFERENCES 2020. Co-creating Persona Scenarios with Diverse Users Enriching Inclusive
[1] Melissa L Anderson, Timothy Riker, Stephanie Hakulin, Jonah Meehan, Kurt Design. In Universal Access in Human-Computer Interaction. Design Approaches
Gagne, Todd Higgins, Elizabeth Stout, Emma Pici-D’Ottavio, Kelsey Cappetta, and and Supporting Technologies, Margherita Antona and Constantine Stephanidis
Kelly S Wolf Craig. 2020. Deaf ACCESS: Adapting Consent Through Community (Eds.). Vol. 12188. Springer International Publishing, Cham, 48–59. https://
Engagement and State-of-the-Art Simulation. The Journal of Deaf Studies and doi.org/10.1007/978-3-030-49282-3_4 Series Title: Lecture Notes in Computer
Deaf Education 25, 1 (2020), 115–125. Science.
[2] H-Dirksen L Bauman. 2004. Audism: Exploring the Metaphysics of Oppression. [17] Kim Goodwin. 2011. Designing for the Digital Age: How to Create Human-Centered
Journal of Deaf Studies and Deaf Education 9, 2 (2004), 239–246. Products and Services. John Wiley & Sons Ltd., West Sussex, UK. Google-Books-ID:
[3] Cynthia L. Bennett and Daniela K. Rosner. 2019. The Promise of Empathy: Design, yH6Aqr5zKJEC.
Disability, and Knowing the "Other". In Proceedings of the 2019 CHI Conference [18] Iris Groschek. 2008. Unterwegs in eine Welt des Verstehens. Gehörlosenbildung
on Human Factors in Computing Systems. ACM, Glasgow Scotland Uk, 1–13. in Hamburg vom 18. Jahrhundert bis in die Gegenwart. Hamburger Historische
https://doi.org/10.1145/3290605.3300528 Forschungen, Vol. 1. Hamburg University Press, Hamburg, Germany.
[4] Lucas Bernhard, Fabrizio Nunnari, Amelie Unger, Judith Bauerdiek, Christian [19] Jo E. Hannay, Kristin Skeide Fuglerud, and Bjarte M. Østvold. 2020. Stakeholder
Dold, Marcel Hauck, Alexander Stricker, Tobias Baur, Alexander Heimerl, Elis- Journey Analysis for Innovation: A Multiparty Analysis Framework for Startups.
abeth André, Melissa Reinecker, Cristina España-Bonet, Yasser Hamidullah, In Universal Access in Human-Computer Interaction. Applications and Practice,
Stephan Busemann, Patrick Gebhard, Corinna Jäger, Sonja Wecker, Yvonne Margherita Antona and Constantine Stephanidis (Eds.). Vol. 12189. Springer
Kossel, Henrik Müller, Kristofer Waldow, Arnulph Fuhrmann, Martin Misiak, International Publishing, Cham, 370–389. https://doi.org/10.1007/978-3-030-
and Dieter Wallach. 2022. Towards Automated Sign Language Production: A 49108-6_27 Series Title: Lecture Notes in Computer Science.
Pipeline for Creating Inclusive Virtual Humans. In Proceedings of the 15th Inter- [20] Alexander Henka and Gottfried Zimmermann. 2017. PersonaBrowser. In Human-
national Conference on PErvasive Technologies Related to Assistive Environments Computer Interaction - INTERACT 2017 (Lecture Notes in Computer Science), Regina
(PETRA ’22). Association for Computing Machinery, New York, NY, USA, 260–268. Bernhaupt, Girish Dalvi, Anirudha Joshi, Devanuj K. Balkrishan, Jacki O’Neill,
https://doi.org/10.1145/3529190.3529202 and Marco Winckler (Eds.). Springer International Publishing, Cham, 54–63.
[5] Roberto Casas, Rubén Blasco Marín, Alexia Robinet, Armando Roy Delgado, Ar- https://doi.org/10.1007/978-3-319-67684-5_4
mando Roy Yarza, John McGinn, Richard Picking, and Vic Grout. 2008. User Mod- [21] Shawn Lawton Henry. 2007. Just Ask: Integrating Accessibility Throughout Design.
elling in Ambient Intelligence for Elderly and Disabled People. In Computers Help- Lulu.com, North Carolina, USA. Google-Books-ID: hRnpXbFB06cC.
ing People with Special Needs, David Hutchison, Takeo Kanade, Josef Kittler, Jon M. [22] Dar’ya Heyko. 2021. Supporting d/Deaf and Hard of Hearing Employees in Their
Kleinberg, Friedemann Mattern, John C. Mitchell, Moni Naor, Oscar Nierstrasz, Workplaces Through Technology, Design, and Community. Ph. D. Dissertation.
C. Pandu Rangan, Bernhard Stefen, Madhu Sudan, Demetri Terzopoulos, Doug University of Guelph, Guelph, Ontario, Canada.
Tygar, Moshe Y. Vardi, Gerhard Weikum, Klaus Miesenberger, Joachim Klaus, [23] Paul C. Higgins. 1979. Outsiders in a Hearing World: The Deaf Community.
Wolfgang Zagler, and Arthur Karshmer (Eds.). Vol. 5105. Springer Berlin Heidel- Urban Life 8, 1 (1979), 3–22. Publisher: SAGE Publications.
berg, Berlin, Heidelberg, 114–122. https://doi.org/10.1007/978-3-540-70540-6_15 [24] Charlotte Jais. 2018. Designing for dementia: personas to aid communication
Series Title: Lecture Notes in Computer Science. between professionals developing built environments for people with dementia.
[6] Larry Constantine. 2006. Users, Roles and Personas. In The Persona Lifecycle: thesis. Loughborough University. https://doi.org/10.26174/thesis.lboro.8248934.
Keeping People in Mind Throughout Product Design, John Pruitt and Tamara Adlin v1
(Eds.). Morgan Kaufmann, San Francisco, 498–519. [25] Hernisa Kacorri, Matt Huenerfauth, Sarah Ebling, Kasmira Patel, Kellie Men-
[7] Alan Cooper. 1999. The inmates are running the asylum: Why high-tech products zies, and Mackenzie Willard. 2017. Regression Analysis of Demographic and
drive us crazy and how to restore the sanity (2 ed.). Sams Publishing, Indianapolis. Technology-Experience Factors Infuencing Acceptance of Sign Language Ani-
[8] Alan Cooper, Robert Reimann, Dave Cronin, Reinhard Engel, and Alan Cooper. mation. ACM Transactions on Accessible Computing 10, 1 (April 2017), 3:1–3:33.
2010. About Face: Interface- und Interaction-Design (1. auf ed.). mitp, Heidelberg https://doi.org/10.1145/3046787
Creating Personas for Signing User Populations ASSETS ’22, October 23–26, 2022, Athens, Greece

[26] Hernisa Kacorri, Matt Huenerfauth, Sarah Ebling, Kasmira Patel, and Macken- Diverse Needs Within the Design Process. In Proceedings of the 2019 CHI Confer-
zie Willard. 2015. Demographic and Experiential Factors Infuencing Accep- ence on Human Factors in Computing Systems. ACM, Glasgow Scotland Uk, 1–12.
tance of Sign Language Animation by Deaf Users. In Proceedings of the 17th https://doi.org/10.1145/3290605.3300880
International ACM SIGACCESS Conference on Computers & Accessibility (AS- [38] Lene Nielsen. 2004. Engaging Personas and Narrative Scenarios. Doctoral Thesis.
SETS ’15). Association for Computing Machinery, Lisbon, Portugal, 147–154. Copenhagen Business School, Frederiksberg.
https://doi.org/10.1145/2700648.2809860 [39] Lene Nielsen. 2019. Personas - User Focused Design. Springer London, London.
[27] Sebastian Kelle, Alexander Henka, and Gottfried Zimmermann. 2015. A Persona- https://doi.org/10.1007/978-1-4471-7427-1
based Extension for Massive Open Online Courses in Accessible Design. Procedia [40] Fabrizio Nunnari, Judith Bauerdiek, Lucas Bernhard, Cristina España-Bonet,
Manufacturing 3 (2015), 3663–3668. https://doi.org/10.1016/j.promfg.2015.07.772 Corinna Jäger, Amelie Unger, Kristofer Waldow, Sonja Wecker, Elisabeth André,
[28] Michael Kipp, Quan Nguyen, Alexis Heloir, and Silke Matthes. 2011. Assessing Stephan Busemann, Christian Dold, Arnulph Fuhrmann, Patrick Gebhard, Yasser
the deaf user perspective on sign language avatars. In The proceedings of the 13th Hamidullah, Marcel Hauck, Yvonne Kossel, Martin Misiak, Dieter Wallach, and
international ACM SIGACCESS conference on Computers and accessibility. ACM Alexander Stricker. 2021. AVASAG: A German Sign Language Translation System
Press, Dundee, Scotland, UK, 107–114. for Public Services (short paper). In Proceedings of the 1st International Workshop
[29] Jessica Korte, Leigh Ellen Potter, and Sue Nielsen. 2017. The impacts of deaf on Automatic Translation for Signed and Spoken Languages (AT4SSL). Association
culture on designing with deaf children. In Proceedings of the 29th Australian for Machine Translation in the Americas, Virtual, 43–48. https://aclanthology.
Conference on Computer-Human Interaction. ACM, Brisbane Queensland Australia, org/2021.mtsummit-at4ssl.5
135–142. https://doi.org/10.1145/3152771.3152786 [41] Lyydia Pertovaara. 2021. Framework for Creating Ethical Older People Personas for
[30] Harlan Lane. 2005. Ethnicity, Ethics, and the Deaf-World. Journal of Deaf Studies the Development of Ambient Assisted Living Technologies. Master’s thesis. Laurea
and Deaf Education 10, 3 (2005), 291–310. University of Applied Sciences, Vantaa, Finland.
[31] Joong Hee Lee, Yong Min Kim, Ilsun Rhiu, and Myung Hwan Yun. 2021. A Persona- [42] Monika Proebster, Julia Hermann, and Nicola Marsden. 2019. Personas and
Based Approach for Identifying Accessibility Issues in Elderly and Disabled Persons - An Empirical Study on Stereotyping of Personas. In Proceedings of
Users’ Interaction with Home Appliances. Applied Sciences 11, 1 (Jan. 2021), 368. Mensch und Computer 2019. ACM, Hamburg Germany, 137–145. https://doi.org/
https://doi.org/10.3390/app11010368 Number: 1 Publisher: Multidisciplinary 10.1145/3340764.3340771
Digital Publishing Institute. [43] John Pruitt and Tamara Adlin. 2006. The Persona Lifecycle. Elsevier, San Francisco.
[32] Nicola Marsden and Maren Haag. 2016. Stereotypes and Politics: Refections [44] Trenton Schulz and Kristin Skeide Fuglerud. 2012. Creating Personas with
on Personas. In Proceedings of the 2016 CHI Conference on Human Factors in Disabilities. arXiv:2003.11875 [cs] 7383 (2012), 145–152. https://doi.org/10.1007/
Computing Systems. ACM, San Jose California USA, 4017–4031. https://doi.org/ 978-3-642-31534-3_22 arXiv: 2003.11875.
10.1145/2858036.2858151 [45] Jenny Singleton. 2014. Toward Ethical Research Practice With Deaf Participants.
[33] Nicola Marsden and Monika Proebster. 2019. Personas and Identity: Looking at Journal of Empirical Research on Human Research Ethics 9, 3 (2014), 59–66.
Multiple Identities to Inform the Construction of Personas. In Proceedings of the [46] Andrew Solomon. 2012. Far From the Tree: Parents, Children and the Search for
2019 CHI Conference on Human Factors in Computing Systems. ACM, Glasgow Identity. Simon and Schuster, New York, USA.
Scotland Uk, 1–14. https://doi.org/10.1145/3290605.3300565 [47] Nicky Sulmon, Karin Slegers, Karel Van Isacker, Maria Gemou, and Evangelos
[34] Michael McKee, Deirdre Schlehofer, and Denise Thew. 2013. Ethical Issues in Bekiaris. 2010. Using Personas to capture Assistive Technology Needs of People with
Conducting Research With Deaf Populations. American Journal of Public Health Disabilities. CSUN, Northridge.
103, 12 (2013), 2174–2178. [48] Amelie Unger, Dieter P. Wallach, and Nicole Jochems. 2021. Lost in Translation:
[35] Tan Ching Ying Michelle. 2017. Exploring inclusive design partnerships through Challenges and Barriers to Sign Language-Accessible User Research. In The
an IDEA framework to support deaf or hard of hearing australian children in 23rd International ACM SIGACCESS Conference on Computers and Accessibility
design process participation. In Proceedings of the 29th Australian Conference on (ASSETS ’21). Association for Computing Machinery, New York, NY, USA, 1–5.
Computer-Human Interaction. ACM, Brisbane Queensland Australia, 433–437. https://doi.org/10.1145/3441852.3476473
https://doi.org/10.1145/3152771.3156151 [49] Jacob O. Wobbrock, Krzysztof Z. Gajos, Shaun K. Kane, and Gregg C. Vander-
[36] Microsoft. 2018. Inclusive Activities. https://www.microsoft.com/design/ heiden. 2018. Ability-based design. Commun. ACM 61, 6 (May 2018), 62–71.
inclusive/ https://doi.org/10.1145/3148051
[37] Timothy Neate, Aikaterini Bourazeri, Abi Roper, Simone Stumpf, and Stephanie [50] Jacob O. Wobbrock, Shaun K. Kane, Krzysztof Z. Gajos, Susumu Harada, and
Wilson. 2019. Co-Created Personas: Engaging and Empowering Users with Jon Froehlich. 2011. Ability-Based Design: Concept, Principles and Examples.
ACM Transactions on Accessible Computing 3, 3 (April 2011), 9:1–9:27. https:
//doi.org/10.1145/1952383.1952384
ASSETS ’22, October 23–26, 2022, Athens, Greece Amelie Nolte, Karolin Lueneburg, Dieter Wallach, and Nicole Jochems

A APPENDIX: ABILITY-BASED PERSONA

Figure 1: Ability-based persona of sign language users in the context of travel information and services. The template consists
of (1) the persona type with label, image and quote), (2) a list of context-specifc needs and pains and (3) a display of ability
ranges, each with respective enabling and disabling factors. (Illustration credits: Severin Arnold)
Designing a Data Visualization Dashboard for Pre-Screening
Hong Kong Students with Specific Learning Disabilities
Ka Yan Fung Zikai Alex Wen∗ Haotian Li
Individualized Interdisciplinary Computational Media and Arts Department of Computer Science
Program, The Hong Kong University Thrust, The Hong Kong University of and Engineering, The Hong Kong
of Science and Technology Science and Technology (Guangzhou) University of Science and
Hong Kong SAR, China Guangzhou, China Technology
kyfungag@connect.ust.hk zikaiwen@ust.hk Hong Kong SAR, China
haotian.li@connect.ust.hk

Xingbo Wang Shenghui Song∗ Huamin Qu


Department of Computer Science Department of Electronic and Department of Computer Science
and Engineering, The Hong Kong Computer Engineering, The Hong and Engineering, The Hong Kong
University of Science and Kong University of Science and University of Science and
Technology Technology Technology
Hong Kong SAR, China Hong Kong SAR, China Hong Kong SAR, China
xingbo.wang@connect.ust.hk eeshsong@ust.hk huamin@cse.ust.hk

ABSTRACT 1 INTRODUCTION AND BACKGROUND


Students with specifc learning disabilities (SLDs) often experience Specifc learning disabilities (SLDs) afect how people process and
reading, writing, attention, and physical movement coordination learn language-based information [21], but they do not afect peo-
difculties. However, in Hong Kong, it takes years for special ed- ple’s intelligence quotient [20]. As a result, SLDs can signifcantly
ucation needs coordinators (SENCOs) and special-ed teachers to impact learning skills and the acquisition of literacy skills [30, 33].
pre-screen and diagnose students with SLDs. Therefore, many stu- SLDs are used to cover a range of frequently co-occurring learn-
dents with SLDs missed the golden time for special interventions ing difculties, most commonly known as dyslexia (reading), dys-
(i.e., before six years old). In addition, although there are screen- graphia (writing), and dyspraxia (physical co-ordination) [2]. Re-
ing tools for students with SLDs in Chinese and Indo-European search [6] suggested that students with SLDs who receive special
languages (e.g., English and Spanish), they did not provide a stu- interventions during the golden time (i.e., before six years old) can
dent data visualization dashboard that could help teachers speed develop better learning skills as they grow up.
up the pre-screening process. Therefore, we designed a new visual- To help special education students, including those with SLDs,
ization dashboard for Hong Kong SENCOs and special-ed teachers Hong Kong Education Bureau has recruited special education needs
to assist them in pre-screening students with SLDs. Our formative coordinators (SENCOs) and special-ed teachers for public schools
study showed that our current design met teachers’ need to quickly since 2017 [4]. However, because of the limited number of SEN-
identify a student’s specifc under-performing tasks and efectively COs and special-ed teachers, one special-ed faculty has to support
collect evidence about how the student was afected by SLDs. Fu- nearly one thousand students. In this case, many students with
ture work will further test the efcacy of our design in real life. SLDs in Hong Kong have to wait over 2 to 3 years to fnish their
assessment [9, 22].
ACM Reference Format:
To help the Hong Kong special-ed faculty speed up assessing
Ka Yan Fung, Zikai Alex Wen, Haotian Li, Xingbo Wang, Shenghui Song,
their students, Fung et al. [10] developed an automatic pre-screening
and Huamin Qu. 2022. Designing a Data Visualization Dashboard for Pre-
Screening Hong Kong Students with Specifc Learning Disabilities. In The tool for dyslexia in Chinese, dysgraphia in Chinese, and dyspraxia.
24th International ACM SIGACCESS Conference on Computers and Accessi- Fung et al.’s pre-screening tool collected valuable student data, in-
bility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, cluding correctness rate, answer time, handwriting process, and
NY, USA, 5 pages. https://doi.org/10.1145/3517428.3550361 the video of the student interacting with the touchscreen. Teach-
ers can analyze the data to understand why the student’s perfor-
∗ Zikai Alex Wen and Shenghui Song are the corresponding authors.
mance is below average and what difculties the student encoun-
tered. However, it is still quite time-consuming for special-ed teach-
ers to analyze the raw data.
Permission to make digital or hard copies of part or all of this work for personal or Currently, the student data dashboards of existing pre-screening
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation tools [3, 8, 12, 29] only provide student performance statistics. For
on the frst page. Copyrights for third-party components of this work must be honored. example, the dashboard of Dyscreen Dyslexia Screener [29] only
For all other uses, contact the owner/author(s).
provides individual student performance scores and scores com-
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Copyright held by the owner/author(s). pared with the student’s related age group. If we were to adopt
ACM ISBN 978-1-4503-9258-7/22/10. the existing student data dashboard designs, teachers would have
https://doi.org/10.1145/3517428.3550361
ASSETS ’22, October 23–26, 2022, Athens, Greece Ka Yan Fung, Zikai Alex Wen, Haotian Li, Xingbo Wang, Shenghui Song, and Huamin Q

faced the same problem: They cannot efciently analyze and un- tools [11, 14, 15, 17, 27, 28] provide information about the per-
derstand students’ performance with SLDs. formance of each task in three testing categories: word recogni-
Therefore, our work aims to design a new student data visual- tion, reading, and writing. Therefore, we designed Panel 2, which
ization dashboard that processes the raw data collected by Fung presents all one student’s task scores. Panel 2 also provides infor-
et al.’s pre-screening tool [10]. Without collecting the raw data of mation about how much a student lags behind the average student
handwriting and videos, special-ed teachers cannot discover how performance in each task. Reviewing Panel 2 allows teachers to
exactly the student’s performance was afected by SLDs. Therefore, decide which testing tasks to investigate quickly.
although our dashboard visualizes the raw data provided by Fung Design Requirement 3: The system should help teachers decide
et al.’s tool, it is potentially helpful for any pre-screening tools de- which sub-question to investigate. In many SLDs pre-screening
signed to meet special-ed teachers’ needs. Our tool allows special- tools [13, 24–26, 31], students need to answer multiple sub-questions
ed teachers to quickly narrow down the list of students who might to complete one task. As a result, students potentially afected by
have SLDs in Chinese from thousands of student records and ef- SLDs may spend too much time on one sub-question (even though
fciently gather evidence that indicates how the students were af- they might answer it correctly). Therefore, teachers need a con-
fected by SLDs. To verify whether our design meets our goals and venient way to determine whether the student was in the above
improves our design, we conducted a formative study with two situation and further investigate how the student answered this
SENCOs, one special-ed teacher, and one special-ed student-teacher. sub-question.
The study showed that our current design met special-ed teach- We designed Panel 3 to meet the teachers’ needs. Panel 3 presents
ers’ need to quickly identify a student’s specifc under-performing the following student data for each sub-question: the correctness
tasks and efectively collect evidence about how the student was rate (including correct, incorrect, and no answer), the time spent on
afected by SLDs. Our future work will test the efcacy of our dash- a sub-question, the average time spent on all sub-questions of the
board design in special-ed teachers’ daily jobs. We will also ex- same task, and the average time that all students spend on the same
plore how our design may beneft SLDs pre-screening for other sub-question. In addition, we designed a new feature for handwrit-
languages (e.g., Indo-European languages such as English). ing tasks - animating the playback of the entire handwriting pro-
cess. Previous work for pre-screening dysgraphia in English only
provided the fnal screenshot of a student’s handwriting [23]. How-
ever, research [10] suggested that analyzing the Chinese character
stroke sequence to pre-screen dysgraphia in Chinese is essential.
2 DASHBOARD DESIGN Furthermore, it is also critical to pre-screen dyspraxia in Chinese
The design of our tool consists of 4 panels (as shown in Figure 1): by identifying slow and curly handwriting process [36]. Therefore,
(1) an overview of all students’ tests performance (Panel 1) for the our handwriting animation (as shown in Figure 1, 3.1) helps teach-
teacher to quickly narrow down the list of students who might ers to pre-screen dysgraphia and dyspraxia in Chinese.
have SLDs; (2) overview of a selected student’s test performance Design Requirement 4: The system should help teachers investigate
(Panel 2) for the teacher to identify the specifc under-performing how the student was afected by SLDs. Fung et al.’s pre-screening
tests quickly; (3) performance statistics of individual questions in tool [10] collected more student data than many pre-screening tools
a test (Panel 3) for the teacher to identify which question to look for SLDs in an Indo-European language (e.g., English, Spanish) [3,
into; (4) student performance when the student answered the ques- 8, 12, 29, 34]: screenshots of every handwriting Chinese character
tion (Panel 4) for the teacher to investigate how the student was stroke and videos of the student interacting with the pre-screening
afected by SLDs. Each one of the panels is designed based on a de- tool. As with any disability, no two individuals experience the same
sign requirement by our target users. In the following paragraphs, difculties, and some may exhibit signs of more than one SLD.
we will elaborate on each design requirement. Therefore, this information is helpful for teachers to collect explicit
Design Requirement 1: The system should help teachers narrow evidence about the common characteristics of SLDs (i.e., short at-
down the list of students who might have SLDs in Chinese. SENCOs tention span and inadequate memory retrieval). That being said,
and special-ed teachers in Hong Kong need to care for around it is time-consuming for teachers to go through all the videos, es-
a thousand students. They need an assistive tool to quickly nar- pecially for Hong Kong special-ed teachers, because they have to
row down the list of students who might have SLDs. Many pre- manage many student cases.
screening tools [1, 7, 12, 18, 19, 32] provide information about which Special-ed teachers care most about students’ specifc spatial
students potentially face mild, moderate, or severe learning dif- movement patterns (e.g., starting to look around the room). How-
culties. Therefore, we designed Panel 1, which allows teachers to ever, because the temporal information is rich, we faced a design
flter the student records by choosing the options of four learning challenge to break down all the temporal information to reduce
difculty levels: acceptable, mild, moderate, and severe. We also visual clutter and avoid visual occlusion.
added a red line that denotes the average score of all students so Inspired by previous timeline designs [16, 35] for analyzing tem-
that teachers can explicitly compare how much a student lags be- poral evolution of body movement we broke down the student
hind the average student performance. movement information into three dimensions (i.e., limbs, heads,
Design Requirement 2: The system should help teachers identify and body movements) as shown in Figure 1, 4.1. We also visualized
the specifc under-performing tasks. After the teacher decides which the average student movement range and put it alongside the indi-
student profle to look into, they want to determine which test- vidual student’s movement visualization, as shown in Figure 1, 4.2.
ing tasks are under-performing. The existing SLDs pre-screening The special-ed teachers can then capture the signals in the timeline
Designing a Dashboard for Pre-Screening Hong Kong Students with Specific Learning Disabilities ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 1: Student Data Visualization Dashboard. Panel 1 shows the overview of student performance. Panel 2 shows the
overview of one student’s testing tasks. Panel 3 shows the student data in answering all sub-questions in one task. Panel
3.1 shows the playback of the handwriting recording. Finally, panel 4 shows the visualization of the movement data (Panel
4.1), the comparison between individual and average movements (Panel 4.2), each stroke of the Chinese character in the writ-
ing tasks (Panel 4.3), and the video recording of the student movement (Panel 4.4).

visualization that indicate a student might have a short attention students with SLDs (3 females, 1 male, between 3 to 30 years of ex-
span compared with general education students. perience in teaching students with SLDs from primary school stu-
Panel 4 also visualizes when and how the student wrote each dents to undergraduates, aged from 20 to 55 years old). They have
Chinese character stroke in the writing tasks (as shown in Figure 1, either an education certifcate or a bachelor degree in special ed-
4.3). This visualization can highlight the situation when a student ucation. We started by interviewing the participants because they
spent a signifcant amount of time fnishing one Chinese character are the key users of the learning analytic visualization tool and
stroke, which informs the teacher that the student might experi- have experience in using relevant systems to pre-screen SLDs stu-
ence inadequate memory retrieval. The teacher can then go to the dents. We anticipated that the participants would be able to use
corresponding video playback time (as shown in Figure 1, 4.4) to their SLDs pre-screening experience to ofer valuable insights into
watch how exactly the student behaved. our current design. During the study, we demonstrated our proto-
type to the participants. Our participants’ working languages are
Chinese and English, so the system language of our prototype sup-
ports these two languages.
3 FORMATIVE STUDY: TEACHER FEEDBACK The data presented in our dashboard prototype are real data col-
lected by Fung et al. [10] when they invited students to play a game
To form our design, we conducted a formative study with two SEN-
for pre-screening SLDs in Chinese. We post-processed the video
COs, one special-ed teacher, and one special-ed student-teacher for
ASSETS ’22, October 23–26, 2022, Athens, Greece Ka Yan Fung, Zikai Alex Wen, Haotian Li, Xingbo Wang, Shenghui Song, and Huamin Q

recordings of student actions using the OpenPose AI model [5] to In the future, we will recruit more SENCOs and special-ed teach-
highlight the students’ heads, arms, and hands. We also wrote a ers to test the efcacy of our dashboard in their daily job. We
script to visualize how a student writes each stroke of a Chinese will also explore how our dashboard design may beneft SLDs pre-
character by processing the handwriting data (i.e., the coordinates screening for other languages (including Indo-European languages).
of screen touching). Finally, our work will inspire educational technology designers
As we showed the dashboard prototype to our study partici- and developers to leverage our design to provide a better student
pants, we asked them to give feedback on the design of the data data visualization dashboard for teachers of students with SLDs.
visualization dashboard to help them pre-screen SLDs. We screen-
recorded and then transcribed all interviews. We coded the tran- ACKNOWLEDGMENTS
scriptions and found two design highlights and one design im-
We would like to thank the anonymous reviewers for their sugges-
provement summarized from the teacher feedback.
tions and our study participants for their valuable feedback.
All SENCOs and special-ed teachers mentioned that the student
data visualization dashboard could facilitate their pre-screening
job. In addition, they appreciated the design of Panel 3 for analyz- REFERENCES
[1] GL Assessment. 2022. Dyslexia and Dyscalculia Screeners Digital.
ing dysgraphia and dyspraxia in Chinese and the design of Panel (2022). https://support.gl-assessment.co.uk/media/1764/gla176-dyslexia-and-
4 for analyzing the issues of short attention span or inadequate dyscalculia-screener-sample-report.pdf
memory retrieval. One special-ed teacher explained why animat- [2] The Dyslexia Association. [n.d.]. Specifc Learning Difculties (SpLDs).
https://www.dyslexia.uk.net/specifc-learning-difculties/
ing the order of handwriting strokes helps, “The stroke order is a [3] Nathanael Bucher, Mirjam Voegeli, and Rebecca Gretler. [n.d.]. Welcome
characteristic of reading and writing [Chinese words]. We can [use to Dybuster Calcularis. https://dybuster.com/wp-content/uploads/2018/12/
WorkingCards_Calcularis_2018.pdf
it to] analyze the student’s sense of spacing. It can provide more
[4] The Hong Kong Education Bureau. 2017.08. 關愛基金《向普通學校撥款以安
evidence for teachers to judge students’ reading and writing prob- 排特殊教育需要統籌主任》試驗計劃成效檢討報告 (Report on the Effective-
lems.” Another special-ed teacher shared a similar opinion, “Chil- ness Review of the Community Care Fund’s Pilot Scheme on "Appropriation of
Funds to Ordinary Schools to Arrange Special Educational Needs Coordinators").
dren with SLDs might not be aware of the writing grid. [So] their (2017.08). https://sense.edb.gov.hk/uploads/page/professional-support/special-
handwriting is sometimes big or small.” educational-needs-coordinator/Evaluation_LegCo_Chi.pdf
One special-ed and one SENCO pointed out that Panel 4 helped https://www.dyslexia.uk.net/specific-learning-difficulties/
[5] Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2021.
them quickly and explicitly collect the evidence of students having OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields.
SLDs because “it can record students’ problematic situations like IEEE Transactions on Pattern Analysis & Machine Intelligence 43, 01 (2021), 172–
186.
easily distracted, and sitting awkwardly” according to the special- [6] Helen Arkell Dyslexia Charity. [n.d.]. When should a Child be Assessed for
ed teacher. All participants agreed that the time range of a stu- Dyslexia? https://www.helenarkell.org.uk/about-dyslexia/parents/faq-for-
dent interacting with the touchscreen (as shown in Figure 1, 4.2) parents.php#: :text=Dyslexia%20can%20be%20identified%20by,symptoms%20are
%20not%20always%20obvious.
in Panel 4 helped them understand the student thinking process. [7] The Psychological Corporation. 2008. DAS-II Sample Report - Pearson Clinical.
One special-ed teacher explained, “If the student thinks for a long (2008). https://www.pearsonclinical.com.au/files/DASII_Sample_Report(1).pdf
time but can write a word correctly. The teacher will wonder if [8] Houston Independent School District. [n.d.]. Dyslexia Universal Screener 2021-
2022. https://www.houstonisd.org/Page/178674
the question is too difcult, or the student cannot extract the word [9] The Hong Kong Society for Community Organization. 2018. 《學
because of inadequate memory retrieval.” 前 有 特 殊 教 育 需 要 兒 童 過 渡 至 學 齡 階 段 的 服 務 需 要》 質
According to one SENCO and one special-ed teacher, we can im- 性 研 究 調 查 報 告 (A Qualitative Research Report on the Service
Needs of Preschool Children with Special Educational Needs in
prove our tool to help them better analyze the student behaviours Transition to School Age). (2018). https://www.legco.gov.hk/yr17-
when the student looks left and right while halfway fnishing the 18/chinese/panels/ed/papers/ed20180302cb4-679-1-c.pdf
writing task. Noticing that our tool provides both handwriting an- [10] Ka Yan Fung, Fung Kwong Chiu, Chan Aidan, and Yu Yi Ching. 2021. A Digital
Tool to Provide Pre-Screening to Dyslexia in Hong Kong. In Proceedings of the
imation (as shown in Figure 1, 3.1) and video recording (as shown 2021 IEEE International Conference on Engineering, Technology & Education. IEEE,
in Figure 1, 4.4), they suggested we synchronously display these 755–761.
two types of playback side by side in Panel 4. [11] Louise Ho. 2000. Hong Kong Writing and Writing Hong Kong. World Englishes
19, 3 (2000), 381–386.
[12] Children’s Learning Institute. [n.d.]. Dyslexia Screeners on CLI Engage.
https://public.cliengage.org/tools/assessment/dyslexia-screener/
4 CONCLUSION AND FUTURE WORK [13] Ruchira Kariyawasam, Madhuka Nadeeshani, Tuan Hamid, Inisha Subasinghe,
Pradeepa Samarasinghe, and Pasangi Ratnayake. 2019. Pubudu: Deep Learning
We presented a new design of a student data visualization dash- based Screening and Intervention of Dyslexia, Dysgraphia and Dyscalculia. In
board for SLDs pre-screening. Our design aims to assist Hong Kong Proceedings of the 2019 14th Conference on Industrial and Information Systems.
IEEE, 476–481.
SENCOs and special-ed teachers with their SLDs pre-screening job [14] Catherine Lam. 2018. Children Assessment Service Epidemiology and Research
so their students can receive SLDs assessment and special inter- Bulletin. (2018). https://www.dhcas.gov.hk/fle/caser/CASER15.pdf
ventions in time. We invited two SENCOs, one special-ed teacher, [15] Che Kan Leong, Pui Wan Cheng, and Li Hai Tan. 2005. The Role of Sensitivity
to Rhymes, Phonemes and Tones in Reading English and Chinese Pseudowords.
and one special-ed student-teacher for a formative study. Our de- Reading and Writing 18, 1 (2005), 1–26.
sign met their needs for quick pre-screening of students with SLDs [16] Haotian Li, Min Xu, Yong Wang, Huan Wei, and Huamin Qu. 2021. A Visual
Analytics Approach to Facilitate the Proctoring of Online Exams. In Proceedings
in Chinese, especially our designs that fulfll their design require- of the 2021 CHI Conference on Human Factors in Computing Systems. 1–17.
ments of identifying a student’s specifc under-performing tasks [17] Catherine McBride-Chang, Fanny Lam, Catherine Lam, Sylvia Doo, Simpson WL
quickly and collecting evidence about how the student was afected Wong, and Yvonne YY Chow. 2008. Word Recognition and Cognitive Profles of
Chinese Pre-school Children at Risk for Dyslexia through Language Delay or
by SLDs. The study participants also suggested a minor design im- Familial History of Dyslexia. Journal of Child Psychology and Psychiatry 49, 2
provement of the visualization dashboard. (2008), 211–218.
Designing a Dashboard for Pre-Screening Hong Kong Students with Specific Learning Disabilities ASSETS ’22, October 23–26, 2022, Athens, Greece

[18] A Jockey Club Learning Support Network. 2010. The Hong Kong Be- [27] Luz Rello, Clara Bayarri, and Azuki Gorriz. 2012. What is Wrong with this Word?
haviour Checklist of Specifc Learning Difculties in Reading and Writ- Dyseggxia: a Game for Children with Dyslexia. In Proceedings of the 14th Inter-
ing for Primary School Students (Second Edition) Manual (BCL-P(II)). national ACM SIGACCESS Conference on Computers and Accessibility. 219–220.
(2010). https://hksld.eduhk.hk/%E8%B3%87%E6%BA%90/%E8%AD%98%E5% [28] Luz Rello, Clara Bayarri, Yolanda Otal, and Martin Pielot. 2014. A Computer-
88%A5%E5%B7%A5%E5%85%B7 based Method to Improve the Spelling of Children with Dyslexia. In Proceedings
[19] NWEA. 2022. Dyslexia Screener Matrix Report. (2022). https://teach.mapnwea. of the 16th International ACM SIGACCESS Conference on Computers & Accessibil-
org/impl/maphelp/Content/ReadFluency/Reports/DyslexiaScreenerMatrix.htm ity. 153–160.
[20] Learning Disabilities Association of America (LDA). [n.d.]. Types of Learning [29] Dyscreen Dyslexia Screener. [n.d.]. Reading Assessment & Progress Monitoring,
Disabilities. https://ldaamerica.org/types-of-learning-disabilities/ Simplifed. https://dystech.com.au/
[21] University of Kent. [n.d.]. What are Specifc Learning Difculties? [30] Hong Kong EP Services. [n.d.]. Specifc Learning Difculties (SLD).
https://www.kent.ac.uk/teaching/networks/ltn/documents/2014-15/18- https://www.hkep.org/specifc-learning-difculties-sld/
Sept-2014-SpLD-handout-L-Regan.pdf [31] Abdul Samad Shibghatullah. 2017. Dleksia Game: A Mobile Dyslexia Screening
[22] Legislative Council of the Hong Kong Special Administrative Region. 2014. Test Game to Screen Dyslexia using Malay Language Instruction. Asian Journal
立 法 會 十 題: 教 育 心 理 學 家 (LCQ10: Educational psychologists). (2014). of Information Technology 16, 1 (2017), 1–6.
https://www.info.gov.hk/gia/general/201405/14/P201405140403.htm [32] Hong Kong Specifc Learning Difculties Research Team. 2015. The Hong Kong
[23] Md Abdur Rahman, Elham Hassanain, Md Mamunur Rashid, Stuart J Barnes, Reading Ability Teacher Observation Checklist for Preschool Children(TOC-K).
and M Shamim Hossain. 2018. Spatial Blockchain-based Secure Mass Screening (2015). https://hksld.eduhk.hk/
Framework for Children with Dyslexia. IEEE Access 6 (2018), 61876–61885. [33] Zikai Alex Wen, Erica Silverstein, Yuhang Zhao, Anjelika Lynne Amog, Kather-
[24] Maria Rauschenberger, Christian Lins, Noelle Rousselle, Andreas Hein, and Se- ine Garnett, and Shiri Azenkot. 2020. Teacher Views of Math E-learning Tools
bastian Fudickar. 2019. Designing a New Puzzle App to Target Dyslexia Screen- for Students with Specifc Learning Disabilities. In Proceedings of the 22nd Inter-
ing in Pre-readers. In Proceedings of the 5th EAI International Conference on Smart national ACM SIGACCESS Conference on Computers and Accessibility. 1–13.
Objects and Technologies for Social Good. 155–159. [34] Zikai Alex Wen, Yuhang Zhao, Erica Silverstein, and Shiri Azenkot. 2021. An
[25] Luz Rello, Ricardo Baeza-Yates, Abdullah Ali, Jefrey P Bigham, and Miquel Serra. Intelligent Math E-Tutoring System for Students with Specifc Learning Disabili-
2020. Predicting Risk of Dyslexia with an Online Gamifed Test. Plos One 15, 12 ties. In Proceedings of the 23rd International ACM SIGACCESS Conference on Com-
(2020), e0241687. puters and Accessibility. 1–4.
[26] Luz Rello, Miguel Ballesteros, Abdullah X Ali, Miquel Serra, Daniela Alarcón [35] Haipeng Zeng, Xingbo Wang, Yong Wang, Aoyu Wu, Ting-Chuen Pong, and
Sánchez, and Jefrey P Bigham. 2016. Dytective: Diagnosing Risk of Dyslexia Huamin Qu. 2022. GestureLens: Visual Analysis of Gestures in Presentation
with a Game. In Pervasive Health. 89–96. Videos. IEEE Transactions on Visualization and Computer Graphics (2022).
[36] 李瑩玓. 2004. 寫字困難學生寫字特徵之分析 (An Analysis of the Writing
Characteristics of Students with Writing Difculties). 師大學報: 教育類 (2004).
Digital Accessibility in Iran: An Investigation Focusing on Iran’s
National Policies on Accessibility and Disability Support
Laleh Nourian Kristen Shinohara Garreth W. Tigwell
ln2293@rit.edu kristen.shinohara@rit.edu garreth.w.tigwell@rit.edu
Department of Computing and School of Information School of Information
Information Sciences Rochester Institute of Technology Rochester Institute of Technology
Rochester Institute of Technology Rochester, New York, USA Rochester, New York, USA
Rochester, New York, USA
ABSTRACT Global North2 [7] and WEIRD countries (Western, Educated, In-
Digital accessibility has become an important topic in the feld of dustrialized, Rich, and Democratic) [26]). The existing bias within
HCI, but when looking at accessibility on a global scale, we fnd that our accessibility and HCI communities might have consequences
the representation of accessibility research is mostly centered in the for digital accessibility when we consider that research from pub-
Global North with countries that are WEIRD (Western, Educated, lished work often goes on to infuence many sectors, government
Industrialized, Rich, and Democratic). Our paper explores digital legislation, and industry practice [4, 12].
accessibility in Iran, focusing exclusively on its national policies There is still a need to increase the representation of people from
on accessibility. Iran is a non-WEIRD country located in the Global diferent parts of the world and the efort from the Global Initia-
South, with no reports on its digital accessibility status from the tive for Inclusive Information and Communication Technologies
Global Initiative for Inclusive Information and Communication (G3ict) partly fulflls this need by developing the Digital Accessibil-
Technologies (G3ict). We found that there is not enough focus on ity Rights Evaluation Index (DARE Index). The DARE Index assesses
accessibility in Iran’s regulations and we conclude our paper by how countries incorporate accessibility in digital content [14] and
recommending directions for improving this situation such as HCI the assessment is completed with the cooperation of Disabled Peo-
and disability organizations in Iran cooperating with G3ict. ple’s International (DPI) Assemblies, and the experts and national
organizations of disabled people in the countries. However, not all
CCS CONCEPTS countries are represented in the DARE Index.
We present our initial work on the status of digital accessibility
• Human-centered computing → Accessibility.
in Iran to identify the extent that digital accessibility is codifed in
ACM Reference Format: Iran’s national policies. Our investigation could be valuable since
Laleh Nourian, Kristen Shinohara, and Garreth W. Tigwell. 2022. Digital there is no work on the status of Iran’s digital accessibility in the
Accessibility in Iran: An Investigation Focusing on Iran’s National Policies
DARE Index and the fact that it is a non-WEIRD country located in
on Accessibility and Disability Support. In The 24th International ACM
the Global South makes it generally under-represented in HCI and
SIGACCESS Conference on Computers and Accessibility (ASSETS ’22), October
23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 5 pages. https://doi. accessibility research [7, 26] (i.e., there are only a few HCI studies
org/10.1145/3517428.3550385 investigating Iranian website accessibility [18, 32], with others
focusing mainly on usability [11, 31]). Moreover, prior studies have
1 INTRODUCTION shown that the policies of a country can infuence accessibility
awareness in society [6, 28] and the results of this investigation
The term postcolonial computing emerged to encourage refection
ascertain an initial understating of accessibility regulations and
of cultural infuence on the design and evaluation of technology
awareness in Iran. Thus, we seek to answer the following questions:
due to increased globalization [21], and such refection must also
RQ1: What regulations, if any, does Iran set for accessibility and
encompass accessibility [44]. Digital accessibility typically ensures
supporting disabled people in digital spaces?
that all disabled people1 can use digital content and services (e.g.,
RQ2: What accessibility guidelines, if any, do the national regula-
websites, mobile applications, etc.), while also providing wider
tions require or suggest that designers use?
benefts to anybody who might encounter contextual challenges.
We provide the frst review of Iran’s policies on accessibility
However, prior studies demonstrate a bias in HCI and accessi-
and disability, showing that digital accessibility is not extensively
bility research publications highlighting that the fndings of the
codifed in its national laws and there are no regulations about
published papers are mostly from specifc regions of the globe (e.g.,
adhering to digital accessibility guidelines. We call on the HCI com-
1 We will be writing with identify-frst language in this paper [5] munity and disability support organizations in Iran, and the Global
Initiative for Inclusive ICTs (G3ict) to cooperate and conduct ex-
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
tensive investigations on digital accessibility in Iran. A cooperative
for proft or commercial advantage and that copies bear this notice and the full citation efort will result in a deeper understanding of digital accessibil-
on the frst page. Copyrights for third-party components of this work must be honored. ity knowledge within Iran and identify knowledge gaps for future
For all other uses, contact the owner/author(s).
ASSETS ’22, October 23–26, 2022, Athens, Greece
accessibility research to address.
© 2022 Copyright held by the owner/author(s).
2 Global North countries are typically richer, economically developed, and located in
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3550385 northern hemisphere (including Australia and New Zealand) [16].
ASSETS ’22, October 23–26, 2022, Athens, Greece Nourian et al.

2 RELATED WORK 3 METHODS


2.1 Accessible Digital Design The goal of our study was to fnd the extent to which digital acces-
sibility was codifed in Iran’s national policies and if there were any
Despite the availability of accessibility guidelines, methods, and
requirements or suggestions about using accessibility guidelines in
tools [23], many mobile and web services remain inaccessible [6, 17,
digital design. One important reason to explore accessibility-related
22, 35, 38]. One of the most widely used international accessibility
laws in a country is that regulations to support disabled people can
resources is the W3C’s Web Content Accessibility Guidelines [9].
infuence other aspects of society that intersect with the work of
However, some designers are critical of the W3C’s Guidelines be-
designers and can impact the output of their work [6, 28].
cause they are difcult to understand [40], not to mention that they
Since the lead author of this paper is fuent in the Persian lan-
are yet to be translated into all ofcial languages (e.g., there is no
guage and literature, we used certain search terms and keywords in
Persian translation, which would provide better support for Iranian
both Persian and English to investigate existing regulations. We uti-
designers).
lized a combination of full-text search, short terms, and stemming
Although accessibility research has grown in the past 24 years [29],
in both languages (Table 1) and searched through the constitution
inaccessible design remains a concern, likely caused by numerous
of Iran and the Iranian national website for all regulations called
factors [37], such as inadequate education, inappropriately scoped
National Laws Portal (qavanin.ir). We also used three search engines:
and budgeted projects, lack of accessibility awareness and expe-
Google (www.google.com), DuckDuckGo (duckduckgo.com), and
rience, and a need for improved design tools or guidelines that
Parsijoo (parsijoo.ir)—an Iranian search engine—to obtain more in-
support accessible design (see: [2, 10, 24, 34, 39–43]). This lack of
formation about accessibility/disability in Iran from the perspective
access is a serious issue because website inaccessibility is globally
of international organizations.
prevalent [1, 6, 19, 22, 35], with many studies focusing on govern-
On the national laws portal (qavanin.ir), frst, we categorized
ment websites because they are usually the primary communication
the laws by searching in their titles manually using certain key-
platforms to citizens and businesses [3], and typically are under the
words (English root words and full words and the corresponding
jurisdiction of accessibility laws.
Persian words for them) and found the ones that met our criteria.
The key words were the root words: “Disable= ÈñÊªÓ , à@ ñKAK ” and
2.2 Digital Accessibility in a Global Context

“Access= €QƒX ” and the full words are “Disability= IJËñʪÓ
 ,úG @ñKAK ”
Prior work has discussed accessibility in the global context showing

and “Accessibility= úæ… QƒX IJÊK

 . A¯ ,øQKYK €QƒX” 
the dominance of some countries in HCI publications. For example, Afterward, we searched the full text of the selected laws using
recent work from Barbareschi et al. [7] argued that most of the fnd- the search terms in Table 1.
ings at the intersection of disability and technology in HCI venues
are from the Global North while 80% of disabled people in the world
live in the Global South, making it crucial that accessibility becomes 4 FINDINGS
an important issue to consider on a global scale. They also argued After investigating Iran’s Constitution, we found that out of 177
that there is a lack of literature on accessibility and disability stud- Articles, only one of them (Article 29) slightly pointed out the gov-
ies in the Global South which results in a lower level of awareness ernment’s responsibility for providing social support for disabled
about accessible and inclusive technology for disabled people in people. However, no more details were provided on this matter [36].
this region [7]. Moreover, only around 8% of all countries are re- On the Iranian national laws portal, we found 32 laws that spoke
sponsible for 80% of the papers published in CHI [8]), and a majority of disability, mostly about physical disability that were linked with
of the fndings at CHI belong to WEIRD countries (Western, Edu- martyrdom, and only one law that addressed digital Accessibility.
cated, Industrialized, Rich, and Democratic) [26]. One contributing We identifed the sections of the laws related to disability and acces-
factor to this issue is that the majority of HCI researchers are lo- sibility using our search terms (Table 1) and read through them to
cated in Western countries [25] which in general, indicates serious fnd how, if any, each law codifed accessibility in the digital spaces.
inequalities in the structure of research communities [7]. One of The governmental “Comprehensive Law on Protection of the Rights
the resources of digital accessibility in the global context is the of Persons with Disabilities” was ratifed 2004 which included 16
DARE Index [14] which reports on the progress made by 137 coun- articles [20]. In 2018, the “Law to Protect the Rights of the Disabled”
tries that are parties to the Convention on the Rights of Persons was published [13, 36] and replaced the previous law aiming to
with Disabilities (CRPD) [14] by measuring three main factors: 1) support disabled people by ensuring their access to national ser-
countries’ digital accessibility regulations and programs, 2) their vices [13]. The new law [13] is a Federal law for the public, private,
capacity and resources to implement accessibility considerations, and governmental organizations. The “Law to Protect the Rights
3) and the efects of digital accessibility on disabled people. of the Disabled” [13] includes accessibility in the physical world
In our study, we focused on the frst factor from the DARE Index: and work environments, healthcare insurance for disabled peo-
the role that a country’s regulations play in digital accessibility, ple, education and employment, etc. Additionally, it mentions that
focusing specifcally on Iran. Previous studies suggest that a factor accessibility should be included in technological spaces, but no
contributing to inaccessible websites in a country is insufcient laws more detail or articles were written on that topic, and it does not
and policies [6, 28]. Thus, investigating accessibility regulations explicitly prohibit discrimination [46]. In 2009, Iran acceded to
in Iran provides the opportunity to learn about the state of digital the Convention on the Rights of Persons with Disabilities (CRPD),
accessibility in this country and help to improve the existing gaps. but it is unclear to what extent Iran is willing to comply with its
Digital Accessibility in Iran ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Search Terms in both Persian and English

Search Terms Accessibility Laws and Regulations


á ËñÊªÓ †ñ®k  IKAÔ  g á K@ ñ¯
 
€QƒX ,úæ… QƒX IJÊK  . A¯ ,øQKYK  €QƒX
Persian
, IKA‚
éKAÓAƒ  .ð ,
à@ ñKAK ,ÈñÊªÓ ,úG@ ñKAK àñKA¯


¹JKð QºË@ øA’¯
Disability Law/Act/policy
Disability, Disable, Disability Policy
Access, Accessible
English
Accessibility Law
Accessibility Regulation, Accessibility policy
Web, Website, Digital

obligations under the convention, as it claimed that “it does not 5.1 Future Work
consider itself bound by any provisions of the Convention which may Laws and policies are one of the assets to spread accessibility knowl-
be incompatible with its applicable rules.” [33]. edge and insufcient laws can create barriers to informing the de-
The results of our investigation demonstrated that accessibility signers about accessibility in digital spaces [6, 28]. As one of our
and disability support is not sufciently codifed in the national future plans, we aim to conduct interviews with Iranian designers
regulations of Iran in either of physical or digital spaces. Many to understand their accessibility awareness and any challenges that
disabled people in Iran face discrimination, humiliation, and in- they face.
accessibility in public spaces [46] and are unable to participate Moreover, another important aspect to take into account when
in social activities independently [46]. Unsurprisingly, there were discussing accessibility are the universities that play an important
also no specifc regulations for accessibility in digital spaces or role in serving governmental agencies [4], in training experts, and
websites in any of the national policies. It is important to mention in bringing new ideas to a society [12]. Accessibility education
that there were some standards relating to accessibility for disabled and training are crucial if we want designers to fully understand
people, but they are only restricted to “movement disabilities” [45]. and create accessible content [27, 30, 47] Thus, we are motivated
Accordingly, since insufcient laws and policies [6, 28] can be a to focus on education settings in Iran to fnd out whether they
factor contributing to the existence of inaccessible websites, we impart knowledge on accessibility by connecting to instructors and
anticipate that Iranian developers and designers may not be aware students in Iranian universities.
of accessibility of digital content to a good extent.

6 CONCLUSION
5 DISCUSSION AND FUTURE WORK We investigated the state of digital accessibility in Iran by exploring
The main implication of our work is that it can inform Iranian its government policies and we wanted to know to what extent
policymakers to refect more on limited accessibility regulations and accessibility was addressed in Iran’s laws and regulations. We pre-
motivate the HCI community and organizations of disabled people sented the frst study on the above criteria in Iran showing that
in Iran to help the government in supporting disabled people, thus accessibility is not considered an essential topic in this country.
increasing the awareness of accessibility among Iranian designers. Our results suggest that there is a possible lack of accessibility
Despite the Iranian government not having sufcient policies awareness in Iran and this work could be benefcial for inform-
for accessibility, there are Iranian organizations such as the Iranian ing the Iranian policymakers, designers, and HCI and disability
Society of People with Disabilities (www.iransdp.com), and the Ira- communities of the importance of digital accessibility. Our study
nian Disability Support Association (www.iraniandsa.org), which with our long-term plans is a starting point for improving digital
indicates an interest and need for increased disability support. accessibility in Iran.
As such, since there is no indication about Iranian organizations
cooperating with G3ict, we recommend that a frst step would be
REFERENCES
for the HCI community and the organizations of people with dis-
[1] Abdulmohsen Abanumy, Ali Al-Badi, and Pam Mayhew. 2005. e-Government
ability in Iran to work with the government and cooperate with Website accessibility: in-depth evaluation of Saudi Arabia and Oman. The Elec-
international organizations like G3ict to improve accessibility regu- tronic Journal of e-government 3, 3 (2005), 99–106. https://ueaeprints.uea.ac.uk/
id/eprint/22182
lations. Our study is benefcial since it helps to provide an initial [2] Hayfa Y Abuaddous, Mohd Zalisham Jali, and Nurlida Basir. 2016. Web ac-
background of accessibility and disability support in Iran. Having a cessibility challenges. International Journal of Advanced Computer Science and
source of information about the status of accessibility in Iran can Applications (IJACSA) (2016). https://doi.org/10.14569/IJACSA.2016.071023
[3] Basel Almourad and Faouzi Kamoun. 2013. Accessibility Evaluation of Dubai
help identify opportunities in promoting and implementing digital e-Government Websites: Findings and Implications. Journal of E-Government
accessibility in this country [15]. Studies and Best Practices (09 2013), 1–15. https://doi.org/10.5171/2013.978647
ASSETS ’22, October 23–26, 2022, Athens, Greece Nourian et al.

[4] Nabil Amara, Mathieu Ouimet, and Réjean Landry. 2004. New evidence on on Human Factors in Computing Systems (CHI ’21) (CHI ’21). Association for
instrumental, conceptual, and symbolic utilization of university research in Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764.
government agencies. Science communication 26, 1 (2004), 75–106. https: 3445488
//doi.org/10.1177/1075547004267491 [27] Stephanie Ludi, Matt Huenerfauth, Vicki Hanson, Nidhi Rajendra Palan, and
[5] Erin E Andrews, Anjali J Forber-Pratt, Linda R Mona, Emily M Lund, Carrie R Paula Conn. 2018. Teaching Inclusive Thinking to Undergraduate Students in
Pilarski, and Rochelle Balter. 2019. #SaytheWord: A disability culture commentary Computing Programs. In Proceedings of the 49th ACM Technical Symposium on
on the erasure of “disability”. Rehabilitation psychology (2019). https://doi.org/ Computer Science Education (Baltimore, Maryland, USA) (SIGCSE ’18). Association
10.1037/rep0000258 for Computing Machinery, New York, NY, USA, 717–722. https://doi.org/10.
[6] Muhammad Bakhsh and Amjad Mehmood. 2012. Web accessibility for disabled: 1145/3159450.3159512
a case study of government websites in Pakistan. In 2012 10th International [28] Sergio Luján-Mora, Rosa Navarrete, and Myriam Peñafel. 2014. Egovernment
Conference on Frontiers of Information Technology. IEEE, 342–347. https://doi. and web accessibility in South America. In 2014 First International Conference
org/10.1109/FIT.2012.68 on eDemocracy & eGovernment (ICEDEG). IEEE, 77–82. https://doi.org/10.1109/
[7] Giulia Barbareschi, Manohar Swaminathan, Andre Pimenta Freire, and Catherine ICEDEG.2014.6819953
Holloway. 2021. Challenges and Strategies for Accessibility Research in the [29] Kelly Mack, Emma McDonnell, Dhruv Jain, Lucy Lu Wang, Jon E. Froehlich, and
Global South: A Panel Discussion. In X Latin American Conference on Human Leah Findlater. 2021. What Do We Mean by “Accessibility Research”? A Literature
Computer Interaction (Valparaiso, Chile) (CLIHC 2021). Association for Computing Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019. Association
Machinery, New York, NY, USA, Article 20, 5 pages. https://doi.org/10.1145/ for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764.
3488392.3488412 3445412
[8] Christoph Bartneck and Jun Hu. 2009. Scientometric Analysis of the CHI Pro- [30] Rachel Menzies, Garreth W. Tigwell, Mandar Tamhane, and Annalu Waller.
ceedings. (2009), 699–708. https://doi.org/10.1145/1518701.1518810 2019. Weaving Accessibility Through an Undergraduate Degree. In The 21st
[9] Ben Caldwell, Michael Cooper, L Guarino Reid, and Gregg Vanderheiden. 2008. International ACM SIGACCESS Conference on Computers and Accessibility (AS-
Web content accessibility guidelines (WCAG) 2.0. WWW Consortium (W3C) SETS ’19). Association for Computing Machinery, New York, NY, USA, 526–529.
(2008). http://www.w3.org/TR/2008/REC-WCAG20-20081211/ https://doi.org/10.1145/3308561.3354611
[10] Michael Crabb, Michael Heron, Rhianne Jones, Mike Armstrong, Hayley Reid, [31] Sedigheh Mohamadesmaeil and Somaye Koohbanani. 2012. Web Usability Eval-
and Amy Wilson. 2019. Developing Accessible Services: Understanding Current uation of Iran National Library Website. Collnet Journal of Scientometrics and
Knowledge and Areas for Future Support. In Proceedings of the 2019 CHI Confer- Information Management 6 (06 2012), 161–174. https://doi.org/10.1080/09737766.
ence on Human Factors in Computing Systems (CHI ’19). Association for Computing 2012.10700931
Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300446 [32] Sedigheh Mohamadesmaeil and Mahrokh Nassehi Oskouei. 2015. Comparative
[11] Iman Dianat, Pari Adeli, Mohammad Asgari Jafarabadi, and Mohammad Ali evaluation of the hospital library websites accessibility in Iran. (2015), 83–96.
Karimi. 2019. User-centred web design, usability and user satisfaction: The https://qje.ntb.iau.ir/article_516853.html
case of online banking websites in Iran. Applied Ergonomics 81 (2019), 102892. [33] United Nations. 2008. Convention on the Rights of Persons with Disabilities. Re-
https://doi.org/10.1016/j.apergo.2019.102892 trieved January 8, 2022 from https://treaties.un.org/pages/ViewDetails.aspx?src=
[12] Henry Etzkowitz. 2003. Innovation in innovation: The triple helix of university- IND&mtdsg_no=IV-15&chapter=4&clang=_en#9
industry-government relations. Social science information 42, 3 (2003), 293–337. [34] Rohan Patel, Pedro Breton, Catherine M. Baker, Yasmine N. El-Glaly, and Kris-
https://doi.org/10.1177/05390184030423002 ten Shinohara. 2020. Why Software is Not Accessible: Technology Profession-
[13] Center for Human Rights in Iran. 2020. Law to Protect the Rights of the Dis- als’ Perspectives and Challenges. In Extended Abstracts of the 2020 CHI Con-
abled. Retrieved June 7, 2021 from https://iranhumanrights.org/2020/01/english- ference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA
translation-irans-law-to-protect-the-rights-of-the-disabled/ ’20). Association for Computing Machinery, New York, NY, USA, 1–9. https:
[14] G3ict. 2020. Digital Accessibility Rights Evaluation Index (DARE Index). Re- //doi.org/10.1145/3334480.3383103
trieved January 3, 2022 from https://g3ict.org/upload/accessible_DARE-Index- [35] Manas Ranjan Patra, Amar Ranjan Dash, and Prasanna Kumar Mishra. 2014. A
2020-Global-Progress-by-CRPD-States-Parties-ENGLISH.pdf Quantitative Analysis of WCAG 2.0 Compliance for some Indian Web Portals.
[15] G3ict. 2022. Digital Accessibility Rights Evaluation Index (DARE Index). Retrieved International Journal of Computer Science, Engineering and Applications 4, 1 (2014),
June 16, 2022 from https://g3ict.org/digital-accessibility-rights-evaluation-index/ 9. https://doi.org/10.48550/arXiv.1710.08788
[16] Danny Haelewaters, Tina A Hofmann, and Adriana L Romero-Olivares. 2021. [36] Iran’s National Laws Portal. 2018. Law to Protect the Rights of the Disabled.
Ten simple rules for Global North researchers to stop perpetuating helicopter Retrieved June 17, 2021 from https://qavanin.ir/Law/TreeText/261251
research in the Global South. PLoS Computational Biology 17, 8 (2021), e1009277. [37] Anne Spencer Ross, Xiaoyi Zhang, James Fogarty, and Jacob O. Wobbrock. 2017.
https://doi.org/10.1371/journal.pcbi.1009277 Epidemiology as a Framework for Large-Scale Mobile Application Accessibility
[17] Vicki L. Hanson and John T. Richards. 2013. Progress on Website Accessibility? Assessment. In Proceedings of the 19th International ACM SIGACCESS Confer-
ACM Trans. Web 7, 1, Article 2 (March 2013), 30 pages. https://doi.org/10.1145/ ence on Computers and Accessibility (ASSETS ’17). Association for Computing
2435215.2435217 Machinery, New York, NY, USA, 2–11. https://doi.org/10.1145/3132525.3132547
[18] Mohammad Hassanzadeh and Fatemeh Navidi. 2010. Web site accessibility [38] Anne Spencer Ross, Xiaoyi Zhang, James Fogarty, and Jacob O. Wobbrock.
evaluation methods in action: A comparative approach for ministerial web sites 2018. Examining Image-Based Button Labeling for Accessibility in Android
in Iran. The Electronic Library (2010). https://doi.org/10.1108/02640471011093499 Apps through Large-Scale Analysis. In Proceedings of the 20th International
[19] Chaomeng James Huang. 2003. Usability of e-government web-sites for people ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’18). As-
with disabilities. In 36th Annual Hawaii International Conference on System Sci- sociation for Computing Machinery, New York, NY, USA, 119–130. https:
ences, 2003. Proceedings of the. IEEE, 11–pp. https://doi.org/10.1109/HICSS.2003. //doi.org/10.1145/3234695.3236364
1174330 [39] Kristen Shinohara, Saba Kawas, Amy J. Ko, and Richard E. Ladner. 2018. Who
[20] Iran. 2004. Comprehensive Law on Protection of the Rights of Persons with Disabili- Teaches Accessibility? A Survey of U.S. Computing Faculty. In Proceedings of
ties. Retrieved June 7, 2021 from http://www.ilo.org/dyn/natlex/natlex4.detail? the 49th ACM Technical Symposium on Computer Science Education (SIGCSE ’18).
p_lang=en&p_isn=91491&p_country=IRN&p_count=168 Association for Computing Machinery, New York, NY, USA, 197–202. https:
[21] Lilly Irani, Janet Vertesi, Paul Dourish, Kavita Philip, and Rebecca E. Grinter. //doi.org/10.1145/3159450.3159484
2010. Postcolonial Computing: A Lens on Design and Development. Association [40] David Swallow, Christopher Power, Helen Petrie, Anna Bramwell-Dicks, Lucy
for Computing Machinery, New York, NY, USA, 1311–1320. https://doi.org/10. Buykx, Carlos A Velasco, Aidan Parr, and Joshue O Connor. 2014. Speaking the
1145/1753326.1753522 Language of Web Developers: Evaluation of a Web Accessibility Information
[22] Joanne M Kuzma. 2010. Accessibility design issues with UK e-government sites. Resource (WebAIR). In International Conference on Computers for Handicapped
Government Information Quarterly 27, 2 (2010), 141–146. https://doi.org/10.1016/ Persons. Springer, 348–355. https://doi.org/10.1007/978-3-319-08596-8_54
j.giq.2009.10.004 [41] Garreth W. Tigwell. 2021. Nuanced Perspectives Toward Disability Simulations
[23] Jonathan Lazar, Daniel F. Goldstein, and Anne Taylor. 2015. Ensuring Digital from Digital Designers, Blind, Low Vision, and Color Blind People. In CHI Con-
Accessibility through Process and Policy (1st ed.). Morgan Kaufmann Publishers ference on Human Factors in Computing Systems (CHI ’21) (CHI ’21). Association
Inc., San Francisco, CA, USA. https://doi.org/doi/book/10.5555/2815674 for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764.
[24] Junchen Li, Garreth W. Tigwell, and Kristen Shinohara. 2021. Accessibility of 3445620
High-Fidelity Prototyping Tools. In CHI Conference on Human Factors in Com- [42] Garreth W. Tigwell, David R. Flatla, and Neil D. Archibald. 2017. ACE: A Colour
puting Systems (CHI ’21) (CHI ’21). Association for Computing Machinery, New Palette Design Tool for Balancing Aesthetics and Accessibility. ACM Trans. Access.
York, NY, USA. https://doi.org/10.1145/3411764.3445520 Comput. 9, 2 (Jan. 2017). https://doi.org/10.1145/3014588
[25] Jonathan Ling and Paul Van Schaik. 2002. The efect of text and background [43] Garreth W. Tigwell, Rachel Menzies, and David R. Flatla. 2018. Designing for
colour on visual search of Web pages. Displays 23, 5 (2002), 223–230. Situational Visual Impairments: Supporting Early-Career Designers of Mobile
[26] Sebastian Linxen, Christian Sturm, Florian Brühlmann, Vincent Cassau, Klaus Content. In Proceedings of the 2018 Designing Interactive Systems Conference
Opwis, and Katharina Reinecke. 2021. How WEIRD is CHI?. In CHI Conference (DIS ’18). Association for Computing Machinery, New York, NY, USA, 387–399.
Digital Accessibility in Iran ASSETS ’22, October 23–26, 2022, Athens, Greece

https://doi.org/10.1145/3196709.3196760 [46] Human Rights Watch. 2018. Iran: People with Disabilities Face Discrimination and
[44] Garreth W. Tigwell, Kristen Shinohara, and Laleh Nourian. 2021. Accessibility Abuse. Retrieved August 20, 2021 from https://www.hrw.org/news/2018/06/26/
Across Borders. In CHI ’21 Workshop: Decolonizing HCI Across Borders (CHI iran-people-disabilities-face-discrimination-and-abuse
Workshop ’21). 1–4. https://arxiv.org/abs/2105.01488 [47] Qiwen Zhao, Vaishnavi Mande, Paula Conn, Sedeeq Al-khazraji, Kristen Shi-
[45] Human Rights Watch. 2018. "I Am Equally Human” Discrimination and nohara, Stephanie Ludi, and Matt Huenerfauth. 2020. Comparison of Meth-
Lack of Accessibility for People with Disabilities in Iran. Retrieved June ods for Teaching Accessibility in University Computing Courses. In The 22nd
19, 2021 from https://www.hrw.org/report/2018/06/27/i-am-equally-human/ International ACM SIGACCESS Conference on Computers and Accessibility (AS-
discrimination-and-lack-accessibility-people-disabilities-iran#_ftn168 SETS ’20). Association for Computing Machinery, New York, NY, USA. https:
//doi.org/10.1145/3373625.3417013
Exploring Accessibility Features and Plug-ins for Digital
Prototyping Tools
Urvashi Kokate Kristen Shinohara Garreth W. Tigwell
uk4890@rit.edu kristen.shinohara@rit.edu garreth.w.tigwell@rit.edu
School of Information School of Information School of Information
Rochester Institute of Technology Rochester Institute of Technology Rochester Institute of Technology
Rochester, NY, USA Rochester, NY, USA Rochester, NY, USA
ABSTRACT in missing some issues compared to manual assessment [14], and
Many digital systems are found to be inaccessible and a large part of there is also a lack of consistency among accessibility evaluation
the issue is that accessibility is not considered early enough in the tools [6, 18]. However, there is potential with accessibility tools that
design process. Digital prototyping tools are a powerful resource support people in meeting WCAG (e.g., color contrast checkers),
for designers to quickly explore both low and high fdelity design especially if they are designed to be used earlier in the design
mockups during initial stages of product design and development. process (e.g., the Accessible Colour Evaluator, which was developed
We evaluated 10 popular prototyping tools to understand their built- through a User-Centered Design process [22]).
in and third-party accessibility features. We found that accessible Prototyping is a design method used early in the design process
design support is largely from third-party plug-ins rather than pro- to lay down ideas and also support quicker evaluations [11, 20].
totyping tools’ built-in features, and the availability of accessibility There are many approaches to facilitating prototyping practice rang-
support varies from tool to tool. There is potential to improve ac- ing from working ofine (e.g., pen and paper) to using low-cost,
cessible design by increasing the potential for accessibility to be readily available software (e.g. Microsoft PowerPoint) to high-end
consider earlier in the design process. professional software (e.g., Adobe XD). We anticipated that proto-
typing tools are likely not providing enough feature support for
CCS CONCEPTS users since prior work already found prototyping tools themselves
are not made to be accessible [15]. Some prototyping software
• Human-centered computing → Accessibility.
also supports the use of plug-ins, and there is potential with this
ACM Reference Format: approach to ofer accessibility plug-ins (e.g., Adee [8]).
Urvashi Kokate, Kristen Shinohara, and Garreth W. Tigwell. 2022. Exploring
There are many scenarios where prototyping tools could support
Accessibility Features and Plug-ins for Digital Prototyping Tools. In The 24th
accessible design. For example, notifying the user when color pairs
International ACM SIGACCESS Conference on Computers and Accessibility
(ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, are inaccessible due to not meeting WCAG minimum contrast crite-
4 pages. https://doi.org/10.1145/3517428.3550391 ria, warning when button target sizes may be too small for certain
devices, recommending accessible fonts styles and font sizes.
1 INTRODUCTION AND RELATED WORK We formally evaluated 10 popular design tools that can be used
Accessible design should not be an afterthought in the design pro- to support digital prototyping to understand what accessibility sup-
cess [4]. When accessibility issues are checked later in the design port they provide. We found that only a few tools have built-in
process, it requires more time, money, and efort in redesigning features to support accessible design and most of the tools rely on
and reprogramming the product [9, 13, 23, 24]. Therefore, it is ben- third party plug-ins that provide features to check for accessibil-
efcial to move accessible design as early as possible in the design ity. The majority focused on color contrast checking and ofered
process [16]. problematic color blind simulations. We recommend that compa-
Accessibility tools are an essential part of the design process nies and the HCI community focus more on providing a range of
to support people meeting accessible design standards set by the accessibility check features to support accessible design during on
Web Content Accessibility Guidelines (WCAG) [12], and mobile of the earliest points of the design process.
app design guidelines also incorporate WCAG criteria [2, 7, 17] due
to an increasing need for mobile accessibility [1]. Designers and 2 EVALUATION OF DESIGN PROTOTYPING
developers have access to a multitude of accessibility tools [5, 10]. TOOLS
However, they generally fall into two categories: those to be used Selection of tools. We selected 10 UI design tools from the 2020 design
during design and development, and those that run evaluations after tools survey conducted by UX tools [19]. Our selection included:
development. Automated accessibility checking generally results Adobe Illustrator, Adobe Photoshop, Adobe XD, Afnity Designer,
Permission to make digital or hard copies of part or all of this work for personal or Axure, Figma, Framer, InVision Studio, Sketch, and UXPin.
classroom use is granted without fee provided that copies are not made or distributed Evaluation method. Our evaluation process had three steps.
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. Step 1: Find and document any accessibility-related support re-
For all other uses, contact the owner/author(s). ported in the prototyping tool’s documentation and/or website. We
ASSETS ’22, October 23–26, 2022, Athens, Greece frst browsed through the website and then did a keyword search
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. to look up for specifc sections of the page. We selected keywords
https://doi.org/10.1145/3517428.3550391 related to visual design, accessibility, and guidelines since interface
ASSETS ’22, October 23–26, 2022, Athens, Greece Kokate et al.

prototyping does not often have the ability for audio (see list of key- Illustrator, Adobe Photoshop, UXPin); (4) voice prototyping (found
words in supplementary materials). We frst checked the top-level in Adobe XD); and (5) adding keyboard navigation (found in Adobe
menus and landing page of each tool’s website. Then we did a key- XD, Axure, Figma, Framer, InVision Studio, UXPin).
word search in the help/support/documentation, and blog section UX Pin is the only tool that had a built-in color contrast checker
of each website. Step 2: Check for any built-in accessible design that automatically identifes text colors with insufcient contrast
features in the tools. We checked the top menu, tool bar, and right with respect to WCAG guidelines. It also had a color blind simulator
clicked on design elements to identify any accessibility features that allowed users to view their designs in eight diferent types of
the prototyping tools ofer. Step 3: Check for third-party plug-ins color blindness, as well as a prototyping feature to add keyboard
for each tool that supports accessible design. We searched plug-in shortcut as a trigger to transition across screens. Adobe Illustrator
managers with Step 1’s list of keywords. We read the description of and Adobe Photoshop both provided color blind simulation features,
each plug-in and noted down its features. We also checked whether but they only allowed users to view in Protanopia or Deuteranopia.
it was free, free with paid features, or paid. Adobe XD was the only prototyping tool that allows users to add
audio feedback and speech output in the design to provide guidance
2.1 Findings for users who can access the prototype with only sound. Adobe
2.1.1 Searching through each prototyping tool’s documentation and XD also supports voice prototyping, which allows users to design
website. We found two out of 10 tools (Afnity Designer, Framer) transitions through voice commands. Axure, Figma, Framer, and
did not have any content related to accessibility. Six prototyping InVision also had the prototyping feature to add keyboard shortcuts
tools (Adobe XD, Axure, Figma, InVision Studio, Sketch, UXPin) for transition. Table 3 in Supplementary Materials provides an
featured blogs based on diferent topics of accessibility such as overview.
‘what is accessibility’, ‘design practices for accessibility’ and ‘why
designers should think about accessibility’. Adobe Illustrator and 2.1.3 Evaluating third-party accessibility plug-ins. Afnity Designer,
Adobe Photoshop mention about their inbuilt color blind simula- Axure, and UX Pin did not support any third-party plug-ins. We
tion features in their documentation. UXPin highlights the built-in were able to identify three general categories that plug-in accessi-
features of color contrast checker and color blind simulation in bility features would ft, and we provide examples for each. Table
the ‘features’ section of their websites. Adobe XD and Figma also 4 in Supplementary Materials provides an overview of the plug-
highlight the Stark plug-in which provides diferent accessibility in accessibility features category type found for each prototype
check features to the users such as color contrast checker, color tool, While Tables 5-11 provide details on each prototyping tool’s
blind simulator, and focus order. plug-ins.
A majority of the tools provide information on why and how to Plugins used for accessibility checking: Color contrast checker:
include accessibility in design, but only half of the tools mention checks contrast between two color layers (foreground text color and
how the tool itself can support them in designing or checking for background color); Touch target size checker: checks touch target
accessibility. Also, the information provided is limited and hard to size with respect to devices and shows if it violates guidelines; and
fnd on the website. None of the tools have a section in the top-level Epilepsy checker: checks if images and animated GIFs in designs are
menu for accessibility on their website. To look up content, a user safe for people with photosensitive epilepsy to view.
needs to search for the content using diferent keywords. ‘Accessi- Plugins used to enhance accessibility: Alt text generator: adds
bility’ and ‘contrast’ were the keywords that returned informative alt-text for images to share with the developers; Focus orderer: adds
content about accessibility for seven of the prototyping tools. We focus order for keyboard navigation; Screen reader support: adds
based how easy it was to search for information about accessibility ARIA roles, ARIA properties and tab index in designs; and Text
on the number of results that were returned and the time taken to resizer: creates legible texts with respect to screen size.
fnd relevant results out of the total returned results. More time Plugins used to support understanding impairments: Color
was taken to look up content which returned more results. Figma blind simulation: tests designs with diferent color blind simulation;
proved more most difcult to fnd relevant content because search- and Visual Impairment Simulation: checks elements against difer-
ing for ‘accessibility’ in the help center gave over 200 results out ent types of visual impairments such as central loss, blind spots,
of which only 3 included the word accessibility since most of the hemianopia, peripheral loss, retinal detachment, ocular albinism.
results were for the word ‘access’. It was easiest to look up for We list all third party accessibility plug-ins in Table 1 by proto-
content on UXPin because it returned fewer results and the built-in typing tool. Figma ofered the most number of plug-ins equal—17 in
features were highlighted on the feature page of the website, which total. Though Figma had relatively more plug-ins, 11 out of the 17
was easier to spot. plug-ins actually only included features to check colors accessibility.
Table 2 in Supplementary Materials provides a breakdown of Table 4 in Supplementary Materials shows the 17 plug-ins cover
accessibility content for each prototyping tool. seven accessibility feature categories. Most of the plug-ins were
free to use. The Stark plug-in is common across the top 3 most
2.1.2 Evaluating tool’s built-in accessibility features. Afnity De- popular tools—Adobe XD, Figma, and Sketch. Stark provides color
signer and Sketch did not have any built-in accessibility features. contrast checking, a color blindness simulator, and focus orderer,
We were able to identify fve categories that built-in accessibility but there is limited functions within the free version of Stark. The
features would ft. The categories were: (1) color contrast checker paid features provide smart suggestion of colors to use if the cur-
(found in UXPin); (2) adding audio feedback and speech output rent colors in the design do not adhere to accessibility guidelines.
(found in Adobe XD); (3) color blind simulation (found in Adobe None of the free plug-ins provide color suggestions to the users.
Exploring Accessibility Features and Plug-ins for Digital Prototyping Tools ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: List of all third party plug-ins. Details of each tool and plug-in are found in supplementary materials.
Prototyping tools List of plug-ins
Figma Able, Zebra, Contrast, A11y - Color Contrast Checker, Color blind, Epilepsy Blocker, Stark,
Color Contrast Grid, Cards for Humanity, A11y - Focus Orderer, Adee Comprehensive
Accessibility Tool, Contrast, Contrast Grid, HCL Easy, Color Contrast, Low Vision
Sketch Stark, Adee Comprehensive Accessibility Tool, Cluse, Color Contrast Analyzer, Check Con-
trast, Color Blindless, Sketch WCAG
Adobe XD Stark, Colorsinspo, Dopely colors
Adobe Illustrator Pantone Connect
Adobe Photoshop Pantone Connect, Check Contrast Ratio
InVision Studio Contrast
Framer Color Contrast checker, Color check, Accessibility Tool Kit

Epilepsy blocker is a paid plug-in available only on Figma. It was from opening their platforms up to the design community to allow
the only plug-in to allow users to check if images and animated designers and developers to contribute new prototyping features.
GIFs in designs are safe for people with photosensitive epilepsy. Figma had 17 accessibility plug-ins and Sketch had seven accessi-
The Adee Comprehensive Accessibility tool plug-in provided the bility plug-ins both with a variety of features, whereas the other
maximum number of features and is available to install in Figma prototyping tools had only one to three plug-ins that provided basic
and Sketch only. It was totally free and included features such as alt color contrast checks and color blindness simulations.
text generator, touch target size checker, a color blindness simulator However, if prototyping tools are relying on third party devel-
and a contrast checker. It also allows users to generate a report for opers, rather than adding built-in accessibility features, then we
any guideline violations found in a design with respect to touch do want to refect on the potential limitations of this. If a specifc
target size and color contrast. The plug-ins Visual Impairment Sim- accessibility feature is only available as a third party plug-in, then
ulation and Epilepsy checker were only available for Figma. Screen the user has to actively seek it out by searching through the plug-in
reader support feature was provided by a plug-in available only store. A new user of the prototyping tool may not know about being
on Framer. Color blindness simulators and color contrast checkers able to install plug-ins. Finally, there is a question of validity. There
were the most popular features ofered by plug-ins. Six out of the is no clear vetting process and what determines a plug-ins success
Seven prototyping tools that support third party plug-ins had color is likely community based (e.g., ratings, comments, feedback). The
contrast checkers and color blindness simulators. prototyping tool companies are potentially in a better position to
develop accessibility features that accurately conform to current
3 DISCUSSION standards and best practice guidelines.
We found that although prototyping tools do ofer some level of For future work, we will employ qualitative methods to inter-
support for accessible design, there is still room for improvement. view diferent stakeholders (e.g., designers and prototyping tool
We found evidence of prototyping tool websites providing informa- companies) to understand their attitudes and concerns for built-in
tion about accessible design practice and info on support their tools and plug-in accessibility features. We also plan to run additional
ofered. However, the information was hard to fnd since none of evaluations of the accessibility features to understand how they are
the websites had a homepage that linked to an accessibility features used within the design process and individual workfow styles to
support page. identify whether there are opportunities to improve usability.
Furthermore, we found the built-in accessibility features in pro-
totyping tools to be limited. The common features were the ability 4 CONCLUSION
to add keyboard navigation, check color contrast, and run a color Designers should include accessibility practices in the early phase
blindness simulator. In fact, the majority focused on color contrast of design such as while creating design prototypes. Prototyping
checking and ofered color blind simulations. We want to acknowl- tools can be used to provide good assistance to designers to verify
edge that color blindness simulations are often limited in accuracy design accessibility. We evaluated 10 design prototyping tools to
and disability simulations are problematic [3, 21]. It is concerning research on the current accessibility assistance provided by these
that so many of the tools and plug-ins focused on color blindness tools. There is support provided by prototyping tools largely in the
simulations without adequate information on the limitations of form of third-party plug-ins, but minimal assistance in the form of
using such a feature [21]. Although use of color is a prominent as- built-in features. Also, the availability of these features varies from
pect of design, we fnd that more efort needs to be directed toward tool to tool. We argue that there is potential to improve accessible
creating accessibility features for other accessible design criteria. design by increasing the potential for accessibility to be consider
It would be useful to conduct follow-up work to understand user earlier in the design process, but the current approach needs more
perspective on what features to prioritize and how. refnement.
While we found fewer accessibility features provided within
the tools themselves, we did notice a decent ofering from third REFERENCES
party plug-ins that either supported the creation or evaluation of [1] Shadi Abou-Zahra, Judy Brewer, and Shawn Lawton Henry. 2013. Essential
accessible design. It is clear that Figma and Sketch have benefted Components of Mobile Web Accessibility. In Proceedings of the 10th International
ASSETS ’22, October 23–26, 2022, Athens, Greece Kokate et al.

Cross-Disciplinary Conference on Web Accessibility (Rio de Janeiro, Brazil) (W4A [14] Jonathan Lazar, Patricia Beere, Kisha-Dawn Greenidge, and Yogesh Nagappa.
’13). Association for Computing Machinery, New York, NY, USA, Article 5, 4 pages. 2003. Web accessibility in the Mid-Atlantic United States: a study of 50 homepages.
https://doi.org/10.1145/2461121.2461138 Universal Access in the Information Society 2, 4 (2003), 331–341. https://doi.org/
[2] Apple. n.d.. Human Interface Guidelines: Visual Design. https://developer. 10.1007/s10209-003-0060-z
apple.com/design/human-interface-guidelines/ios/overview/themes/. Accessed: [15] Junchen Li, Garreth W. Tigwell, and Kristen Shinohara. 2021. Accessibility of
2020-07-18. High-Fidelity Prototyping Tools. In Proceedings of the 2021 CHI Conference on
[3] Cynthia L. Bennett and Daniela K. Rosner. 2019. The Promise of Empathy: Design, Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association
Disability, and Knowing the "Other". In Proceedings of the 2019 CHI Conference for Computing Machinery, New York, NY, USA, Article 493, 17 pages. https:
on Human Factors in Computing Systems. Association for Computing Machinery, //doi.org/10.1145/3411764.3445520
New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300528 [16] Adriana Martín, Alejandra Cechich, and Gustavo Rossi. 2011. Accessibility at
[4] Vicente Luque Centeno, Carlos Delgado Kloos, Martin Gaedke, and Martin Nuss- Early Stages: Insights from the Designer Perspective. In Proceedings of the Inter-
baumer. 2005. Web Composition with WCAG in Mind. In Proceedings of the 2005 national Cross-Disciplinary Conference on Web Accessibility (Hyderabad, Andhra
International Cross-Disciplinary Workshop on Web Accessibility (W4A) (Chiba, Pradesh, India) (W4A ’11). Association for Computing Machinery, New York, NY,
Japan) (W4A ’05). Association for Computing Machinery, New York, NY, USA, USA, Article 9, 9 pages. https://doi.org/10.1145/1969289.1969302
38–45. https://doi.org/10.1145/1061811.1061819 [17] Microsoft. n.d.. Universal Windows Platform documentation. https://docs.
[5] Lisa Dziuba. 2019. Accessibility tools for designers and develop- microsoft.com/en-gb/windows/uwp. Accessed: 2020-07-18.
ers. https://uxdesign.cc/accessibility-tools-for-designers-and-developers- [18] Marian Padure and Costin Pribeanu. 2020. Comparing six free accessibility
ea400a415c0a. Last Accessed: 2022-6-11. evaluation tools. Informatica Economica 24, 1 (2020), 15–25. https://doi.org/10.
[6] Tânia Frazão and Carlos Duarte. 2020. Comparing Accessibility Evaluation Plug- 24818/issn14531305/24.1.2020.02
Ins. In Proceedings of the 17th International Web for All Conference (Taipei, Taiwan) [19] Taylor Palmer and Jordan Bowman. 2020. 2020 Design Tools Survey. https:
(W4A ’20). Association for Computing Machinery, New York, NY, USA, Article //uxtools.co/survey-2020#conclusion. Last Accessed: 2022-6-11.
20, 11 pages. https://doi.org/10.1145/3371300.3383346 [20] K. Schneider. 1996. Prototypes as assets, not toys. Why and how to extract
[7] Google. n.d.. Material Design. https://material.io. Accessed: 2020-07-18. knowledge from prototypes. (Experience report). In Proceedings of IEEE 18th
[8] Samine Hadadi. 2021. Adee: Bringing Accessibility Right Inside Design Tools. In International Conference on Software Engineering. 522–531. https://doi.org/10.
The 23rd International ACM SIGACCESS Conference on Computers and Accessibility 1109/ICSE.1996.493446
(Virtual Event, USA) (ASSETS ’21). Association for Computing Machinery, New [21] Garreth W. Tigwell. 2021. Nuanced Perspectives Toward Disability Simulations
York, NY, USA, Article 101, 4 pages. https://doi.org/10.1145/3441852.3476478 from Digital Designers, Blind, Low Vision, and Color Blind People. In Proceedings
[9] Shawn Lawton Henry and Andrew Arch. 2012. Financial factors in de- of the 2021 CHI Conference on Human Factors in Computing Systems. Association
veloping a web accessibility business case for your organization. W3C. for Computing Machinery, New York, NY, USA, Article 378, 15 pages. https:
http://www.w3.org/WAI/bcase/fn#decreasing. Accessed: 2014-08-16.. //doi.org/10.1145/3411764.3445620
[10] Shayna Hodkin. 2019. 8 tools that make accessible design easier. https://www. [22] Garreth W. Tigwell, David R. Flatla, and Neil D. Archibald. 2017. ACE: A Colour
invisionapp.com/inside-design/accessibility-tools/. Last Accessed: 2022-6-11. Palette Design Tool for Balancing Aesthetics and Accessibility. ACM Trans. Access.
[11] Gopinaath Kannabiran and Susanne Bødker. 2020. Prototypes as Objects of Desire. Comput. 9, 2, Article 5 (Jan. 2017), 32 pages. https://doi.org/10.1145/3014588
Association for Computing Machinery, New York, NY, USA, 1619–1631. https: [23] Shari Trewin, Brian Cragun, Cal Swart, Jonathan Brezin, and John Richards.
//doi.org/10.1145/3357236.3395487 2010. Accessibility Challenges and Tool Features: An IBM Web Developer
[12] Andrew Kirkpatrick, Joshue O’Connor, Alastair Campbell, and Michael Cooper. Perspective. In Proceedings of the 2010 International Cross Disciplinary Confer-
2018. Web Content Accessibility Guidelines (WCAG) 2.1. ence on Web Accessibility (W4A) (Raleigh, North Carolina) (W4A ’10). Asso-
[13] Chris Law, Julie Jacko, and Paula Edwards. 2005. Programmer-Focused Website ciation for Computing Machinery, New York, NY, USA, Article 32, 10 pages.
Accessibility Evaluations. In Proceedings of the 7th International ACM SIGACCESS https://doi.org/10.1145/1805986.1806029
Conference on Computers and Accessibility (Baltimore, MD, USA) (Assets ’05). [24] Brian Wentz, Paul T Jaeger, and Jonathan Lazar. 2011. Retroftting accessibility:
Association for Computing Machinery, New York, NY, USA, 20–27. https://doi. The legal inequality of after-the-fact online access for persons with disabilities
org/10.1145/1090785.1090792 in the United States. First Monday 16, 11 (Nov. 2011). https://doi.org/10.5210/fm.
v16i11.3666
Does XR introduce experience asymmetry in an
intergenerational seting?
Vibhav Nanda Hanuma Teja Maddali Amanda Lazar
Independent Researcher Department of Computer Science, College of Information Studies,
vibhavnanda1996@gmail.com University of Maryland College Park, University of Maryland College Park,
USA USA
hmaddali@umd.edu lazar@umd.edu
ABSTRACT tablets) [14] and remote socialization mechanisms such as on on-
Intergenerational social interactions are benefcial for bridging gen- line platforms [1]. However, these technologies and mechanisms
erational gaps, strengthening family bonds, and improving social (e.g., video chat, social media) may not preserve the dynamic of
cohesiveness. However, opportunities for in-person intergenera- real-world social interactions, but extended reality (XR) can help
tional social interactions are decreasing as families become increas- maintain this dynamic which can inspire joy [1] among older adults
ingly geographically dispersed. Researchers are examining how and positively infuence their social well-being [12] during remote
technology might support these interactions. Extended Reality (XR) interactions. XR is an umbrella term for a set of emerging tech-
is an emerging technology that has shown potential for support- nologies (augmented, virtual, mixed-reality) that allow diferent
ing immersive remote interactions but might cause an “experience compositions of real and virtual objects in the user’s feld-of-view
asymmetry” in an intergenerational setting. In this poster we con- [19]. Immersion, Imagination, and Multimodal Interaction (3i’s)
trast the user experience of younger and older participants engaging [3] are core features of XR that, in view of their potential benefts
in remote gardening sessions with our social XR prototypes. We for older adult users, calls for the examination of XR-mediated
present systemic infuence factors that afected user experience of intergenerational social interactions.
participants from diferent age groups diferently with our XR pro- The focus of designers trying to support remote intergenera-
totypes. We discuss potential approaches to mitigate their efects tional interactions also tends to be game-centric [7, 14, 18], the
based on observational learning and becoming aware of designer competitive nature of which may not appeal to older adults [14].
biases. Here, we examine an approach that instead focuses on casual in-
teractions that could be better suited to support daily social partic-
CCS CONCEPTS ipation. The benefts of activities that are more representative of
daily social participation, such as gardening, can be very impactful.
• Human-centered computing → Human computer interaction
We use gardening activity since it possesses health benefts [10], is
(HCI); Accessibility; Empirical studies in accessibility.
a popular everyday activity among older adults [20, 21], and has
been found to increase intergenerational social interactions [20].
KEYWORDS
This poster presents a secondary analysis of data from inter-
Extended Reality, Intergenerational Interaction, Gardening, Acces- views with 13 gardener dyads about the usability of our social XR
sibility prototypes in supporting remote learning and social interaction
ACM Reference Format: in a garden setting. We contribute an understanding of systemic
Vibhav Nanda, Hanuma Teja Maddali, and Amanda Lazar. 2022. Does XR factors of infuence that could negatively afect older adults’ (65
introduce experience asymmetry in an intergenerational setting?. In The 24th years and above) user experience with social XR. We examined
International ACM SIGACCESS Conference on Computers and Accessibility an “experience asymmetry” where some older participants’ user
(ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, experience was negatively afected, and younger participants’ user
4 pages. https://doi.org/10.1145/3517428.3550373 experience was positive or not visibly afected.

1 INTRODUCTION AND RELATED WORK


Intergenerational social interactions are benefcial for bridging gen-
2 DATA, METHODS, AND ANALYSIS
erational gaps [7], strengthening family bonds [7], improving social
cohesiveness [18], and reducing social isolation [14]. A growing Our data was collected over the course of two studies to identify
body of literature is studying how to support remote intergener- perceptions of XR and evaluate social XR prototypes for remote
ational interactions using conventional technology (e.g., laptops, teaching, learning, and connecting in informal settings with friends,
family, other hobbyists. We analyzed 15 hours of transcripts from
Permission to make digital or hard copies of part or all of this work for personal or interviews with 13 gardener dyads. This included 13 dyads (average
classroom use is granted without fee provided that copies are not made or distributed age diference = 22 years) with 6 older adults (average age = 70.3
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. years; 5 females; 1 male) and 21 relatively younger adults and teens
For all other uses, contact the owner/author(s). (average age = 30.7 years; 14 females; 7 males). The prototypes
ASSETS ’22, October 23–26, 2022, Athens, Greece evaluated represented an XR system where a remote user using a
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. VR headset can view and interact (e.g., annotate) with a garden of
https://doi.org/10.1145/3517428.3550373 an on-site user who uses an AR device.
ASSETS ’22, October 23–26, 2022, Athens, Greece Vibhav Nanda et al.

Figure 1: From Study 1, frst-person-view of “remote” participant wearing Oculus quest (left) and “remote” user wearing VR
headset and onsite participant using tablet as AR device (right).

Figure 2: From Study 2, a physical garden (top-left), its 3D model from photogrammetry (top-right), frst-person views of dyad
walking through a study scenario using the 3D model (bottom-left and bottom right)

In Study 1, 9 gardener dyads walked through three scenarios activity space and participants might impact remote interactions in
based on frequent activities when teaching/learning gardening (e.g., an intergenerational setting.
garden planning) in a lab-simulated remote setting and assessed Constructivist grounded theory [4] was used to analyze data
the potential strengths and drawbacks of XR prototypes when facil- from the transcripts of the 2 studies. For study 2, the frst author
itating key interactions for remote instruction as well as informal engaged in an iterative process of open coding, memo writing,
social interactions. Participants worked together in a 360-image pattern identifcation, and axial coding to fnd emergent themes.
representation of familiar garden using an Oculus Quest (remote Through team discussions, interesting themes were identifed, and
VR user) or tablet (onsite AR user) (Figure 1). Although only 1 dyad associated codes were fnalized, which were then used for focused
in this study had an older adult, information from this dyad helped coding transcripts from study 1. After completing focused coding,
with contrasting older adults’ user experience with that of younger we looked for infuence factors by identifying and examining ex-
participants in other dyads. Study 2 involved 4 gardener dyads with perience asymmetry: when systemic factors have opposite infuence
at least one older adult (above 65 years) and a younger gardener on the user experience of two diferent user groups. We decided to
who was at least relatively one generation younger. The dyads focus more on factors that negatively afected older adult partici-
walked through scenarios from Study 1 in a naturalistic setting pants’ experience than younger participants’ because commercially
with one participant in a physical garden. Participants were also available XR may be designed around younger users’ and designers’
able to use basic avatars and a photogrammetrically reconstructed preferences. This analysis can also serve to become aware of design
digital twin (Figure 2) of a pre-selected garden-site instead of a 360 (or methodology) biases that can arise due to study prototype de-
image. This allowed for detailed feedback on practical challenges signers being younger adults rather than attributing it to a defcit
with using the system and how the virtual representation of the model of aging that tends to position older adults as technophobes.
Does XR introduce experience asymmetry in an intergenerational seting? ASSETS ’22, October 23–26, 2022, Athens, Greece

3 FINDINGS 3.3 Diferent reactions to sensory limitations


In this section, we identify three systemic infuence factors and of XR
present example instances of where they infuenced older adults’ Participants had noted that all fve senses were used for garden-
user experience diferently when compared with younger partici- ing in the physical world. However, lack of smell, taste, and touch
pants using our social XR prototypes. couldn’t be recreated in the prototypes’ virtual environments, due
to the limitations of current commercially available XR. This limi-
3.1 XR immersion felt diferent for some older tation negatively afected some older participants’ user experience
but did not seem to have a signifcant negative efect on younger
and younger participants
participants’ user experience, which remained positive overall. For
We fnd that the immersive capability of XR negatively afected instance, S1P7 (older participant) talk about how “I could do just
some older participants’ user experience while simultaneously posi- as much if we had a computer screen and remote because I could
tively afecting younger participants’ user experience. For example, draw on the computer screen. . . and I really wouldn’t need the virtual
S1P7 (older participant) chose not to standup from her chair when reality”. The missing experience of “smell” and “feel [touch]” in
she was immersed in the virtual environment created by the VR the virtual garden environment and the common elements of sight
headset. When the second author explained that she could navigate and sound as in conventional computer made her question the ad-
the virtual environment by moving in the physical world, S1P7 said vantage of remote sessions in XR and seemed to negatively impact
that even though “I understand that intellectually. That doesn’t mean her user experience. S1P5 (younger participant) also identifed that
that I want to do it. (With) my luck, the thing would fail.” A fear of the sense of touch was missing from his experience in the virtual
“crashing” into the wall while immersed in the virtual environment environment. He explained that in the physical world he uses touch
negatively afected S1P7s’ user experience as it prevented her from to assess the soil quality, saying that he “feels” for “how well it [soil]
engaging with the technology’s main purpose. On the other hand, sticks together, how it feels in my hand,” and added that he couldn’t
S2P2 (younger participant) felt excited by the immersion and de- use touch in the virtual environment as he would in the physical
scribed a positive experience with the immersion because according world. However, S1P5 still indicated a more positive attitude to-
to her while immersed “you can actually feel like you’re there [in wards XR and mentioned “I think it’s [prototype’s] cool.” Sensory
the garden].” limitations of XR did not have a signifcant negative impact on his
user experience.
3.2 Interactions that change the virtual
environment can be diferently interpreted 4 DISCUSSION AND FUTURE WORK
Our prototypes allowed users to interact with a virtual represen- Our fndings presented three systemic infuence factors that cre-
tation of the garden area in various ways, one of which was to ated an experience asymmetry for intergenerational participants
manipulate virtual proxies for certain physical objects in the envi- by negatively afecting some older adults’ user experience with the
ronment in a way that isn’t possible in the real world. However, this prototypes and positively or not visibly afecting younger partic-
capability was harder to interpret for some older participants which ipants. We see that for older adults, these factors were related to
negatively afected their user experience, but positively afected the sensory limitations of the XR and the core features of an XR
younger participants’ user experience. For example, a prototype experience such as immersion and interpretability of interactions
in study 2 allowed the participants to move the sun in the virtual with virtual objects.
environment by using a slider which further changed the distribu- While we only identifed some factors causing experience asym-
tion of sunlight across the virtual garden. After attempting to move metry in this poster, further empirical data is required to understand
the sun, S2P1 (older participant) stated that “I’d just rather step out the possible mechanisms underlying the causes and efects of these
my back and see what the sun is doing.” This suggestion to rather factors and identify optimal user-centered approaches to mitigate
observe the efects of the sun in the physical world disregarded the their negative efects. For example, it is possible that these factors
utility of user interactions to manipulate the virtual environment are linked to afnity for novel technologies or past experience with
and was indicative of the difculty S2P1 had interpreting this capa- XR rather than age. Regardless, our data indicates that these po-
bility in a virtual environment. While talking about her experience tential barriers, and particularly those related to immersion and
trying to move the sun to manipulate the virtual environment, S2P1 multimodal interactions with the virtual environment, are areas
said that “I don’t feel very comfortable being on this [VR headset] of interest for future work and generalizable to more application
. . . with regard to gardening, when I can just easily step out my back domains and settings.
door and see what I need to see [the sun].” This feedback from S2P1 Future work could explore approaches that balance the experi-
linked the negative afect on her user experience, in the form of ence asymmetry, caused by these factors, during the introduction
discomfort, to the difculty she had interpreting this capability. to XR devices and technology. This could be done during the par-
However, S2P2 (younger participant) found the ability to change ticipatory design process or during user’s initial onboarding for
the amount of sun light “interesting” as it enabled her to “know the device. Unlike manuals or solo training, learning from a known
what it [the garden] looks like throughout the day,” and access sunny partner while onboarding could provide opportunities for remote
weather even if it was raining outside. The capability to change the social interaction, as we found in our own gardener dyad sessions.
virtual environment through interactions positively afected S2P2’s Past work has also shown that observational learning [15, 16] dur-
user experience. ing onboarding can help older adults learn new technology related
ASSETS ’22, October 23–26, 2022, Athens, Greece Vibhav Nanda et al.

skills. Such interventions can also help older adults feel positively Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), 274–279. DOI:https:
about their ability [5, 8, 9, 16] to engage with the 3i’s of XR [3] //doi.org/10.1109/ISMAR-Adjunct54149.2021.00063
[7] Eng Tat Khoo, Tim Merritt, and Adrian David Cheok. 2009. Designing physical
because self-belief is essential to actualizing the benefts of tech- and social intergenerational family entertainment. Interacting with Computers 21,
nology [2, 8, 9]. Alternatively, we can lean into this asymmetrical 1–2 (January 2009), 76–87. DOI:https://doi.org/10.1016/j.intcom.2008.10.009
[8] Luciana Lagana‘. 2008. Enhancing the Attitudes and Self-Efcacy of Older Adults
experience aspect and present diferent interfaces for users with Toward Computers and the Internet: Results of a Pilot Study. Educ Gerontol 34, 9
diferent attitudes towards XR. This approach, rather than seeing (September 2008), 831. DOI:https://doi.org/10.1080/03601270802243713
older participants’ attitudes towards XR technology as a problem in [9] Luciana Laganà, Taylor Oliver, Andrew Ainsworth, and Marc Edwards. 2011.
Enhancing computer self-efcacy and attitudes in multi-ethnic older adults: a
need of fxing, would mean that developers slow down on pursuing randomised controlled study. Ageing Soc 31, 6 (August 2011), 911–933. DOI:https:
novel experiences to focus on ensuring that all users can access, //doi.org/10.1017/S0144686X10001340
and are interested in, cutting edge technologies. [10] Chhian Hūi Lêng and Jung-Der Wang. 2016. Daily home gardening improved
survival for older people with mobility limitations: an 11-year follow-up study in
In this poster, we chose gardening as an activity to provide a Taiwan. Clin Interv Aging 11, (July 2016), 947–959. DOI:https://doi.org/10.2147/
setting for casual social interaction and to represent a meaningful CIA.S107197
[11] Cun Li, Jun Hu, Bart Hengeveld, and Caroline Hummels. 2019. Story-Me: Design
everyday activity that is outside of the game-focused approach of a System to Support Intergenerational Storytelling and Preservation for Older
to past remote technology-mediated intergenerational literature Adults. In Companion Publication of the 2019 on Designing Interactive Systems
[7, 14, 18]. Future work can similarly look to understand other Conference 2019 Companion (DIS ’19 Companion), Association for Computing
Machinery, New York, NY, USA, 245–250. DOI:https://doi.org/10.1145/3301019.
experience asymmetry factors that might be visible in less multi- 3323902
sensory, or even learning focused, intergenerational settings (e.g., [12] Charles Xueyang Lin, Chaiwoo Lee, Dennis Lally, and Joseph F. Coughlin. 2018.
Impact of Virtual Reality (VR) Experience on Older Adults’ Well-Being. In Human
remote storytelling [6, 11, 13, 17]). Aspects of IT for the Aged Population. Applications in Health, Assistance, and Enter-
tainment, Jia Zhou and Gavriel Salvendy (eds.). Springer International Publishing,
ACKNOWLEDGMENTS Cham, 89–100. DOI:https://doi.org/10.1007/978-3-319-92037-5_8
[13] Hayes Rafe, Glenda Revelle, Koichi Mori, Rafael Ballagas, Kyle Buza, Hiroshi
This work was supported in part by grant 90REGE0008, U.S. Admin- Horii, Joseph Kaye, Kristin Cook, Natalie Freed, Janet Go, and Mirjana Spasoje-
istration for Community Living, National Institute on Disability, vic. 2011. Hello, is grandma there? let’s read! StoryVisit: family video chat and
connected e-books. In Proceedings of the SIGCHI Conference on Human Factors in
Independent Living and Rehabilitation Research, Department of Computing Systems (CHI ’11), Association for Computing Machinery, New York,
Health Human Services. Opinions expressed do not necessarily NY, USA, 1195–1204. DOI:https://doi.org/10.1145/1978942.1979121
[14] Logan Reis, Kathryn Mercer, and Jennifer Boger. Technologies for fostering
represent ofcial policy of the Federal government. intergenerational connectivity and relationships: Scoping review and emergent
concepts | Elsevier Enhanced Reader. DOI:https://doi.org/10.1016/j.techsoc.2020.
REFERENCES 101494
[15] Colleen D. Story. 2010. Observational Learning among Older Adults Living in
[1] Steven Baker, Jenny Waycott, Romina Carrasco, Thuong Hoang, and Frank Vet-
Nursing Homes. ProQuest LLC.
ere. 2019. Exploring the Design of Social VR Experiences with Older Adults.
[16] Doreen Struve and Hartmut Wandke. 2009. Video Modeling for Training Older
In Proceedings of the 2019 on Designing Interactive Systems Conference (DIS
Adults to Use New Technologies. ACM Trans. Access. Comput. 2, 1 (May 2009),
’19), Association for Computing Machinery, New York, NY, USA, 303–315.
4:1-4:24. DOI:https://doi.org/10.1145/1525840.1525844
DOI:https://doi.org/10.1145/3322276.3322361
[17] Duotun Wang, Jennifer Healey, Jing Qian, Curtis Wigington, Tong Sun, and
[2] A. Bandura. 1977. Self-efcacy: toward a unifying theory of behavioral change.
Huaishu Peng. 2021. Lets Make A Story Measuring MR Child Engagement. Re-
Psychol Rev 84, 2 (March 1977), 191–215. DOI:https://doi.org/10.1037//0033-295x.
trieved May 18, 2022 from http://arxiv.org/abs/2104.06536
84.2.191
[18] Z. Zhou, A.D. Cheok, S.P. Lee, L.N. Thang, C.K. Kok, W.Z. Ng, Y.K. Cher, M.L.
[3] Grigore C. Burdea and Philippe Coifet. 2003. Virtual Reality Technology. John
Pung, and Y. Li. 2005. Age Invader: human media for natural social-physical
Wiley & Sons.
inter-generational interaction with elderly and young. In Proceedings of the 2005
[4] Kathy Charmaz. 2006. Constructing Grounded Theory: A Practical Guide through
International Conference on Active Media Technology, 2005. (AMT 2005)., 203–204.
Qualitative Analysis. SAGE.
DOI:https://doi.org/10.1109/AMT.2005.1505308
[5] Adeline Chu and Beth Mastel-Smith. 2010. The outcomes of anxiety, confdence,
[19] XR Access Symposium Report. Google Docs. Retrieved June 8, 2022 from
and self-efcacy with Internet health information retrieval in older adults: a pilot
https://docs.google.com/document/d/131eLNGES3_2M5_roJacWlLhX-
study. Comput Inform Nurs 28, 4 (August 2010), 222–228. DOI:https://doi.org/10.
nHZqghNhwUgBF5lJaE/edit?usp=sharing&usp=embed_facebook
1097/NCN.0b013e3181e1e271
[20] 2014-NGA-Garden-to-Table.pdf. Retrieved June 8, 2022 from https://garden.org/
[6] Jennifer Healey, Duotun Wang, Curtis Wigington, Tong Sun, and Huaishu
special/pdf/2014-NGA-Garden-to-Table.pdf
Peng. 2021. A Mixed-Reality System to Promote Child Engagement in Re-
[21] Gardening for older people - Better Health Channel. Retrieved May 19, 2022
mote Intergenerational Storytelling. In 2021 IEEE International Symposium on
from https://www.betterhealth.vic.gov.au/health/healthyliving/gardening-for-
older-people
Flexible Activity Tracking for Older Adults Using Mobility Aids
— An Exploratory Study on Automatically Identifying Movement
Modality
Dimitri Vargemidis Kathrin Gerling
dimitri.vargemidis@kuleuven.be kathrin.gerling@kuleuven.be
KU Leuven KU Leuven
Leuven, Belgium Leuven, Belgium

Luc Geurts Vero Vanden Abeele


luc.geurts@kuleuven.be vero.vandenabeele@kuleuven.be
KU Leuven KU Leuven
Leuven, Belgium Leuven, Belgium

ABSTRACT Particularly in the context of older adults, activity tracking supports


Wearable activity trackers are inaccessible to older adults who use physicians and therapists to follow up and better understand the
mobility aids (e.g., walker, wheelchair), because the accuracy of rehabilitation progress of their patients [3, 6], e.g., while recovering
trackers drops considerably for such movement modalities (MMs). from surgery or injuries after a fall, or coping with back pain and
As an initial step to address this problem, we implemented and other chronic conditions. In terms of personal exercise, activity
tested a minimum distance classifer to automatically identify the trackers help older adults to keep an automated record of their levels
used MM out of seven modalities, including movement with or of PA, which can make them more aware of their daily activity,
without a mobility aid, and no movement. Depending on the test ftness, and physical wellbeing in general [16].
setup, our classifer achieves accuracies between 82 % and 100 %. However, currently available wearable tracking systems do not
These fndings can be leveraged in future work to combine the adapt well to this audience. We identifed two main problems. (1)
classifer with algorithms tailored to each mobility aid to make The accuracy of step counting for older adults is lower than for
activity trackers accessible to users with limited mobility. younger adults. A lower walking speed and divergent gait are the
most important reasons for this problem [4, 14], and result, in most
CCS CONCEPTS cases, in an underestimation of the number of steps [14]. (2) Existing
activity trackers are not accessible to older adults who use mobility
• Human-centered computing → Accessibility; • Computing
aids, as these movement modalities are not supported, resulting
methodologies → Machine learning.
in even less accurate results [10]. Considering the likelihood of
older adults needing such mobility aids in late life due to sudden
KEYWORDS
(e.g., surgery, rehabilitation) or gradual changes (e.g., loss of muscle
wearables, activity tracking, mobility aids, accessibility, machine strength, aching joints), staying physically active and being aware
learning of their own levels of PA is particularly important [5].
ACM Reference Format: Several studies discuss and compare the accuracy of activity
Dimitri Vargemidis, Kathrin Gerling, Luc Geurts, and Vero Vanden Abeele. trackers in diferent settings. Activity trackers are designed to op-
2022. Flexible Activity Tracking for Older Adults Using Mobility Aids — erate within a set parameter range. For example, walking slowly or
An Exploratory Study on Automatically Identifying Movement Modality. running fast negatively impacts the accuracy of the computed step
In The 24th International ACM SIGACCESS Conference on Computers and count [13]. For older adults, with a slower gait in general, consumer-
Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New grade activity trackers can have accuracies ranging from 18 % to
York, NY, USA, 5 pages. https://doi.org/10.1145/3517428.3550371
78 %, depending on the tracker’s brand and the user’s walking speed
[14]. Younger adults’ step count accuracies are on average 90 %
1 INTRODUCTION [1]. Mobility aids further impede the accuracy of existing activity
Wearables for activity tracking are routinely leveraged to measure trackers, rendering them unreliable for use in clinical contexts [8].
physical activity (PA) in medical and research settings (e.g., [6, Because of these known accuracy issues, stand-alone algorithms
7]), and can be used for self-refection on activity routines [12]. have been designed to address this, focusing on one MM at a time.
For example, with an improved algorithm tailored to rollator use,
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
accuracies of about 84 % can be achieved, compared to 10 % with
for proft or commercial advantage and that copies bear this notice and the full citation out-of-the-box wearable trackers [11]. Moreover, studies discuss
on the frst page. Copyrights for third-party components of this work must be honored. the implementation of an activity tracking system that has sepa-
For all other uses, contact the owner/author(s).
ASSETS ’22, October 23–26, 2022, Athens, Greece
rate algorithms for walking and running activities, using machine
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3550371
ASSETS ’22, October 23–26, 2022, Athens, Greece Vargemidis, et al.

learning to decide which algorithm should be applied to each situa- Table 1: Overview of all data recording setups for each partic-
tion. For example, Bagui et al. report an accuracy of 93 % with this ipant as a combination of MM, sensor position, and location.
approach [2]. For each setting, we made two recordings
With a similar strategy, we aspire to build a fexible activity
tracking system for older adults to (1) recognise their MM, and (2) Movement modality Sensor position Location
apply an algorithm for that specifc modality to accurately com- Inside
Mobility aid
pute the achieved levels of PA. In this study, we focus on the frst Outside
Cane
objective, and consider six MMs: walking with a cane, crutches, Inside
Wrist
walker, rollator, using a wheelchair, or walking without a mobility Outside
aid (Figure 1). Our goal is to investigate the possibility of identi- Inside
Mobility aid
fying the user’s MM based on sensor data from a single inertial Outside
Crutches
measurement unit (IMU). Since this is part of an exploratory study, Inside
Wrist
we recorded a dataset with only three younger participants (two Outside
persons in the age range of 25-35, and one in 60-65). This allowed Inside
Mobility aid
us to avoid exposing older adults to health risks in light of the Outside
Walker
ongoing COVID-19 pandemic. Next, we implemented a minimum Inside
Wrist
distance classifer, and used the dataset to verify the accuracy of Outside
our classifer’s prediction. Inside
Mobility aid
Outside
Rollator
2 DATASET Inside
Wrist
Outside
We gathered 3-axes accelerometer and gyroscope data from three
participants engaging in the six MMs we considered in this study. In Inside
Mobility aid
addition to these MMs, we wanted to distinguish between a person Outside
Wheelchair
being in motion or not. Therefore, we also collected data while Inside
Wrist
the participants were sitting down calmly. For each modality, we Outside
recorded sensor data in a combination of diferent setups: having Inside
No mobility aid Wrist
the IMU attached to either the participant’s wrist or to the mobility Outside
aid being used, and recording the data either inside (smooth foor) Sitting Wrist Inside
or outside (uneven pavement). For each specifc setup, we made
Table 2: Three derived types of IMU data
two recordings (Table 1): the frst one to train our classifcation
model, the second one to verify the accuracy of its predictions.
Type of data Symbol
Each recording is about 60 seconds long, and data is sampled q
at a rate of approximately 20Hz. We used an Adafruit Flora micro- ®
Magnitude of angular velocity (ω) ωx2 + ωy2 + ωz2 = ω
controller and a BNO055 IMU to register all activity and send the
q
Magnitude of acceleration (®
a) a x2 + ay2 + az2 = a
sensor data in real time to a laptop, where everything was stored
in a CSV fle. We recorded movement data for each setup of Table Magnitude of acceleration (® a) along a® · д® = aд
1. The entire data collection process took about three to four hours ®
the gravitational force’s direction (д)
per participant. None of the participants used mobility aids in their
Table 3: Overview of all considered, mostly statistical, base
daily lives. However, for this exploratory study, having experience
features
with all aids is less important, as we investigate whether each aid
afords a sufciently distinct pattern of movement to automatically
recognise it using a machine learning algorithm. Base feature Abbreviation
Frequency of the signal f
Minimum value of the signal min
3 IMPLEMENTATION OF THE CLASSIFIER Maximum value of the signal max
We opted to use a minimum distance classifer because of its simplic- The diference between max and min range
ity and low use of resources, e.g., memory and computing power, Average value of the signal mean
and trained a model based on a feature set that highlights the most Standard deviation of all values of the signal sd
pronounced diferences between parameters of MMs. We derived Median value of the signal median
three types of data samples from the IMU, i.e., angular velocity mag- First quartile, 25th percentile of all values Q1
nitude, acceleration magnitude, and magnitude of the acceleration
Third quartile, 75th percentile of all values Q3
component along the gravitational force’s direction (Table 2). For
The diference between Q3 and Q1 IQR
each of these data types, we considered 11 possible base features
Total energy of the data signal e
(Table 3). Thus, to train our classifer’s model, we can compose a
feature set from a selection of 33 diferent features in total.
To train our minimum distance classifer, we select a feature
set, group all participants’ data per MM, and compute the mean
Flexible Activity Tracking for Older Adults Using Mobility Aids ASSETS ’22, October 23–26, 2022, Athens, Greece

Algorithm 1: Pseudo-code of the train function


for each movement modality do
for each selected feature do
compute mean feature value
end for
end for

Algorithm 2: Pseudo-code of the predict function


compute feature values of given data sample
for each movement modality do
compute Mahalanobis distance
(a) (b) end for
return movement modality with smallest distance

Table 4: Confusion matrix showing the classifer’s predic-


tions (only wrist-worn IMU) as a relative ratio (in %). The
mean accuracy is 88 %, and all results in this table are
obtained using the following features: ωQ 3 , ω I Q R , ωr anдe ,
ωmedian , ω f , aQ 1 , a I Q R , asd , amean , amedian , a f , aд,Q 3 , aд, I Q R ,
aд,mean , aд,median , aд,e

Actual

Wheelchair
Crutches

Rollator

Walker

Sitting
No aid
Cane
(c) (d)

Cane 67 0 8 8 0 0 0
Crutches 25 92 0 8 0 0 0
Predicted

Rollator 0 0 92 0 0 8 0
Walker 0 0 0 84 8 0 0
Wheelchair 0 8 0 0 92 0 0
No aid 8 0 0 0 0 92 0
Sitting 0 0 0 0 0 0 100

4 RESULTS
(e) Here, we consider four diferent subsets of our data to verify our
classifer. For each of these cases, we obtained a feature set through
Figure 1: Mobility aids supported by the fexible track- forward selection, i.e., iteratively adding features that lead to the
ing system: (a) cane, (b) crutches, (c) walker, (d) rollator, biggest increase in prediction accuracy until it no longer increases.
(e) wheelchair. The sixth movement modality, i.e., walking Table 4 shows the resulting confusion matrix when we consider all
without a mobility aid, is not depicted here data recordings where participants wore the IMU on their wrist,
and train our classifer model with 16 selected features. The mean
prediction accuracy for this setup is 88 %. When the IMU is attached
to the mobility aid instead of the participant’s wrist, the prediction
accuracy is 82 % with four features selected to train the model
(Table 5). Note that no mobility aid is used for sitting and walking
value for each selected feature of all MMs (see Algorithms 1 and 2). without an aid. Therefore, only for those MMs, we included the data
Once a model is trained, the algorithm predicts the user’s MM by recordings during which the IMU was attached to the participants’
comparing the feature values of a given data sample to the mean wrist (see Table 1).
feature values of all MMs stored in the model. The smallest distance In a setting where an older adult would use no mobility aid to
between these values results in the classifer’s MM prediction. To walk short distances inside, and a rollator or a walker for longer dis-
compensate for potential diferences in range and variance among tances outside, the classifer achieves a prediction accuracy of 96 %
the selected features, we used the Mahalanobis distance instead of (Table 6). Similarly, if a person only walks either with or without a
the Euclidean distance [9]. cane, the classifer always predicts the correct MM (Table 7).
ASSETS ’22, October 23–26, 2022, Athens, Greece Vargemidis, et al.

Table 5: Confusion matrix showing the classifer’s predic- distinguishing two diferent activities [2], we found that predictions
tions as a relative ratio (in %) (IMU attached to mobility aid, are still accurate when we include up to seven MMs, including MMs
except for No aid and Siting). The mean accuracy is 82 %, using fve mobility aids. Note that in a more realistic setting, one
and all results in this table are obtained using the following older adult is unlikely to use all considered types of mobility aids
features: ωmin , amin , aд,Q 3 , aд,r anдe on a regular basis. We can leverage this knowledge, e.g., by let-
ting users select which aids they regularly use through a one-time
Actual setup. This will further increase the classifer’s prediction accuracy,

Wheelchair
mounting to 96 % or higher.
Crutches

Rollator Additionally, the results suggest that the IMU’s location also

Walker

Sitting
No aid
impacts the prediction accuracy. Here, we found more favourable
Cane

results for a wrist-worn IMU. However, sensor placement is not


Cane 84 8 0 0 0 0 0 trivial [15, 17], and needs to be subject of follow-up studies.
Crutches 8 66 0 0 0 0 0
6 LIMITATIONS AND FUTURE WORK
Predicted

Rollator 0 0 50 0 0 0 0
Walker 8 8 0 92 0 0 0 This initial study allowed us to investigate the feasibility and poten-
Wheelchair 0 8 33 8 33 0 0 tial of a fexible tracking system. An important limitation is that we
No aid 0 8 0 0 0 100 0 recruited three participants only, and no older adults were involved
Sitting 0 0 17 0 17 0 100 for recording our dataset. We took this decision because the work
was carried out during the COVID-19 pandemic, and we did not
Table 6: Confusion matrix showing the classifer’s predic- want to put older adults at risk of exposure to collect preliminary
tions as a relative ratio (in %) for a subset of MMs (only wrist- data. Future studies need to verify whether the results found in
worn IMU). The mean accuracy is 96 %, and all results in this this study still apply to similar setups with older adults, specifcally
table are obtained using the following features: ωmean , a f , those who are familiar with using mobility aids in their daily lives,
aд,Q 3 , aд,sd and what adjustments would need to be made. Moreover, these re-
sults allow for continuing the fexible tracking approach, i.e., linking
Actual a MM prediction to a tailored algorithm for accurately computing
the user’s level of PA, e.g., step count or distance travelled.
Rollator

Walker

Sitting
No aid

7 CONCLUSION
Rollator 83 0 0 0 In this exploratory study, we examined fexible tracking as an ap-
proach to make activity tracking more accessible to older adults
Predicted

Walker 0 100 0 0 using mobility aids. As an initial step, we implemented and evalu-
No aid 17 0 100 0 ated a minimum distance classifer, achieving prediction accuracies
of more than 82 %. Future work should focus on evaluating this ap-
Sitting 0 0 0 100
proach with movement data of older adults, and linking predictions
to algorithms that can accurately track activity.
Table 7: Confusion matrix showing the classifer’s predic-
tions as a relative ratio (in %) for a subset of MMs where we REFERENCES
consider only sitting, and walking with or without a cane [1] Hyun-Sung An, Gregory C. Jones, Seoung-Ki Kang, Gregory J. Welk, and Jung-
(only wrist-worn IMU). The mean accuracy is 100 %, and all Min Lee. 2017. How valid are wearable physical activity trackers for mea-
results in this table are obtained using the following fea- suring steps? European Journal of Sport Science 17, 3 (March 2017), 360–368.
https://doi.org/10.1080/17461391.2016.1255261 Publisher: Routledge _eprint:
tures: aд,Q 3 , aд,sd https://doi.org/10.1080/17461391.2016.1255261.
[2] Sikha Bagui, Xingang Fang, Subhash Bagui, Jeremy Wyatt, Patrick Houghton,
Joe Nguyen, John Schneider, and Tyler Guthrie. 2022. An improved step count-
Actual ing algorithm using classifcation and double autocorrelation. International
Sitting
No aid

Journal of Computers and Applications 44, 3 (March 2022), 250–259. https:


Cane

//doi.org/10.1080/1206212X.2020.1726006 Publisher: Taylor & Francis _eprint:


https://doi.org/10.1080/1206212X.2020.1726006.
[3] Jens Barth, Jochen Klucken, Patrick Kugler, Thomas Kammerer, Ralph Steidl, Jür-
Cane 100 0 0 gen Winkler, Joachim Hornegger, and Björn Eskofer. 2011. Biometric and mobile
Predicted

gait analysis for early diagnosis and therapy monitoring in Parkinson’s disease.
No aid 0 100 0 In 2011 Annual International Conference of the IEEE Engineering in Medicine and
Biology Society. 868–871. https://doi.org/10.1109/IEMBS.2011.6090226
Sitting 0 0 100 [4] Lynne Clay, Megan Webb, Claire Hargest, and Divya Bharatkumar Adhia. 2019.
Gait quality and velocity infuences activity tracker accuracy in individuals
post-stroke. Topics in Stroke Rehabilitation 26, 6 (Aug. 2019), 412–417. https:
//doi.org/10.1080/10749357.2019.1623474 Publisher: Taylor & Francis _eprint:
5 DISCUSSION https://doi.org/10.1080/10749357.2019.1623474.
The results suggest that predicting diferent MM is possible, and [5] Conor Cunningham, Roger O’ Sullivan, Paolo Caserotti, and Mark A. Tully.
2020. Consequences of physical inactivity in older adults: A systematic re-
accuracies above 82 % can be achieved with straightforward ma- view of reviews and meta-analyses. Scandinavian Journal of Medicine & Sci-
chine learning algorithms. While Bagui and colleagues report on ence in Sports 30, 5 (2020), 816–827. https://doi.org/10.1111/sms.13616 _eprint:
Flexible Activity Tracking for Older Adults Using Mobility Aids ASSETS ’22, October 23–26, 2022, Athens, Greece

https://onlinelibrary.wiley.com/doi/pdf/10.1111/sms.13616. community dwelling older adults. Geriatric Nursing 36, 2 (March 2015), S21–S25.
[6] Andreas Ejupi, Matthew Brodie, Yves J. Gschwind, Daniel Schoene, Stephen Lord, https://doi.org/10.1016/j.gerinurse.2015.02.019
and Kim Delbaere. 2014. Choice stepping reaction time test using exergame tech- [13] Tifany Sears, Elmer Alvalos, Samantha Lawson, Ian McAlister, L. Eschbach, and
nology for fall risk assessment in older people. In 2014 36th Annual International Jennifer Bunn. 2017. Wrist-Worn Physical Activity Trackers Tend To Underesti-
Conference of the IEEE Engineering in Medicine and Biology Society. 6957–6960. mate Steps During Walking. International Journal of Exercise Science 10, 5 (Aug.
https://doi.org/10.1109/EMBC.2014.6945228 2017), 764–773. https://digitalcommons.wku.edu/ijes/vol10/iss5/12
[7] Orestis Giotakos, Katerina Tsirgogianni, and Ioannis Tarnanas. 2007. A virtual [14] Salvatore Tedesco, Marco Sica, Andrea Ancillao, Suzanne Timmons, John Barton,
reality exposure therapy (VRET) scenario for the reduction of fear of falling and and Brendan O’Flynn. 2019. Accuracy of consumer-level and research-grade
balance rehabilitation training of elder adults with hip fracture history. In 2007 activity trackers in ambulatory settings in older adults. PLOS ONE 14, 5 (May
Virtual Rehabilitation. 155–158. https://doi.org/10.1109/ICVR.2007.4362157 2019), e0216891. https://doi.org/10.1371/journal.pone.0216891 Publisher: Public
[8] Paul Kooner, Taran Schubert, James L Howard, Brent A Lanting, Matthew G Library of Science.
Teeter, and Edward M Vasarhelyi. 2021. Evaluation of the Efect of Gait Aids, [15] Dimitri Vargemidis, Kathrin Gerling, Vero Vanden Abeele, Luc Geurts, and Katta
Such as Canes, Crutches, and Walkers, on the Accuracy of Step Counters in Spiel. 2021. Irrelevant Gadgets or a Source of Worry: Exploring Wearable Ac-
Healthy Individuals. Orthopedic Research and Reviews 13 (Jan. 2021), 1–8. https: tivity Trackers with Older Adults. ACM Transactions on Accessible Computing
//doi.org/10.2147/ORR.S292255 (TACCESS) 14, 3 (Aug. 2021), 28. https://doi.org/10.1145/3473463 Publisher: ACM
[9] Konstantinos Koutroumbas and Sergios Theodoridis. 2008. Pattern Recognition. PUB27 New York, NY, USA.
Academic Press. [16] Rachel K. Walker, Amanda M. Hickey, and Patty S. Freedson. 2016. Advantages
[10] Jonas Lauritzen, Adolfo Muñoz, Jose Luis Sevillano, and Anton Civit. 2013. The and Limitations of Wearable Activity Trackers: Considerations for Patients and
usefulness of activity trackers in elderly with reduced mobility: a case study. Clinicians. Clinical Journal of Oncology Nursing 20, 6 (Dec. 2016), 606–610.
Studies in Health Technology and Informatics 192 (2013), 759–762. https://doi.org/10.1188/16.CJON.606-610
[11] Denys J.C. Matthies, Marian Haescher, Suranga Nanayakkara, and Gerald Bieber. [17] Clint Zeagler. 2017. Where to wear it: functional, technical, and social considera-
2018. Step Detection for Rollator Users with Smartwatches. In Proceedings of the tions in on-body location for wearable technology 20 years of designing for wear-
Symposium on Spatial User Interaction (SUI ’18). Association for Computing Ma- ability. In Proceedings of the 2017 ACM International Symposium on Wearable Com-
chinery, New York, NY, USA, 163–167. https://doi.org/10.1145/3267782.3267784 puters. ACM, Maui Hawaii, 150–157. https://doi.org/10.1145/3123021.3123042
[12] Tara O’Brien, Meredith Troutman-Jordan, Donna Hathaway, Shannon Armstrong,
and Michael Moore. 2015. Acceptability of wristband activity trackers among
Improving Image Accessibility by Combining Haptic and
Auditory Feedback
Mallak Alkhathlan ML Tlachac
Worcester Polytechnic Institute Worcester Polytechnic Institute
Worcester, United States Worcester, United States
malkhathlan@wpi.edu mltlachac@wpi.edu

Lane Harrison Elke Rundensteiner


Worcester Polytechnic Institute Worcester Polytechnic Institute
Worcester, United States Worcester, United States
ltharrison@wpi.edu rundenst@wpi.edu

ABSTRACT blindness or low vision (BLV) to be able to understand digital im-


Advancements in accessibility have led to mobile applications that ages and their properties. Unfortunately, most image content is
help blind or low vision people (BLV) access surrounding infor- largely inaccessible [1, 7, 24] to the more than 1.3 billion people
mation independently. Unfortunately, accessibility of visual infor- worldwide who have a vision disability [26]. Previous work has
mation such as images remains limited. Previous research demon- started exploring strategies to capture more image details. For ex-
strated that spatial interaction can help BLV users build a mental ample, in a study [18] that used image tags to capture the relative
model of the relative locations of image objects. As haptics has spatial positions of objects within an image, 71.4% of BLV partic-
recently become a core component of modern smartphones, we ipants expressed enjoyment at being able to discover the relative
extend this prior research by designing three prototypes that use locations of objects.
haptic feedback to reveal object location in images. We evaluate Recent technology advances in smartphones can ofer more en-
these techniques in terms of experience and ability of BLV users to gaging experiences that simultaneously convey more relevant in-
locate multiple objects in images. Evaluation results in a prelimi- formation to BLV users. Smartphones contain actuators which can
nary study with seven BLV users suggest that the proposed haptic operate with low voltage and provide a high-speed touch sensa-
feedback prototype with auditory notifcations to identify people tions. The physical reaction created by a touchscreen interface is
and auditory caption can provide a more accessible and engaging called haptic feedback, or “haptics”. Force feedback and tactile hap-
image experience. tic feedback are two major components of haptic technology [9].
Force feedback relates to muscles and tendons, which can produce
CCS CONCEPTS physical characteristics such as force and mechanical compliance
[4, 16]. In contrast, tactile feedback can be described based on the
• Human-centered computing → Accessibility systems and
human senses and is in response to manual input [19, 27]. Tactile
tools; Empirical studies in interaction design; haptic feedback.
feedback systems have been studied in many diferent contexts,
KEYWORDS and have been used to inform the design of techniques within the
HCI community [2, 3, 5, 6, 8, 15–17, 20, 22, 23, 25, 28].
Haptics, Touchscreens, Smartphones, Screen readers, Visual impair- In this study, we thus aim to verify and extend previous studies
ment, Accessibility in the current context of touchscreen mobile phones. Our research
ACM Reference Format: goal is to improve BLV understanding of images through tactile
Mallak Alkhathlan, ML Tlachac, Lane Harrison, and Elke Rundensteiner. haptic feedback, as such feedback has the potential to solve certain
2022. Improving Image Accessibility by Combining Haptic and Auditory accessibility issues. Our focus is to explore emerging haptic capabil-
Feedback. In The 24th International ACM SIGACCESS Conference on Com- ities to communicate to BLV users where objects appear spatially
puters and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece.
within the image. We designed and tested various prototypes to
ACM, New York, NY, USA, 6 pages. https://doi.org/10.1145/3517428.3550362
explore the feasibility of providing haptic feedback to allow BLV
users to form a mental model of the images, thereby gaining a
1 INTRODUCTION AND RELATED WORK
more in-depth understanding of the images. As haptics alone can
Images are widely used in mobile applications. Given the prolifer- only convey spatial information, some of our prototypes paired
ation of online images, it is increasingly critical for people with the haptic notifcations with auditory image descriptions to form a
Permission to make digital or hard copies of part or all of this work for personal or more complete user experience, geared towards improving image
classroom use is granted without fee provided that copies are not made or distributed accessibility. Our study designs and evaluates three prototypes:
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored.
Prototype (F1) has only haptic feedback to indicate object and
For all other uses, contact the owner/author(s). no-object regions, prototype (F2) has haptic feedback to compare
ASSETS ’22, October 23–26, 2022, Athens, Greece object and no-object regions as well as auditory notifcations to
© 2022 Copyright held by the owner/author(s). identify people, and prototype (F3) has haptic feedback to compare
ACM ISBN 978-1-4503-9258-7/22/10.
https://doi.org/10.1145/3517428.3550362 object and no-object regions as well as auditory notifcations to
ASSETS ’22, October 23–26, 2022, Athens, Greece

identify people as well as auditory captions describing the image and a object region (object regions exist where there is a person in
scenario. the image). Comparatively, For notifcations of no-object, we use
Our research is most similar to that of Morris et al. [18] in that impact haptics to provide a heavy physical metaphor that indicates
we also explore how BLV users spatially explore images. Since a collision between one fnger and a no-object region. These tac-
the standard used for image descriptions since 1995 is outdated, tile responses intend to reinforce understanding of spatial objects
Morris et al. [18] developed a prototype for Android phones that within the image[11]. We select the two types of impacts to indicate
allows direct interactions with an image when touching a specifed the efect of collision, however, we intend to experiment with more
region. Yet even more engaging descriptions can be experienced feedback patterns in the future as more expressive technology is
with current technologies, such as haptics. developed.
Microsoft’s Seeing AI app1 attempts to create a more engaging While our prototypes can relate spatial knowledge regarding any
experience for BLV users. Seeing AI primarily uses auditory feed- objects, we focused on images that contain people because a prior
back to describe an image object the user is touching on the screen. study [21] found that questions about people were more frequent
Recently, Seeing AI introduced a haptic notifcation when users than other image questions in accessibility contexts. Therefore, un-
found the object. However, it remains unclear how users interact like prototype F1, we incorporate facial recognition for prototypes
with such haptics or whether haptics create a more engaging ex- F2 and F3 by syncing the audio indicators with the haptics. In proto-
perience. How to address challenges involved in image navigation type F3, a screen reader also reads the image caption aloud. For the
are also unclear, such as situations when images contain multiple captions, we add descriptions in the button’s content description
objects. feld to label an image-based button to interact properly with the
Thus, our study explores possibilities beyond Morris et al. [18] screen reader.
and Seeing AI. We extend the spatial interaction style introduced by For each of the three prototype, we presented users with four
Morris et al. [18] by incorporating haptic feedback to assist image diferent scenarios, a, b, c, and d, which respectively contain 2, 3,
navigation. We design prototypes that leverage haptic notifcations 4, and 5 objects. These scenarios test the ability of participants to
to distinguish between object and no-object image regions. Our locate multiple objects within images. No participant experienced
prototypes extend the haptic ideas explored by Seeing AI which use the same image more than once. Briefy, the prototypes are as
one type of haptic notifcation. The expressiveness of haptics and follows:
increased capabilities in modern smartphones suggest that more
(1) Prototype (F1) has only haptic feedback to compare object
information might be communicated with appropriate design. We
and no-object regions (Appendix Fig. 1).
also conduct a usability test with our prototypes to evaluate how
(2) Prototype (F2) has haptic feedback to compare object and
users interact with the haptic notifcations alone as well as when
no-object regions as well as auditory notifcations to identify
the haptic notifcations are paired with complementary auditory
people (Appendix Fig. 2).
image description. Our research explores how combining haptics
(3) Prototype (F3) has haptic feedback to compare object and no-
with auditory image descriptions can promote spatial awareness to
object regions, auditory notifcations to identify people, and
make images more accessible for BLV users.
auditory captions to describe the image scenario (Appendix
Fig. 3).
2 SYSTEM DESIGN
We prototype three novel interactions based on previous research
3 PRELIMINARY EVALUATION
[18] that illustrate the relative location of the components in an
image. We implement our prototypes as an iOS application. The As a proof-of-concept, we conduct an evaluation study of the proto-
app works across devices with diferent screen size. Our prototypes types (IRB-approved). Using Twitter advertisements, we recruited
consist of a feedback generator (UIFeedbackGenerator superclass 7 legally blind adults with diferent levels of haptic experience (age
that available from UIKit and iOS 10 or later) that can play the pre- range 21-51 years old; mean age = 32 years old; 4 reported male; 3
defned haptic patterns [12]. The haptic notifcations [13] facilitate female). Participants used their personal smartphones. Each par-
image navigation by allowing users to distinguish between difer- ticipant completed a 15-minute tutorial, allowing them to practice
ent regions in the image. To trigger feedback, notifcations informs using the prototypes on sample images and ask questions to guar-
users when tasks have been successfully completed or failed. Only antee understanding. For these evaluations, prototype order and
a single fnger interacting with the image is required to trigger the scenario order randomized across participants using a latin square
object and no-object notifcations. Object and no-object regions are methodology to mitigate order efects. All sessions were recorded.
activated by the use of the touchend event which helps to prevent Evaluation measures include four components: memory recall,
unintentional actions through the UINotifcationFeedbackGener- time measurement, sensitivity level, and preference. 1) Memory
ator subclass [13]. Specifcally, the prototypes deliver a medium recall: A previous study [14] indicates that one of the difculties
haptic notifcation to indicate that the user has successfully found that BLV encountered when using touch screens is learning where
the object while the prototypes deliver a heavy haptic notifcation objects are located on the screen and remembering the location
to indicate that the user has failed to fnd the object [10]. In noti- of on-screen objects. We address this problem by testing memory
fcations of object, we use impact haptics to provides a medium recall. For each prototype, we asked “Do you remember the sequence
physical metaphor that indicates a collision between one fnger of the scenarios in same order given?” participants would then state
the sequence they remember, for example, “There were 4, 3, 2, and 5
1 https://www.microsoft.com/en-us/ai/seeing-ai objects.” to answer our questions. When comparing the accuracy of
Improving Image Accessibility by Combining Haptic and Auditory Feedback ASSETS ’22, October 23–26, 2022, Athens, Greece

participant recall, the prototypes showed statistically signifcant 4 CONCLUSION


diferences (� 2 (2, N = 7) = 16.88, P = .0003) while scenarios did not In this study, we explore how BLV users can beneft from haptic
difer. Thus, the quantity of feedback channels (haptics and audio) feedback when navigating images. We developed three prototypes
appears to impact recall, while the number of objects (between two that incorporate varying degrees of details to enhance digital im-
and fve) in the image does not appear to impact recall. Overall, the age understanding. With diferent haptic notifcations for object
integration of the three channels may prove in understanding the and no-object regions, our prototypes are designed so that BLV
image. users can navigate the spatial location of objects within images.
2) Time measurement: For each image, we recorded the time To evaluate our three prototypes, we conduct a usability test with
taken by participants to fnd each object in seconds. The time it seven BLV participants. The participants preferred the haptic feed-
took to fnd each individual object is the dependent variable for back prototype with auditory notifcations to identify people and
this experiment. The predictors of the model were categorical vari- auditory caption. Overall, the results revealed that tactile haptic
ables for scenario, prototype, and object number. In our repeated feedback has the potential to improve image understanding and
measures design, the result of the outcome variable is continuous user experience of BLV smartphone users. Our fndings can enable
and each participant having multiple observations, thus a linear a better understanding of tactile feedback rendering strategies and
mixed model was used to analyze the data. No statistically signif- support the adoption of haptic feedback as an accessibility feature
cant results were found for the model that contained an interaction in smartphones.
between prototype and scenario. However, there was a diference in
completion time between object 2 and object 3 (using the data from
scenarios b, c and d) for prototype F1 compared to the diference 5 ACKNOWLEDGMENT:
between object 2 and object 3 (using the data from scenarios b, c We thank Imam Abdulrahman Bin Faisal University (IAU) and Saudi
and d) for prototype F2. The model that contained an interaction Arabian Cultural Mission to the USA (SACM) for fnancial support.
between prototype and object showed no signifcant diference
(� 2 (8, N = 7) = 14.48, P = .07). Thus, the completion time was simi- REFERENCES
lar for two, three, and four objects. However, the completion time [1] Mallak Alkhathlan, ML Tlachac, Lane Harrison, and Elke Rundensteiner. 2021.
appears increased when a ffth object was added to the image. We “Honestly I Never Really Thought About Adding a Description”: Why Highly En-
did not explore the completion time for images with more than gaged Tweets Are Inaccessible. In IFIP Conference on Human-Computer Interaction.
Springer, 373–395.
fve objects, though future studies might explore whether there are [2] Christopher S Campbell, Shumin Zhai, Kim W May, and Paul P Maglio. 1999.
ceiling efects for the number of efective objects. What you feel must be what you see: adding tactile feedback to the trackpoint.
In In: Proc. of INTERACT’99: 7th IFIP Conference on Human Computer Interaction.
3) Sensitivity level: At the end of the session, participants were Citeseer.
asked to rate the appropriateness of the strength of the haptic [3] Géry Casiez, Nicolas Roussel, Romuald Vanbelleghem, and Frédéric Giraud. 2011.
feedback experienced when exploring the images. The level of Surfpad: riding towards targets on a squeeze flm efect. In Proceedings of the
SIGCHI Conference on Human Factors in Computing Systems. 2491–2500.
haptic notifcation was efectively used to measure comfort. Our [4] Heather Culbertson, Samuel B Schorr, and Allison M Okamura. 2018. Haptics:
prototypes contained two diferent haptic notifcations; a heavy The present and future of artifcial touch sensation. Annual Review of Control,
strength notifcation was used to indicate failure at fnding an Robotics, and Autonomous Systems 1 (2018), 385–409.
[5] Jack Tigh Dennerlein, David B Martin, and Christopher Hasser. 2000. Force-
object, and a medium strength notifcation was used to indicate feedback improves performance for steering and combined steering-targeting
success at fnding an object. Notably, all but one participant reported tasks. In Proceedings of the SIGCHI conference on Human factors in computing
systems. 423–429.
that the haptic feedback notifcation level was appropriate. The [6] Nicholas A Giudice, Hari Prasath Palani, Eric Brenner, and Kevin M Kramer. 2012.
majority of participants thus approved of the strength of both types Learning non-visual graphical information using a touch-based vibro-audio
of notifcations. interface. In Proceedings of the 14th international ACM SIGACCESS conference on
Computers and accessibility. 103–110.
4) Preference: After having reviewed all three prototypes, partici- [7] Cole Gleason, Amy Pavel, Emma McCamey, Christina Low, Patrick Carrington,
pants were asked to rate their preferences regarding each prototype. Kris M Kitani, and Jefrey P Bigham. 2020. Twitter A11y: A browser extension
Six participants reported lower preference scores for prototype F1 to make Twitter images accessible. In Proceedings of the 2020 CHI Conference on
Human Factors in Computing Systems. 1–12.
when compared to prototype F2. One participant had no change [8] Cagatay Goncu and Kim Marriott. 2011. GraVVITAS: generic multi-touch presen-
in preference between these two prototypes. A Wilcoxon signed- tation of accessible graphics. In IFIP Conference on Human-Computer Interaction.
Springer, 30–48.
rank test indicated that this diference was statistically signifcant, [9] Mitchell L. Gordon and Shumin Zhai. 2019. Touchscreen Haptic Augmentation
z = -2.311, P = .02 and exact test probability of P = .03. All seven Efects on Tapping, Drag and Drop, and Path Following. In Proceedings of the
participants reported lower preference scores for prototype F1 as 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland
Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12.
compared to prototype F3. The Wilcoxon signed-rank test indicated https://doi.org/10.1145/3290605.3300603
that this diference was statistically signifcant, z = -2.53, P = .01 [10] 2022 Apple Inc. 2022. Human Interface Guidelines. Retrieved Jun 20, 2022
and exact test probability of P = .02. Similarly, all seven participants from https://developer.apple.com/design/human-interface-guidelines/patterns/
playing-haptics
reported lower preference scores for prototype F2 as compared to [11] 2022 Apple Inc. 2022. Human Interface Guidelines. Retrieved Jun 20, 2022
prototype F3. The Wilcoxon signed-rank test indicated that this from https://developer.apple.com/design/human-interface-guidelines/patterns/
feedback/
diference was statistically signifcant, z = -2.41, P = .02. From these [12] 2022 Apple Inc. 2022. uikit. Retrieved Jun 20, 2022 from https://developer.apple.
results, we deduce that participants preferred the prototype F3 that com/documentation/uikit/uifeedbackgenerator
conveyed information using haptic feedback with both auditory [13] 2022 Apple Inc. 2022. uikit. Retrieved Jun 20, 2022 from https://developer.apple.
com/documentation/uikit/uinotifcationfeedbackgenerator
notifcations to identify people and auditory captions describing [14] Shaun K Kane, Jefrey P Bigham, and Jacob O Wobbrock. 2008. Slide rule: making
the scene. mobile touch screens accessible to blind people using multi-touch interaction
ASSETS ’22, October 23–26, 2022, Athens, Greece

techniques. In Proceedings of the 10th international ACM SIGACCESS conference


on Computers and accessibility. 73–80.
[15] Ernst Kruijf, Saugata Biswas, Christina Trepkowski, Jens Maiero, George Ghinea,
and Wolfgang Stuerzlinger. 2019. Multilayer haptic feedback for pen-based
tablet interaction. In Proceedings of the 2019 CHI Conference on Human Factors in
Computing Systems. 1–14.
[16] Karon E MacLean. 2008. Haptic interaction design for everyday interfaces.
Reviews of Human Factors and Ergonomics 4, 1 (2008), 149–194.
[17] Giuseppe Melf, Karin Müller, Thorsten Schwarz, Gerhard Jaworek, and Rainer
Stiefelhagen. 2020. Understanding what you feel: A mobile audio-tactile system
for graphics used at schools with students with visual impairment. In Proceedings
of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.
[18] Meredith Ringel Morris, Jazette Johnson, Cynthia L Bennett, and Edward Cutrell.
2018. Rich representations of visual content for screen reader users. In Proceedings
of the 2018 CHI conference on human factors in computing systems. 1–11.
[19] Alvaro Pascual-Leone and Roy Hamilton. 2001. The metamodal organization of
the brain. Progress in brain research 134 (2001), 427–445.
[20] Benjamin Poppinga, Charlotte Magnusson, Martin Pielot, and Kirsten Rassmus-
Gröhn. 2011. TouchOver map: audio-tactile exploration of interactive maps. In
Proceedings of the 13th International Conference on Human Computer Interaction
with Mobile Devices and Services. 545–550.
[21] Elliot Salisbury, Ece Kamar, and Meredith Morris. 2017. Toward scalable social
alt text: Conversational crowdsourcing as a tool for refning vision-to-language
technology for the blind. In Proceedings of the AAAI Conference on Human Com-
putation and Crowdsourcing, Vol. 5.
[22] Brandon T Shrewsbury. 2011. Providing haptic feedback using the kinect. In The
proceedings of the 13th international ACM SIGACCESS conference on Computers
and accessibility. 321–322.
[23] Francesca Sorgini, Renato Caliò, Maria Chiara Carrozza, and Calogero Maria
Oddo. 2018. Haptic-assistive technologies for audition and vision sensory dis-
abilities. Disability and Rehabilitation: Assistive Technology 13, 4 (2018), 394–421.
[24] Abigale Stangl, Meredith Ringel Morris, and Danna Gurari. 2020. " Person, Shoes,
Tree. Is the Person Naked?" What People with Vision Impairments Want in
Image Descriptions. In Proceedings of the 2020 CHI Conference on Human Factors
in Computing Systems. 1–13.
[25] Eric Vezzoli, Thomas Sednaoui, Michel Amberg, Frédéric Giraud, and Betty
Lemaire-Semail. 2016. Texture rendering strategies with a high fdelity-capacitive
visual-haptic friction control device. In International Conference on Human Haptic
Sensing and Touch Enabled Computer Applications. Springer, 251–260.
[26] World Health Organization (WHO). 2020. Blindness and vision impairment.
Retrieved October 8, 2019 from https://www.who.int/news-room/fact-sheets/
detail/blindness-and-visual-impairment
[27] Thomas Wolbers, Roberta L Klatzky, Jack M Loomis, Magdalena G Wutte, and
Nicholas A Giudice. 2011. Modality-independent coding of spatial layout in the
human brain. Current Biology 21, 11 (2011), 984–989.
[28] Yang Zhang and Chris Harrison. 2015. Quantifying the targeting performance
beneft of electrostatic haptic feedback on touchscreens. In Proceedings of the
2015 International Conference on Interactive Tabletops & Surfaces. 43–46.
Improving Image Accessibility by Combining Haptic and Auditory Feedback ASSETS ’22, October 23–26, 2022, Athens, Greece

APPENDICES
PROTOTYPE SPECIFICATIONS
Table 1: Haptic feedback legend for prototypes F1, F2, and F3. No-object regions do not contain people, and when touched with
one fnger, it triggers a no-object notifcation. Object regions contain people, and when touched with one fnger, it triggers an
object notifcation. We use heavy and medium impact haptics for the no-object and object notifcations, respectively.

No-object region Notifcation of no-object Object region Notifcation of object

(a) Image with 2 objects. (b) Image with 3 objects. (c) Image with 4 objects. (d) Image with 5 objects.

Figure 1: Prototype (F1): has only haptic feedback; prototype. (a) shows scenario a: image with two objects, (b) shows scenario b:
image with three objects, (c) shows scenario c: image with four objects, and (d) shows scenario d: image with fve objects.

(a) Image with 2 objects. (b) Image with 3 objects. (c) Image with 4 objects. (d) Image with 5 objects.

Figure 2: Prototype (F2): has haptic feedback with auditory notifcations to identify people. We use audio feedback for each
object to indicate people. Note, the audio feedback is depicted as text and the images do not contain text. (a) shows scenario
a: image with two objects; frst object audio is “Here is Denzel Washington”, and second object audio is “Here is Viola Davis”.
(b) shows scenario b: image with three objects; frst object audio is “Here is Edward Carstens”, second object audio is “Here is
Virginia Madsen”, and third object audio: “Here is Elaine Madsen”. (c) shows scenario c: image with four objects; frst object
audio is “Here is Laura Harris”, second object audio is “Here is Aubrey dollar”, third object audio is “Here is Angie Harmon”,
and fourth object audio is “Here is Paula Newsome”. (d) shows scenario d: image of fve objects; frst object audio is “Here is
Benjamin”, second object audio is “Here is Olivi”, third object audio is “Here is Isabella”, f ourth object audio is “Here is Emma”,
and ffth object audio is “Here is Sophia”.
ASSETS ’22, October 23–26, 2022, Athens, Greece

(a) Image with 2 objects. (b) Image with 3 objects. (c) Image with 4 objects. (d) Image with 5 objects.

Figure 3: Prototype (F3): has haptic feedback with auditory notifcations to identify people and auditory caption. We use audio
feedback for each object to indicate people. Note, the audio feedback is depicted as text and the images do not contain text. (a)
shows scenario a: image with two objects; frst object audio is ”Here is Denzel Washington”, and second object audio is ”Here is
Viola Davis” ; the caption is ”Actors atend the opening night”. (b) shows scenario b: image with three objects; frst object audio is
”Here is Edward Carstens”, second object audio is ”Here is Virginia Madsen”, and third object audio: ”Here is Elaine Madsen” ;
caption is ”Actor, dramatist and person arrive at the premiere ”. (c) shows scenario c: image with four objects; frst object audio is
”Here is Laura Harris”, second object audio is ”Here is Aubrey dollar”, third object audio is ”Here is Angie Harmon”, and fourth
object audio is ”Here is Paula Newsome” ; caption is ”Actors atend the upfront presentation”. (d) shows scenario d: image of fve
objects; frst object audio is ”Here is Benjamin”, second object audio is ”Here is Olivi”, third object audio is ”Here is Isabella”,
f ourth object audio is ”Here is Emma”, and ffth object is”Here is Sophia” ; caption is ”Happy children and adults with a shopping
cart inside retail”.
Inter-rater Reliability of Command-Line Web Accessibility
Evaluation Tools
Eryn Rachael Kelsey-Adkins Robert Thompson
ekelseyadkins@mines.edu rthompson@mines.edu
Colorado School of Mines Colorado School of Mines
Golden, Colorado, USA Golden, Colorado, USA
ABSTRACT has led many organizations to incorporate Continuous Integra-
This study compares four command-line interface (CLI) web acces- tion/Continuous Deployment (CI/CD) practices that can eliminate
sibility tools, examining if one CLI tool is sufcient for automated or reduce the need for many manual tasks [10]. Incorporating auto-
accessibility evaluation. The four tools were: Axe-core/cli, IBM mated Web accessibility evaluation in CI/CD practices can alert de-
Equal Access NPM Accessibility Checker (Accessibility Checker), velopers to egregious accessibility failures. Command-line interface
Pa11y-ci, and the A11y Machine. Inter-rater reliability was calcu- (CLI) automated Web accessibility evaluation tools are particularly
lated using Gwet’s alpha coefcient 2 (AC 2 ), and the results indicate suited to integration with CI/CD. WAI maintains a list of nineteen
very poor reliability between tools. CLI evaluation tools, but makes no claims to the reliability of any
of them [13].
CCS CONCEPTS A common reliability measurement is Inter-Rater Reliability
(IRR), which quantifes the rate of agreement or disagreement be-
• Human-centered computing → Accessibility design and eval-
tween two or more raters [4]. Statistics to calculate IRR include
uation methods.
Cohen’s kappa, Fleiss’s kappa, Krippendorf’s alpha, and Gwet’s al-
pha coefcient 2 (AC 2 ). Both Krippendorf’s alpha and Gwet’s AC 2
KEYWORDS can be calculated with missing data and can use diferent weight
accessibility, command-line interface, inter-rater reliability systems for diferent types of ratings. Gwet’s AC 2 was developed to
ACM Reference Format: address paradoxes in kappa-like statistics: the prevalence problem,
Eryn Rachael Kelsey-Adkins and Robert Thompson. 2022. Inter-rater Reli- which can cause kappa estimates to be erroneously low, and the
ability of Command-Line Web Accessibility Evaluation Tools. In The 24th bias problem, which can cause kappa estimates to be erroneously
International ACM SIGACCESS Conference on Computers and Accessibility high [5]. This makes Gwet’s AC 2 well suited to evaluate IRR of
(ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, Web accessibility evaluation tools where not every tool rates every
4 pages. https://doi.org/10.1145/3517428.3550395 HTML element. Gwet’s AC 2 values range from -1 to 1. A result of
-1 shows near perfect disagreement, whereas a result of 1 signifes
1 INTRODUCTION AND BACKGROUND unanimous agreement. A result of 0 is equivalent to each rater
Tim Berners-Lee, inventor of the World Wide Web, said, "it is critical randomly assigning ratings.
that the Web be usable by anyone, regardless of individual capa- Previous studies have compared automated evaluation tools [2,
bilities and disabilities" [1] at the launch of the Web Accessibility 3, 6, 7, 9], but very few have evaluated IRR between tools [2, 7] and
Initiative (WAI) in 1997. WAI promotes Web accessibility through no studies have focused on IRR in CLI evaluation tools. This study
the creation of accessibility standards such as the Web Content uses Gwet’s AC 2 to evaluate IRR of four automated evaluation tools:
Accessibility Guidelines (WCAG). WCAG 2.0, published in 2008, Axe-core/cli, IBM’s Accessibility Checker, Pa11y-ci, and the A11y
outlines accessibility success criteria across four principles: that Machine.
Web content must be Perceivable, Observable, Understandable, and
Robust (POUR) [11]. These guidelines have been the basis of acces- 2 METHODOLOGY
sibility and non-discrimination legislation in nineteen countries
2.1 Inclusion and Exclusion Criteria for
and the European Union [12].
Web accessibility can be evaluated through manual testing or Evaluation Tools
with the aid of automated tools. Automated testing cannot replace The frst step of the analysis was to identify the inclusion and exclu-
manual testing, but it is an important part of accessibility evaluation sion criteria for the evaluation tools. For the desired use case, the
of websites, both after deployment and during development. The tools should have a command line interface, be open source, and be
need for accelerated development and delivery of website features able to evaluate groups of websites for compliance with W3C Web
Content Accessibility Guidelines 2.0 and Section 508, the US federal
Permission to make digital or hard copies of part or all of this work for personal or procurement standard. Any tool that is a more specifc version
classroom use is granted without fee provided that copies are not made or distributed of another tool should be excluded in favor of the more general
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. version. The World Wide Web Consortium (W3C) maintains a list
For all other uses, contact the owner/author(s). of Web Accessibility Evaluation Tools that can be fltered by type
ASSETS ’22, October 23–26, 2022, Athens, Greece of tool, guidelines, license, and language, among others [13]. Using
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. the inclusion criteria, fve flters were identifed: Command Line
https://doi.org/10.1145/3517428.3550395 Interface, Open Source, WCAG 2.0 — W3C Web Content Accessibility
ASSETS ’22, October 23–26, 2022, Athens, Greece Kelsey-Adkins and Thompson

Table 1: Node.js packages of each evaluation tool

Package Name Developer Version Release Date Last Update


@axe-core/cli Deque Systems, Inc. 4.4.2 01/10/2015 06/01/2022
accessibility-checker IBM Accessibility 3.1.30 05/18/2020 05/10/2022
pa11y-ci Nature Publishing Group 3.0.1 04/08/2015 11/26/2021
the-a11y-machine Liip 0.9.3 01/29/2016 02/08/2017

Table 2: Rating classifcation by evaluation tool

Rating A11y machine Accessibility Checker Axe-core/cli Pa11y-ci


pass N/A N/A pass N/A
warning notice, warning potential violation incomplete notice, warning
violation error violation violation error

Guidelines 2.0, Section 508, US federal procurement standard, and Gwet’s AC 2 was calculated following the general form:
Automatically checks groups of web pages or web sites. pa − pe
Applying these flters resulted in fve evaluation tools. After α= (1)
1 − pe
applying the exclusion criterion, four evaluation tools met all cri- where pa is the weighted percent agreement between evaluation
teria: Axe-core/cli, IBM Equal Access NPM Accessibility Checker tools and pe the weighted percent chance agreement. As there is a
(Accessibility Checker), Pa11y-ci, and the A11y Machine. The most ranked order to the ratings, such that pass < warning < violation,
current version of each evaluation tool was installed with Node an ordinal weight function was calculated:
Package Manager (NPM) version 8.1.4. See Table 1 for versions and
the most recent update of each evaluation tool. All tools were run (
with Google Chrome version 102.0.5005.61 in headless mode. 1, if k = l
w kl = #{(i, j),min(k,l )≤i <j ≤max (k,l )} (2)
1− wmax , if k , l
where #{(i, j), min(k, l) ≤ i < j ≤ max(k, l)} is the number of
2.2 Data Collection pairs (i, j), i < j, that can be formed between min(k, l) and max(k, l),
In order to compare the efectiveness of each evaluation tool, the and wmax the maximum value over all k and l [4].
frst 5 higher education institutions of the United States listed in
the ranking Webometrics [14] were used as the sample: Harvard 3 RESULTS
University, Stanford University, Massachusetts Institute of Technol- Using the methods described in section 2, the accessibility ratings
ogy, University of California Berkeley, and University of Michigan. for 81 HTML elements across fve websites by the four evaluation
The homepage of each institution was evaluated, as it is the most tools were analyzed. Axe-core/cli rated the highest number of ele-
representative page and is the web page frequently analyzed in web ments at 78, while the A11y Machine rated the least (see Figure 1).
accessibility studies [8]. Reports from each evaluation tool were
saved as JSON fles.
Each HTML element evaluated by one or more tools was entered
into a table with all of its associated ratings from any of the evalua-
tion tools. As each evaluation tool has its own rating system, these
ratings were divided into pass, warning, and violation (see Table 2).
Note that although both Pa11y-ci and Axe-core/cli use a pass rating,
only Axe-core/cli explicitly rates specifc HTML elements as "pass".
A system of four ratings, not evaluated, pass, warning, and viola-
tion, was considered but rejected, as it is unclear from the reports
if any specifc element has been evaluated by that specifc tool. As
Gwet’s AC 2 is based on the computation of pairwise values, any
element that had been rated by only one tool was discarded. See
Table 3 for a truncated copy of the rating table.
The rating table was converted into an agreement table that
records how often each element received each rating. The cells in Figure 1: HTML Element Rating by Tool
this table, r i j , are the number of times the evaluation tools assigned
element i rating category j. See Table 4 for a truncated copy of the Calculating Gwet’s AC 2 gave an observed weighted percent
agreement table. agreement, pa , of 0.87 and a chance weighted percent agreement,
Inter-rater Reliability of CLI Accessibility Evaluation Tools ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 3: Example of the ratings table

HTML Element A11y machine Accessibility Checker Axe-core/cli Pa11y-ci


<a class="home-hero-link" ... </a> warning violation warning warning
<a href="https://coronavirus ... </a> violation violation violation violation
<button type="submit" ...</button> violation pass warning
<form id="navbar-search" ... </form> violation violation violation violation
<p>2022/23 season</p> violation violation violation

Table 4: Example of the agreement table report formats, and enabling page load delays and timeout lim-
its. IBM’s Accessibility Checker not only notes the specifc HTML
Element pass warning violation element fagged due to a violation or potential violation, but also re-
ports that element’s specifc location in the DOM tree. Axe-core/cli
1 0 3 1
reports include a list of standards each HTML element passed,
2 0 0 4
which can reduce manual evaluation times, and links to more in-
3 0 0 4
formation for each standard. Both Accessibility Checker and Axe-
4 1 1 1
core/cli seem to be better at fagging appropriate use of aria tags
such as focusable elements tagged aria-hidden=true, but also had a
higher number of false positive contrast errors. Pa11y-ci runs on
pe , of 0.88. The weighted inter-rater reliability of all four evaluation the HTML Code Snifer accessibility engine by default, but can be
tools was -0.05, indicating near randomness. Inter-rater reliability confgured to run on the axe-core engine as well, which increases
was then recalculated excluding the A11y Machine ratings; the its fexibility.
A11y Machine met all inclusion and exclusion criteria, but has been This study compared the ratings given to individual HTML el-
archived with no updates since 2019. Calculating Gwet’s AC 2 on ements. However, analyzing ratings for WCAG success criteria
the remaining three evaluation tools resulted in a score of -0.17 rather than individual elements may provide useful insight into
(pa = 0.82, pe = 0.85). the IRR of these tools. Further, these tools were not evaluated for
intra-rater reliability or accuracy. An accurate Web accessibility
4 DISCUSSION AND FUTURE WORK evaluation tool must have few false positives and negatives, and
must produce consistent results. Future work should analyze the
Gwet’s AC 2 shows there is very little inter-rater reliability between correctness, completeness and specifcity of the results of each of
these evaluation tools when analyzing ratings for individual HTML these CLI evaluation tools. As Pa11y-ci can be confgured to run
elements. The diference between two tools may be minor if one on either the HTML Code Snifer or the axe-core accessibility en-
tool marks an element as a violation and another tool marks that gines, future work could analyze the inter-rater reliability between
element as a potential violation, if the developers that rely on these Pa11y-ci confgured for the axe-core rule set and Axe-core/cli.
tools in a CI/CD environment follow up on both ratings. However,
a false pass rating could allow an accessibility standard violation to
make it into production. This reinforces the idea that incorporating 5 CONCLUSION
more than one evaluation tool or method in accessibility testing is Four command-line interface Web accessibility evaluation tools
necessary. were analyzed for inter-rater reliability with Gwet’s AC 2 . Each eval-
Usability can be highly subjective and dependent on individual uation tool was used to assess the accessibility of fve university
use cases, so the four evaluation tools were only informally assessed website home pages. Each individual HTML element and the as-
for their ease of use during this study. The criteria were ease of use, sociated number of pass, warning, or violation ratings it received
confguration options, and the level of detail included in reports. from these evaluations were collated into an agreement table. As
Of the four, the A11y Machine was the easiest to setup and pass < warning < violation, a ordinal weight function was used
run. This tool can crawl all Web pages within a website (up to 128 to calculate Gwet’s AC 2 . An initial result of -0.05 shows very poor
pages), which is useful for a comprehensive accessibility overview inter-rater reliability, near equivalent to random guessing. There
of a website when a site map is not available. The A11y Machine were notable diferences both in which HTML elements each eval-
outputs the results to an easily read local Web interface that can uation tool fagged, as well as in the ratings assigned to the same
be fltered by violation level; a summary of the evaluation is saved elements be diferent tools. Each evaluation tool prioritizes and tests
to a JSON fle for integration with CI/CD environments. However, for each accessibility standard through diferent criteria and no one
the A11y Machine has the least options for confguration according of these tools should be assumed to provide complete coverage of
to specifc use case needs and is no longer maintained. Any issues the WCAG standards that can be tested through automation.
with the package will likely not be resolved in the future.
The other three evaluation tools are highly confgurable and eas- REFERENCES
ily connected to CI/CD environments. These confguration options [1] Tim Berners-Lee. 1997. Press Release: W3C Launches Web Accessibility Initiative.
include enabling or disabling specifc rules or rule sets, specifying W3C. Retrieved June 5, 2022 from https://www.w3.org/Press/WAI-Launch.html
ASSETS ’22, October 23–26, 2022, Athens, Greece Kelsey-Adkins and Thompson

[2] Maria Björkman. 2022. Inter-tool Reliability of Three Automated Web Accessibil- (March 2021), 179–184. https://doi.org/10.1007/s10209-019-00702-w
ity Evaluators. USCCS 15 (2022), 15–25. [9] Marian Pădure and Costin Pribeanu. 2019. Exploring the diferences between
[3] Giorgio Brajnik. 2004. Comparing accessibility evaluation tools: a method for fve accessibility evaluation tools. In Proceedings of RoCHI 2019-International
tool efectiveness. Universal Access in the Information Society 3, 3 (2004), 252–263. Conference on Human-Computer interaction. MATRIX ROM, Bucharest, Romania,
https://doi.org/10.1007/s10209-004-0105-y 87–90. http://rochi.utcluj.ro/articole/7/RoCHI2019-Padure.pdf
[4] Kilem L. Gwet. 2015. On the Krippendorf’s alpha coefcient. (2015). [10] Mojtaba Shahin, Muhammad Ali Babar, and Liming Zhu. 2017. Continuous
https://www.researchgate.net/profle/Kilem-Gwet/publication/267823285_ Integration, Delivery and Deployment: A Systematic Review on Approaches,
On_Krippendorf’s_Alpha_Coefcient/links/60e3bf0892851ca944ae25d6/On- Tools, Challenges and Practices. IEEE Access 5 (2017), 3909–3943. https://doi.
Krippendorfs-Alpha-Coefcient.pdf org/10.1109/ACCESS.2017.2685629
[5] Kevin A. Hallgren. 2012. Computing Inter-Rater Reliability for Observational [11] W3C Web Accessibility Initiative (WAI). 2008. Web Content Accessibility Guide-
Data: An Overview and Tutorial. Tutorials in Quantitative Methods for Psychology lines (WCAG) 2.0. W3C Web Accessibility Initiative (WAI). Retrieved June 5,
8, 1 (2012), 23–34. https://doi.org/10.20982/tqmp.08.1.p023 2022 from https://www.w3.org/TR/WCAG20/#guidelines
[6] Rita Ismailova and Yavuz Inal. 2022. Comparison of Online Accessibility Evalua- [12] W3C Web Accessibility Initiative (WAI). 2018. Web Accessibility Laws & Policies.
tion Tools: An Analysis of Tool Efectiveness. IEEE Access 10 (2022), 58233–58239. W3C Web Accessibility Initiative (WAI). Retrieved June 5, 2022 from https:
https://doi.org/10.1109/ACCESS.2022.3179375 //www.w3.org/WAI/policies/
[7] Ashli M. Molinero, Frederick G Kohun, and R. Morris. 2006. Reliability in au- [13] W3C Web Accessibility Initiative (WAI). 2022. Web Accessibility Evaluation Tools
tomated evaluation tools for web accessibility standards compliance. Issues in List. W3C Web Accessibility Initiative (WAI). Retrieved June 5, 2022 from
Information Systems 7, 2 (2006), 218–222. https://www.w3.org/WAI/ER/tools/
[8] Carlos Máñez-Carvajal, Jose Francisco Cervera-Mérida, and Rocío Fernández- [14] Webometrics. 2022. United States of America | Ranking Web of Universities: Webo-
Piqueras. 2021. Web accessibility evaluation of top-ranking university Web sites metrics ranks 30000 institutions. Webometrics. Retrieved June 5, 2022 from https://
in Spain, Chile and Mexico. Universal Access in the Information Society 20, 1 www.webometrics.info/en/North_america/United%20States%20of%20America
Investigating How People with Disabilities Disclose Dificulties
on YouTube
Shuo Niu Jaime Garcia
shniu@clarku.edu Summayah Waseem
Clark University Li Liu
Worcester, Massachusetts, USA jaime.garciagarcia.455@my.csun.edu
summayah.waseem.616@my.csun.edu
lliu@csun.edu
California State University, Northridge
Northridge, California, USA

ABSTRACT needs and challenges when doing daily activities [1, 7, 11, 12], bar-
Video-sharing platforms such as Youtube are increasingly used by riers with the video interactions [3, 8], and ways to have playful
people with disabilities (PWDs) to share their experiences and con- experiences [4]. However, there is no systematic examination of
cerns in their lives. However, there is no systematic examination how YouTubers with disabilities disclose challenges and difcult
of how and why YouTubers disclose their challenges publicly on experiences. This work presents a preliminary analysis of 257 video
YouTube. This poster presents a preliminary analysis of 257 video clips that mention disability challenges to understand how and why
clips made by YouTubers with disabilities. The most common dis- PWDs use YouTube to discourse disabilities.
closed difculties are related to social support and societal attitudes. We collected numerous YouTube videos posted by PWDs and
PWDs also use YouTube to share knowledge about accessibility sampled 1,000 video clips containing difcult words. Then the au-
and advocate for public changes. thors performed an analysis of disability difculties under the envi-
ronmental barrier framework [6]. Our results indicate that YouTube
CCS CONCEPTS is primarily used to disclose social support pressures and societal
attitudes. PWD YouTubers use the platform to share knowledge
• Human-centered computing → Empirical studies in acces-
and experience with assistive technologies and the environment.
sibility; Empirical studies in collaborative and social computing.
YouTube is also used for advocating changes and calling for in-
clusion. This work builds a foundation for our future quantitative
KEYWORDS analysis of how PWDs discourse difculties on video-sharing plat-
online videos, YouTube, video data, disability, barrier forms.
ACM Reference Format:
Shuo Niu, Jaime Garcia, Summayah Waseem, and Li Liu. 2022. Investigating 2 DATA AND ENCODING
How People with Disabilities Disclose Difculties on YouTube. In The 24th We took three steps to collect video data. In the frst step, we iden-
International ACM SIGACCESS Conference on Computers and Accessibility tifed disability keywords for fve disability categories to search
(ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA,
channels of YouTubers with disabilities (Table 1) with YouTube Data
5 pages. https://doi.org/10.1145/3517428.3550383
API. Then the channels were selected by a mix of programmatic
and manual fltering. The included channels must contain at least
1 INTRODUCTION one disability keyword in the channel description and at least one
People with disabilities (PWDs) use video-sharing platforms such video. We also exclude channels with “center,” “organization,” “asso-
as YouTube to share everyday life stories, present creative work, ciation,” “group,” or “mission” in the channel description to remove
and exchange information [2]. PWDs present their daily activities channels of groups and organizations. Then we manually verifed
and opinions in videos, which contain the challenges and difcul- the channel to ensure the videos were about individual PWDs or
ties they have met with social circles, technologies, and the envi- their caregivers. In the second step, we collected all videos posted
ronment. Understanding PWDs’ unique needs center accessibility by these channels. We keep the videos with closed captions and
research. Recent research has leveraged online videos to examine at least a difculty keyword in the closed caption (e.g., “difcult”,
PWDs’ interaction barriers and identify design opportunities. HCI “difculty”, “disappoint”, “challenge”, “hard”, “impossible”, “inconve-
researchers studied online videos to understand PWDs’ particular nience”, 96 words in total). This step yielded 16,710 videos made
by 431 creators. In the third step, we segment videos into small
Permission to make digital or hard copies of part or all of this work for personal or video clips by the timestamps denoted in the closed caption fle
classroom use is granted without fee provided that copies are not made or distributed (SubRip fle). Each SubRip timestamp has a start and ending time
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. and the video’s speech text for around a 5-second video segment.
For all other uses, contact the owner/author(s). We merged 24 consecutive timestamps into one video clip. The
ASSETS ’22, October 23–26, 2022, Athens, Greece start of each clip is the beginning of the frst timestamp, and the
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. end is the ending of the last timestamp. All speech texts for a clip
https://doi.org/10.1145/3517428.3550383 are concatenated. Then we only keep video clips with at least one
ASSETS ’22, October 23–26, 2022, Athens, Greece Niu, et al.

Disability Search Keywords


Vision blindness, low vision, blind, loss of vision, visual impairment
Speech speech disorders, language disorders, tourette syndrome, aphasia, speech impairment, loss of speech, speech disability, communication
disability
Mobility Amputation, arthritis, cerebral palsy, Charcot-Marie-Tooth disease, Huntington’s disease, juvenile rheumatoid arthritis, multiple
sclerosis, muscular dystrophy, scleroderma, scoliosis, spina bifda, spinal cord injury, mobility disability, physical disability, wheelchair
user, physically disabled
Hearing deafness, hard of hearing, deaf, hearing loss, hearing impairment
Cognitive & anxiety disorders, Asperger syndrome, attention defcit hyperactivity disorder, ADHD, autism, autism spectrum disorders, bipolar
Neural disorder, down syndrome, fetal alcohol syndrome, intellectual disabilities, mental retardation, learning disabilities, mental health,
obsessive compulsive disorder , OCD, post-traumatic stress disorder, PTSD, Williams syndrome, mental health disability
Table 1: Keywords used to search YouTube channels of PWD

difcult keyword in the speech text. For this poster, we sampled others’ stigma, making PWDs feel excluded or isolated. A YouTuber
1,000 video clips for qualitative coding with the disability barrier shared their experience “I believe that new individuals in my life
framework [6]. may feel unable to accept me due of my troubles, which may be rather
scary to them.” YouTubers with disabilities also mentioned the lack
3 DATA ENCODING of support or communities. Like one video mentions “When no one
is at home, this is how I travel around. Because of my Ehlers-Danlos
We used the disability barrier framework [6] to guide the genera-
Syndrome and back problems, it’s a little challenging.” Thirteen clips
tion of barrier sub-categories. The framework suggests six types of
are about social problems caused by communication impairment.
barriers of PWD: IT access, social support and societal attitudes, sys-
For example, “People who are hard of hearing have no idea how to
tems and policies, economic, built environment, natural environment,
speak in a public situation with someone who does not have a hear-
assistive technology, and transportation. Four authors split the 1,000
ing disability.” YouTubers also share their challenges with COVID.
video clips and annotate whether it mentions the six barrier themes
In one video about how the coronavirus afects blind people, the
(multi-categorical annotation). After categorization, IT access and
YouTuber says “Hand sanitizer was becoming increasingly difcult
assistive technology clips are combined into one barrier theme
to hold. It was becoming a serious issue.”
since many YouTubers mention assistive software and technologies.
IT Access and Assistive Technologies. 49 video clips are about
The built environment, natural environment, and transportation
problems with IT or assistive technologies. The most mentioned
barriers are all related to environmental factors, and they have 22
technologies are assistive hardware (13 clips) and software (12 clips).
clips in total. Therefore, we combine the clips of those three themes
For example, a YouTuber with a motor disability shows how they
into one theme. Then the four authors used afnity diagramming
take a bath and mention the chair texture can hurt skin (Figure 2a).
to group the clip notes. The fnal sub-themes can be seen in Table 2.
In another video, a YouTuber with vision impairment demonstrates
the problem with the magnifer feature (Figure 2b). Nine clips are
4 RESULTS about the accessibility barriers in video games. For example, a
257 clips from 223 videos contain at least one difculty under the game YouTuber with vision impairment mentions the fog efect
barrier framework [6]. The rest of the clips are either not a dif- makes the text hard to read (Figure 2c). Seven clips discuss problems
culty statement (e.g., “no problem” or “accept a challenge”), general with closed captioning. For example, one YouTuber argues that
difculties unrelated to disabilities (e.g., “a hard math problem”), YouTube should not remove community captions because “You’re
or general hardship with the disability (e.g., “my problems with just alienating a large population of people by saying things like, ‘uh
bipolar disorder”). All speech in this section are paraphrased for catch you later,’ or ‘uh farewell, we don’t need you viewing the movies.”
privacy protection [5]. Five video clips mention problems with communication tool and
Social Support and Societal Attitudes. The largest sub-theme AAC such as hearing aids and sign languages. Three clips mention
is disclosing or discussing the self-stigma with disabilities or the mobile or web software problems that contain barriers (Figure 2d).
shame or negativity caused by the disability. For example, in one System and Policies. System and policy barriers relate to the
video, a YouTuber with a physical disability mentioned that “Physi- availability of and access to the systems to support participation. In
cal disabilities are aggravating since they make you feel like you’re our data, 14 videos mention barriers within the education system,
bothering everyone else.” Some other videos call for stopping self- 12 mention issues with the healthcare system, and nine complaints
stigma, such as one video mentions “We should probably stop apolo- about public agencies or policy problems. For example, one YouTu-
gizing because you aren’t truly sorry. It’s simply a means of defusing ber with vision impairment mentions their struggles in a new se-
the issue and making it less awkward.” Some YouTubers talk about mester that “The frst week could be difcult since you’re getting used
the problems with other people’s lack of awareness of their disabili- to learning a new way around somewhere, or making sure the new
ties. For example, one YouTube discloses that “It’s quite aggravating equipment works, and that everything is in working order.” Another
for me because I’m dealing with a variety of issues relating to my YouTuber with bipolar disorder comments about the healthcare that
ataxia that few people understand.” Some difculties are caused by
Investigating How People with Disabilities Disclose Dificulties on YouTube ASSETS ’22, October 23–26, 2022, Athens, Greece

Sub-theme Defnition
Video game Accessibility issues with video games
Closed captioning Issues caused closed captioning or the lack of captioning
IT & assistive
technologies

Comm. and AAC* tool Hindrance of communication caused by technologies


Assistive software Problems with assistive features in software (e.g., screen reader and magnifer)
Web & app Accessibility issues with websites, mobile apps, or computer software
Assistive device Barriers with assistive hardware, devices, and tools (e.g., wheelchair and hearing aid)
Social stigma Stigma of disability; PWDs being isolated or excluded due to discrimination
Social support &

Disability awareness Social issues caused by neglecting, misconceptions of, and nonrecognition of disabilities
societal attitudes

Social support Difculties caused by lack of social support or communities


Communication Social barriers caused by communication difculties or impairments
Self-stigma Difculties caused by PWDs’ self-stigma, shame, or negativity caused by the disability
COVID Difculties caused COVID-19, social-distancing, quarantine, and PPE
Healthcare Lack of medical treatment or information, or the poor quality of treatment and healthcare
Education Problems, discrimination, or inaccessible learning in education and childcare
& policies
Systems

Government Problems with the authorities such as government, legislation, and public policies
Employment Barriers in job market, employment, and workplace
Business & service Barriers in commercial services, business, and recreational services
Residential Problems with private and residential space, including building environment and home appliances
Environment &
transportation

Public facility Barriers found in a public environment or commercial facilities


Transportation Barriers with vehicles and transportation systems
Natural environment Problem caused by the natural environment, weather, or climate
Econ. Problems caused by inadequate economic or allocation of fnancial resources
Table 2: The sub-themes of disability barriers in [6]. *AAC standards for Augmentative and Alternative Communication

Figure 1: The distribution of videos in each of the sub-themes.

they need “additional information on the therapy’s disadvantages (a compound found in marijuana). We live in a place where none of
and criticisms so that we can see both sides” About public policies, it is yet legal.” Employment and business and services have four
one video mentions that “For her (the daughter’s) back issues and videos mentioning them.
muscle problems, I was really interested in getting some CBD oil cream
ASSETS ’22, October 23–26, 2022, Athens, Greece Niu, et al.

(a) A YouTube explains the texture of (b) A video shows a problem with (c) A YouTube gamer mentions the fog (d) A kid is struggling with using an
bathing chair can hurt skin. magnifers when reading comments. can make text hard to read. iPad with the left hand.

(e) A YouTuber tells a story about (f) A YouTuber shows it is difcult to (g) An interviewee mentions they lost
hitting a stone and overturn the use a regular ladder to fx the smoke their vision and it is hard to take
wheelchair. detector. busses.

Figure 2: Top: Example videos showing difculties with IT Access and Assistive Technologies. Bottom: Example videos showing
environment and transportation difculties

Environment and Transportation. 22 video clips mention pressure. Our work fnds six challenges related to social support and
problems with the environment. In this theme, barriers with public societal attitudes: self-stigma, disability awareness, social stigma,
and commercial facilities are mentioned the most with 12 clips (e.g., lack of social support, communication challenges, and COVID-19
Figure 2e). Six clips show their problems at home when using tools challenges. The prevalence of social difculty disclosure indicates
to complete daily tasks (e.g., Figure 2f). Three clips talked about that future HCI and accessibility research need to pay attention to
challenges taking transportation or navigating the environment PWD’s social and emotional needs. Accessibility technologies need
(e.g., Figure 2g). One video mentions the challenge of the cold to be designed to ofer dignity and shield against stigma. YouTube
weather. videos can be used to examine the social situation and societal
Economic. Eleven videos mention economic difculties. For attitudes that hurt PWD. Future research could leverage YouTube
example, one video mentioned a fnancial issue with social security videos to investigate PWDs’ need for social respect and mental
that “money is very tight for a lot of people, especially those people support. But it should also be noted that using YouTube data to
living on a fxed income.” infer the commonality of PWD difculties and accessibility barriers
may introduce statistical bias since whether YouTube creators have
the same disability distribution as the overall population needs
5 DISCUSSION AND FUTURE WORK future investigation.
YouTube for Knowledge-Sharing. YouTube is commonly used Public Advocacy with Online Videos. Our result suggests that
for sharing personal experiences with technologies and public and YouTube is also used to advocate for solving accessibility problems
private environments. Disclosing the difculties may help other [9]. YouTubers with disabilities discuss their issues with education,
people to avoid similar challenges. YouTubers share their expe- healthcare, government, and employment systems to share their
riences with assistive tools and software features. Gamers speak struggles and raise public awareness of accessibility challenges.
about the accessibility challenges when playing games. PWDs also The discussion of technological and environmental problems in-
share their difculties with public facilities and everyday tools. For dicates that PWDs want their voice heard. YouTube has a culture
other people with similar disability challenges, these videos are of creating community and opinion leaders. Disclosing challenges
valuable experiences they can learn from. Therefore, video-sharing with systems and policies could promote public changes. Future
platforms could recommend such videos to help PWDs avoid prob- research should examine the role of online videos as a new way of
lems or obtain solutions. For designers, YouTube can be a data accessibility advocacy and their efects on public awareness. It will
source to discover accessibility issues with their products and de- be interesting to study how YouTubers use video-sharing commu-
sign. Design framework and theories are needed to guide analysis nities to call for action and solve common accessibility challenges.
of online videos for accessibility examination. Meanwhile, researchers need to examine the challenges of acces-
YouTube as a Place for Disclosing Social Pressure. Our pre- sibility advocacy. It will be interesting to investigate whether and
liminary analysis suggests that the most common difculties dis- how creators exaggerate video content and use other attractive
closed by YouTubers are problems with social support and societal techniques to engage the viewers.
attitudes [10]. PWDs reveal the self-stigma and negativity caused
by disabilities, indicating YouTube is a place for disclosing social
Investigating How People with Disabilities Disclose Dificulties on YouTube ASSETS ’22, October 23–26, 2022, Athens, Greece

REFERENCES and Rehabilitation 96, 4 (2015), 578–588. https://doi.org/10.1016/j.apmr.2014.12.


[1] Katya Borgos-Rodriguez, Kathryn E Ringland, and Anne Marie Piper. 2019. 008
MyAutsomeFamilyLife: Analyzing Parents of Children with Developmental Dis- [7] Franklin Mingzhe Li, Franchesca Spektor, Meng Xia, Mina Huh, Peter Cederberg,
abilities on YouTube. Proc. ACM Hum.-Comput. Interact. 3, CSCW (11 2019). Yuqi Gong, Kristen Shinohara, and Patrick Carrington. 2022. “It Feels Like Taking
https://doi.org/10.1145/3359196 a Gamble”: Exploring Perceptions, Practices, and Challenges of Using Makeup
[2] Barbara E Bromley. 2008. Broadcasting Disability: An Exploration of the Ed- and Cosmetics for People with Visual Impairments. In CHI Conference on Human
ucational Potential of a Video Sharing Web Site. Journal of Special Education Factors in Computing Systems (CHI ’22). Association for Computing Machinery,
Technology 23, 4 (12 2008), 1–13. https://doi.org/10.1177/016264340802300401 New York, NY, USA. https://doi.org/10.1145/3491102.3517490
[3] Dasom Choi, Uichin Lee, and Hwajung Hong. 2022. “It’s not wrong, but I’m [8] Xingyu Liu, Patrick Carrington, Xiang ’Anthony’ Chen, and Amy Pavel. 2021.
quite disappointed”: Toward an Inclusive Algorithmic Experience for Content What Makes Videos Accessible to Blind and Visually Impaired People?. In
Creators with Disabilities. In CHI Conference on Human Factors in Computing Proceedings of the 2021 CHI Conference on Human Factors in Computing Sys-
Systems. 1–19. tems (CHI ’21). Association for Computing Machinery, New York, NY, USA.
[4] Jared Duval, Ferran Altarriba Bertran, Siying Chen, Melissa Chu, Divya Subramo- https://doi.org/10.1145/3411764.3445233
nian, Austin Wang, Geofrey Xiang, Sri Kurniawan, and Katherine Isbister. 2021. [9] Shuo Niu, Cat Mai, Katherine G McKim, and D Scott McCrickard. 2021.
Chasing Play on TikTok from Populations with Disabilities to Inspire Playful and #TeamTrees: Investigating How YouTubers Participate in a Social Media Cam-
Inclusive Technology Design. In Proceedings of the 2021 CHI Conference on Human paign. Proc. ACM Hum.-Comput. Interact. 5, CSCW2 (10 2021). https://doi.org/10.
Factors in Computing Systems (CHI ’21). Association for Computing Machinery, 1145/3479593
New York, NY, USA. https://doi.org/10.1145/3411764.3445303 [10] Shuo Niu, Katherine G McKim, Hess Danielle, and Kathleen P Reed. 2022. Educa-
[5] Casey Fiesler, Nathan Beard, and Brian C Keegan. 2020. No Robots, Spiders, tion, Personal Experiences, and Advocacy: Examining Drug-Addiction Videos on
or Scrapers: Legal and Ethical Regulation of Data Collection Methods in Social YouTube. Proc. ACM Hum.-Comput. Interact. 5, CSCW2 (11 2022).
Media Terms of Service. Proceedings of the International AAAI Conference on Web [11] Woosuk Seo and Hyunggu Jung. 2021. Understanding the community of blind
and Social Media 14, 1 SE - Full Papers (5 2020), 187–196. https://ojs.aaai.org/ or visually impaired vloggers on YouTube. Universal Access in the Information
index.php/ICWSM/article/view/7290 Society 20, 1 (2021), 31–44. https://doi.org/10.1007/s10209-019-00706-6
[6] Joy Hammel, Susan Magasi, Allen Heinemann, David B Gray, Susan Stark, Pamela [12] Johann Wentzel, Sasa Junuzovic, James Devine, John Porter, and Martez Mott.
Kisala, Noelle E Carlozzi, David Tulsky, Sofa F Garcia, and Elizabeth A Hahn. 2015. 2022. Understanding How People with Limited Mobility Use Multi-Modal Input.
Environmental Barriers and Supports to Everyday Participation: A Qualitative In CHI Conference on Human Factors in Computing Systems (CHI ’22). Association
Insider Perspective From People With Disabilities. Archives of Physical Medicine for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3491102.
3517458
ProAesthetics: Changing How We View Prosthetic Function
Susanna Abler Foad Hamidi
University of Maryland, Baltimore County, USA University of Maryland, Baltimore County, USA
sabler1@umbc.edu foadhamidi@umbc.edu
ABSTRACT creation of the prosthesis often involves the end-user minimally,
There are multiple perspectives that must be considered while outside of measurements for size and tests for comfort much of the
designing a limb: The engineering requirements, the visual and prosthetic limb is already prefabricated. There is great potential
aesthetic appeal, the needs and wants of the client, and how these in exploring how users view the appearance of a prosthetic limb
designs impact users physically, socially, and mentally. Historically, and how involvement in the design process of a cosmetic limb or
the main focus of design has been on the engineering requirements limb cover can lead to designs that are personalized and refect
while neglecting the individual’s needs or desires for aesthetics aspects of a user’s identity. In this project, we adopted a participa-
which are seen as secondary concerns. However, these aspects still tory interdisciplinary approach to study how input from prosthetic
hold important roles in how the user views and connects with their users can be combined with technical knowledge of the prosthetics
own limb. In order to better understand the impact that aesthetic manufacturing process as well as an eye for visual composition
design has on the individual, this exploratory case study aimed to create aesthetically pleasing prosthetic designs. We engaged in
to create custom-designed prosthetic limb covers for lower limb the prototyping and fabrication of three customized lower-limb
amputees. This poster includes the fndings of the research and prosthetics that focused on aesthetic appeal and personalization
design process, including related works, the interview, design, and and found that users had innovative ideas for how to customize
prototyping processes, as well as issues that occurred during the their devices.
project and how they were either overcome or could be addressed This project is informed and inspired by previous research that
in the future. has studied the impact of aesthetic appearance on individuals with
limb loss as well as efective methods for pursuing the design pro-
CCS CONCEPTS cess. Customizing prosthetics and assistive technologies (ATs), more
broadly, has been studied for the past few decades in the context
• Human-centered computing → Accessibility.
of the work of Do-It-Yourself (DIY) AT communities [4, 5, 8, 9].
Of particular relevance to this project is the work of Profta et al.
KEYWORDS
that showed customization practices within the deaf community
3D Printing, Assistive Technology, Co-Design, Aesthetics that went beyond functionality and used aesthetic aspects of hear-
ACM Reference Format: ing aids to represent aspects of identity [10]. Other projects have
Susanna Abler and Foad Hamidi. 2022. ProAesthetics: Changing How We shown how online communities create and share customized ATs
View Prosthetic Function. In The 24th International ACM SIGACCESS Con- [4, 9] and also identifed the challenges of balancing practical, aes-
ference on Computers and Accessibility (ASSETS ’22), October 23–26, 2022, thetic and political considerations when creating customized ATs
Athens, Greece. ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/ [7]. Other eforts have focused on developing co-design methods
3517428.3550386 where people with disabilities, AT experts, and designers come
together to create customized and personalized designs (e.g., [1, 6]).
1 INTRODUCTION AND BACKGROUND Finally, Bekrater-Bodmann et al. developed a Prosthesis Embodi-
In Design Meets Disability, Graham Pullin argues strongly for the ment Scale [2, 3] used to describe how prosthesis-users connect
need to explore how design and disability can inform and inspire with their prosthetic limb and how to evaluate that connection.
each other, and in particular why it is important to explore the These projects motivated the current exploratory case study that
incorporation of aesthetic design into the construction of varying focuses on inquiring into aesthetic preferences and considerations
types of assistive technology [11]. When it comes to the design of of lower-limb prosthetic users to augment existing devices to make
prosthetic limbs, the main area of focus has often been on improv- them more appealing. Our fndings show that prosthetic users may
ing the mechanical capabilities of the device to replicate biological have personally meaningful preferences for what their devices look
capabilities as closely as possible. While this is not inherently a and feel like that if incorporated into prosthetic design may make
bad thing, it can lead to some level of neglect towards the visual them more appealing and aligned with one’s identity.
and aesthetic aspects of the prosthesis, often viewing the appear-
ance as secondary to restoration of movement. Furthermore, the
2 METHODS
Permission to make digital or hard copies of part or all of this work for personal or Five individuals with lower-limb amputations participated in semi-
classroom use is granted without fee provided that copies are not made or distributed structured interviews regarding their interest in cosmetic covers
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. as well as how they felt about their current limb appearance. Each
For all other uses, contact the owner/author(s). interview was approximately an hour, occurring during a regularly
ASSETS ’22, October 23–26, 2022, Athens, Greece scheduled appointment at the clinic where they receive care for
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. their prosthetic limb. Three women and two men were interviewed,
https://doi.org/10.1145/3517428.3550386 all were white and above the age of thirty. Notes were taken during
ASSETS ’22, October 23–26, 2022, Athens, Greece Susanna Abler and Foad Hamidi

Figure 1: "Jersey Devil" limb cover (digital 3D model top Left, 3D printed model bottom Left), and 3D printed "Flame On" cover
design (Right)

the interview process as recording devices were prohibited. From had thought of but had not really pursued. These answers were con-
these notes key words and quotes were taken out to provide context sistent with data from the literature review; some individuals were
for their relationship to their current prosthetic device as well as very narrowly focused on realism, others had an interest in more
providing insight into what their interests were regarding cosmetic robotic or abstract appearances, and some were happy with their
covers. The data from these interviews was then used to design limbs the way that they were [5, 6]. From there they were further
aesthetically focused limb covers, beginning with digital sketches interviewed about what they might like to see in a limb cover for
and 3D models of the limbs using Autodesk’s Fusion 360. The themselves, with varying answers given between each individual.
participants had each given diferent criteria for their desired covers, Some were more interested in a realistic looking cover, while others
whether it be certain colors to use or avoid, sports teams they liked, had varying interest in less lifelike designs. Answers as to why they
or activities that they enjoyed. Due to time constraints amplifed felt the way they did about their preferences varied. One participant
by COVID-19 restrictions, we were unable to get feedback from the responded “everyone wants to look normal” when asked about her
participants regarding the initial sketches. After initial designs were preference for a realistic limb cover. Another participant mentioned
fnalized, the models were printed utilizing additive 3D printing she is around children often and is concerned about her prosthetic
technology. Due to the varying aesthetic desires of each participant, scaring them, she wants to make sure her limb is something that the
each limb cover had its own unique challenges and prototyping children can connect with as well and a realistic limb cover would
needs that we will describe in the next section. Our university’s raise fewer concerns with these younger children. Two participants
Institutional Review Board (IRB) ofce approved the study before with bionic knees mentioned the importance of a cover acting as
data collection began. protection for the mechanical components inside their prosthetic
limb and that such a focus was often considered more important
than aesthetics due to insurance covering protective covers rather
3 FINDINGS
than aesthetic covers.
Of the fve interviews conducted, two of the participants stated An overarching issue regarding prosthetic limbs is the expense
that they currently had minimal to no interest in cosmetic covers of the limbs themselves with the additional cost of the amputation
while three stated that they had previously considered altering the procedure if applicable. Many people feel like they cannot aford the
appearance of their prosthetic. The three who had some previous additional cost of cosmetic changes, especially if they are paying
interest in altering the appearance of their limb had varying levels out of pocket. Financial stressors can impact well-being and mental
of current interest, one was no longer interested in altering the health and further medical costs of occupational therapy and limb
appearance of their limb, another was still interested in it, but it was maintenance can impact continued use of a limb [7]. While aford-
simply less of a priority, and the third was receptive to the concept ability and access to limbs is an important factor in both designing a
of altering the appearance of their limb as it was something they
ProAesthetics: Changing How We View Prosthetic Function ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 2: "Peace and Love" prosthetic limb cover, front (Left) and back (Center) of the model, detail of DIY glitter snow globe
(Right)

prosthetic limb and continued acceptance of the limb into the body The most time-intensive cover was the "Peace and Love" design
identity of the individual, this research is focused more narrowly (Figure 2). The creation of the cutouts for this model was fairly
on the aesthetics of the limb and how that impacts the user, areas time intensive due to limited features in the Fusion 360 program.
relating to public policy and insurance our outside of the scope of Fusion360 does not yet have support for wrapping a sketch around
the research in this project. a model, so shapes had to be created on individual sketch planes
From the fve designs described by participants, we selected a and cut out to prevent stretching. Nylon was used for this cover due
total of three design concepts, based on their practicality and possi- to its fexibility and because it is a dyeable material. Nylon test dye
ble aesthetic appeal, to proceed to the design and prototyping stage. sticks were printed to evaluate dying time needed for the material
The designs ranged from more simple colors with minor additions to achieve the most vibrant colors in the requested pink and purple.
to requiring diferent types of materials and separate prototyp- These sticks were then dipped in the dye for ffteen seconds to two
ing artifacts. The most simplistic design was the one titled "Jersey minutes to see what time would produce the most ideal color for
Devil," (Figure 1, Left) as all that was requested was something in the covers.
red and black that contained the logo of the participant’s favorite This cover came with an additional area of prototyping due
hockey team. This design required little additional prototyping and to the inclusion of a "glitter snow globe heart" in the cover. This
was fairly simple to model and print, the most difcult part was snow globe was made using acrylic sheets, water resistant adhesive,
modeling the logo to be visible but also solidly connected to the glitter, and vegetable glycerin. A heart shaped metal cookie cutter
body of the cover. and a heat gun to heat gun was used to cut the sheets clear acrylic,
The second most complicated cover was the "Flame On" design those that the cutter could not cut through fully were cut with
(Figure 1, Right). The cover itself was simple to model as it was scissors. These shapes created the top and bottom of the snow
solid, but due to the fact that it was designed for an above-the-knee globe. Long thin strips of acrylic were cut out and heated up to
amputee it was much larger than the other two designs and had to then mold along the inside of the cookie cutter, creating the middle
be printed in two parts. There were several issues that occurred with portion of the snow globe. These were then adhered to the heart
the printing of this cover in which prints would fail in the middle cutout shapes and flled with vegetable glycerin and glitter. Varying
of the print, taking extra time and troubleshooting to complete. concentrations of the glycerin were evaluated for fow and sustained
For the fame design, a few diferent avenues were explored. Vinyl movement for the glitter.
stickers were explored as an option; however, it was discovered
they do not stick well to the PLA material and would eventually 4 DISCUSSION
peel of. Further research was done into how they do fame decals
The results from the interviews show that some participants had
for cars and motorcycles and made some attempts with painting on
an interest in cosmetic limb covers and creative and personal ideas
the design. From this research the vinyl fame stickers were then
for what they could look like. Each participant had their own reac-
repurposed to act as a stencil for the fames, using tape to further
tions to the idea of custom cosmetic limb covers, whether it was
defne the edges of the shape. Glossy black spray paint was used
immediate interest or indiference. And each came in with their
over the remainder of the cover.
own reasons for that reaction, be it fnancial, cultural, or personal.
Additionally, we found that it is feasible to create custom limb
ASSETS ’22, October 23–26, 2022, Athens, Greece Susanna Abler and Foad Hamidi

cover prototypes for individuals utilizing commercially available Further research into this topic can also examine the role of pub-
materials, making this a more accessible avenue for the user. lic policy and regulations in how aesthetic aspects of assistive
Involving the end-user in the process of design and understand- technologies are viewed and how does this interact with devices’
ing the impact of their involvement is important for future devel- afordability and functionality. We focused only on lower-body
opment of assistive technology. There is great value in the end user prosthetics in this study. Future work can explore devices designed
not only being involved in the design of their limb but learning how for other parts of the body and with other fabrication methods.
to design for themselves, having more agency in the appearance Finally, and more importantly, future research can seek feedback
of their limb [8]. The interviews showed that individuals do have from individuals with disabilities on co-designed prosthetics in an
an interest in custom cosmetic covers and while follow up on the iterative process.
covers was not acquired, the criteria given by the participants was
fulflled through the project. ACKNOWLEDGMENTS
Another refection that our research points to but needs further We would like to thank our participants and the clinic staf for their
future exploration is questioning the assumption that a prosthetic time and input. We would also like to thank Steven McAlpine, Jamie
limb is necessary for all individuals with limb diferences. Much of Gurganus, Symmes Gardner, and Drew Holladay. This project is
the research shows that individuals with a congenital limb difer- partially supported by National Science Foundation (NSF) under
ence are less likely to want or need a prosthetic limb to function in Grants DRL-2005502 and DRL-2005484.
daily life [5]. The same can go for amputees as over time they may
fnd they are more comfortable, happier, and more mobile without REFERENCES
a prosthetic limb. The purpose of this research is not to imply that [1] Leila Afatoony and Su Jin (Susan) Lee. 2020. AT makers: A multidisciplinary ap-
this method is the only way for prosthesis users to experience the proach to co-designing assistive technologies by co-optimizing expert knowledge.
In Proceedings of the 16th Participatory Design Conference 2020 - Participation(s)
world or fnd comfort. Rather, the research is aimed at those who do Otherwise - Volume 2, ACM, New York, NY, USA. DOI:https://doi.org/10.1145/
want to use a prosthetic limb and fnd they are not happy with their 3384772.3385158
device, specifcally if that dissatisfaction stems from the appearance [2] Robin Bekrater-Bodmann. 2020. Factors Associated With Prosthesis Embodiment
and Its Importance for Prosthetic Satisfaction in Lower Limb Amputees. Front.
of the prosthesis. The goal of this research is to help individuals Neurorobot. 14, (2020), 604376.
feel happier with themselves and how they present themselves to [3] Robin Bekrater-Bodmann. 2020. Perceptual correlates of successful body–
prosthesis interaction in lower limb amputees: psychometric characterisation
the world, and those who feel happier without a prosthesis have and development of the Prosthesis Embodiment Scale. Sci. Rep. 10, 1 (August
already reached that point. 2020), 1–13.
This research also highlighted the need for an interdisciplinary [4] Erin Buehler, Stacy Branham, Abdullah Ali, Jeremy J. Chang, Megan Kelly Hof-
mann, Amy Hurst, and Shaun K. Kane. 2015. Sharing is Caring. In Proceedings of
lens when approaching prosthetic design. The task of designing the 33rd Annual ACM Conference on Human Factors in Computing Systems - CHI
these prosthetic limb covers required the use of disciplines like me- ’15, ACM Press, New York, New York, USA. DOI:https://doi.org/10.1145/2702123.
chanical engineering, visual arts, psychology, and disability studies 2702525
[5] Erin Buehler, Amy Hurst, and Megan Hofmann. 2014. Coming to grips: 3D
to both designs aesthetically pleasing and functional covers as well printing for accessibility. In Proceedings of the 16th international ACM SIGACCESS
as provide contextual understanding of the micro and macro im- conference on Computers & accessibility (ASSETS ’14), Association for Computing
Machinery, New York, NY, USA, 291–292.
pacts that a prosthetic limb and disability has on the individual. [6] Megan Hofmann, Jefrey Harris, Scott E. Hudson, and Jennifer Mankof. 2016.
These disciplines each worked to create a broader understanding Helping hands: Requirements for a Prototyping Methodology for Upper-limb
of that context, beginning with the design process and needs of the Prosthetics Users. In Proceedings of the 2016 CHI Conference on Human Factors
in Computing Systems, ACM, New York, NY, USA. DOI:https://doi.org/10.1145/
limb itself to understanding the societal structures that infuence 2858036.2858340
the design of the device. Having a broad understanding of the fac- [7] Megan Hofmann, Kristin Williams, Toni Kaplan, Stephanie Valencia, Gabriella
tors that can impact both the user and their prosthetic device is Hann, Scott E. Hudson, Jennifer Mankof, and Patrick Carrington. 2019. Occupa-
tional therapy is making: : Clinical Rapid Prototyping and Digital Fabrication. In
important in the design of future assistive technology as the many Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems,
physical, mental, and social factors that impact the individual’s life ACM, New York, NY, USA. DOI:https://doi.org/10.1145/3290605.3300544
[8] Jennifer Mankof, Megan Hofmann, Xiang “anthony”Chen, Scott E. Hudson, Amy
will dictate the type of assistive technology they need. While this Hurst, and Jeeeun Kim. 2019. Consumer-grade fabrication and its potential to
research took on an interdisciplinary approach, it was limited by revolutionize accessibility. Commun. ACM 62, 10 (September 2019), 64–75.
the knowledge base of our team, a limitation that can be remedied [9] Jeremiah Parry-Hill, Patrick C. Shih, Jennifer Mankof, and Daniel Ashbrook.
2017. Understanding Volunteer AT Fabricators: Opportunities and Challenges
by working with a broader team of experts in the future. in DIY-AT for Others in e-NABLE. In Proceedings of the 2017 CHI Conference
on Human Factors in Computing Systems (CHI ’17), Association for Computing
Machinery, New York, NY, USA, 6184–6194.
5 CONCLUSION AND FUTURE WORK [10] Halley P. Profta, Abigale Stangl, Laura Matuszewska, Sigrunn Sky, and Shaun
K. Kane. 2016. Nothing to Hide: Aesthetic Customization of Hearing Aids and
In this exploratory research, we studied how to co-design and Cochlear Implants in an Online Community. In Proceedings of the 18th Interna-
tional ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’16),
prototype aesthetic elements for prosthetics using DIY method and Association for Computing Machinery, New York, NY, USA, 219–227.
consumer-grade prototyping materials, such that the designs refect [11] Graham Pullin. 2011. Design Meets Disability. MIT Press, London, England.
users’ personal preferences and desires and go beyond functionality.
We designed and prototyped three designs based on participant
specifcations.
There are several directions for future research. First, we can
build on the current work to develop improved co-design models
that bring together multiple perspectives into the design process.
Towards Visualization of Time–Series
Ecological Momentary Assessment (EMA) Data on Standalone
Voice–First Virtual Assistants
Yichen Han1 , Christopher Bo Han3 , Chen Chen1 , Peng Wei Lee2 ,
Michael Hogarth5 , Alison A. Moore5 , Nadir Weibel1 , Emilia Farcas4
1 Computer Science and Engineering, 2 Electrical and Computer Engineering,
3 Departmentof Mathematics, 4 Qualcomm Institute, 5 School of Medicine
University of California San Diego, La Jolla, California, United States
{y4han,cbhan,chenchen,pwlee,mihogarth,alm123,efarcas,weibel}@ucsd.edu

Figure 1: With touchscreen based standalone voice–frst virtual assistant, older adults are able to query and visualize the
time–series based Ecological Momentary Assessment (EMA) data (e.g., the quality and time of the sleep).
ABSTRACT CCS CONCEPTS
Population aging is an increasingly important consideration for • Human-centered computing → Empirical studies in acces-
health care in the 21th century, and continuing to have access and sibility.
interact with digital health information is a key challenge for ag-
ing populations. Voice-based Intelligent Virtual Assistants (IVAs) KEYWORDS
are promising to improve the Quality of Life (QoL) of older adults, Gerontechnology, Accessibility, Health – Well-being, User Experi-
and coupled with Ecological Momentary Assessments (EMA) they ence Design, Older Adults, Voice User Interfaces, EMA
can be efective to collect important health information from older
ACM Reference Format:
adults, especially when it comes to repeated time-based events.
Yichen Han1 , Christopher Bo Han3 , Chen Chen1 , Peng Wei Lee2 ,, Michael
However, this same EMA data is hard to access for the older adult: Hogarth5 , Alison A. Moore5 , Nadir Weibel1 , Emilia Farcas4 . 2022. Towards
although the newest IVAs are equipped with a display, the efective- Visualization of Time–Series Ecological Momentary Assessment (EMA)
ness of visualizing time–series based EMA data on standalone IVAs Data on Standalone Voice–First Virtual Assistants. In The 24th International
has not been explored. To investigate the potential opportunities ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22),
for visualizing time–series based EMA data on standalone IVAs, we October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, 4 pages.
designed a prototype system, where older adults are able to query https://doi.org/10.1145/3517428.3550398
and examine the time–series EMA data on Amazon Echo Show
— a widely used commercially available standalone screen–based 1 INTRODUCTION
IVA. We conducted a preliminary semi–structured interview with Population aging is a global health consideration in the 21th cen-
a geriatrician and an older adult, and identifed three fndings that tury [8], especially when it comes to keeping track of older adults’
should be carefully considered when designing such visualizations. daily health data. To more efectively support older adults and their
clinicians to collect daily health check-ins, studies have shown the
efectiveness of using Ecological Momentary Assessments (EMA),
a well–known method to more conveniently assess specifc recur-
Permission to make digital or hard copies of part or all of this work for personal or ring behaviors and track specifc health states. An efective EMA
classroom use is granted without fee provided that copies are not made or distributed strategy could allow clinicians to constantly monitor older adults
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. and ensure their physical and mental well-being, hopefully leading
For all other uses, contact the owner/author(s). to an increase in their Quality of Life (QoL) [9, 18].
ASSETS ’22, October 23–26, 2022, Athens, Greece However, such strategies often necessitates tremendous amounts
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. of time and efort; and further issues, such as communication chal-
https://doi.org/10.1145/3517428.3550398 lenges, could arise [20]. While a wide variety of technologies have
ASSETS ’22, October 23–26, 2022, Athens, Greece Han et al.

Figure 2: Two sample demonstrations on visualizing time-series data on Amazon Echo Show with voice input (a–b) and a semi-
structured interview exploring their utility (c). Faces are pixelized to protect participants’ personally identifable information.

been used to keep track of the older adults’ conditions [16, 19], many 2 PRELIMINARY DESIGN
older adults are still facing obstacles while using smart devices, Our preliminary designs consist of two example scenarios demon-
most notably those with complicated Graphical User Interfaces strating how time–series EMA data can be visualized and are bene-
(GUI) [13, 14]. fcial to older adults (see Figs. 2a and b). In both cases, we assume
Voice based conversational user interfaces ofers an promising that the EMA data has already been collected and stored in a secure
alternative interaction channel for older adults to interact with Electronic Health Record (EHR). We considered scenarios of one–
their digital health information. Such modality allows older adults measure visualization, where we only visualize one measure
to naturally and easily use voice based conversation–like com- (e.g., the hour of sleep as shown in Fig. 2a) over the time domain,
mands to delegate tasks to, or query information from their health- and two–measure visualization, that places two one–measure
care providers. While Trajkova et al. [21] demonstrates the limita- graphs with the same dimension vertically, which might indicate
tions of today’s voice assistants among aging users, others (e.g., [3– possible correlations between two dependent variables. Visualiza-
5, 15, 17]) show the feasibility and potential usefulness of using tions with more than two measures are out of our scope due to the
standalone voice–based Intelligent Virtual Assistants (IVAs) for consideration of the screen size and participants’ cognitive load.
conducting EMA among older adults patients.
Scenario 1 – One–Measure Visualization: Fig. 2a demonstrates
Despite the potential of deploying EMAs through IVAs, due
a one–measure visualization for hours of sleep, one important time-
to the nature of ambiguity of conversational voice user interfaces,
series EMA data used in geriatrics to help tracking older adults’
Chen et al. [5, 6] pointed out the necessity of also integrating simple
QoL [7]. A voice skill could be initiated by a simple query: “Hey
visual components for facilitating better interaction with the EMAs
Alexa, what’s my average hours of sleep everyday in the week?” The
deployed on the device. The same group of researchers [6] also
mean hours of sleep by days in a week will then be computed, yield-
investigated how to visualize simple interaction elements, such as
ing a line plot shown on the IVA’s screen. The trend of the graph
the texts of EMA questions and input buttons on Amazon Echo
could help older adults recall the common activities on particular
Show [10], a widely used commercially available standalone IVAs.
days in a week that possibly led to an irregularity of sleeping hours.
However, while data collection through EMA is important, previous
work did not yet explore how to designs and visualize the data Scenario 2 – Two–Measure Visualization: Older adults should
collected through EMAs on the IVA device itself, especially when also be able to visualize two–measure plots with the same time
it comes to display time–series visualizations. span to examine any potential correlations. As an example, Fig. 2b
In this paper we present the frst preliminary exploratory efort demonstrates how older adults could query about their sleep hours
to design time–series data visualizations on screen based voice– and alcohol consumption in the past 10 days. This plot can visually
frst IVAs. We chose Amazon Echo Show [10] as the standalone guide the user to fnd the correlation between the daily alcohol
voice–based IVA device with a built–in touchscreen due to its dom- consumption and sleep, so that the user can properly in–situ adjust
inant market share [1], however, many of our fndings could be to a healthier lifestyle.
transferred to other screen–based voice–frst IVAs. We prototyped Prototype: We prototyped the system based on Amazon Alexa
a system that enables older adults to examine their past EMA data and Amazon Web Services (AWS). The overall system architecture
on the built–in touchscreen of Echo Show. By observing interaction is shown in Fig. 3, and consists of three major components: (1) a
with the IVA visualization, and especially older adults’ past data front–end converting users’ utterances to queries, (2) a middle–tier
and corresponding trends, we found that older adults might be able that handles the queries and requests the graphs, and (3) a back–
to make in-situ decisions to address possible unhealthy lifestyle. end to generate the graphs. In the front–end, speech recognition
We conducted a remote co–design workshop with two repre- technology in the IVAs, such as Alexa Voice Service (AVS) in an
sentative stakeholders – an older adult and a geriatrician. After Echo Show, are used to recognize the raw audio input. An AWS
thematic analysis of the recordings, we present fndings and poten-
tial future development.
Towards Visualization of Time–Series
Ecological Momentary Assessment (EMA) Data on Standalone Voice–First Virtual Assistants ASSETS ’22, October 23–26, 2022, Athens, Greece

Figure 3: When older adults request a graph from voice assistant Alexa. AWS Lambda handles this request by asking EC2 to
generate a graph and send it back to Echo show, paired with Alexa.

Lambda Function is then triggered and passes through the tran- that having a set vertical span has the limitation of being “bounded
scribed text to the back-end server via RESTful API which will by the responses they [the users] have” and emphasized that hav-
then generate a time-series graph. Through Amazon Presentation ing a fexible vertical range could be implemented “so people don’t
Language (APL) [12], the plot is then rendered on the touchscreen potentially freak out”.
as a background image. Through this process, older adults are able Participants also emphasized the importance of carefully design-
to view their time–series healthcare information as a graph on their ing the time span (i.e., visualized as horizontal axis). In Figs. 2a and
standalone devices. b, the graphs show data within only a short time span (e.g., the
time span of Fig. 2a is a week and the time span of Fig. 2b is 10
3 DESIGN AND USER EVALUATION days). However, healthcare providers could request data from a
To understand how data visualization can help the aging population, longer time span, such as from the past several months or years.
we conducted a remote 30–minute co–design workshop through P2 in fact questioned the efectiveness of graphing only tens days
Zoom [22], and interviewed an older adult (P1) from the retirement of data in Fig. 2b and suggested to “make the axis longer so you[the
community "Vi" in La Jolla, California and a geriatric provider (P2) visualization] show more” health data. Future designs should adapt
from the University of California San Diego Health System. Our the time span according to the user’s demand.
study has been approved by the Institutional Review Board (IRB). (3) The Types of Data to be Visualized Should be Carefully
We showed the interviewees our two initial visualization prototypes Chosen.
(see Fig. 2a and b) and solicited feedback, encouraging them to Currently, the prototypes only display quantitative data, but there
propose alternatives. Specifcally, the participants were asked to are numerous essential qualitative data points and types that could
express their opinions regarding (a) the potential functionality and be included in the graphs, as suggested by the participants. The
efects of the proposed EMA visualizations, as well as (b) how useful quality of sleep, an example of qualitative data, can provide a clearer
these visualizations would be to their daily routine. Fig. 2c shows a picture on how alcohol is afecting the user’s sleep. This data type,
screenshot of our Zoom session with the two stakeholders. however, is difcult to be implemented in the visualizations. One
method is to “build it in as a color ... to overlay the number of [sleep-
4 PRELIMINARY FINDINGS ing] hours with quality”, as suggested by P2. The addition of "setting
We recorded the complete Zoom session and analyzed the discus- goals" perhaps overlaid on top of the quantitative data would also
sion using analysis [11]. In this section we summarize our fndings make the visualizations more personal to the user, since meeting
in terms of three aspects. their goal encourages the user to maintain a better lifestyle. P1
(1) Selections of Measures to be Visualized Should be Care- mentioned that “you might set your goal fairly low... so if you’re
fully Considered. consistently meeting your lower goal, then that gives you a chance to
Participants agreed that mean values, as visualized in Fig. 2a, could consider whether you want to increase the goal or not”. In the graphs
provide clear insights while retrospecting the older adults’ activities showing the correlation between the amount of consumed alcohol
over the previous weeks. However, participants also expressed a and the sleeping time, the time diference between when the user
preference for seeing additional measures to help evaluating the last consumed alcohol and the time they fell asleep the same day is
spread of the measured data. For example, P2 mentioned that if also a factor that afects sleep. Therefore, this variable could also
“error bars” existed in Fig. 2a,“the preciseness of the number of sleep- be considered to add into the visualizations.
ing hours” could be shown. This revision would help visualize the
consistency of sleeping time among aging populations.
5 CONCLUSION AND FUTURE WORK
To help older adults visualize their time-series EMA data through
(2) The Span of Horizontal and Vertical Axis Should be Care-
voice input, we designed and implemented a prototype that can
fully Designed.
generate one–measure and two–measure graphs on standalone
Participants suggested to have more fexibility in terms of the range
voice–frst virtual assistants. We explored opportunities of the sys-
for the vertical axis. As shown in Fig. 2a, the range of the vertical
tem and summarized needs on 3 major components of this visual-
axis is a four–hour diference. Changing the range of the vertical
ization, which are measures, types, and time span of data. Although
axis to be more fexible would allow users to display a more ac-
some of these opportunities can be easily implemented, others like
curate variability of the number of sleeping hours. P2 mentioned
providing a better picture of the user’s conditions and goals, can be
ASSETS ’22, October 23–26, 2022, Athens, Greece Han et al.

difcult to integrate into the visualizations and interpreted correctly (online), Spain) (CUI ’21). Association for Computing Machinery, New York, NY,
by the user and the healthcare providers. USA, Article 31, 6 pages. https://doi.org/10.1145/3469595.3469626
[7] Kate Crowley. 2011. Sleep and sleep disorders in older adults. Neuropsychology
This work is only scratching the surface in terms of how to design review 21, 1 (2011), 41–53.
graphical visualization of EMAs on IVAs, and outline a number of [8] Carl-Johan Dalgaard, Casper Worm Hansen, and Holger Strulik. 2022. Physio-
logical aging around the World. PloS one 17, 6 (2022), e0268276.
challenges that still need to be resolved. In future work, we plan [9] Lorraine S Evangelista, Jung-Ah Lee, Alison A Moore, Marjan Motie, Hassan
to conduct user studies with older participants to fully explore Ghasemzadeh, Majid Sarrafzadeh, and Carol M Mangione. 2015. Examining the
opportunities and challenges faced by this a system like the one we efects of remote monitoring systems on activation, self-care, and quality of life
in older patients with chronic heart failure. The Journal of cardiovascular nursing
have prototyped. Meanwhile, integrating some straight-forward 30, 1 (2015), 51.
suggested refnements as part of our existing prototype will already [10] Amazon Inc. 2020. Echo Show 8 – HD smart display with Alexa – stay connected
make visualizations more adaptive to context and representation with video calling. https://www.amazon.com/Echo-Show-Pantalla-inteligente-
Alexa/dp/B07PF1Y28C/ref=sr_1_1?dchild=1&keywords=echo+show&qid=
of qualitative data. 1598674780&sr=8-1
[11] Helene Jofe. 2012. Thematic analysis. Qualitative research methods in mental
ACKNOWLEDGMENTS health and psychotherapy 1 (2012), 210–223.
[12] Alexa Presentation Language. 2022. Echo Show 8 – HD smart display with Alexa
This work is part of project VOLI [2] and was supported by NIH/NIA – stay connected with video calling. https://developer.amazon.com/en-US/docs/
alexa/alexa-design/apl.html
under grant R56AG067393. Co-author Michael Hogarth has an [13] Rock Leung, Charlotte Tang, Shathel Haddad, Joanna Mcgrenere, Peter Graf, and
equity interest in LifeLink Inc. and also serves on the company’s Vilia Ingriany. 2012. How older adults learn to use mobile devices: Survey and
Scientifc Advisory Board. The terms of this arrangement have been feld investigations. ACM Transactions on Accessible Computing (TACCESS) 4, 3
(2012), 1–33.
reviewed and approved by the UC San Diego in accordance with [14] Qingchuan Li and Yan Luximon. 2020. Older adults’ use of mobile device: us-
its confict of interest policies. We appreciate insightful feedback ability challenges while navigating various interfaces. Behaviour & Information
Technology 39, 8 (2020), 837–861.
from the anonymous reviewers, Manas Satish Bedmutha, Mary [15] Ella T. Lifset, Kemeberley Charles, Emilia Farcas, Nadir Weibel, Michael Hogarth,
Draper and fellow colleagues from the Design Lab and Computer Chen Chen, Janet G. Johnson, and Alison A. Moore. 2020. Can an Intelligent
Science and Engineering at UC San Diego, as well as residents from Virtual Assistant (IVA) Meet Older Adult Health-Related Needs in the Context of
a Geriatric 5Ms Framework?. In Journal of the American Geriatrics Society, Vol. 70.
the Vi in La Jolla. Wiley 111 River St, Hoboken 07030-5774, NJ USA, S245–S246.
[16] W Ben Mortenson, Andrew Sixsmith, and Ryan Woolrych. 2015. The power (s)
REFERENCES of observation: Theoretical perspectives on surveillance technologies and older
people. Ageing & Society 35, 3 (2015), 512–530.
[1] 2019. Alexa Devices Maintain 70% Market Share in U.S. according to sur- [17] Khalil Mrini, Chen Chen, Ndapa Nakashole, Nadir Weibel, and Emilia Farcas.
vey. https://marketingland.com/alexa-devices-maintain-70-market-share-in-u- 2021. Medical Question Understanding and Answering for Older Adults. Southern
s-according-to-survey-265180 California Machine Learning and Natural Language Processing Symposium (2021).
[2] 2022. Project VOLI at UC San Diego. http://voli.ucsd.edu [18] Martin Salzmann-Erikson and Henrik Eriksson. 2012. Panoptic power and mental
[3] Rebecca Adaimi, Ka Tai Ho, and Edison Thomaz. 2020. Usability of a Hands- health Nursing—Space and surveillance in relation to staf, patients, and neutral
Free Voice Input Interface for Ecological Momentary Assessment. In 2020 IEEE places. Issues in Mental Health Nursing 33, 8 (2012), 500–504.
International Conference on Pervasive Computing and Communications Workshops [19] Andrew Sixsmith, Gloria Gutman, et al. 2013. Technologies for active aging. Vol. 9.
(PerCom Workshops). IEEE, 1–5. Springer.
[4] Kemeberley Charles, Chen Chen, Janet G. Johnson, Alice Lee, Ella T. Lifset, [20] Annelie J. Sundler, Hilde Eide, Sandra van Dulmen, and Inger K. Holmström. 2016.
Michael Hogarth, Nadir Weibel, Emilia Farcas, and Alison A. Moore. 2021. How Communicative challenges in the home care of older persons – a qualitative explo-
might an intelligent voice assistant address older adults’ health-related needs?. In ration. Journal of Advanced Nursing 72, 10 (2016), 2435–2444. https://doi.org/10.
Journal of the American Geriatrics Society, Vol. 69. Wiley 111 River St, Hoboken 1111/jan.12996 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/jan.12996
07030-5774, NJ USA, S243–S244. [21] Milka Trajkova and Aqueasha Martin-Hammond. 2020. “Alexa is a Toy”: Ex-
[5] Chen Chen, Janet G Johnson, Kemeberly Charles, Alice Lee, Ella T. Lifset, Michael ploring Older Adults’ Reasons for Using, Limiting, and Abandoning Echo. In
Hogarth, Alison A. Moore, Emilia Farcas, and Nadir Weibel. 2021. Understanding Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
Barriers and Design Opportunities to Improve Healthcare and QOL for Older (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York,
Adults through Voice Assistants. In The 23rd International ACM SIGACCESS NY, USA, 1–13. https://doi.org/10.1145/3313831.3376760
Conference on Computers and Accessibility (Virtual Event, USA) (ASSETS ’21). [22] Matin Yarmand, Chen Chen, Danilo Gasques, James D. Murphy, and Nadir Weibel.
Association for Computing Machinery, New York, NY, USA, Article 9, 16 pages. 2021. Facilitating Remote Design Thinking Workshops in Healthcare: The Case
https://doi.org/10.1145/3441852.3471218 of Contouring in Radiation Oncology. In Extended Abstracts of the 2021 CHI
[6] Chen Chen, Khalil Mrini, Kemeberly Charles, Ella T. Lifset, Michael Hogarth, Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI
Alison A. Moore, Nadir Weibel, and Emilia Farcas. 2021. Toward a Unifed EA ’21). Association for Computing Machinery, New York, NY, USA, Article 40,
Metadata Schema for Ecological Momentary Assessment with Voice-First Virtual 5 pages. https://doi.org/10.1145/3411763.3443445
Assistants. In CUI 2021 - 3rd Conference on Conversational User Interfaces (Bilbao
Understanding and Improving Information Extraction From
Online Geospatial Data Visualizations for Screen-Reader Users
Ather Sharif Andrew M. Zhang Anna Shih
asharif@cs.washington.edu azhang26@cs.washington.edu annas55@cs.washington.edu
Paul G. Allen School of Computer Paul G. Allen School of Computer Paul G. Allen School of Computer
Science & Engineering | DUB Group, Science & Engineering, Science & Engineering,
University of Washington University of Washington University of Washington
Seattle, Washington, USA Seattle, Washington, USA Seattle, Washington, USA

Jacob O. Wobbrock Katharina Reinecke


wobbrock@uw.edu reinecke@cs.washington.edu
The Information School | Paul G. Allen School of Computer
DUB Group, Science & Engineering | DUB Group,
University of Washington University of Washington
Seattle, Washington, USA Seattle, Washington, USA

Figure 1: Interactions with a geospatial data visualization showing COVID-19 cases per US state, using our enhancements to
VoxLens. “Q” represents questions that screen-reader users can verbally ask using our enhancement, and “A” represents the
answers they would hear via their screen readers.

ABSTRACT remain unexplored. In this work, we study the interactions of and


Prior work has studied the interaction experiences of screen-reader information extraction by screen-reader users from online geospa-
users with simple online data visualizations (e.g., bar charts, line tial data visualizations. Specifcally, we conducted a user study with
graphs, scatter plots), highlighting the disenfranchisement of screen- 12 screen-reader users to understand the information they seek
reader users in accessing information from these visualizations. from online geospatial data visualizations and the questions they
However, the interactions of screen-reader users with online geospa- ask to extract that information. We utilized our fndings to generate
tial data visualizations, commonly used by visualization creators a taxonomy of information sought from our participants’ interac-
to represent geospatial data (e.g., COVID-19 cases per US state), tions. Additionally, we extended the functionalities of VoxLens—an
open-source multi-modal solution that improves data visualization
accessibility—to enable screen-reader users to extract information
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed from online geospatial data visualizations.
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
ASSETS ’22, October 23–26, 2022, Athens, Greece CCS CONCEPTS
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. • Human-centered computing → Information visualization;
https://doi.org/10.1145/3517428.3550363 Accessibility systems and tools.
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Andrew M. Zhang, Anna Shih, Jacob O. Wobbrock, and Katharina Reinecke

KEYWORDS 2 USER STUDY


geospatial, visualization, accessibility, screen reader, blind, voice We conducted a Wizard-of-Oz [3, 6] user study with 12 screen-
assistant, map reader users to understand their information extraction experiences
with online geospatial data visualizations, subsequently generat-
ACM Reference Format:
ing a taxonomy of their interactions. We acted as the “wizards”
Ather Sharif, Andrew M. Zhang, Anna Shih, Jacob O. Wobbrock, and Katha-
rina Reinecke. 2022. Understanding and Improving Information Extraction
and simulated responses from a hypothetical screen reader, follow-
From Online Geospatial Data Visualizations for Screen-Reader Users. In ing recommendations from prior work [9, 12, 16]. We present our
The 24th International ACM SIGACCESS Conference on Computers and Acces- methodology, results, and the taxonomy development process.
sibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York,
NY, USA, 5 pages. https://doi.org/10.1145/3517428.3550363
2.1 Participants, Materials, & Procedure
1 INTRODUCTION Our participants were 12 screen-reader users (M=50.3 years, SD=13.6;
Online data visualizations efectively communicate large volumes see Appendix A, Table 2). Seven participants self-identifed as
of data to their audience [8] and enable users to identify outliers, women and fve as men. We compensated participants with a $20
recognize patterns, and explore oddities in the data that may be Amazon gift card for one hour of their time. Our data set included
challenging to determine from a simple table [25]. The benefts of three geospatial data visualizations (curated based on the search
data visualizations are especially applicable to map-based visual- results for “most popular map visualizations 2021” on Google): (1)
izations that assist users in exploring, summarizing, and analyzing US trafc congestion in 2021; (2) Latest COVID-19 vaccination per-
geospatial data [2, 26, 27]. Indeed, geospatial data visualizations dis- centages per country; and (3) Percentage of US workers at or below
playing information on current events, including COVID-19 cases minimum wage in 2021 per state.
and vaccination rates per US state and country, were amongst the We conducted our studies via Zoom and used its built-in fea-
top 10 most popular interactive data visualizations in 2021 [21, 24]. tures for recording and transcribing sessions. First, we presented
However, screen-reader users—who may not be able to interact participants with a holistic overview of the visualization generated
with online visualizations fully using sight—are inherently disen- using VoxLens’ Summary mode. Then, we asked our participants
franchised from extracting information from online visualizations to explore the data in the visualization by verbally asking questions,
due to the inaccessibility of the visualizations [4, 11, 15, 22]. Sev- replicating the behavior of the Q-&-A mode of VoxLens. Each par-
eral prior works have identifed the need for accessible online data ticipant interacted with all three visualizations. We randomized the
visualizations, shedding light on the disenfranchisement caused by order of the visualizations across participants.
inaccessible visualizations for screen-reader users [10, 13, 15, 22, 28].
Most recently, Sharif et al. [22] reported that even when an alter-
native text (“alt-text”) exists, screen-reader users spend 211% more 2.2 Analysis & Results
time and are 61% less accurate in extracting information from online We used semantic thematic analysis [14, 19] and employed Braun
data visualizations compared to non-screen-reader users. However, and Clarke’s “essentialist” method [1], focusing on the “surface
their work only explored simple online data visualizations, such as meanings of the data.” At least two researchers independently coded
bar charts, line graphs, and scatter plots. We build on their work by the transcripts and identifed 18 initial thematic codes, resolving our
understanding interaction experiences and information extraction disagreements through mutual discussions. We combined our 18
by screen-reader users from online geospatial data visualizations. initial codes into eight axial codes and classifed the axial codes into
To understand the information screen-reader users seek from two broader categories. Our inter-rater reliability (IRR), expressed
online geospatial data visualizations and the questions they ask as percentage agreement [7], was 89.4%, demonstrating a high level
to extract that information, we conducted a Wizard-of-Oz [3, 6] of agreement between raters [5, 7].
user study with 12 screen-reader users. We found that our partici- We found that besides extracting and comparing individual data
pants grouped and fltered geospatial data through categorization points, our participants performed additional actions in their in-
and ranking in addition to extracting and comparing individual teractions with online geospatial data visualizations, which we
data points. We utilized these fndings to develop a taxonomy of classifed into two high-level categories: (1) Categorization; and (2)
information sought by our screen-reader users during their explo- Ranking. Specifcally, we found that screen-reader users categorize
rations. Finally, using the taxonomy, we extended the function- data by (in the order of frequency): regional, political, climate-related,
alities of VoxLens [23]—an open-source JavaScript plug-in that population-related, and spoken-language-related. For example, P4
improves the accessibility of online data visualizations using a inquired about alcohol consumption diferences between diferent
multi-modal approach—by supporting information extraction from regions of Asia: Are there geographic diferences, like, between eastern
online geospatial data visualizations. Asia versus western Asia or southern Asia? Our participants also
In this work, we contribute the: (1) taxonomy of information ranked data based on their values. Specifcally, they sought (in the
sought by our screen-reader users in their explorations of online order of frequency): top and bottom X data points and X data points
geospatial data visualizations; and (2) enhancement of VoxLens [23]— surrounding the average; where X is an arbitrary number that varied
an open-source JavaScript plug-in to make online data visualiza- across participants. For example, P3 wanted to fnd the top three
tions accessible—to support information extraction from online countries in 2021 for COVID-19 vaccination percentages: So, Hong
geospatial data visualizations. Kong was the highest, but who is in the second and third position?
Information Extraction From Online Geospatial Data Visualizations ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Taxonomy of information screen-reader users seek when exploring online geospatial data visualizations to extract and
compare data points. Information types within each category are in descending order based on their sought frequency. For each
information type, the “Query” column shows some of the questions that our participants asked to extract that information.

Category Information Type Query


Is there a diference between the east and west, or the north and south?
Regional
Tell me the values of the Southern states as opposed to the Northwest.
How do the trends during the Democratic presidential campaigns com-
Political pare to the trends during Republican presidential campaigns?
Do socialist countries have higher rates?
Categorization
Do warmer climate states have higher values?
Climate-Related
How are colder places compared to warmer places?
Can you compare these two states by population?
Population-Related
Do the states with larger population have higher trafc rates?
Spoken-Language-Related Can we compare Spanish speaking countries to the English ones?
What are the fve top countries in Western Europe?
Top
I’d like to see them all in order from the highest to the lowest.
What are the bottom 10 countries in the graph?
Ranking Bottom
What about the second and third lowest?
Which ones are in the middle?
Surrounding Average
What are the three countries that are closest to the average?

2.3 Taxonomy Development Specifcally, we implemented regional categorization and ranking


Using these fndings, we developed a taxonomy of information the data by the top- and bottom-most values. To identify the chart
sought by screen-reader users containing three tiers: (1) Category, type and regional classifcation to accurately process users’ queries,
the broader categories; (2) Information Type, the axial codes; and we extended the existing confguration options for developers to
(3) Query, questions that our participants asked to extract a given include two more parameters: “chartType” and “dataModule.” Four
information type. We show the taxonomy in Table 1, organizing the values for “chartType” are possible: (1) bar; (2) line; (3) scatter; and
categories and the information types in the order of their frequency. (4) map. For “dataModule,” two values are possible: (1) state; and
(2) country. (No additional data or confguration was required from
3 ENHANCEMENTS TO VOXLENS the developers.)
Drawing on the results from the taxonomy from our user study with 3.2.1 Regional Categorization. For data involving the states in the
12 screen-reader users, we extended the capabilities of VoxLens by US, our participants categorized the data by US regions (e.g., east
supporting information extraction from geospatial data visualiza- coast); for countries of the world, they grouped the data by conti-
tions. nents (e.g., Asia). Therefore, we implemented two data modules:
state and country. State allows the data to be grouped and fltered
3.1 Brief Overview of VoxLens by US region, whereas country does so by continent. Developers
VoxLens is an open-source JavaScript plug-in that improves the can enable regional categorization by using map as the “chartType”
accessibility of online data visualizations for screen-reader users us- and specifying the appropriate “dataModule.” For example, in a
ing a multi-modal approach [23], requiring only a single line of code graph representing COVID-19 cases per US state, the user can ask
for integration from developers. VoxLens supports three modes: (1) region-related questions, such as: “how is the east coast vs. the
Q-&-A (verbal interaction); (2) Summary (holistic overview of the west coast?” (Figure 1). We chose the US regions using National
data); and (3) Sonifcation (sonifed version of the data). However, Geographic Society’s [18] classifcation of US regions. Additionally,
VoxLens is currently limited to simple visualizations, such as bar we made our modules scalable, enabling straightforward additions
charts, line graphs, and scatter plots. and modifcations to the list of our regions.

3.2.2 Support for Ranking. Our enhancements enable the users to


3.2 Our Additions to VoxLens obtain the top X and bottom X data points, where X represents any
Our objective was to extend the functionality of VoxLens’ Q-&- number of data points. For example, users can ask for the top-seven
A mode by enabling screen-reader users to extract information or the bottom-fve data points. Our algorithm, currently, only rec-
from online geospatial data visualizations. To this end, we selected ognizes specifc keywords to rank the data (e.g., “top” or “bottom”).
the most frequently sought information types from our taxonomy. We plan on extending the vocabulary in our future iteration.
ASSETS ’22, October 23–26, 2022, Athens, Greece Ather Sharif, Andrew M. Zhang, Anna Shih, Jacob O. Wobbrock, and Katharina Reinecke

4 DISCUSSION & CONCLUSION [6] Melita Hajdinjak and France Mihelic. 2004. Conducting the Wizard-of-Oz Exper-
iment. Informatica (Slovenia) 28, 4 (2004), 425–429.
In this work, we presented a taxonomy of information sought by [7] Donald P Hartmann. 1977. Considerations in the choice of interobserver reliability
screen-reader users in their interactions with online geospatial data estimates. Journal of applied behavior analysis 10, 1 (1977), 103–116.
[8] Jake Holland. 2017. New York Times’ Upshot editor discusses data visualiza-
visualizations, generated using the fndings from our user study tion, storytelling. https://dailynorthwestern.com/2017/05/03/campus/new-york-
with 12 screen-reader users. Our work is the frst to understand and times-upshot-editor-discusses-data-visualization-storytelling/. (Accessed on
improve the information extraction of screen-reader users from 03/05/2022).
[9] Todd Hunt. 1982. Raising the Issue of Ethics through Use of Scenarios. The
online geospatial data visualizations. We found that screen-reader Journalism Educator 37, 1 (1982), 55–58.
users perform regional categorization to extract and compare data [10] Mario Konecki, Charles LaPierre, and Keith Jervis. 2018. Accessible data visual-
points. Additionally, screen-reader users rank the data based on the ization in higher education. In 2018 41st international convention on information
and communication technology, electronics and microelectronics (MIPRO). IEEE,
values of the dependent variable, arranging it by the highest, lowest, Institute of Electrical and Electronics Engineers, New York, NY, USA, 0733–0737.
and nearest-to-the-average. Utilizing our fndings, we extended the [11] Bongshin Lee, Arjun Srinivasan, Petra Isenberg, John Stasko, et al. 2021. Post-
WIMP Interaction for Information Visualization. Foundations and Trends® in
capabilities of VoxLens [23], enabling screen-reader users to extract Human-Computer Interaction 14, 1 (2021), 1–95.
information from online geospatial data visualizations. [12] Peng Liang and Onno De Graaf. 2010. Experiences of using role playing andwiki
In our study, a recurring yet unsurprising observation was that in requirements engineering course projects. In 2010 5th International Workshop
on Requirements Engineering Education and Training. IEEE, Institute of Electrical
each participant exhibited a distinct way of interacting with online and Electronics Engineers, New York, NY, USA, 1–6.
geospatial data visualizations. Although we found high-level simi- [13] Alan Lundgard, Crystal Lee, and Arvind Satyanarayan. 2019. Sociotechnical
larities in their interactions, their word choices and verbosity levels considerations for accessible visualization design. In 2019 IEEE Visualization
Conference (VIS). IEEE, Institute of Electrical and Electronics Engineers, New
for the questions they asked to extract information were unique. York, NY, USA, 16–20.
Therefore, we recommend using personalized designs [17, 20] that [14] Moira Maguire and Brid Delahunt. 2017. Doing a thematic analysis: A practical,
step-by-step guide for learning and teaching scholars. All Ireland Journal of
cater to the individualized preferences of users by identifying us- Higher Education 9, 3 (2017), 3351–3364.
age patterns (e.g., input queries issued to extract information) to [15] Kim Marriott, Bongshin Lee, Matthew Butler, Ed Cutrell, Kirsten Ellis, Cagatay
improve the interaction experiences of screen-reader users. Goncu, Marti Hearst, Kathleen McCoy, and Danielle Albers Szafr. 2021. Inclusive
data visualization for people with disabilities: a call to action. Interactions 28, 3
We plan on conducting task-based user studies with screen- and (2021), 47–51.
non-screen-reader users to assess the performance of our enhance- [16] Randall B Martin. 1991. The assessment of involvement in role playing. Journal
ment to VoxLens using a mixed-methods approach. Additionally, of clinical psychology 47, 4 (1991), 587–596.
[17] Esther Nathanson. 2017. Native voice, self-concept and the moral case for per-
we intend to extend the functionality of VoxLens to include more sonalized voice technology. Disability and rehabilitation 39, 1 (2017), 73–81.
complex data visualizations, such as multi-line graphs. Future work [18] National Geographic Society. n.d.. United States Regions | National Geographic
Society. https://www.nationalgeographic.org/maps/united-states-regions/. (Ac-
can employ our methodology to build systems that use voice as- cessed on 03/29/2022).
sistants for screen-reader users to improve their information ex- [19] Michael Quinn Patton. 1990. Qualitative evaluation and research methods. SAGE
traction. We hope that by providing insights into the screen-reader Publications, Inc., Thousand Oaks, CA, USA.
[20] Silvia Quarteroni and Suresh Manandhar. 2007. User modelling for personalized
users’ interactions with geospatial data visualizations and open- question answering. In Congress of the Italian Association for Artifcial Intelligence.
sourcing our code, this work will inspire researchers and developers Springer, Springer Berlin Heidelberg, Berlin, Heidelberg, 386–397.
to make online data visualizations more accessible to screen-reader [21] Nick Routley. 2021. Our Top 21 Visualizations of 2021 - Visual Capitalist. https:
//www.visualcapitalist.com/our-top-21-visualizations-of-2021/. (Accessed on
users and reduce the information access disparity between screen- 06/18/2022).
and non-screen-reader users caused by inaccessible visualizations. [22] Ather Sharif, Sanjana Shivani Chintalapati, Jacob O. Wobbrock, and Katharina
Reinecke. 2021. Understanding Screen-Reader Users’ Experiences with Online
Data Visualizations. In The 23rd International ACM SIGACCESS Conference on
Computers and Accessibility (Virtual Event, USA) (ASSETS ’21). Association for
ACKNOWLEDGMENTS Computing Machinery, New York, NY, USA, Article 14, 16 pages. https://doi.
This work was supported by the University of Washington Center org/10.1145/3441852.3471202
[23] Ather Sharif, Olivia H. Wang, Alida T. Muongchan, Katharina Reinecke, and
for Research and Education on Accessible Technology and Expe- Jacob O. Wobbrock. 2022. VoxLens: Making Online Data Visualizations Accessible
riences (CREATE). Finally, we thank and remember our recently- with an Interactive JavaScript Plug-In. In CHI Conference on Human Factors in
departed team member Zoey for her feline support, without which Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing
Machinery, New York, NY, USA, Article 478, 19 pages. https://doi.org/10.1145/
the purrusal of this work would not have been as efective. May she 3491102.3517431
cross the rainbow bridge in peace and fnd her way to cat heaven. [24] Terence Shin. 2021. The 10 Best Data Visualizations of 2021 | by Terence
Shin | Towards Data Science. https://towardsdatascience.com/the-10-best-data-
visualizations-of-2021-fec4c5cf6cdb?gi=13652b78af45. (Accessed on 06/18/2022).
[25] Antony Unwin. 2020. Why is Data Visualization Important? What is Important
REFERENCES in Data Visualization? · Issue 2.1, Winter 2020. https://hdsr.mitpress.mit.edu/
[1] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. pub/zok97i7p/release/3. (Accessed on 03/05/2022).
Qualitative research in psychology 3, 2 (2006), 77–101. [26] Jia Yu, Zongsi Zhang, and Mohamed Sarwat. 2018. Geosparkviz: a scalable geospa-
[2] Martin Breunig, Patrick Erik Bradley, Markus Jahn, Paul Kuper, Nima Mazroob, tial data visualization framework in the apache spark ecosystem. In Proceedings of
Norbert Rösch, Mulhim Al-Doori, Emmanuel Stefanakis, and Mojgan Jadidi. 2020. the 30th international conference on scientifc and statistical database management.
Geospatial data management research: Progress and future directions. ISPRS 1–12.
International Journal of Geo-Information 9, 2 (2020), 95. [27] Xianfeng Zhang and Micha Pazner. 2004. The icon imagemap technique for multi-
[3] Nils Dahlbäck, Arne Jönsson, and Lars Ahrenberg. 1993. Wizard of Oz stud- variate geospatial data visualization: approach and software system. Cartography
ies—why and how. Knowledge-based systems 6, 4 (1993), 258–266. and Geographic Information Science 31, 1 (2004), 29–41.
[4] Joel J Davis. 2002. Disenfranchising the Disabled: The Inaccessibility of Internet- [28] Jonathan Zong, Crystal Lee, Alan Lundgard, JiWoong Jang, Daniel Hajas, and
Based Health Information. Journal of Health Communication 7, 4 (2002), 355–367. Arvind Satyanarayan. 2022. Rich Screen Reader Experiences for Accessible Data
https://doi.org/10.1080/10810730290001701 Visualization. In The 24th EG/VGTC Conference on Visualization (EuroVis’ 22),
[5] Matthew Graham, Anthony Milanowski, and Jackson Miller. 2012. Measuring Rome, Italy, 13-17 June, 2022. Eurographics-European Association for Computer
and Promoting Inter-Rater Agreement of Teacher and Principal Performance Graphics.
Ratings.
Information Extraction From Online Geospatial Data Visualizations ASSETS ’22, October 23–26, 2022, Athens, Greece

A PARTICIPANT DEMOGRAPHICS

Table 2: Screen-reader participants, their gender identifcation, age, screen reader, vision level, and diagnosis. Under the “G”
(Gender) column, M = Male, F = Female, and N B = Non-binary.

G Age Screen Vision-Loss Level Diagnosis


Reader
P1 M 36 JAWS Blind since birth, Complete blindness Leber Congenital Amaurosis
P2 M 58 JAWS Complete blindness, Lost vision gradually Cataracts and Glaucoma
P3 M 49 JAWS Complete blindness, Lost vision gradually Leber Congenital Amaurosis
P4 M 32 NVDA Blind since birth, Complete blindness Peters Anomaly
P5 F 32 NVDA Blind since birth, Complete blindness Retinopathy of Prematurity
P6 F 65 JAWS Complete blindness, Lost vision gradually Retinitis Pigmentosa
P7 F 68 Fusion Lost vision gradually, Partial blindness Stargaart’s Maculopathy
P8 F 69 JAWS Blind since birth, Complete blindness Retinopathy of Prematurity
P9 F 52 JAWS Blind since birth, Complete blindness Retinopathy of Prematurity
P10 F 38 JAWS Blind since birth, Complete blindness Leber Congenital Amaurosis
P11 F 47 JAWS Complete blindness, Lost vision gradually Meningitis, Optic Neuropathy
P12 M 57 JAWS Complete blindness, Lost vision gradually Retinitis Pigmentosa
Understanding Design Preferences for Sensory-Sensitive Earcons
with Neurodivergent Individuals
Lauren Race Kia El-Amin Sarah Anoke
Twitter Inc., San Francisco, CA, U.S. Twitter Inc., San Francisco, CA, U.S. Twitter Inc., San Francisco, CA, U.S.

Andrew Hayward Amber James Amy Hurst


Twitter UK Ltd., London, U.K Twitter Inc., San Francisco, CA, U.S. New York University, New York, NY,
U.S.

Audrey Davis Theresa Mershon


Twitter Inc., San Francisco, CA, U.S. Twitter Inc., San Francisco, CA, U.S.
ABSTRACT (ADHD), Autism Spectrum Disorder (ASD), or Generalized Anxiety
Earcons are a critical auditory modality for those who perceive Disorder (GAD) and result in experiencing sensory sensitivities
information best through sound. Yet earcons can trigger sensory [34]. Sensory sensitivities present challenges with sound, which
sensitivities with neurodivergent individuals, causing pain or dis- can cause distraction [33], pain, or discomfort [10]. Yet the Web
comfort and creating barriers to information access. They must be Content Accessibility Guidelines (WCAG) lack specifc guidance on
carefully designed with neurodivergent representation in the design supporting sensory sensitivities and neurodivergence [28], while
process to minimize the harm they impose. To address these chal- cognitive and learning disability guidance is framed as supplemen-
lenges, we conduct a study on Twitter, a social media platform with tal [27] and, therefore, not required to meet compliance minimums.
frequent earcons, to understand how to design sensory-sensitive For sound design, WCAG recommends 1) the ability to turn of
earcons for neurodivergent individuals. We present the results of sound longer than three seconds [30]; 2) ensuring sound is not the
our qualitative interviews with nine neurodivergent Twitter users, only way to perceive information [29]; and 3) making sure users
uncovering six key themes for designing sensory-sensitive earcons. can separate speech from background sounds [31]. Earcons—or au-
Based on our fndings, we ofer a set of novel guidelines for practi- dio notifcations—are commonly used on the web and 1) are often
tioners to design sensory-sensitive earcons for accessibility. shorter than three seconds [30]; 2) must work for those who require
sound to perceive information and have sensory sensitivities, such
CCS CONCEPTS as individuals with multiple disabilities [29]; and 3) are primary
sounds, not background sounds [31]. Earcons must be as accessible
• Human-centered computing; • Human computer interac-
for those who prefer to have them turned on but are sensitive to out-
tion (HCI);
put modalities. Thus, we propose sensory-sensitive earcons (Figure
KEYWORDS 1) that minimize the pain and discomfort triggered by sensory sensi-
tivities. We crafted them with neurodivergent representation in the
Accessibility, Design, Neurodiversity, Sensory Sensitivities, Sound, design process [26] to reduce neurotypical assumptions embedded
Earcons, Social Media in the fnal artifact and improve overall design [8].
ACM Reference Format: The goal of this research is to better understand how to design
Lauren Race, Kia El-Amin, Sarah Anoke, Andrew Hayward, Amber James, sensory-sensitive earcons for neurodivergent users. We explore this
Amy Hurst, Audrey Davis, and Theresa Mershon. 2022. Understanding research goal on Twitter, a social media platform that is a source
Design Preferences for Sensory-Sensitive Earcons with Neurodivergent
of frequent notifcations. We describe our qualitative interviews
Individuals. In The 24th International ACM SIGACCESS Conference on Com-
with nine neurodivergent Twitter users, present the fndings from
puters and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece.
ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3517428.3550365 our study to understand design preferences for sensory-sensitive
earcons, and contribute a set of six novel, concrete design guidelines.
1 INTRODUCTION
An estimated range of 7% [16] to 20% [11] of adults identify as 2 RELATED WORK
neurodivergent—a term describing individuals who have learning, Past HCI literature studies the design of technologies to support neu-
cognitive, and psychological disabilities. Neurodivergence can take rodivergent individuals who have sensory sensitivities. Researchers
many forms, including Attention Defcit Hyperactivity Disorder have probed support methods for neurodivergent users managing
Permission to make digital or hard copies of part or all of this work for personal or distracting and uncomfortable sensory experiences by using au-
classroom use is granted without fee provided that copies are not made or distributed dio [22], proprioceptive [24], haptic [25], and visual [18, 19, 32]
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. modalities. Zolyomi et al. found that neurodivergent adults relied
For all other uses, contact the owner/author(s). on coping strategies to manage distracting noises which imposed
ASSETS ’22, October 23–26, 2022, Athens, Greece cognitive load [33]. Damiani studied how sound intensity and pitch
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. can cause pain and discomfort [10]. To support sensory needs, schol-
https://doi.org/10.1145/3517428.3550365 ars have advocated for including a range of sensory preferences
ASSETS ’22, October 23–26, 2022, Athens, Greece Lauren Race et al.

to accommodate varying individuals’ thresholds for sensory input which we used to develop a concrete set of sensory-sensitive earcon
[4]. Though some design processes aim to incorporate a spectrum design guidelines. We used the guidelines to create an earcon for
of sensory preferences [21], Spiel and Gerling found that neuro- the send tweet interaction on Twitter (Figure 1).
divergent individuals are often excluded from the design process
[26]. Dalton argued for the involvement of neurodiverse designers
who can create from lived experience [9]. Thus, researchers have
identifed frameworks for practitioners to include neurodiverse per-
spectives throughout the design process [2, 16, 23]. With various
technologies competing for our attention [20], past research has
investigated how to improve earcon design to increase compre-
hension and decrease the cognitive load imposed. Scholars have
researched how customizing notifcations for sensory preferences Figure 1: A sensory-sensitive earcon for send tweet. The sound
benefts those who prefer auditory output [3] and those who prefer wave of a sensory-sensitive earcon is one second or less, min-
non-auditory notifcations [12]. Researchers have looked at how imizes wavelength space between repetitive notes, has an ∼
earcons can assist in situations where visual interfaces cannot [5], 40-50 dB intensity, ∼ 1k Hz frequency, and uses a slow attack
such as leveraging unique sounds users could learn for commonly and decay with no sustain and release for a smooth sound
used words such as “thanks” [14]. The design of earcons is complex envelope. Image credit: Listen.
since—though pitch and timbre can be leveraged to capture atten-
tion [1]—some end users have described earcons as “discordant”
and “too obtrusive”, advocating for more “aesthetically pleasant”
sounds [13]. 4 DESIGN GUIDELINES FINDINGS
Though prior HCI literature examines both neurodiversity and Our interviews uncovered six key themes for designing sensory-
earcons, there is a lack of research investigating how earcons afect sensitive earcons: 1) duration; 2) repetition; 3) sound envelope; 4)
neurodivergent users or identifes the variables in earcons that intensity and frequency; 5) familiarity; and 6) purpose. Many (5
trigger sensory sensitivities with that user population. Further, participants) described challenges with duration. They told us that
there are currently no design guidelines for earcons that minimize “ding” sounds were bothersome because they would linger, while
the pain and discomfort triggered by sensory sensitivities. Our work shorter sounds were more pleasant. When asked to provide specifc
addresses this gap in accessibility and HCI literature by expanding duration guidance, they said that shorter than one second was ideal.
existing earcon design guidelines [6, 7, 15] to include the needs of P6 told us: “Half a second to a second at most because, other than that,
neurodivergent users. they feel obtrusive.” Additionally, 6 participants reported sensory
sensitivities with repetition. They told us that hearing a sound mul-
tiple times annoyed them and was unnecessary for understanding
3 UNDERSTANDING SENSORY SENSITIVE an earcon. P5 said, “I don’t need to hear it three times to know that
EARCON DESIGN it happened.” Almost all (8 participants) reported issues with the
Our research team included both neurodivergent and neurotypical shape of the sound envelope. They told us that sudden earcons
individuals to incorporate lived experience, while initially explor- annoyed them, while gentler earcons were appreciated. P9 said,
ing which types of earcons trigger sensory sensitivities and should “Sounds are more pleasant when they’re a little softer or smoother,
be avoided. We recruited participants with a screener survey sent whether it’s a gradual increase in volume or notes that fow together.”
to 22,439 individuals to fnd Twitter users between the ages of 18- Almost all (8 participants) highlighted issues with intensity and
65, asking them their age, gender, whether they self-identifed as frequency. High-pitched or loud notes caused discomfort with par-
neurodivergent, and if they had used Twitter in the past month. ticipants, even if they had their computing device volume settings
We received 201 qualifed responses before identifying nine Twitter turned down. P1 said: “I don’t like high-pitched noises at all. I fnd
users (1 nonbinary, 1 trans male, 3 female, 4 male; ages 18-35) who them extremely uncomfortable and jarring.” They told us that more
self-identifed as neurodivergent and had experience using audio balanced and soothing tones were better tolerated. Most (6 partic-
notifcations on their mobile devices. With informed consent, we ipants) detailed their views with familiarity. They explained that
conducted qualitative, semi-structured interviews via teleconfer- recognizable sounds could be more clearly mapped to an action. P1
encing software. During the interviews, we asked key questions said: “I like something that I can recognize right of the bat. That’s
such as: 1) Do you keep sound notifcations turned on or of? Why? 2) another reason why I like the skeuomorphic sounds because I can iden-
What do you think makes a sound annoying? 3) What are some types tify what it is trying to tell me immediately.” Most (6 participants)
of sounds you would like to see less of in apps? 4) Tell me about a time reported challenges with purpose. They said that excessive sound
when you had a bad experience with sound notifcations. 5) What that did not identify anything specifc could feel overwhelming. P9
sound notifcations do you like, or even love? 6) Was there ever a sound said they did not like “Generic sounds that don’t relate to anything
notifcation that was fne at frst, but increasingly bothered you over or aren’t identifying anything specifc—they’re there to fll the space.”
time? Can you describe that sound? We recorded video, transcribed They told us that they would like more careful and practical earcon
the audio, and stored the data on a secure cloud server. The frst and placement. P1 said: “If there has to be a sound to something, it doesn’t
second author reviewed the data and performed a thematic analysis. necessarily have to be attached to every button that you press, but it
Our fndings yielded a set of six sensory-sensitive earcon themes, should be functional in some way.”
Understanding Design Preferences for Sensory-Sensitive Earcons with Neurodivergent Individuals ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Sensory Sensitive Earcon Design Guidelines

1. Keep duration to one second or less. 2. Minimize wavelength space between 3. Use a smooth sound envelope, with a slow
repetitive notes to avoid percussive attack and decay, and no sustain or release.
sounds.
4. Keep intensity mid-range (∼ 40-50 dB) 5. Use recognizable, skeuomorphic 6. Only use earcons for confrming something
and the frequency close to the human sounds. has happened or indicating an error.
voice (∼ 1k Hz).

5 DISCUSSION & RECOMMENDATIONS Interaction with Sound (AM ’11). Association for Computing Machinery, New
York, NY, USA, 95–100. DOI:https://doi.org/10.1145/2095667.2095681
While prior earcon themes focus on increasing comprehension [4] LouAnne Boyd, Kendra Day, Ben Wasserman, Kaitlyn Abdo, Gillian Hayes, and
[6, 7, 15], the themes we uncovered focus on increasing accessibility Erik Linstead. 2019. Paper Prototyping Comfortable VR Play for Diverse Sensory
Needs. In Extended Abstracts of the 2019 CHI Conference on Human Factors
for neurodivergent individuals with sensory sensitivities. These in Computing Systems (CHI EA ’19). Association for Computing Machinery,
themes expand upon WCAG [28] sound design requirements for New York, NY, USA, Paper LBW1714, 1–6. DOI:https://doi.org/10.1145/3290607.
turning of sound longer than three seconds [30], making sure 3313080
[5] Stephen A. Brewster. 1998. Using earcons to improve the usability of tool palettes.
sound is not the only output modality [29], and sound clarity [31] In CHI 98 Conference Summary on Human Factors in Computing Systems
by minimizing pain and discomfort with shorter sounds the user (CHI ’98). Association for Computing Machinery, New York, NY, USA, 297–298.
has enabled. This study underscores that WCAG compliance is the DOI:https://doi.org/10.1145/286498.286775
[6] Stephen Brewster, Peter Wright, and Alistair Edwards. 1995. Experimentally
minimum and unique use cases—such as earcons—need additional derived guidelines for the creation of earcons.
considerations for accessibility. Based on the six key themes for [7] Stephen A. Brewster, Peter C. Wright, and Alistair D. N. Edwards. 1993. An eval-
uation of earcons for use in auditory human-computer interfaces. In Proceedings
designing sensory-sensitive earcons, we provide recommendations of the INTERACT ’93 and CHI ’93 Conference on Human Factors in Computing
with our guidelines (Table 1). Systems (CHI ’93). Association for Computing Machinery, New York, NY, USA,
222–227. DOI:https://doi.org/10.1145/169059.169179
[8] Jim A. Carter and David W. Fourney. 2007. Techniques to assist in developing
6 CONCLUSION & FUTURE WORK accessibility engineers. In Proceedings of the 9th international ACM SIGACCESS
Earcons can trigger sensory sensitivities with neurodivergent indi- conference on Computers and accessibility (Assets ’07). Association for Comput-
ing Machinery, New York, NY, USA, 123–130. https://doi.org/10.1145/1296843.
viduals, yet must be accessible for those who prefer to use them. To 1296865
make earcons more accessible, we wanted to learn how to design [9] Nicholas Sheep Dalton. 2013. Neurodiversity & HCI. In CHI ’13 Extended
earcons for neurodivergent individuals. We described our qualita- Abstracts on Human Factors in Computing Systems (CHI EA ’13). Associa-
tion for Computing Machinery, New York, NY, USA, 2295–2304. DOI:https:
tive interviews with nine neurodivergent Twitter users to inform //doi.org/10.1145/2468356.2468752
the earcon designs. Our study identifed six key themes for design- [10] Luca M. Damiani. 2019. Hyper Sensorial – Human Computed Neurodivergent
Poem. In Extended Abstracts of the 2019 CHI Conference on Human Factors in
ing sensory-sensitive earcons: duration, repetition, sound envelope, Computing Systems (CHI EA ’19). Association for Computing Machinery, New
intensity and frequency, familiarity, and purpose. Based on our fnd- York, NY, USA, Paper VS04, 1–2. DOI:https://doi.org/10.1145/3290607.3311779
ings, we contributed six, novel concrete guidelines for practitioners [11] Nancy Doyle, Neurodiversity at work: a biopsychosocial model and the impact
on working adults, British Medical Bulletin, Volume 135, Issue 1, September 2020,
to design sensory-sensitive earcons. We see opportunities to evalu- Pages 108–125, https://doi.org/10.1093/bmb/ldaa021
ate future earcon concepts generated from the design guidelines [12] Jose A. Gallud and Ricardo Tesoriero. 2015. Smartphone Notifcations: A Study
by surveying a larger sample size of both neurodivergent and neu- on the Sound to Soundless Tendency. In Proceedings of the 17th International
Conference on Human-Computer Interaction with Mobile Devices and Services
rotypical users, explore sensory-sensitive earcons on other social Adjunct (MobileHCI ’15). Association for Computing Machinery, New York, NY,
media platforms, and research how sensory sensitivities impact USA, 819–824. DOI:https://doi.org/10.1145/2786567.2793706
[13] Stavros Garzonis, Simon Jones, Tim Jay, and Eamonn O’Neill. 2009. Auditory icon
other output modalities in HCI. and earcon mobile service notifcations: intuitiveness, learnability, memorability
and preference. In Proceedings of the SIGCHI Conference on Human Factors in
ACKNOWLEDGMENTS Computing Systems (CHI ’09). Association for Computing Machinery, New York,
NY, USA, 1513–1522. DOI:https://doi.org/10.1145/1518701.1518932
We thank our participants, Emmanuelle Bastien, Anita Butler, Rosie [14] Ellen Isaacs, Alan Walendowski, and Dipti Ranganathan. 2001. Hubbub: a wire-
Cabreros, Gerard Cohen, Kristine Ibale, Brett Lewis, Listen, Katrina less instant messenger that uses earcons for awareness and for "sound instant
messages". In CHI ’01 Extended Abstracts on Human Factors in Computing Sys-
Lofaro, Nada Rastad, Tess Sitzmann, Jason Sofonia, and Johanna tems (CHI EA ’01). Association for Computing Machinery, New York, NY, USA,
Stein. 3–4. DOI:https://doi.org/10.1145/634067.634070
[15] David K. McGookin and Stephen A. Brewster. 2004. Understanding concur-
rent earcons: Applying auditory scene analysis principles to concurrent earcon
REFERENCES recognition. ACM Trans. Appl. Percept. 1, 2 (October 2004), 130–155. DOI:https:
[1] David Beattie, Lynne Baillie, and Martin Halvey. 2017. Exploring How Drivers //doi.org/10.1145/1024083.1024087
Perceive Spatial Earcons in Automated Vehicles. Proc. ACM Interact. Mob. Wear- [16] Vivian Genaro Motti. 2019. Designing emerging technologies for and with neuro-
able Ubiquitous Technol. 1, 3, Article 36 (September 2017), 24 pages. DOI:https: diverse users. In Proceedings of the 37th ACM International Conference on the
//doi.org/10.1145/3130901 Design of Communication (SIGDOC ’19). Association for Computing Machin-
[2] Laura Benton, Asimina Vasalou, Rilla Khaled, Hilary Johnson, and Daniel Gooch. ery, New York, NY, USA, Article 11, 1–10. DOI:https://doi.org/10.1145/3328020.
2014. Diversity for design: a framework for involving neurodiverse children in the 3353946
technology design process. In Proceedings of the SIGCHI Conference on Human [17] Listen. Listen Homepage. Retrieved May 21, 2022 from https://wearelisten.com/
Factors in Computing Systems (CHI ’14). Association for Computing Machinery, [18] Jessica Navedo, Amelia Espiritu-Santo, and Shameem Ahmed. 2019. Strength-
New York, NY, USA, 3747–3756. DOI:https://doi.org/10.1145/2556288.2557244 Based ICT Design Supporting Individuals with Autism. In The 21st International
[3] Athina Bikaki and Andreas Floros. 2011. An RSS-feed auditory aggregator using ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’19).
earcons. In Proceedings of the 6th Audio Mostly Conference: A Conference on
ASSETS ’22, October 23–26, 2022, Athens, Greece Lauren Race et al.

Association for Computing Machinery, New York, NY, USA, 560–562. DOI:https: [25] Will Simm, Maria Angela Ferrario, Adrian Gradinar, Marcia Tavares Smith,
//doi.org/10.1145/3308561.3354637 Stephen Forshaw, Ian Smith, and Jon Whittle. 2016. Anxiety and Autism: Towards
[19] A. Ould Mohamed, V. Courboulay, K. Sehaba, and M. Menard. 2006. Attention Personalized Digital Health. In Proceedings of the 2016 CHI Conference on Hu-
analysis in interactive software for children with autism. In Proceedings of the man Factors in Computing Systems (CHI ’16). Association for Computing Machin-
8th international ACM SIGACCESS conference on Computers and accessibility ery, New York, NY, USA, 1270–1281. DOI:https://doi.org/10.1145/2858036.2858259
(Assets ’06). Association for Computing Machinery, New York, NY, USA, 133–140. [26] Katta Spiel and Kathrin Gerling. 2021. The Purpose of Play: How HCI Games
DOI:https://doi.org/10.1145/1168987.1169011 Research Fails Neurodivergent Populations. ACM Trans. Comput.-Hum. Interact.
[20] Eleftherios Papachristos, Timothy Robert Merritt, Tobias Jacobsen, and Jimmi 28, 2, Article 11 (April 2021), 40 pages. DOI:https://doi.org/10.1145/3432245
Bagger. 2020. Designing Ambient Multisensory Notifcation Devices: Managing [27] W3C. Making Content Usable for People with Cognitive and Learning Disabilities.
Disruptions in the Home. In 19th International Conference on Mobile and Ubiq- Retrieved April 27, 2022 from https://www.w3.org/TR/coga-usable/
uitous Multimedia (MUM 2020). Association for Computing Machinery, New [28] W3C. Web Content Accessibility Guidelines (WCAG) 2.1. Retrieved April 27,
York, NY, USA, 59–70. DOI:https://doi.org/10.1145/3428361.3428400 2022 from https://www.w3.org/TR/WCAG21/
[21] Angela M. Puccini, Marisa Puccini, and Angela Chang. 2013. Acquiring educa- [29] W3C. Web Content Accessibility Guidelines (WCAG) 2.1. Success Criterion 1.3.3
tional access for neurodiverse learners through multisensory design principles. Sensory Characteristics. Retrieved April 27, 2022 from https://www.w3.org/TR/
In Proceedings of the 12th International Conference on Interaction Design and WCAG21/#sensory-characteristics
Children (IDC ’13). Association for Computing Machinery, New York, NY, USA, [30] W3C. Web Content Accessibility Guidelines (WCAG) 2.1. Success Criterion 1.4.2
455–458. DOI:https://doi.org/10.1145/2485760.2485848 Audio Control. Retrieved April 27, 2022 from https://www.w3.org/TR/WCAG21/
[22] Lauren Race, Amber James, Andrew Hayward, Kia El-Amin, Maya Gold Pat- #audio-control
terson, and Theresa Mershon. 2021. Designing Sensory and Social Tools for [31] W3C. Web Content Accessibility Guidelines (WCAG) 2.1. Success Criterion 1.4.7
Neurodivergent Individuals in Social Media Environments. In The 23rd Interna- Low or No Background Audio. Retrieved April 27, 2022 from https://www.w3.
tional ACM SIGACCESS Conference on Computers and Accessibility (ASSETS org/TR/WCAG21/#low-or-no-background-audio
’21). Association for Computing Machinery, New York, NY, USA, Article 61, 1–5. [32] Victoria Yaneva, Irina Temnikova, and Ruslan Mitkov. 2015. Accessible Texts for
DOI:https://doi.org/10.1145/3441852.3476546 Autism: An Eye-Tracking Study. In Proceedings of the 17th International ACM
[23] Amon Rapp, Federica Cena, Christopher Frauenberger, Niels Hendriks, and SIGACCESS Conference on Computers & Accessibility (ASSETS ’15). Association
Karin Slegers. 2019. Designing Mobile Technologies for Neurodiversity: Chal- for Computing Machinery, New York, NY, USA, 49–57. DOI:https://doi.org/10.
lenges and Opportunities. In Proceedings of the 21st International Conference 1145/2700648.2809852
on Human-Computer Interaction with Mobile Devices and Services (MobileHCI [33] Annuska Zolyomi, Andrew Begel, Jennifer Frances Waldern, John Tang, Michael
’19). Association for Computing Machinery, New York, NY, USA, Article 75, 1–5. Barnett, Edward Cutrell, Daniel McDuf, Sean Andrist, and Meredith Ringel
DOI:https://doi.org/10.1145/3338286.3344427 Morris. 2019. Managing Stress: The Needs of Autistic Adults in Video Calling.
[24] Kathryn E. Ringland, Christine T. Wolf, LouAnne Boyd, Jamie K. Brown, Andrew Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 134 (November 2019), 29
Palermo, Kimberley Lakes, and Gillian R. Hayes. 2019. DanceCraft: A Whole- pages. DOI:https://doi.org/10.1145/3359236
body Interactive System for Children with Autism. In The 21st International [34] Annuska Zolyomi and Jaime Snyder. 2021. Social-Emotional-Sensory Design Map
ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’19). for Afective Computing Informed by Neurodivergent Experiences. Proc. ACM
Association for Computing Machinery, New York, NY, USA, 572–574. DOI:https: Hum.-Comput. Interact. 5, CSCW1, Article 77 (April 2021), 37 pages. DOI:https:
//doi.org/10.1145/3308561.3354604 //doi.org/10.1145/3449151
Understanding How People with Visual Impairments Take
Selfies: Experiences and Challenges
Ricardo E. Gonzalez Penuela Paul Vermette Zihan Yan
Information Science, Cornell Tech, Cornell University, Ithaca, New York, Information Science, Cornell
Cornell University, New York, New USA University, Ithaca, New York, USA
York, USA prv25@cornell.edu zihanyan@zju.edu.cn.
reg258@cornell.edu

Cheng Zhang Keith Vertanen Shiri Azenkot


Information Science, Cornell Computer Science, Michigan Jacobs Technion-Cornell Institute,
University, Ithaca, New York, USA Technological University, Houghton, Cornell Tech, Cornell University,
chengzhang@cornell.edu Michigan, USA New York, New York, USA
vertanen@mtu.edu shiri.azenkot@cornell.edu

Figure 1: Participants demonstrating how they take a selfe with their smartphone (P3, P6, P5, and P9).

ABSTRACT We contribute design guidelines that researchers and designers can


Selfes are a pervasive form of communication in social media. implement for creating accessible selfe-taking applications.
While there has been some work on systems that guide people with
visual impairments (PVI) in taking photos, nearly all has focused CCS CONCEPTS
on using the camera on the back of the device. We do not know
• Human-centered computing → Accessibility; Empirical stud-
whether and how PVI take selfes. The aim of our work is to un-
ies in accessibility; Human computer interaction (HCI); Empirical
derstand (1) PVI selfe-taking experiences and challenges, (2) what
studies in HCI.
information do PVI need when taking selfes, and (3) what modali-
ties do PVI prefer (e.g., tactile, verbal, or non-verbal audio) to sup-
port selfe-taking. To address this gap, we conducted interviews KEYWORDS
with 10 PVI. Our fndings show that current selfe-taking applica- Selfes, Accessibility, Challenges Photo-Taking, Design Guidelines,
tions do not provide enough assistance to meet the needs of PVI. Photography, PVI

Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed ACM Reference Format:
for proft or commercial advantage and that copies bear this notice and the full citation Ricardo E. Gonzalez Penuela, Paul Vermette, Zihan Yan, Cheng Zhang,
on the frst page. Copyrights for third-party components of this work must be honored. Keith Vertanen, and Shiri Azenkot. 2022. Understanding How People with
For all other uses, contact the owner/author(s).
Visual Impairments Take Selfes: Experiences and Challenges. In The 24th
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Copyright held by the owner/author(s).
International ACM SIGACCESS Conference on Computers and Accessibility
ACM ISBN 978-1-4503-9258-7/22/10. (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA,
https://doi.org/10.1145/3517428.3550372 5 pages. https://doi.org/10.1145/3517428.3550372
ASSETS ’22, October 23–26, 2022, Athens, Greece Ricardo Gonzalez Penuela et al.

1 INTRODUCTION & RELATED WORK takes one more step towards making photography fully accessible
Like their sighted counterparts, people with visual impairments to PVI.
(PVI) take photos, share, and recall memories [2, 12, 19, 23, 26].
Moreover, for PVI, taking photos can be an important way to get 2 METHOD
information about their environment, whether it is from remote We recruited 10 participants, all of whom had a visual impairment.
friends, the crowd [14-16, 20, 22] or computer vision services [18]. Their ages ranged from 22 to 68 (mean=38, SD=14). All partici-
However, since they cannot see the photos, it is difcult for them to pants were based in the United States and were recruited through
capture the target and ensure a photo is aesthetic and high quality the National Federation of the Blind mailing list. Four identifed
(e.g., well-lit, and properly framed). as totally blind, fve identifed as legally blind, and one participant
Researchers developed methods that provide audio and tactile identifed as low vision. Eight were female and two were male. All
guidance to assist PVI in taking photos [1-11]. For example, Jayant participants were iPhone users.
et al. developed Easy Snap [2] a mobile device application that We conducted remote interviews that were approximately one-
helped PVI capture other people in a photo by providing verbal hour long via videoconferencing software. We frst asked partici-
instructions, vibrations, and audio cues. Similarly, Cutter et al. [9] pants to recall instances where they needed to use their phone’s
developed a method to help PVI capture a document. Their method front-facing camera. Next, we asked them about their selfe-taking
provided the user precise verbal instructions specifying the direc- behavior, e.g., “which applications do you use to take selfes?”.
tion and distance to move the device. This body of work focused Then, we asked participants to demonstrate how they take a selfe
on taking photos with the back camera (the camera on the back of with their phone while they “thought out loud” [17]. Finally, we
the device) and ensuring that a certain target was captured. Fur- asked participants to provide feedback about a set of potential fea-
thermore, these systems have mainly focused on whether a PVI tures and modes of interaction for an accessible selfe-taking appli-
can take a “compliant” photo with the provided guidance (e.g., cation.
Whether they can capture a document, people, or a scene in an During the interviews, one of the researchers took notes of sig-
urban area). Only a small number of researchers have considered nifcant moments and relevant quotes. In addition, the interviews
the task of taking selfes, which is typically more open ended and were videorecorded and transcribed. Two researchers coded two of
requires more detailed information to meet the user’s needs. the transcripts individually through an inductive coding process.
Today, people frequently take selfes (i.e., photos of themselves). Then, the two researchers went through the generated codes to-
They share selfes to present themselves in a desired light (partic- gether until there was an agreement on a single set of codes. The
ularly in social media platforms). PVI want to participate in social resulting codebook was utilized by one of the researchers to code
media just like their sighted counterparts, taking and posting self- the rest of the transcripts.
ies as well as other types of photos [11, 13, 19, 21, 26], so it is im-
portant to ensure that selfe-taking is accessible. 3 FINDINGS
Only one project has proposed an accessible selfe-taking ap- 3.1 Selfe-Taking Experiences and Challenges
plication for people with visual impairments [24]. Yungjung et
Participants described their weekly selfe-taking behavior. Five
al. [24] designed SelfeHelper, an application that provided verbal
participants took selfes infrequently (1-4 every week), three fre-
feedback about the number of faces, sizes and their locations in
quently (5-9) and two very frequently (10+). Participants took self-
the screen. Although this application addressed some challenges
ies for various reasons: to communicate with friends (P3, P8), to
in the selfe-taking process (e.g., framing the user’s head in the
take photos with their newborn (P6), for online dating (P1, P8, P9),
photo), participants mentioned that they desired more information
to remember an occasion with their peers (P1, P5, P10), and to share
(i.e., their physical appearance, or the background). Fang et al. [25]
how they looked (P4, P7).
also designed an application that provides feedback throughout
All participants except P4 sent selfes to a sighted person for
the selfe-taking process, but it was not intended for PVI. These
validation before sharing them. P4 had extensive experience tak-
applications did not provide holistic support for accessible selfe-
ing selfes, and she felt that if “you do not care too much about
taking—they did not provide enough information to meet PVI
how [the photo] looks” then she would send the photo without
needs, or the feedback was not designed to be understood by PVI.
validation.
Thus, at present, we still do not know what specifc challenges PVI
Many participants expressed desire but also frustration over tak-
experience when taking selfes, what type of support can best as-
ing selfes, and using frontal camera features to take augmented
sist them and what information they need to take a selfe.
selfes, like face flters (P1, P4-10). Two participants mentioned us-
To address this gap, we investigated the selfe-taking experi-
ing inaccessible camera applications in social media (e.g., Snapchat
ences of PVI. Our research questions were: (1) What are the current
and Facebook Messenger) solely because they are forced by social
challenges that PVI face when taking selfes? (2) What information
circumstances (e.g., friends only use these platforms).
do PVI want when taking selfes? (3) What modalities do PVI pre-
P5, who had a successful business with her online blog, de-
fer (e.g., tactile, verbal, audio non-verbal cues) in a hypothetical
scribed how the inability to take good selfes independently could
accessible selfe-taking application?
afect her business:
In this paper, we describe an interview study to address the re-
search questions and contribute design guidelines for accessible “I see when people post those types of things (selfes),
selfe-taking applications. Building on prior work, our research you really get a feel for who they are and you get a
feel for who they are as a person, in addition to who
Understanding How People with Visual Impairments Take Selfies: Experiences and Challenges ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Participant demographics.

Participant ID Gender Age Visual Impairment Onset


1 Male 29 Totally blind Birth
2 Female 44 Totally blind 1
3 Female 22 Legally blind Birth
4 Female 24 Legally blind Birth
5 Female 32 Legally Blind Birth
6 Male 31 Legally Blind 11
7 Female 52 Totally blind Birth
8 Female 38 Totally blind Birth
9 Female 41 Low Vision. 37
10 Female 68 Legally blind Birth

they are as a business owner. And I feel like I’m kind None of the participants were excited about using tactile cues
of missing out on that because I can’t express myself to convey information. On the contrary, some participants discour-
in that way”. aged the use of vibrations because they thought that it would be
confusing or uncomfortable to have the phone vibrating while tak-
P9 explained how she felt when she was creating a dating profle:
ing a photo (P1, P4). P4 elaborated on their reasoning: “The only
“[Taking a selfe for the dating app] was stressful. I thing I can think of that your phone can do [with tactile cues] is buzz,
kept thinking it is probably going to be a really and that would shake the camera.”
weird shot where I’m either cut of, where it is go-
ing to look like I have a physical disability.”
4 DESIGN GUIDELINES & FUTURE WORK
3.2 Information Needs and Modality In prior studies, involving the back camera, researchers considered
Many participants pointed out how the lack of information in the only one criterion for taking successful photos: the target object
guidance provided by the camera they typically use to take selfes was captured in the photo [1–3, 9, 11]. In contrast, our fndings
(e.g., the iPhone camera) demotivated them entirely from taking a showed that taking selfes was more complex, and involved three
selfe (P1, P2, P5, P7, P8, P10). success criteria: PVI had to capture themselves, make sure they
P8 described how the feedback given by Apple’s guidance sys- looked as desired, and not capture anything unwanted in the back-
tem was not helpful: “It will say if a photo is bright or dark, but ground. Based on this observation and our fndings, we present
even photos that people send me when it’s a good photo or they design guidelines for accessible selfe-taking applications:
say it’s a good photo, it will describe it as dark. So, I don’t really •Tailor the system guidance to the goal of the user and
have a good sense of what that means”. support all success criteria. For example, P4 mentioned that
Participants desired to have more detailed and descriptive infor- sometimes she wanted to take a quick photo and they did not care
mation about their physical appearance to share selfes confdently how she looked, but other times when she shared the photo on
(P1, P2, P6, P7, P8). P1 provided an example of the kind of informa- social media, she cared about how she looked and the photo qual-
tion he would like to receive: “Bob is sitting at his desk in his new ity. This is consistent with prior fndings [24]. Most participants
vinyl desk chair, and he is wearing a brown sweater with three but- wanted to know how they looked (e.g., their facial expressions and
tons, blue jeans, and black socks.” clothing). In the frst scenario the iOS camera guidance provides
Six participants had a strong preference for human-like conver- enough information since it helps PVI center their face and snap
sational guidance (P1, P4, P5, P7, P9, P10). Six participants cau- a quick photo; but in the second scenario, PVI need more informa-
tioned that verbal information should strive to be natural and in- tion about their physical appearance (e.g., their facial expression,
tuitive (P1, P2, P4, P5, P7, P9). They were especially concerned whether their eyes are closed or open, clothing detected), about
with verbal cues that may lead to cognitive overload (e.g., “turn background information (e.g., disorganized table, foor detected)
the phone 45 degrees to the right”) because (1) they would need to and the quality of the photo (e.g., whether the photo is blurry,
process that information and (2) such instructions would be harder whether the faces in the photo can be seen clearly).
to reliably follow. •Guide the user through human-like conversational
Half of the participants also liked the idea of having auditory prompts and descriptions. As many participants mentioned, PVI
non-verbal guidance cues (P1, P4, P7, P8, P9). Many participants are accustomed to being assisted by sighted peers when they take
recalled some form of “hot-cold” tone guidance. This meant that photos. Thus, conversational prompts and descriptions of photo
the guidance tone would change its rhythm and volume to guide content and quality would be natural and easy to follow.
their movements. P8 provided an example: “This is [in an applica- •Communicate low-level guidance with real-time feed-
tion] for taking pictures of documents with the back camera to read back using non-verbal audio cues: As mentioned by partic-
them. It uses tonal guidance, it is louder and steadier when the image ipants, PVI are familiar with systems that leverage the pitch,
is clear, and softer when you are too far away or wavery when you tone, or rhythm of sounds to convey a change of state in the
are not steady. It is one of my favorite ways to use a camera.” system or progress of the task. This, combined with human-like
ASSETS ’22, October 23–26, 2022, Athens, Greece Ricardo Gonzalez Penuela et al.

conversational prompts, would provide PVI with the necessary in- Objects. Proceedings of the 2020 CHI Conference on Human Factors in Comput-
formation to make small adjustments to the camera position. ing Systems. Association for Computing Machinery, New York, NY, USA, 1–12.
DOI:https://doi.org/10.1145/3313831.3376143
•Include settings to flter information: Consistent with the [12] Cynthia L. Bennett, Jane E, Martez E. Mott, Edward Cutrell, and Meredith Ringel
literature in other domains about accessible technology for PVI Morris. 2018. How Teens with Visual Impairments Take, Edit, and Share Photos
on Social Media. In Proceedings of the 2018 CHI Conference on Human Factors
[27], we recommend providing an easy way to turn on or of infor- in Computing Systems (CHI ’18). Association for Computing Machinery, New
mation provided by the guidance system (e.g., turn on/of guidance York, NY, USA, Paper 76, 1–12. DOI:https://doi.org/10.1145/3173574.3173650
about the photo’s background or people’s facial expressions). More [13] Dustin Adams, Sri Kurniawan, Cynthia Herrera, Veronica Kang, and Natalie
Friedman. 2016. Blind Photographers and VizSnap: A Long-Term Study. In Pro-
generally, information should be modifable with varying degrees ceedings of the 18th International ACM SIGACCESS Conference on Comput-
of specifcity to adapt to the needs and preferences of the user. ers and Accessibility (ASSETS ’16). Association for Computing Machinery, New
We hope that these guidelines will be used by researchers and York, NY, USA, 201–208. DOI:https://doi.org/10.1145/2982142.2982169
[14] Jefrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller,
designers to create accessible selfe-taking applications. In a future Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual
study, we will develop our own accessible selfe-taking application White, and Tom Yeh. 2010. VizWiz: nearly real-time answers to visual questions.
In Proceedings of the 23nd annual ACM symposium on User interface software
that PVI could use to share photos about their daily lives. and technology (UIST ’10). Association for Computing Machinery, New York,
NY, USA, 333–342. DOI:https://doi.org/10.1145/1866029.1866080
ACKNOWLEDGMENTS [15] J. P. Bigham, C. Jayant, A. Miller, B. White and T. Yeh, "VizWiz::LocateIt - en-
abling blind people to locate objects in their environment," 2010 IEEE Computer
This research was supported in part by the National Science Foun- Society Conference on Computer Vision and Pattern Recognition - Workshops,
dation under awards IIS-1909248 and IIS-1909930. We thank all par- 2010, pp. 65-72, doi: 10.1109/CVPRW.2010.5543821.
[16] Michele A. Burton, Erin Brady, Robin Brewer, Callie Neylan, Jefrey P.
ticipants for their time and feedback. Bigham, and Amy Hurst. 2012. Crowdsourcing subjective fashion advice us-
ing VizWiz: challenges and opportunities. In Proceedings of the 14th inter-
REFERENCES national ACM SIGACCESS conference on Computers and accessibility (AS-
SETS ’12). Association for Computing Machinery, New York, NY, USA, 135–142.
[1] Samuel White, Hanjie Ji, and Jefrey P. Bigham. 2010. EasySnap: real-time DOI:https://doi.org/10.1145/2384916.2384941
audio feedback for blind photography. In Adjunct proceedings of the 23nd [17] Ericsson, K. A., & Simon, H. A. (1984). Protocol analysis: Verbal reports as data.
annual ACM symposium on User interface software and technology (UIST the MIT Press.
’10). Association for Computing Machinery, New York, NY, USA, 409–410. [18] Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2018.
DOI:https://doi.org/10.1145/1866218.1866244 A Face Recognition Application for People with Visual Impairments: Un-
[2] Chandrika Jayant, Hanjie Ji, Samuel White, and Jefrey P. Bigham. 2011. derstanding Use Beyond the Lab. In Proceedings of the 2018 CHI Con-
Supporting blind photography. In The proceedings of the 13th interna- ference on Human Factors in Computing Systems (CHI ’18). Associa-
tional ACM SIGACCESS conference on Computers and accessibility (ASSETS tion for Computing Machinery, New York, NY, USA, Paper 215, 1–14.
’11). Association for Computing Machinery, New York, NY, USA, 203–210. DOI:https://doi.org/10.1145/3173574.3173789
DOI:https://doi.org/10.1145/2049536.2049573 [19] Violeta Voykinska, Shiri Azenkot, Shaomei Wu, and Gilly Leshed. 2016. How
[3] Marynel Vázquez and Aaron Steinfeld. 2012. Helping visually impaired Blind People Interact with Visual Content on Social Networking Services. In
users properly aim a camera. In Proceedings of the 14th international Proceedings of the 19th ACM Conference on Computer-Supported Cooperative
ACM SIGACCESS conference on Computers and accessibility (ASSETS Work & Social Computing (CSCW ’16). Association for Computing Machinery,
’12). Association for Computing Machinery, New York, NY, USA, 95–102. New York, NY, USA, 1584–1595. DOI:https://doi.org/10.1145/2818048.2820013
DOI:https://doi.org/10.1145/2384916.2384934 [20] Erin L. Brady, Yu Zhong, Meredith Ringel Morris, and Jefrey P. Bigham. 2013. In-
[4] Dustin Adams, Tory Gallagher, Alexander Ambard, and Sri Kurniawan. 2013. In- vestigating the appropriateness of social network question asking as a resource
terviewing blind photographers: design insights for a smartphone application. In for blind users. In Proceedings of the 2013 conference on Computer supported
Proceedings of the 15th International ACM SIGACCESS Conference on Comput- cooperative work (CSCW ’13). Association for Computing Machinery, New York,
ers and Accessibility (ASSETS ’13). Association for Computing Machinery, New NY, USA, 1225–1236. DOI:https://doi.org/10.1145/2441776.2441915
York, NY, USA, Article 54, 1–2. DOI:https://doi.org/10.1145/2513383.2513418 [21] Shaomei Wu and Lada A. Adamic. 2014. Visually impaired users on an online
[5] Marynel Vázquez and Aaron Steinfeld. 2014. An Assisted Photography Frame- social network. In Proceedings of the SIGCHI Conference on Human Factors
work to Help Visually Impaired Users Properly Aim a Camera. ACM in Computing Systems (CHI ’14). Association for Computing Machinery, New
Trans. Comput.-Hum. Interact. 21, 5, Article 25 (November 2014), 29 pages. York, NY, USA, 3133–3142. DOI:https://doi.org/10.1145/2556288.2557415
DOI:https://doi.org/10.1145/2651380 [22] D. Gurari et al., "VizWiz Grand Challenge: Answering Visual Questions from
[6] Roberto Manduchi and James M. Coughlan. 2014. The last meter: blind visual Blind People," 2018 IEEE/CVF Conference on Computer Vision and Pattern
guidance to a target. In Proceedings of the SIGCHI Conference on Human Fac- Recognition, 2018, pp. 3608-3617, doi: 10.1109/CVPR.2018.00380.
tors in Computing Systems (CHI ’14). Association for Computing Machinery, [23] Susumu Harada, Daisuke Sato, Dustin W. Adams, Sri Kurniawan, Hironobu
New York, NY, USA, 3113–3122. DOI:https://doi.org/10.1145/2556288.2557328 Takagi, and Chieko Asakawa. 2013. Accessible photo album: enhancing the
[7] Jan Balata, Zdenek Mikovec, and Lukas Neoproud. 2015. BlindCamera: Cen- photo sharing experience for people with visual impairment. In Proceedings
tral and Golden-ratio Composition for Blind Photographers. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI
of the Mulitimedia, Interaction, Design and Innnovation (MIDI ’15). As- ’13). Association for Computing Machinery, New York, NY, USA, 2127–2136.
sociation for Computing Machinery, New York, NY, USA, Article 8, 1–8. DOI:https://doi.org/10.1145/2470654.2481292
DOI:https://doi.org/10.1145/2814464.2814472 [24] Lee, Y., & Oh, U. (2020). SelfeHelper: Improving Selfe Experiences for People
[8] Minju Kim and Jungjin Lee. 2019. PicMe: Interactive Visual Guidance for with Visual Impairments. ᆫ ᅡᆨ
ᄒ ᅮHCI ᆨ
ᄀ ᅡᄒ
ᄒ ᅬᅩ ᆫᅮ
ᄂ ᆫᅵ ᄌ, 15(3), 23-30.
Taking Requested Photo Composition. In Proceedings of the 2019 CHI [25] Fang, N., Xie, H., & Igarashi, T. (2018). Selfe Guidance System in Good Head
Conference on Human Factors in Computing Systems (CHI ’19). Associ- Postures. IUI Workshops.
ation for Computing Machinery, New York, NY, USA, Paper 395, 1–12. [26] Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2017. The
DOI:https://doi.org/10.1145/3290605.3300625 Efect of Computer-Generated Descriptions on Photo-Sharing Experiences of
[9] Michael P. Cutter and Roberto Manduchi. 2015. Towards Mobile OCR: People with Visual Impairments. Proc. ACM Hum.-Comput. Interact. 1, CSCW,
How to Take a Good Picture of a Document Without Sight. In Proceed- Article 121 (November 2017), 22 pages. DOI:https://doi.org/10.1145/3134756
ings of the 2015 ACM Symposium on Document Engineering (DocEng [27] Hugo Nicolau, André Rodrigues, André Santos, Tiago Guerreiro, Kyle Mon-
’15). Association for Computing Machinery, New York, NY, USA, 75–84. tague, and João Guerreiro. 2019. The Design Space of Nonvisual Word
DOI:https://doi.org/10.1145/2682571.2797066 Completion. In The 21st International ACM SIGACCESS Conference on
[10] Jongho Lim, Yongjae Yoo, Hanseul Cho, and Seungmoon Choi. 2019. Touch- Computers and Accessibility (ASSETS ’19) Association for Computing Ma-
Photo: Enabling Independent Picture Taking and Understanding for Visually- chinery, New York, NY, USA, 249–261. DOI: https://doi-org.proxy.library.
Impaired Users. In 2019 International Conference on Multimodal Interaction cornell.edu/10.1145/3308561.3353786
(ICMI ’19). Association for Computing Machinery, New York, NY, USA, 124–134. [28] Pradhan, A., & Daniels, G. (2021). Inclusive beauty: how buying and using cos-
DOI:https://doi.org/10.1145/3340555.3353728 metics can be made more accessible for the visually impaired (VI) and blind con-
[11] Dragan Ahmetovic, Daisuke Sato, Uran Oh, Tatsuya Ishihara, Kris Kitani, and sumer. Cosmetics and Toiletries, 136(4), DM4-DM15. Association, USA, Article
Chieko Asakawa. 2020. ReCog: Supporting Blind People in Recognizing Personal 109, 1929–1948.
Voice-Enabled Blockly: Usability Impressions of a Speech-driven
Block-based Programming System
Obianuju Okafor Stephanie Ludi
obianujuokafor@my.unt.edu stephanieludi@unt.edu
University of North Texas University of North Texas
Denton, Texas, USA Denton, Texas, USA
ABSTRACT where programs are constructed by dragging colorful blocks to a
Block-based programming environments pose a challenge for peo- workspace and snapping them together. These blocks represent
ple with upper-limb motor impairments. This is because they are programming constructs such as conditional statements, loops,
highly dependent on the physical manipulation of a mouse or key- functions, variables, etc. BBPEs were created to reduce the steep
board to drag and drop elements on the screen. Our research aims to learning curve associated with learning to program using traditional
make the block-based programming environment Blockly, accessi- text-based programming languages [6], [2], [25]. In text-based pro-
ble to users with upper limb motor impairments by adding voice as gramming languages, to create programs, users have to memorize
an alternative input modality. This voice-enabled version of Blockly the syntax of that language. Additionally, when programming with
will reduce the need for the use of a pointing device, thus increasing text-based programming languages, a lot of typing is required. All
access for people with limited dexterity. The Voice-enabled Blockly of this could be challenging for novices trying to get into program-
system consists of the Blockly application, a speech recognition ming, and may discourage them.
API, predefned voice commands, and a custom function. A us- BBPEs reduce the need for the memorization of complex syn-
ability study was conducted using a prototype of Voice-enabled tactical statements, which encourages learners to focus more on
Blockly. The results revealed that people with upper-limb motor fundamental and general programming concepts [2], [25]. Hence,
impairments can use the system. However, it also exposed some BBPEs are often used to introduce programming concepts and
shortcomings of the tool and gave some suggestions on how to fx computational thinking to novices [5]. Although most block-based
them. Based on the fndings, changes will be made to the system, programming environments were built for a younger population,
and then, it will be evaluated in another user study in the near studies conducted revealed that block-based programming environ-
future. ments are benefcial to novice programmers of all ages, ranging
from elementary school students to high school students [22] [21],
CCS CONCEPTS to college students [23], and even professionals [16] [3]. BBPEs
also do not require much typing; hence, they could be a suitable
• Human-centered computing → Accessibility design and eval-
platform for novices with limited dexterity.
uation methods; Empirical studies in accessibility.
Although block-based programming environments have indeed
KEYWORDS proven to be advantageous for teaching programming to novices,
they also have some limitations. One drawback they have is their
Motor Impairments, Blocked-based Programming, Accessibility, dependence on the use of a mouse or keyboard to drag and drop
Speech Recognition, Usability Testing elements on the screen [18]. This has made them inaccessible to
ACM Reference Format: people with upper limb motor impairments (ULMI) such as cerebral
Obianuju Okafor and Stephanie Ludi. 2022. Voice-Enabled Blockly: Usability palsy, muscular dystrophy, multiple sclerosis, etc. People with ULMI
Impressions of a Speech-driven Block-based Programming System. In The often sufer from paralysis, muscle weakness, and poor coordination
24th International ACM SIGACCESS Conference on Computers and Accessibil-
[13]. This makes performing actions like dragging and dropping
ity (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY,
objects using a pointing device quite hard [24], an action often
USA, 5 pages. https://doi.org/10.1145/3517428.3550382
performed in BBPEs. This limitation excludes a signifcant portion
1 INTRODUCTION of people from reaping the benefts associated with the use of
BBPEs.
Block-based programming environments (BBPE) such as Scratch
1 , Blockly 2 , and Pencil Code 3 are visual programming interfaces Our research addresses the obstacles that people with ULMI
face in the BBPE Blockly, and presents a possible solution.
1 https://scratch.mit.edu/
In our approach, we add speech as an alternative form of input in
2 https://developers.google.com/blockly/
3 https://pencilcode.net/ Blockly, to make it accessible to people with ULMI who would like
to use Blockly. Similar work exist where they attempted to make
Permission to make digital or hard copies of part or all of this work for personal or the BBPE Scratch, accessible to people with ULMI, by creating a
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation standalone tool called Myna. Myna allows people with ULMI to per-
on the frst page. Copyrights for third-party components of this work must be honored. form actions in Scratch using their voice [20], [18], [19]. In contrast
For all other uses, contact the owner/author(s). to Myna, to make Blockly accessible to people with ULMI, we added
ASSETS ’22, October 23–26, 2022, Athens, Greece
© 2022 Copyright held by the owner/author(s).
speech as an alternative input modality in the system, and not as a
ACM ISBN 978-1-4503-9258-7/22/10. separate entity. Therefore, no download or installation is needed.
https://doi.org/10.1145/3517428.3550382
ASSETS ’22, October 23–26, 2022, Athens, Greece Obianuju Okafor and Stephanie ludi

Figure 1: System Overview

The rationale behind doing this is that the user experience will of code. The speech recognition API that we chose was the
be more seamless and there will be room for collaboration among Web speech API 5 . This API was used in several research
children with and without dexterity skills. Additionally, Myna was studies [12], [7].
not evaluated by enough users in their target population. Only 2 (3) Voice Commands: The voice commands are the words ut-
people with a ULMI participated in their study. In the user study tered by users when performing actions in the system. They
that we conducted with our prototype, we had 9 participants with are limited and predefned. This helps prevent any ambiguity
ULMI. associated with more verbose speech recognition systems. A
Before implementing the prototype of our system, we conducted voice command consists of one or two words, e.g., "delete",
a preliminary study [10], [11]. The study aimed to confrm the fea- "edit feld", etc. Each voice command has an action that they
sibility of using voice as an input modality, particularly for people perform, e.g., selecting, deleting, etc. For example, to delete
with cerebral palsy who were our target audience at the time, as a block from the workspace, the user will say "delete".
they are known to sometimes have speech impediments [4], [9]. The design of the voice commands was infuenced by four
The study also helped us test the appropriateness of our voice com- things. The frst one was the design of keyboard shortcuts
mands. The results of the preliminary study revealed the need to for Blockly, which was done in a prior work [8], the second
broaden the target population to include anyone with an ULMI. one was the fndings from the preliminary study that we
Additionally, the study showed the need to modify some voice conducted [10],[11], the third and fourth is empathy and
commands. domain knowledge.
Voice commands can be broken down into 5 categories:
2 VOICE-ENABLED BLOCKLY • Navigation Commands: These are the commands used to
Our system is a voice-enabled version of the Blockly application, navigate through menus, drop-down menus, and between
hence the name. Blockly’s primary input method is a mouse or a stack of blocks in the workspace. Using these commands,
keyboard. We added speech as an alternative input modality for a user can control and move the cursor from one point to
people with ULMI. Our goal is not to replace the mouse or keyboard another. e.g. "up", "down".
as a form of input in Blockly, but to provide an alternative for those • Placement Commands: These commands are used to select
who cannot use them. In this section, we present the components blocks in the menu and place them in the workspace. They
of Voice-enabled Blockly and how they work together. The system can also be used to remove blocks from the workspace.
comprises four main components: These commands are synonymous with dragging and
dropping blocks using a mouse or keyboard. e.g. "select",
(1) Blockly Application: Blockly is a browser-based block-based "delete".
programming environment. Originally, in Blockly, actions • Control Commands: These set of commands are respon-
are performed using a mouse or keyboard. We made some sible for controlling elements in the interface, such as
modifcations to Blockly’s source code so the same actions opening and closing menus, e.g. "menu", "close" etc.
can also be performed using speech. Blockly’s source code • Edit Commands: These commands are used to edit a block’s
was made available online via GitHub 4 , and therefore the text value or to add a comment to a block. It can also be
code used in this research project was downloaded from used to change the option selected from the drop-down
Blockly’s GitHub repository. menu, e.g., "edit feld", "save".
(2) Speech Recognition API: Speech recognition API is a robust • Mode Commands: These commands are used to switch
pre-built library that records speech in real-time, converts between the 3 modes in our system. The modes are nav-
it to text, and returns the text. As with every API, it can be igation mode, connect mode, and edit mode. e.g. "edit",
added to your website or application by adding a few lines "connect", etc.
4 https://github.com/google/blockly 5 https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API
Voice-Enabled Blockly: Usability Impressions of a Speech-driven Block-based Programming System ASSETS ’22, October 23–26, 2022, Athens, Greece

Participant ID Age Gender Condition Overall Experience (1-5) SUS Score (0-100)
P1 40 Male Cerebral Palsy 2 70
P2 24 Male Muscular Dystrophy 4 85
P3 46 Female Muscular Dystrophy 4 55
P4 28 Female Muscular Dystrophy 4 52.5
P5 63 Female Amyotrophic Lateral Sclerosis 2 50
P6 30 Female Cerebral Palsy 4 62.5
P7 20 Female Cerebral Palsy 3 45
P8 21 Female Cerebral Palsy 4 80
P9 36 Male Cerebral Palsy 5 95
Table 1: Participant information and their respective overall experience rating, and SUS score

(4) Custom Function: To perform actions in Blockly using speech,


we created a function that translates voice commands to ac-
tions. This function entails a switch statement [17]. In the
switch statement, each case is a voice command. Each voice
command is paired with a corresponding action function
to be executed. For instance, the voice command "delete" is
paired with the block deletion function.
Fig. 1 gives an overview of the system and its components. Look-
ing at Figure 1, we can see that these components interact in the
following way. The speech recognition API receives speech input Figure 2: Blockly program created in task 3
through the device’s microphone. The audio input is processed and
converted to text. The generated text is sent to the custom func-
tion. The custom function passes the text to the switch statement
using our prototype, they shared their screen using the Zoom screen
to see if there is a case that matches the text. If there is a match,
share feature.
the action associated with that case is executed, which triggers a
During the session, participants were given a demo, and then
change to Blockly’s user interface. If there is no match in the switch
six tasks were assigned to them. The frst task was a training task
statements, the message "command doesn’t exist" is displayed in
to familiarize themselves with the system. The second task was a
the console, and no action is performed.
navigation task; the participants had to navigate through the user
interface. The 3rd task was a program creation task, they were
3 USABILITY STUDY shown a program, and they were asked to recreate that program.
We created a prototype of our application and used it to conduct Fig. 2 shows the program that was created. The fourth task was
a usability study. The study aimed to investigate the following: (i) another navigation task; they were to navigate from the last block to
how easy the application is to use, (ii) the faws in the system, and the frst block of the program that was created in task 3. In the ffth
(iii) what improvements can be made. task, participants were asked to add a comment to the frst block
in the workspace. In the fnal task, they were asked to delete the
3.1 Participants frst block from the workspace and to undo the delete action. They
We recruited 9 participants with ULMI, to use the prototype and had to complete all tasks with their voice. While they performed
give feedback. The participants comprised of 6 females and 3 males. each task, they were observed. Each session lasted an average of 60
The age range was from 20 to 63 years old (M=34.2, SD=13.087). minutes.
Over half of the participants had cerebral palsy, 3 had muscular At the end of the study, participants had to complete a survey.
dystrophy, and 1 had amyotrophic lateral sclerosis. To recruit par- The survey had a total of 23 questions. 10 out of the 23 questions
ticipants, we post our study fyer on several Facebook and Discord were the System usability scale (SUS) questions. SUS questions are
groups catering to people with motor impairments. Table 1 shows a quick and efective way to measure the usability of a system [14].
information about each participant, such as age, sex, condition, After completing the survey, they were sent a $25 Amazon gift card.
and two quantitative metrics of the study, which are the overall
experience rating and the SUS score. 3.3 Result
We present the fndings of the study in this section. We break the
3.2 Procedure results down into 3 categories, the system’s usability, its drawbacks,
The study was conducted remotely over Zoom. Before the study, and some suggestions on how to improve it.
we deployed the prototype online using Heroku 6 . Therefore, the
participants were able to access it through their browser. When 3.3.1 System Usability. We evaluated the usability of the system
based on two quantitative metrics, the overall experience rating
6 https://voice-enabled-blockly.herokuapp.com/blockly/accessibleblockly/interface.html given by the participants and their SUS score. We also considered
ASSETS ’22, October 23–26, 2022, Athens, Greece Obianuju Okafor and Stephanie ludi

open-ended responses given in the survey. The SUS score was de- that are difcult to detect should have alternative words. An addi-
rived from the response to the 10 SUS questions in the survey. The tional suggestion given that is an extension of this strategy, was to
score was graded on a scale of 0 to 100. Similarly, the overall expe- add a feature that allows users to modify or add commands they
rience rating was taken from the response to the survey question want to use. Three suggestions were given about improving the
asking them to rate their overall experience from 1 to 5. Table 1 learnability and comprehensibility of commands. One of them was
shows the overall experience rating given by each participant and to use commands that are easier to understand. Another was to
their SUS score. give a list of commands before the study to allow the user time to
The mean SUS score was 66.1 (SD=17.4). The maximum score prepare. A third was that commands that were not distinct enough
was 95, while the minimum SUS score was 45. 44.4% had an SUS should be changed. Regarding improving the intuitiveness of the
score greater than 68, which is the 50th percentile. Furthermore, commands, P7 gave some examples of commands that need to be
almost 80% of the participants rated their overall experience above changed to be more intuitive. For example, they stated that "in"
average. Most of the participants rated their experience as 4 and should be replaced by "inside". To address the system’s delay in
above. The average experience rating was 3.555. The highest overall response, P1 stated that the time lag should be reduced to allow
experience rating was 5, while the lowest experience rating was 2. him perform actions faster. Although not mentioned as a challenge,
Additionally, when asked what they liked most about the system, 6 some participants suggested ways to improve navigation in the
out of 9 participants stated that it was easy to use. Some of their system. One was to add number labels to the menu categories. An-
responses include the following: other suggestion given was to allow for the skipping of disabled
blocks when navigating up or down a menu category.
It was easy to use and understand (P2)
On the basis of observation, we also decided to make some
Easy to use, mostly responsive, very intuitive (P9)
changes. We will make the color of each menu category and their
blocks very distinct. In addition, we will add a section in the user
However, 2 out of the 6 gave the caveat that the system was easy interface that displays the voice commands and their respective
to use once they understood the commands. They stated: functions. This section will help reduce the cognitive load users
After getting used to the commands, I could see know- face when using voice to program in Voice-enabled Blockly. Finally,
ing how to use this with more ease. The commands we will provide a highly detailed manual that can be used to teach
are a little difcult to learn at frst (P3) people how to use the system.
Easy to use once you know the commands (P4)

3.3.2 Drawbacks. One of the questions asked in the survey was "In
3.4 Discussion and Future Work
your own words, what did you dislike most about the prototype?". The results are promising. The overall experience rating, the SUS
The reported drawbacks can be grouped into fve main themes. The score, and the open-ended responses show that on average people
frst and most common problem was that speech recognition did with ULMI found the system easy to use. However, the result also
not work properly. Almost 70% of the participants reported this. reveals that there is room for improvement. According to [15],
In particular, there were certain words that the system found hard [14],[1] the average SUS score is 68, and anything less than that is
to detect, e.g., "in" and "up". The second most reported issue was poor. In our result only 44.4% had an SUS score above 68. We aim
that voice commands were difcult to learn, especially in the given to make changes so that most, if not all, of our users have an SUS
time frame. 5 out of 9 people reported this. A related issue to the score of 68 and above.
previous one was that some commands were confusing and can be A huge part of why the system did not have higher SUS score val-
mixed up, e.g., "edit" and "edit feld". One third of people reported ues could be attributed to the steep learning curve of the commands.
this. Another command-related issue was that the commands were As mentioned above, 2 participants, P3 and P4, reported that after
not intuitive enough. Only P7 reported this. The last category of understanding the commands, they found the tool easier to use.
problems mentioned involves the response time of the system. Two Another issue that may have ameliorated a participant’s experience
people stated that the system took too long to respond. Due to this, was the dysfunctional voice recognition system. There may be a
they had to wait before performing the next action. need to change to a more robust speech recognition system that
We discovered some problems with the system while observing is better at detecting speech. Although these two things afect the
the participants as they performed the tasks. One was that the application, they are things that can be improved.
speech recognition system was afected by poor internet. Another In future work, we will make the necessary modifcations to
issue we observed was that the colors of the blocks in diferent menu address all of these issues. Some examples of the changes we plan
categories were similar. Due to this, participants were sometimes to make include, add alternative words for certain commands, make
confused about which menu category to go into when trying to the commands more intuitive and less ambiguous, increase the
locate a block. system’s response speed, add number labels to menu categories,
change the color of some menu categories, add a new section on
3.3.3 Suggestions. When asked what they would like to change, the user interface for voice commands, etc. We are confdent that
the participants gave several suggestions. Most of the suggestions with these changes, the overall usability of this application, as well
were given as a solution to the challenge faced during the study. As as the user experience, will improve greatly. After making these
a solution to the voice recognition system having difculty in rec- changes, we will conduct another user study. We will compare the
ognizing certain commands. P2 and P7 suggested that commands results of the two studies to see if there is any improvement.
Voice-Enabled Blockly: Usability Impressions of a Speech-driven Block-based Programming System ASSETS ’22, October 23–26, 2022, Athens, Greece

REFERENCES [14] Will T. 2017. Measuring and Interpreting System Usability Scale (SUS) - UIUX
[1] Hadi Alathas. 2018. How to Measure Product Usability with the Trend. Retrieved "Mar 28, 2022" from https://uiuxtrend.com/measuring-system-
System Usability Scale (SUS) Score. Retrieved "Apr 01, 2022" usability-scale-sus/
from https://uxplanet.org/how-to-measure-product-usability-with-the-system- [15] Usability.gov. 2013. System Usability Scale (SUS). https://www.usability.gov/
usability-scale-sus-score-69f3875b858f how-to-and-tools/methods/system-usability-scale.html [Online; accessed 28-
[2] David Bau, Jef Gray, Caitlin Kelleher, Josh Sheldon, and Franklyn Turbak. 2017. March-2022].
Learnable Programming: Blocks and Beyond. Commun. ACM 60, 6 (may 2017), [16] Leticia Azucena Vaca-Cárdenas, Francesca Bertacchini, Assunta Tavernise,
72–80. https://doi.org/10.1145/3015455 Lorella Gabriele, Antonella Valenti, Diana Elizabeth Olmedo, Pietro Pantano,
[3] Georgios Fesakis and Kiriaki Serafeim. 2009. Infuence of the Familiarization and Eleonora Bilotta. 2015. Coding with Scratch: The design of an educational
with "Scratch" on Future Teachers’ Opinions and Attitudes about Programming setting for Elementary pre-service teachers. In 2015 International Conference on
and ICT in Education. SIGCSE Bull. 41, 3 (jul 2009), 258–262. https://doi.org/10. Interactive Collaborative Learning (ICL). 1171–1177. https://doi.org/10.1109/ICL.
1145/1595496.1562957 2015.7318200
[4] Centers for Disease Control and Prevention. 2021. What is Cerebral Palsy? [17] W3Schools. 2022. JavaScript Switch Statement. https://www.w3schools.com/js/
Retrieved "Jan 13, 2022" from https://www.cdc.gov/ncbddd/cp/facts.html js_switch.asp [Online; accessed 14-April-2022].
[5] N. Humble. 2019. Developing Computational Thinking Skills in K-12 Education [18] Amber Wagner and Jef Gray. 2015. An Empirical Evaluation of a Vocal User
Through Block Programming Tools. In ICERI2019 Proceedings (Seville, Spain) (12th Interface for Programming by Voice. Vol. 8. 47–63. https://doi.org/10.4018/IJITSA.
annual International Conference of Education, Research and Innovation). IATED, 2015070104
4865–4873. https://doi.org/10.21125/iceri.2019.1190 [19] Amber Wagner and Jef Gray. 2017. An Empirical Evaluation of a Vocal User
[6] Surendheran Kaliyaperumal. 2019. Novice programmers’ attitude towards the Interface for Programming by Voice. https://doi.org/10.4018/978-1-5225-1759-
introduction of block-based coding in Virtual Reality programming ABSTRACT. 7.ch012
(06 2019). [20] Amber Wagner, Ramaraju Rudraraju, Srinivasa Datla, Avishek Banerjee, Mandar
[7] Phoebe Lin, Jessica Van Brummelen, Galit Lukin, Randi Williams, and Cynthia Sudame, and Jef Gray. 2012. Programming by voice: a hands-free approach
Breazeal. 2020. Zhorai: Designing a Conversational Agent for Children to Explore for motorically challenged children. (05 2012). https://doi.org/10.1145/2212776.
Machine Learning Concepts. Proceedings of the AAAI Conference on Artifcial 2223757
Intelligence 34 (04 2020), 13381–13388. https://doi.org/10.1609/aaai.v34i09.7061 [21] David Weintrop. 2019. Block-Based Programming in Computer Science Education.
[8] Stephanie Ludi and Mary Spencer. 2017. Design Considerations to Increase Commun. ACM 62, 8 (July 2019), 22–25. https://doi.org/10.1145/3341221
Block-based Language Accessibility for Blind Programmers Via Blockly. Journal [22] David Weintrop and Uri Wilensky. 2015. To Block or Not to Block, That is the
of Visual Languages and Sentient Systems 3 (07 2017), 119–124. https://doi.org/10. Question: Students’ Perceptions of Blocks-Based Programming. In Proceedings
18293/VLSS2017-013 of the 14th International Conference on Interaction Design and Children (Boston,
[9] National Institute of Neurological Disorders and Stroke. 2013. Cere- Massachusetts) (IDC ’15). Association for Computing Machinery, New York, NY,
bral Palsy: Hope Through Research. Retrieved "Jan 13, 2022" USA, 199–208. https://doi.org/10.1145/2771839.2771860
[23] Joseph B. Wiggins, Fahmid M. Fahid, Andrew Emerson, Madeline Hinckle, Andy
from https://www.ninds.nih.gov/Disorders/Patient-Caregiver-Education/Hope-
Smith, Kristy Elizabeth Boyer, Bradford Mott, Eric Wiebe, and James Lester. 2021.
Through-Research/Cerebral-Palsy-Hope-Through-Research
Exploring Novice Programmers’ Hint Requests in an Intelligent Block-Based
[10] Obianuju Okafor. 2022. Helping Students with Cerebral Palsy Program via Voice-
Coding Environment. In Proceedings of the 52nd ACM Technical Symposium on
Enabled Block-Based Programming. SIGACCESS Access. Comput. 132, Article 2
Computer Science Education (Virtual Event, USA) (SIGCSE ’21). Association for
(mar 2022), 1 pages. https://doi.org/10.1145/3523265.3523267
Computing Machinery, New York, NY, USA, 52–58. https://doi.org/10.1145/
[11] Obianuju Okafor and Ludi Stephanie. 2022. Helping Students with Upper-body
3408877.3432538
Motor Disabilities Program via Voice-Enabled Block-based Programming (Lecture
[24] James Williamson. 2017. What I’ve learned about motor impairment. http:
Notes in Computer Science (LNCS)). Springer, New York, NY, USA.
//simpleprimate.com/blog/motor Accessed on 2022-01-12.
[12] Lucas Rosenblatt, Patrick Carrington, Kotaro Hara, and Jefrey Bigham. 2018.
[25] Zhen Xu, Albert Ritzhaupt, Fengchun Tian, and Karthikeyan Umapathy. 2019.
Vocal Programming for People with Upper-Body Motor Impairments. W4A ’18:
Block-based versus text-based programming environments on novice student
Proceedings of the Internet of Accessible Things, 1–10. https://doi.org/10.1145/
learning outcomes: a meta-analysis study. Computer Science Education 29 (01
3192714.3192821
2019), 1–28. https://doi.org/10.1080/08993408.2019.1565233
[13] International Neuromodulation Society. 2012. Motor Impairment. https://www.
neuromodulation.com/motor-impairment Accessed on 2022-01-12.
The Landscape of Accessibility Skill Set in the Sofware Industry
Positions
Lilu Martin Catherine M. Baker
marti400@wwu.edu catherinebaker@creighton.edu
Western Washington University Creighton University
Bellingham, Washington, USA Omaha, Nebraska, USA

Kristen Shinohara Yasmine N. Elglaly


kristen.shinohara@rit.edu elglaly@wwu.edu
Rochester Institute of Technology Western Washington University
Rochester, New York, USA Bellingham, Washington, USA

ABSTRACT design and development were assessed to uncover whether they


In the software industry, good design often translates into good support accessibility [1, 15]. Practices of accessibility requirements
user experiences. As accessibility is a key component of usable soft- in software development were also investigated [16]. Addition-
ware, there are opportunities for software professionals to include ally, researchers identifed how accessibility can be better taught
accessibility in their skill set. However, despite a push to motivate in undergraduate education [7]. Many companies are working on
more companies to include accessibility as a desired knowledge and improving the accessibility of their products [21]. Large software
skill set, little is known about how much companies are seeking to companies that own enough resources created positions for acces-
recruit employees with accessibility profciency. In this paper, we sibility experts to facilitate the production of accessible products
investigated the extent to which companies seek software designer [6]. However, the landscape of accessibility jobs in software com-
and developer skills in accessibility by analyzing software job posts panies and the desired accessibility qualifcations in software jobs
on LinkedIn. Our results showed that the majority of job posts did is unknown. To cover this gap in literature, we studied the accessi-
not require any accessibility skills, and that educating developers bility skill set required by software companies, based on their job
and designers about accessibility was a required qualifcation for posts, in various software roles, e.g., developer, designer, and tester.
many of the accessibility-focused software jobs. Our goal was to determine what the software industry is seeking
in new hires in terms of accessibility skills; shedding some light
CCS CONCEPTS for tech professionals on what they need to learn with respect to
accessibility.
• Human-centered computing → Accessibility technologies.
By analyzing a little more than 5900 software job posts, retrieved
from LinkedIn, we found that accessibility skill set is not in demand
KEYWORDS
in any of the general software engineering roles. Accessibility best
accessibility skills, job posts, software developer, designer, tester practices, without further explanation or specifcs, was required in
ACM Reference Format: a relatively small number of the general developer, designer, and
Lilu Martin, Catherine M. Baker, Kristen Shinohara, and Yasmine N. Elglaly. tester roles. Job posts with “accessibility” in their title required a
2022. The Landscape of Accessibility Skill Set in the Software Industry Posi- large and diverse set of accessibility experiences that go beyond
tions. In The 24th International ACM SIGACCESS Conference on Computers the technical experience to teaching and advocacy experiences. In
and Accessibility (ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, essence, convincing and coaching the software development team
New York, NY, USA, 4 pages. https://doi.org/10.1145/3517428.3550389 to implement accessibility will be falling on the shoulders of those
that will be hired in accessibility roles.
1 INTRODUCTION
Along the years, software companies were blamed for producing
2 METHOD
inaccessible technology, including websites and mobile applications
[17, 28]. Several researchers studied the barriers to creating accessi- We chose LinkedIn from which to collect publicly available job
ble software [20]. Some researchers found that empathy is necessary posts as it has the best availability for Application Programming In-
for creating accessible software, and discussions and solutions were terfaces (API) to do the scraping. Also, when we compared the data
proposed to foster empathy [8, 27]. Tools and frameworks used in on multiple jobs websites, e.g., Google Jobs, Indeed, and LinkedIn,
we found that most jobs were duplicated across these sites. We
Permission to make digital or hard copies of part or all of this work for personal or searched for job types that are important in the development of
classroom use is granted without fee provided that copies are not made or distributed accessible technology. The search terms we used were: Web De-
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. veloper, UX Designer, UI Designer, Software Engineer, Software
For all other uses, contact the owner/author(s). Developer, Front End Developer, UIUX Accessibility, Software Ac-
ASSETS ’22, October 23–26, 2022, Athens, Greece cessibility, and Accessibility Tester. The location was set to the
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. USA. We collected the data in May-June 2021. These search terms
https://doi.org/10.1145/3517428.3550389 were plugged into the API called Phantombuster, frst with the
ASSETS ’22, October 23–26, 2022, Athens, Greece Lilu Martin, Catherine M. Baker, Kristen Shinohara, and Yasmine N. Elglaly

LinkedIn Search Export to get the results of the diferent searches, “A solid grasp of user-centered design, usability test-
then with the LinkedIn Job Scraper to get the actual data in each of ing methodologies and accessibility best practices.”
the scrapes. In total there were 5,920 posts, after removing dupli- The Tester role was the one with the most frequent occurrence
cates, grabbed of of LinkedIn using these APIs. We considered job of accessibility qualifcations in non-accessibility specifc job posts.
posts with the Job Title feld containing the word “Accessibility” to The accessibility statements were also generic in most cases, such
be accessibility-specifc jobs. Two researchers manually analyzed as knowledge of web standards and accessibility standards, and
the Job Description feld associated with the accessibility-specifc performing accessibility testing. In one post by a federal contractor,
jobs to identify the required accessibility qualifcations. We then accessibility testing for Section 508 compliance was specifed.
searched automatically via ctrl+F for the accessibility qualifcations
in the generic software job posts to check whether generic develop- 3.2 Accessibility-Specifc Job Posts
ment and design roles require accessibility knowledge. We created
The accessibility-specifc roles in job posts had titles such as Ac-
Python scripts to calculate the frequency and percentage of these
cessibility Support Engineer for the engineer role, and Accessi-
accessibility keywords.
bility Designer and Product Designer - Accessibility for the de-
signer role. The accessibility-specifc tester role had a wide range of
3 RESULTS AND DISCUSSION job titles such as Accessibility Tester, Accessibility/A11y/ADA/508
Our dataset is comprised of 5920 posts from 2061 diferent compa- Trusted Tester, Quality Assurance (QA) Tester with Accessibility
nies. The main industry of 1551 companies is computer software Testing experience, Accessibility Scrum Tester, Automation Acces-
or information technology. The other companies’ main industry sibility Tester, and WCAG Accessibility Tester. We did not fnd any
includes fnancial services, marketing, and media, among others. accessibility-specifc job posts for the developer role. We grouped
While we did not flter our search by company size, and there is a the skills and qualifcations required by the accessibility-specifc
range of small companies in our dataset, most of the companies are job posts into the following categories:
large ones, such as Amazon, Apple, Microsoft, GitHub, Esri, Oracle,
3.2.1 Accessibility Guidelines and Implementation. Solid under-
Samsung, Google, Twitter, and IBM, in no particular order. 1886
standing of WCAG (2.0 and 2.1 AA) was one of the most prevalent
posts (31.85%) were accessibility-focused, and 4034 posts (68.14%)
required qualifcations across all accessibility-specifc roles. It was
were not. Table 1 shows a comparison of the frequency and per-
rare to see posts requiring experience in WCAG Level AAA com-
centage of a sample of accessibility-related keywords in both acces-
pliance. Other guidelines mentioned in the job posts were Apple
sibility and non-accessibility job posts. The keywords “accessibility”
accessibility Guidelines, Microsoft Accessibility Standards, Android
occurred in 66.9% of the accessibility job posts, while occurred in
accessibility guidelines, and inclusive design guidelines. Several
5.68% of the non accessibility posts. The keywords “WCAG” [26],
job posts required familiarity with accessibility implementation
“Accessibility Guidelines”, and “Section 508”[22] occurred in 23.6%
techniques like proper use of Accessible Rich Internet Applications
of the accessibility job posts, while occurred in only 0.97% of the
(ARIA)[4] attributes, logical order, and correct semantics. Another
non accessibility posts. All keywords related to assistive technol-
common qualifcation was localization and knowledge of multiple
ogy such as “Assistive Technologies”, “Screen Readers”, “JAWS”[14],
platform and browser accessibility practices, including iOS and
“NVDA”[19], “VoiceOver”, “Talkback”, “Zoomtext”, “Dragon Natu-
Android mobile devices, and desktop operating systems: Windows,
rally Speaking”[12], and “Voice Recognition” appeared in 19.7% of
MacOS and ChromeOS.
the accessibility posts, and in 0.17% of the non accessibility posts.
This data about the job qualifcations for various software pro- 3.2.2 Accessibility Regulations. The majority of job posts with ref-
fessional roles portrays the image that accessibility knowledge is erence to regulations were for the Tester role, and required knowl-
generally not expected from software engineers, developers, de- edge of Section 508 and ADA. This is expected as our data was
signers, or testers. limited to positions in the US. In a few instances, knowledge about
more specifc accessibility regulations were required, such as Sec-
3.1 Non Accessibility-Specifc Job Posts tion 255 of the Telecommunications Act, the Communications and
Video Accessibility Act (CVAA)[11], and the European accessibility
With the exception of the Engineer role, there were few job posts
standard EN 301 549[13]. Most of the job posts discussed knowledge
that required accessibility skills in roles that are not accessibility-
of accessibility regulations in terms of compliance.
specifc. For the Developer role, a few posts listed understanding
web accessibility and familiarity with ARIA[4] and WCAG 2.1 as 3.2.3 Assistive Technology. In Design role job posts, experience
part of the job qualifcations. Usually, accessibility was referenced using assistive technology and designing for assistive technolo-
in these posts as one of the software quality criteria. For example: gies were common qualifcations. In Tester role job posts, solid
experience testing with assistive technology and confguring test
“Understanding of front-end development best prac-
tools and assistive devices for testing was an in-high demand skill.
tices (accessibility, 508/WCAG compliance, perfor-
Assistive technologies often cited were Zoomtext, Dragon Natu-
mance optimization, SEO).”
rally Speaking[12], screen readers (JAWS[14], NVDA[19], MACIOS
Similarly, a few job posts for the Designer role included accessi- VoiceOver, TalkBack, Narrator), keyboard only (navigation, focus,
bility skills in the qualifcations, such as familiarity with WCAG. input), magnifers and voice recognition products. Several job posts
Other posts mentioned accessibility in generic and broad terms. For mentioned in the job description that being a native assistive tech-
example: nology user is preferred. A blind person, for example, that uses a
The Landscape of Accessibility Skill Set in the Sofware Industry Positions ASSETS ’22, October 23–26, 2022, Athens, Greece

General Software Roles (4034 posts) Accessibility Specifc Roles (1886 posts)
Keyword Frequency Percentage Keyword Frequency Percentage
Accessibility 229 5.68% Accessibility 1262 66.9%
WCAG 30 0.74% WCAG 208 11%
Accessibility Guidelines 4 0.09% Accessibility Guidelines 126 6.7%
Section 508 5 0.12% Section 508 111 5.9%
Localization 14 0.3% Localization 75 3.9%
JAWS 1 0.02% JAWS 87 4.6%
NVDA 1 0.02% NVDA 71 3.7%
Assistive Technologies 2 0.05% Assistive Technologies 66 3.5%
VoiceOver 1 0.02% VoiceOver 42 2.2%
Talkback 1 0.02% Talkback 31 1.6%
Zoomtext 0 0% Zoomtext 30 1.6%
Screen Readers 1 0.02% Screen Readers 26 1.37%
Voice Recognition 0 0% Voice Recognition 10 0.5%
Dragon Naturally Speaking 0 0% Dragon Naturally Speaking 9 0.47%
Magnifers 0 0% Magnifers 4 0.2%
CVAA 0 0% CVAA 1 0.05%
Table 1: The frequency and percentage of the occurrence of a sample of accessibility keywords in general and accessibility
specifc job posts.

screen reader on daily basis is considered a native screen reader support to others currently working in the company, e.g., designers,
user [18]. So, this job qualifcation implies that people with product managers, developers, and engineers.
disabilities, that use assistive technology on daily basis, will
be highly considered for the job. 3.2.7 Advocacy. It was interesting to see that advocacy was a fre-
quent job qualifcation for many Designer and Tester job posts.
3.2.4 Testing Strategies and Tools. There were various testing The required qualifcations included the ability to promote the im-
strategies that were listed, mostly in the Tester role job posts, such as portance of accessibility, experience driving organizational change
accessibility audit, third-party audits, test and validate accessibility through advocacy, advocate for inclusive design methodologies,
acceptance criteria, creating automation scripts for WCAG compli- and help developers think with an accessible mindset.
ance, testing using assistive technology, evaluating and formulating “Talent for building relationships and trust within
a Voluntary Product Accessibility Template (VPAT)[23], and per- teams who have been resistant to accessibility in the
forming usability testing with reference to accessibility. Experience past.”
with specifc testing tools were often cited in required qualifca-
tion such as Wave[25], Axe[5], CCA[9], Accessibility Management 3.2.8 Disability Knowledge and the Disability Communities. A few
Platforms (AMP)[2], Accessibility Inspector, ANDI[3], Contrast posts specifed disability knowledge as required qualifcations. This
Analyzer, and Color Blindness Simulator. In at least one job post, requirement was holistic and included understanding for all as-
experience with document accessibility testing, e.g. and PDF, Excel, pects of disability, including visual, mobility, hearing, cognitive,
was required. and speech. Several job posts required active participation in the
accessibility community, empathy for people with disabilities, or
3.2.5 Accessibility Certificates. Certifcation in accessibility was
experience working directly with people and communities of con-
preferred in a handful of job posts in the Engineer and Tester roles.
sumers with disabilities.
Examples are Trusted Tester, Certifed Professional in Accessibility
To sum, job posts for developers, designers, and testers
Core Competencies (CPACC)[10], 508 Certifed Accessibility Tester,
rarely mention accessibility in the job qualifcations. When
and Certifed IAAP Web Accessibility Specialist (WAS)[24].
accessibility is mentioned, it is usually listed as one of the soft-
3.2.6 Coaching and Educating. Many Designer and Tester job posts ware quality criteria, e.g., performance, or in generic terms, e.g.,
added to the qualifcations list the ability to train and coach others experience in accessibility best practices. On the other hand,
on various accessibility topics. These topics ranged from accessibil- accessibility-specifc jobs require from applicants to have ex-
ity acceptance criteria to inclusive design principles and practices, tensive technical accessibility knowledge, and to function as
and WCAG. The new hires are expected to provide consultative accessibility educators and advocates within the company.
ASSETS ’22, October 23–26, 2022, Athens, Greece Lilu Martin, Catherine M. Baker, Kristen Shinohara, and Yasmine N. Elglaly

Another insight from our data is that the majority of accessi- [5] Axe 2022. Axe The Standard in Accessibility Testing. Retrieved June 21, 2022 from
bility qualifcations were related to accessibility testing, and https://www.deque.com/axe/
[6] Shiri Azenkot, Margot J Hanley, and Catherine M Baker. 2021. How Accessibility
little attention was given to accessibility in the earlier stages Practitioners Promote the Creation of Accessible Products in Large Companies.
of software development, e.g., requirements and design. This Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1–27.
[7] Catherine M Baker, Yasmine N Elglaly, and Kristen Shinohara. 2020. A systematic
points to a broader structural issue with tech job posts as accessi- analysis of accessibility in computing education research. In Proceedings of the
bility is seen as an afterthought rather than a way to restructure 51st ACM Technical Symposium on Computer Science Education. 107–113.
the system to beneft a wider variety of people. [8] Cynthia L Bennett and Daniela K Rosner. 2019. The Promise of Empathy: Design,
Disability, and Knowing the" Other". In Proceedings of the 2019 CHI conference on
human factors in computing systems. 1–13.
4 CONCLUSION [9] CCA 2022. Colour Contrast Analyser. Retrieved June 21, 2022
from https://www.accessibility-developer-guide.com/setup/helper-tools/colour-
We analyzed position announcements and job descriptions for acces- contrast-analyser
sibility skills to determine the extent to which industry companies [10] CPACC 2022. Certifed Professional in Accessibility Core Competencies. Re-
trieved June 21, 2022 from https://www.accessibilityassociation.org/s/certifed-
are seeking such skills. We then distinguished between general professional
software positions that may require skills and knowledge in acces- [11] CVAA 2022. The 21st Century Communication and Video Accessibility Act
sibility from accessibility specialist positions. Our results indicate (CVAA). Retrieved June 21, 2022 from https://www.fcc.gov/consumers/guides/
21st-century-communications-and-video-accessibility-act-cvaa
that accessibility skills are rarely sought in general software roles, [12] Dragon Naturally Speaking 2022. Dragon Naturally Speaking. Retrieved June 21,
while professionals in accessibility positions were tasked with ed- 2022 from https://www.nuance.com/dragon.html
[13] EAS 2022. European Accessibility Standard EN 301 549. Retrieved June 21, 2022
ucating and mentoring software professionals in understanding from https://ec.europa.eu
and implementing accessibility. The expectations of the required [14] JAWS 2022. Job Access With Speech (JAWS). Retrieved June 21, 2022 from
accessibility knowledge in general roles and accessibility roles are https://support.freedomscientifc.com/products/blindness/jawsdocumentation
[15] Michael Longley and Yasmine N Elglaly. 2021. Accessibility Support in Web
unbalanced. We hope that this snapshot of accessibility demand in Frameworks. In The 23rd International ACM SIGACCESS Conference on Computers
software jobs motivates companies to seek employees with accessi- and Accessibility. 1–4.
bility skills in general roles, and motivate software professionals to [16] Darliane Miranda and João Araujo. 2022. Studying industry practices of accessi-
bility requirements in agile development. In Proceedings of the 37th ACM/SIGAPP
acquire accessibility knowledge, e.g., through training. Finally, we Symposium on Applied Computing. 1309–1317.
hope that accessibility as a skill and a mindset will be covered in [17] M Nagaraju, Priyanka Chawla, and Ajay Rana. 2019. A practitioner’s approach to
assess the wcag 2.0 website accessibility challenges. In 2019 Amity International
college education, so software professionals become well-equipped Conference on Artifcial Intelligence (AICAI). IEEE, 958–966.
to create accessible software. We recognize that the Job Description [18] Native AT User 2022. Accessibility Testing by People with Disabilities. Re-
on LinkedIn job posts may not consistently refect the accurate trieved June 21, 2022 from https://www.24a11y.com/2019/accessibility-testing-
by-people-with-disabilities/
and complete information about the advertised role. In the future, [19] NVDA 2022. NonVisual Desktop Access (NVDA) Screen Reader. Retrieved June 21,
we will further investigate the process of recruiting candidates 2022 from https://www.nvaccess.org/
with accessibility skills, and whether there is specialization within [20] Rohan Patel, Pedro Breton, Catherine M Baker, Yasmine N Elglaly, and Kristen
Shinohara. 2020. Why software is not accessible: Technology professionals’
accessibility, e.g., accessibility specialists in particular disabilities. perspectives and challenges. In Extended abstracts of the 2020 CHI conference on
human factors in computing systems. 1–9.
ACKNOWLEDGMENTS [21] PEAT Report 2018. The Accessible Technology Skills Gap. The Partnership on
Employment & Accessible Technology (PEAT), report. Retrieved June 21, 2022 from
This material is based upon work supported by the United States https://www.peatworks.org/infographic-the-accessible-technology-skills-gap/
[22] Section 508 2022. Section 508 of the Rehabilitation Act. Retrieved June 21, 2022
National Science Foundation under grants #2121606, #2121428, from https://www.section508.gov/
#2121549, and Henry Luce Foundation - Clare Boothe Luce Fund. [23] VPAT 2022. Voluntary Product Accessibility Template (VPAT). Retrieved June 21,
2022 from https://www.section508.gov/sell/vpat/
[24] WAS 2022. Web Accessibility Specialist. Retrieved June 21, 2022 from https:
REFERENCES //www.accessibilityassociation.org/s/wascertifcation
[1] Abdullah Alsaeedi. 2020. Comparing web accessibility evaluation tools and [25] Wave 2022. WAVE Web Accessibility Evaluation Tool. Retrieved June 21, 2022
evaluating the accessibility of webpages: proposed frameworks. Information 11, from https://wave.webaim.org/
1 (2020), 40. [26] WCAG2.1 2018. Web Content Accessibility Guidelines (WCAG) 2.1. Retrieved
[2] AMP 2022. The Accessibility Management Platform. Retrieved June 21, 2022 from June 21, 2022 from https://www.w3.org/TR/WCAG21/
https://www.levelaccess.com/solutions/software/amp/ [27] Peter Wright and John McCarthy. 2008. Empathy and experience in HCI. In
[3] ANDI 2022. Accessibility Testing Tool. Retrieved June 21, 2022 from https: Proceedings of the SIGCHI conference on human factors in computing systems.
//www.ssa.gov/accessibility/andi 637–646.
[4] ARIA 2022. Accessible Rich Internet Applications (ARIA). Retrieved June 21, 2022 [28] Shunguo Yan and PG Ramachandran. 2019. The current status of accessibility in
from https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA mobile apps. ACM Transactions on Accessible Computing (TACCESS) 12, 1 (2019),
1–31.
How people who are deaf, Deaf, and hard of hearing use
technology in creative sound activities
Keita Ohshiro Mark Cartwright
ko89@njit.edu mark.cartwright@njit.edu
New Jersey Institute of Technology New Jersey Institute of Technology
Newark, New Jersey, USA Newark, New Jersey, USA

ABSTRACT [29]. Those who engage in creative sound activities enjoy perform-
Creative sound activities, such as music playing and audio engi- ing, producing, and distributing their works in audio-related felds
neering, are said to have been democratized with the development including music, flm, TV, radio, and podcasting [23].
of technology. Yet, the use of technology in creative sound activities However, they have not been democratized for all in the context
by people who are deaf, Deaf, and hard of hearing (DHH) has been of accessibility and disability. With a growing interest in the acces-
underexplored by the research community. To address this gap, we sibility of audio and sound technologies [17], we have begun to see
conducted an online survey with 50 DHH participants to under- research on creative sound activities by people with disabilities, for
stand their use of technology and barriers they face in their creative example, by people who are blind or have low vision [42, 44]. Yet,
sound activities. We fnd DHH people use four types of technology the research community has underexplored how people who are
— hearing devices, sound manipulation, sound visualization, and deaf, Deaf, and hard of hearing (DHH) [8, 26] use technology in
speech-to-text — for three purposes — to improve sound perception creative sound activities. In 2021, the World Health Organization
via auditory and visual means, to avoid hearing fatigue, and to bet- reported that more than 1.5 billion people (1 in 5 people) worldwide
ter communicate with hearing people. We also fnd their barriers are afected by some degree of DHH, and they estimated that this
to technology: unknown availability, limited options, and limita- number will increase to nearly 2.5 billion people (1 in 4 people) by
tions that technology can solve. We discuss opportunities for more 2050 [40]. Given this increasing DHH population and that creative
inclusive design specifc to DHH people’s creative sound activities, sound activities are often primarily auditory, we feel it is important
as well as facilitating access to information about technology. to develop an understanding of the current state of accessibility by
DHH people in creative sound activities.
CCS CONCEPTS In this paper, we aim to understand how DHH people use tech-
nology in their creative sound activities and what barriers they
• Human-centered computing → Empirical studies in acces-
may face. We present our fndings from an online survey with 50
sibility.
responses by DHH people who engage in creative sound activi-
ties. We conclude by discussing future research directions to make
KEYWORDS technology more available and inclusive of DHH people.
accessibility, deaf, Deaf, hard of hearing, audio engineering, music
ACM Reference Format: 2 RELATED WORK
Keita Ohshiro and Mark Cartwright. 2022. How people who are deaf, Deaf,
DHH people often have difculty perceiving sound characteristics
and hard of hearing use technology in creative sound activities. In The 24th
International ACM SIGACCESS Conference on Computers and Accessibility
such as pitch, loudness, timbre, and spatial information [15, 16, 48].
(ASSETS ’22), October 23–26, 2022, Athens, Greece. ACM, New York, NY, USA, Some use medical hearing devices such as hearing aids (HA) or
4 pages. https://doi.org/10.1145/3517428.3550396 cochlear implants (CI) to improve auditory perception, yet they
can cause additional challenges such as worsening pitch perception
or increasing noises [9, 14, 27, 30–32, 43]. Issues such as tinnitus,
1 INTRODUCTION hearing fatigue, and hearing fuctuation can also afect their quality
It is said that creative sound activities, such as music playing and au- of hearing [24, 47]. To overcome these limitations of their auditory
dio engineering, have been democratized with the development of system, DHH people often rely more on other senses such as vision
technology [13, 25]. Nowadays, these activities are not only limited and touch to experience sound [46].
to professionals working with high-end studio equipment but also Prior work has ofered understanding of DHH people’s expe-
opened to amateurs and novices using personal computers and au- riences in creative sound activities. DHH musicians utilize visual
dio production software called Digital Audio Workstations (DAWs) and physical cues as well as music theory to develop musical self-
efciency [18]. Social and cultural factors afect how DHH people
Permission to make digital or hard copies of part or all of this work for personal or approach their creative sound activities as being DHH [11]. Stories
classroom use is granted without fee provided that copies are not made or distributed by DHH individuals show their unique experience in creative sound
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored. activities as being DHH [2, 10, 15, 19, 22, 48]. For example, Richard
For all other uses, contact the owner/author(s). Einhorn shared how he was able to continue his professional audio
ASSETS ’22, October 23–26, 2022, Athens, Greece engineer career after becoming deaf later in his life [15]. There also
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9258-7/22/10. are organizations that support DHH people’s creative sound activ-
https://doi.org/10.1145/3517428.3550396 ities [1, 34–37, 39]. For example, the Frequalise Project by Music
ASSETS ’22, October 23–26, 2022, Athens, Greece Keita Ohshiro and Mark Cartwright

and the Deaf [34] demonstrated the efectiveness of technology to For example, P12 sometimes switches from HA to bone conduction
provide positive music learning experience to 63 young DHH peo- headphones to rest his ear.
ple through 26 sessions of workshops [33]. In addition, research on Participants also commonly use sound manipulation technology
Accessible Digital Music Instruments (ADMI) has been exploring to improve auditory perception. Some use equalization to boost or
the design and evaluation of how technology can be used for DHH cut certain frequencies. P27, an advanced amateur musician, octave-
people to play music and collaborate with other DHH and hearing shifts the pitch with an octave pedal when writing songs with his
people [6, 17], such as Music Aid [45] and Felt Sound [7]. electric guitar and bass. To hear sound from their non-DHH side ear,
While this related work gives an idea of how DHH people are many of the single-sided DHH participants use technology such as
using technology in creative sound activities, it remains partial signal routing and mono-to-stereo conversion. Single-sided DHH
and sparse because it consists mainly of individual stories, the use audio engineers also use stereo fipping (swapping the left and right
of technology is not the primary focus, and its scope is limited to channels) to check stereo-feld panning and phase cancellation.
music-specifc activities. Focusing on technology with a broader Participants use visualizations both to better perceive sound and
scope, our work extends past work to understand the current state to avoid hearing fatigue. For example, participants who engage in
of accessibility in creative sound activities by DHH people. audio engineering commonly reported using waveform (i.e., time-
amplitude) and a few also use spectrum (i.e., frequency-amplitude)
3 METHODS and spectrogram (i.e., time-frequency) displays. Such visualizations
We conducted an online survey targeting DHH people who en- allow them not only to supplement their auditory perception, but
gage in creative sound activities. To understand their experience also to edit sound with their eyes so that they can rest their ears. P29,
with technology, we asked about their use of technology to make a professional audio engineer for podcasts, said “I can sometimes
their activities accessible and the challenges they face in using and do edits with no headphones and a script, because I can edit sound
choosing the technology. Using snowball sampling, we recruited by sight.” For music playing, some use a tuner and metronome to
survey participants via individuals and online communities related visually ensure that pitch and tempo were correct. Also, P34 uses a
to DHH, accessibility, music, and audio engineering. All partici- piano roll to visually see the notes and P12 uses sound waves as a
pants were fuent in English and over 18 years old. The survey metronome to see the click sound when it was difcult to hear.
took approximately 20 minutes. Participation was voluntary and With one exception, all participants who practice audio engi-
not compensated. The study received the New Jersey Institute of neering reported the use of DAWs. We assume participants who
Technology IRB approval. reported the use of DAWs are also familiar with some use of the
50 participants completed the survey (Table 1). The degree of technology mentioned above such as equalization and waveform
hearing loss [12] of the participants varies from mild to profound, / spectrum / spectrogram displays, as they are basic features in
and the time of becoming DHH varies from birth to adulthood. DAWs. In contrast, the use of DAWs was typically not reported by
Their creative sound activities largely consist of music playing and participants who only play musical instruments or sing but do not
audio engineering (Table 2). Music playing includes singing and practice audio engineering.
playing musical instruments. Audio engineering includes recording, Some participants use speech-to-text technology for communi-
editing, mixing, mastering for music, radio, flm, podcast, as well cation with hearing people during their activities, such as Google’s
as transcribing and captioning. We analyzed the survey data using Live Transcribe [21] and Otter.ai [41]. At online meetings, they also
a thematic analysis approach [4, 5, 28, 38]. use an auto caption feature in video conference software such as
Zoom [49] and Google Hangouts [20].
4 FINDINGS No participants use the technology specialized for tactile feed-
back. However, P42, an electric bass player who is Deaf, reported
4.1 Use of Technology the usefulness of tactile feedback with his instrument and amplifer.
We found participants use four types of technology — hearing de- He said “a fve strings bass helps me feel the pulsating bass line to the
vices, sound manipulation, sound visualization, and speech-to-text backbeat... Musical amplifers are the best instrument to feel the beat
— and for three purposes — to better perceive sound through audi- for me.”
tory and visual means, to avoid hearing fatigue, and to communicate
with others and understand speech.
The most commonly reported technologies are hearing devices. 4.2 Barriers
Many use one or a combination of medical and non-medical hear- While some participants reported their use of technology, others
ing devices such as HA, Bone-Anchored HA [3], CI, headphones, described barriers to technology such as unknown availability, lim-
earbuds, and speakers. They use hearing devices primarily to im- ited options, and limited solutions that it can provide. In fact, when
prove auditory perception. To hear more clearly, some use them we asked about the technologies that make their activities more ac-
together with Bluetooth, noise cancelling features, and neck loops cessible, more than one-third of participants (n=19) did not specify
(a loop of wire worn around the neck that transmits audio signals any technologies.
to hearing devices). However, the use of hearing devices can also Some participants reported not knowing if technology that would
induce hearing fatigue and physical and mental overload. Thus, make their activity accessible is available in the frst place. P8, an
some people choose not to wear HA and CI all the time because they intermediate amateur musician, expressed his frustration saying
become overwhelmed by the unwanted efects such as distortion “Don’t even know what’s available. Ignorance.” They feel difculty
and mufing, especially when listening to music rather than speech. in fnding the right technology for their hearing and activities. P4,
How people who are deaf, Deaf, and hard of hearing use technology in creative sound activities ASSETS ’22, October 23–26, 2022, Athens, Greece

Table 1: Participant Demographics. (Numbers in parentheses are the number of participants)

Gender Male (26), female (19), non-binary/non-conforming (3), transgender male (2)
Age 18-24 years old (6), 25-34 (18), 35-44 (8), 45-54 (6), 55-64 (6), 65-74 (4), 75-84 (1), 85+ (1)
DHH identity Hard of hearing (25 including three single-sided), deaf (18 including six single-sided), Deaf (4), others (3)

Table 2: Participants’ Activity and Level of Experience. (beg = beginner; int = intermediate; adv = advanced)

Activity Amateur beg Amateur int Amateur adv Professional N/A Total
Music playing 4 12 8 14 2 40
Audio engineering 1 4 7 12 1 25

a professional musician and audio engineer, said “I am not aware of DHH situations and activities. For future research, we recommend
any technology/tools developed specifcally for my problem.” Whereas, researchers to work closely with DHH users to identify specifc
P16 expressed reluctance to try using technology: “Since I can’t hear challenges in their activity and design inclusive technologies. We
well, I haven’t even tried using most technology from a musical per- also see an opportunity for research that facilitates community
spective.” where both hearing and DHH people who engage in creative sound
Even when the challenge seems solvable or improvable with tech- activities can exchange knowledge and information about technol-
nology, the limited variety of products do not solve the challenges ogy. That would help raise awareness of available technologies and
unique to each individual. P22, an advanced amateur musician and how to use them efectively.
audio engineer, expressed a sense of resignation by saying “the
idea... is so niche that it’s not something product designers tend to 6 CONCLUSION
think about or cater for.” Some wish to have more inclusive products. In this work, we provide early insight into DHH people’s use of
A few use DIY solutions specifc to their hearing and activity. How- technology and its barriers in creative sound activities. To support
ever, that is not a feasible solution for others who are unfamiliar these activities, we found DHH people use four types of technology
with technology: “Software is still a mystery to me.” (P6). — hearing devices, sound manipulation, sound visualization, and
Participants reported limited solutions that current technology speech-to-text — and for three purposes — to better perceive sound
may provide for certain tasks. P9, who became deaf after estab- through auditory and visual means, to avoid hearing fatigue, and to
lishing his career as a professional audio engineer, said “I can’t communicate with others and understand speech. We found DHH
master as well as I used to and no tool will fx that.” P30, who started people also have barriers to technology in terms of its availability,
audio engineering for podcasts after becoming deaf, said “There limited options, and limitations that technology can solve. In future
are defnitely times when I just cannot do a specifc task — creating work, we plan on conducting interviews that focus on more specifc
certain kinds of sound efects or soundscapes, or I don’t always catch tasks of creative sound activities that DHH people fnd challenging.
if speech isn’t as clear as it should be or if it fully ‘matches’ the audio
from other actors, etc. It’s simply not a thing I can always do.”
ACKNOWLEDGMENTS
With these barriers, technology still cannot ease participants’
lack of confdence. P2 said “Most of my challenges come from lack We thank all participants, organizations, and communities that
of confdence that what I am hearing is accurate.” It also created a contributed to this study.
concern for their professional career. P48, who works as a Digital
Signal Processing (DSP) engineer, said “I feel I am held back when REFERENCES
learning advanced DSP and flter design since I can’t properly hear... [1] Audiovisability. 2022. Audiovisability. Retrieved June 23, 2022 from https:
//www.audiovisability.com/
(I) don’t feel like I can advance super far in that feld.” To deal with [2] Swann Barrat. 2020. I’m a sound technician. Losing my hearing
the lack of confdence, many mentioned seeking hearing people’s was devastating. CBC News (24 Nov. 2020). Retrieved June 23,
2022 from https://www.cbc.ca/news/canada/british-columbia/i-m-a-sound-
feedback as a workaround. P27 said “I must get a second opinion. technician-losing-my-hearing-was-devastating-1.5813327
Must.” However, the availability of hearing people is a challenge. [3] Ricardo Ferreira Bento, Alessandra Kiesewetter, Liliane Satomi Ikari, and Rubens
They need to be familiar with creative sound activities to provide Brito. 2012. Bone-anchored hearing aid (BAHA): indications, functional results,
and comparison with reconstructive surgery of the ear. International archives of
valid feedback. Scheduling them is also not easy. Otherwise, partic- otorhinolaryngology 16, 03 (2012), 400–405.
ipants had to rely on guessing. P48 said “I’m always second guessing [4] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.
my ears”. P16 said “(I’m) trusting it sounds ok... I have no idea how I Qualitative research in psychology 3, 2 (2006), 77–101.
[5] Virginia Braun and Victoria Clarke. 2013. Successful qualitative research: A
actually sound.” practical guide for beginners. sage.
[6] Doga Cavdir. 2022. Touch, Listen,(Re) Act: Co-designing Vibrotactile Wearable
Instruments for Deaf and Hard of Hearing. In NIME 2022. PubPub.
5 DISCUSSION [7] Doga Cavdir and Ge Wang. 2020. Felt sound: A shared musical experience for
Our study found participants’ current use of technology and the bar- the deaf and hard of hearing. In Proceedings of the 20th international conference
on new interfaces for musical expression (nime-20).
riers they face in creative sound activities. Aside from HA/CI, most [8] Anna Cavender and Richard E Ladner. 2008. Hearing impairments. In Web
of the technology was not specifcally designed for their specifc accessibility. Springer, 25–35.
ASSETS ’22, October 23–26, 2022, Athens, Greece Keita Ohshiro and Mark Cartwright

[9] Marshall Chasin and Frank A Russo. 2004. Hearing aids and music. Trends in [30] Valerie Looi, Hugh McDermott, Colette McKay, and Louise Hickson. 2008. The
Amplifcation 8, 2 (2004), 35–47. efect of cochlear implantation on music perception by adults with usable pre-
[10] Wendy Cheng and Au.D. Willia Horowitz. 2021. Making Music with a Hearing operative acoustic hearing. International journal of audiology 47, 5 (2008), 257–
Loss: Strategies and Stories (2nd ed.). AAMHL Publications, Gaithersburg, MD, 268.
USA. [31] Sara MK Madsen and Brian CJ Moore. 2014. Music and hearing aids. Trends in
[11] Warren N Churchill. 2016. Claiming musical spaces: Stories of deaf and hard-of- Hearing 18 (2014), 2331216514558271.
hearing musicians. Ph. D. Dissertation. Teachers College, Columbia University. [32] Hugh J McDermott. 2004. Music perception with cochlear implants: a review.
[12] John G Clark. 1981. Uses and abuses of hearing loss classifcation. Asha 23, 7 Trends in amplifcation 8, 2 (2004), 49–82.
(1981), 493–500. [33] Music and the deaf. 2016. The Frequalise Report: A project by Music and the
[13] Bill Crow. 2006. Musical creativity and the new technology. Music Education Deaf. Retrieved June 23, 2022 from https://network.youthmusic.org.uk/fle/
Research 8, 1 (2006), 121–130. 27500/download?token=4_rxCK37
[14] Harvey Dillon. 2008. Hearing aids. Hodder Arnold. [34] Music and the Deaf. 2022. Music and the Deaf | West Yorkshire | MatD. Retrieved
[15] Richard Einhorn. 2012. Observations from a musician with hearing loss. Trends June 23, 2022 from https://www.matd.org.uk/
in Amplifcation 16, 3 (2012), 179–182. [35] Drake Music. 2022. Drake Music | Leaders in Music, Disability & Technology.
[16] Leah Findlater, Bonnie Chinh, Dhruv Jain, Jon Froehlich, Raja Kushalnagar, and Retrieved June 23, 2022 from https://www.drakemusic.org/
Angela Carey Lin. 2019. Deaf and hard-of-hearing individuals’ preferences for [36] Youth Music. 2022. Youth Music Home Page. Retrieved June 23, 2022 from
wearable and mobile sound awareness technologies. In Proceedings of the 2019 https://youthmusic.org.uk/
CHI Conference on Human Factors in Computing Systems. 1–13. [37] Deaf Professional Arts Network. 2022. D-PAN: Deaf Professional Artist Network.
[17] Emma Frid. 2019. Accessible digital musical instruments—a review of musical Retrieved June 23, 2022 from https://d-pan.org/
interfaces in inclusive music practice. Multimodal Technologies and Interaction 3, [38] Kimberly A Neuendorf. 2018. Content analysis and thematic analysis. In Advanced
3 (2019), 57. research methods for applied psychology. Routledge, 211–223.
[18] Robert Fulford, Jane Ginsborg, and Juliet Goldbart. 2011. Learning not to listen: [39] Association of Adult Musicians with Hearing Loss. 2022. Welcome! - Association
the experiences of musicians with hearing impairments. Music Education Research of Adult Musicians with Hearing Loss. Retrieved June 23, 2022 from https://www.
13, 4 (2011), 447–464.

You might also like