Blind Leading The Sighted: Drawing Design Insights From Blind Users Towards More Productivity-Oriented Voice Interfaces

Blind Leading the Sighted: Drawing Design Insights
from Blind Users towards More Productivity-oriented

Voice Interfaces
ALI ABDOLRAHMANI, Department of Information Systems, UMBC, 1000 Hilltop Circle, Baltimore MD
21250, USA
KEVIN M. STORER, Department of Informatics, UC Irvine, 5019 Donald Bren Hall, Irvine CA 92697, USA
ANTONY RISHIN MUKKATH ROY and RAVI KUBER, Department of Information Systems,
UMBC, 1000 Hilltop Circle, Baltimore MD 21250, USA
STACY M. BRANHAM, Department of Informatics, UC Irvine, 5019 Donald Bren Hall,
Irvine CA 92697, USA
Voice-activated personal assistants (VAPAs) are becoming smaller, cheaper, and more accurate, such that they
are now prevalent in homes (e.g., Amazon Echo, Sonos One) and on mobile devices (e.g., Google Assistant,
Apple Siri) around the world. VAPAs offer considerable potential to individuals who are blind, offering effi-
ciencies over gesture-based input on touchscreen devices. However, research is just beginning to reveal the
ways in which these technologies are used by people who are blind. In the first of two studies, we inter-
viewed 14 blind adults with experience of home and/or mobile-based VAPAs, surfacing myriad accessibility,
usability, and privacy issues for this community. A second study analyzing podcast content from 28 episodes
relating to blind interactions with VAPAs was then undertaken to validate and extend findings from the
first study. In addition to verifying prior findings, we learned that blind users wanted to leverage VAPAs
for more productivity-oriented tasks and increased efficiency over other interaction modalities. We conclude
that (1) VAPAs need to support a greater variety of AI personas, each specializing in a specific type of task;
(2) VAPAs need to maintain continuity of voice interaction for both usability and accessibility; and (3) blind
VAPA users, and especially blind technology podcasters, are expert voice interface users who should be in-
corporated into design processes from the beginning. We argue that when the blind lead the sighted through
voice interface design, both blind and sighted users can benefit.
CCS Concepts: • Human-centered computing → Empirical studies in accessibility; Accessibility
technologies;
Additional Key Words and Phrases: Voice-activated personal assistant, voice user interface, blind, visual im-
pairment, usability, accessibility
ACM Reference format:
Ali Abdolrahmani, Kevin M. Storer, Antony Rishin Mukkath Roy, Ravi Kuber, and Stacy M. Branham. 2019. 18
Blind Leading the Sighted: Drawing Design Insights from Blind Users towards More Productivity-oriented
Voice Interfaces. ACM Trans. Access. Comput. 12, 4, Article 18 (December 2019), 35 pages.
https://doi.org/10.1145/3368426
Authors’ addresses: A. Abdolrahmani, A. R. Mukkath Roy, and R. Kuber, Department of Information Systems, UMBC, 1000
Hilltop Circle, Baltimore MD 21250, USA; emails: {aliab1, antonyr1, rkuber}@umbc.edu; K. M. Storer and S. M. Branham,
Department of Informatics, UC Irvine, 5019 Donald Bren Hall, Irvine CA 92697, USA; emails: {storerk, sbranham}@uci.edu.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from permissions@acm.org.
© 2019 Association for Computing Machinery.
1936-7228/2019/12-ART18 $15.00
https://doi.org/10.1145/3368426
ACM Transactions on Accessible Computing, Vol. 12, No. 4, Article 18. Publication date: December 2019.
18:2 A. Abdolrahmani et al.
1 INTRODUCTION
Though once the subject of science-fiction, voice-activated personal assistants (VAPAs), like
Apple’s Siri1 and Amazon’s Alexa,2 are now used by millions of consumers at home and on-the-go.
VAPAs (also termed as “intelligent voice agents” or “intelligent personal assistants”) are charac-
terized by their ability to respond to vocalized user requests and reply in a manner similar to a
human assistant. VAPAs’ unique voice- and audio-based interactions provide a convenient alter-
native to traditional visual and manual interactions, especially in situations where a user’s hands
or eyes are otherwise occupied, such as cooking, exercising, or even driving. In addition to their
situational convenience, VAPAs’ non-visual interaction paradigm could be particularly valuable
to blind users. As VAPAs’ functionalities continue to develop—from setting recurring reminders,
browsing the web, playing music, to shopping online—their voice interactions provide an accessi-
ble modality for engaging with an increasing number of digital domains.
A recent study exploring product reviews written by Amazon Echo owners found that a sig-
nificant portion referenced VAPAs’ utility for users with visual impairments [Pradhan, Mehta and
Findlater 2018]—suggesting that the blind community is aware of the potential of these devices
to provide an accessible mode of interaction with the often-inaccessible digital domain. However,
while researchers have explored the use and usability of VAPAs in a variety of contexts [Cowan
et al. 2017; Easwara Moorthy and Vu 2015; Efthymiou and Halvey 2016; Guy 2016; Lopatovska
et al. 2018; Luger and Sellen 2016], there remains a significant lack of research exploring the way
that blind users engage with VAPAs and the accessibility and usability issues they may face.
In this article, we describe a series of two studies aimed at understanding (1) how blind users
engage with VAPAs and (2) the usability and accessibility issues encountered by blind VAPA users.
In our first study, originally appearing at ACM ASSETS 2018 [Abdolrahmani et al. 2018], we con-
ducted semi-structured interviews with 14 legally blind VAPA users. In our second study, reported
for the first time here, we triangulated and extended results of Study 1 by performing a thematic
analysis on a corpus of 28 podcast episodes. Podcast episodes featured blind podcasters who dis-
cussed the use and functionality of VAPAs, specifically pertaining to blind users. We found that
blind users frequently used VAPAs as a tool for accomplishing productivity tasks, and they viewed
VAPAs as a valuable mode of interaction, despite several accessibility issues. Additionally, we found
that blind technology podcasters are expert VAPA users who willingly share this knowledge with
their audience—which they perceive to be primarily other blind people, but also sighted VAPA
users. We share new insights from Study 2 and revisit the results of our earlier work in light of
these findings, as they relate to the accessibility features appreciated by blind users and the us-
ability issues they identify as particularly problematic. We conclude by discussing these findings
in relation to future designs of VAPAs, and we reflect upon the appropriateness of podcasts as
a source of user-generated content for triangulating, or even perhaps priming, research studies
concerning the blind community.
2 RELATED WORK
2.1 Interaction with VAPAs
With the increasing range of voice-activated smart speakers, voice-based interfaces on mobile de-
vices, and speech recognition software, Vlingo,3 Maluuba,4 and Evi,5 available to consumers and
1 https://www.apple.com/siri/.
2 https://developer.amazon.com/alexa.
3 https://vlingo.en.softonic.com.
4 http://www.maluuba.com.
5 http://www.evi.com.
Blind Leading the Sighted 18:3
developers alike, the variety of activities performed using VAPAs is larger than ever [Easwara
Moorthy and Vu 2015]. Prior work suggests that VAPAs are most commonly used for simple in-
formation retrieval tasks, like determining the weather [Luger and Sellen 2016], and leisure and
entertainment purposes, such as listening to the news, playing music, and controlling external
devices [Bentley et al. 2018; Lopatovska et al. 2018]. Less commonly, and primarily within the aca-
demic domain, the use of VAPAs has been explored for supporting speech language pathologists
[Yu et al. 2018], and personal fitness tracking [Chung et al. 2018].
Despite the wide range of applications available to VAPAs, previous research has found that
many factors impact their adoption including their usability [Chen and Wang 2018; Coskun-Setirek
and Mardikyan 2017; Han and Yang 2018] or privacy concerns around them [Chhetri and Motti
2019; Liao et al. 2019]. For instance, the accuracy of VAPAs’ speech recognition is a significant fac-
tor in their overall acceptance, and while their speech recognition features have improved, there is
high cognitive effort involved in determining how best to phrase commands [Cowan et al. 2017],
especially when a language other than English is used [Bogers et al. 2019]. Similarly, previous
work found that the “low transparency” of VAPAs’ inner workings increases the effort required
to properly phrase interactions [Chen and Wang 2018]. The location and presence of third parties
impacts users’ decisions about whether and how to use VAPAs, where use in public areas was
associated with feelings of embarrassment [Cowan et al. 2017] and increased caution, due to the
potential of disclosing private information [Easwara Moorthy and Vu 2015; Efthymiou and Halvey
2016]. Users view some VAPA platforms as more appropriate than others for certain types of tasks
[Luger and Sellen 2016] and express satisfaction with home-based VAPAs even when they do not
produce the information desired [Lopatovska et al. 2018]. Despite this diverse body of work ex-
ploring the use scenarios, contexts, and factors contributing to adoption and usability, there has
been a lack of work exploring VAPA usage by blind and visually impaired users.
2.2 Voice Interaction for Diverse Groups

Recent research has begun to explore the use of VAPAs by increasingly diverse users, including
people with disabilities and people whose age may present difficulties in digital interactions, such
as older adults and children [Baldauf et al. 2018; Carroll et al. 2017; Wulf et al. 2014]. For example,
Baldauf et al. [2018] explored potential VAPA use among users with mild to moderate cognitive
impairments and found that they desired more integrations of VAPAs into everyday tools that
are inaccessible. Similarly, Carroll et al. [2017] presented a conceptual design of a context-aware
VAPA, called Robin, to support independent living for users with cognitive impairments through
simple voice commands and prompts for routine tasks. Wulf et al. [2014] explored the use of Siri
by older adults; while the simplicity of voice-based interaction was appreciated, participants had
difficulty initiating speech exchanges and often unintentionally interrupted the interaction. In
a similar study of eight older adults and their caregivers, Portet et al. [2013] found that, while
VAPAs were viewed as relatively unobtrusive to caregiving, only half of their participants indicated
that interacting through voice inputs felt unnatural. Finally, Lovato and Piper [2015] investigated
children’s use of voice agents, revealing that children use voice input systems for exploration
(e.g., “Hey Siri, what’s your daddy’s name?”), information seeking (e.g., “Hey Siri, how did they
invent Angry Birds?” and functional tasks (e.g., to send text messages). A follow-up study found
that VAPAs were an accessible tool for finding answers to (often silly) questions for pre-literate
children, who may have difficulty finding information through a screen and keyboard [Lovato
Piper and Wartella 2019].
While the above studies demonstrate that VAPAs are an accessible digital tool for a wide vari-
ety of users, they do not directly investigate VAPA use by people with visual impairments. There
has been relatively little academic research into the use and usability of VAPAs for blind users.
But, scholarly work in HCI and Assistive Technology (AT) has more widely explored related do-
mains, including (1) challenges faced by individuals who are blind interacting with audio-output
technologies, like screen readers, and (2) the usability of voice-input systems for blind users. We
describe findings from these domains below.
Examples of studies focusing on voice-output systems for people who are blind include work by
Lazar et al. [2007] and Murphy et al. [2008]. In a diary study of 100 blind users, Lazar et al. [2007]
investigated the usability issues of navigating the web with screen readers. They found that the
primary frustrations for blind users were caused by a lack of consideration for how users might
traverse websites with screen readers. For instance, HTML layouts produced unusual outputs on
screen readers, which read content in tag order rather than by visual arrangement, and a pervasive
lack of alt text on images. Similarly, Murphy et al. [2008] surveyed 30 screen reader users to in-
vestigate navigation strategies when browsing the web. They found that screen readers often did
not convey adequate spatial information about web page layout for users with visual impairments
to effectively browse webpages. From these findings, the authors designed a browser plug-in that
conveyed spatial information using auditory and haptic feedback.
Voice-input systems for people with vision impairments have been developed for a variety of
application areas. For example, Ran et al. [2004] used voice input to control their Drishti system
for eyes- and hands-free navigation for users with visual impairments. Bouck et al. [2011] found
that voice input helped students with visual impairments use calculators in mathematics education
settings. After voice input and talk-to-text became common on mobile phones, Azenkot and Lee
[2013] conducted a survey with blind and sighted participants to compare their use of these tech-
nologies. Overall, blind users reported higher levels of satisfaction with voice-input technologies
than their sighted peers. Voice inputs were often more efficient for blind users than touch-screen
keyboards. However, participants spent significant time (about 80% of time-on-task) editing text
that had been misunderstood by speech recognition software. To provide additional control with
voice inputs, Zhong et al. [2014] designed JustSpeak, a voice-activated mobile application that al-
lowed users to navigate Android devices non-visually. JustSpeak improved upon previous work
in this space by dynamically identifying valid voice input commands based on the current appli-
cation context, supporting sequential voice commands, and integrating with other mobile screen
readers for increased interaction flexibility. Brewer and Piper [2017] designed a voice-based blog
system for older adults who had lost their vision later in life and found that voice inputs were an
accessible way to engage meaningfully in online interactions.
Because VAPAs employ both voice inputs and outputs, their similarities to existing technologies
for blind users that use one or both of these interaction modalities, like screen readers, is a topic
of growing interest in HCI [Abdolrahmani et al. 2018; Branham and Mukkath Roy 2019; Vtyu-
rina et al. 2019a; Vtyurina et al. 2019b]. Screen readers, which translate visual content into voice
outputs, have long been a primary way for blind users to interact with graphical user interfaces.
Screen readers are available on most major operating systems, either natively (like VoiceOver on
MacOS and iOS), or through third parties (such as NVDA,6 JAWS,7 and Narrator8 on Windows,
or TalkBack9 on Android). In contrast to VAPAs, screen readers usually accept input through key-
boards or, in some cases, specialized gestural inputs. Despite similarities in their audio outputs,
VAPAs and screen readers are designed for different user bases. While screen readers are designed
with users with visual impairments in mind, VAPAs are marketed to a general consumer base and,
consequently, may not closely consider the needs of users with visual impairments [Branham and
6 https://www.nvaccess.org/.
7 https://www.freedomscientifc.com/Products/Blindness/JAWS.
8 https://www.microsoft.com/en-us/accessibility/windows.
9 https://support.google.com/accessibility/android/answer/6283677.
Mukkath Roy 2019]. So, while VAPAs bear some superficial similarities to other assistive devices
in their interaction paradigm, they may not be fully accessible.
Recent research has found that screen readers and VAPAs have different strengths and weak-
nesses, which could be leveraged to increase the value of VAPAs for blind users. Our first study
[Abdolrahmani et al. 2018], which we revisit below, identifies limitations of VAPAs in relation to
screen readers. In particular, we found that the utility of VAPAs is hindered by an inability to ad-
just the speed of the voice and to handle complex, work-productive tasks (as when information
seeking or editing text). Branham and Mukkath Roy [2019] found similarly that many commercial
VAPA interaction guidelines mandate durations of pauses, length of utterances, speed and pitch of
voice, and degree of command complexity, which may limit utility for blind VAPA users. Recent
work by Vtyurina et al. [2019a, 2019b] explored distinctions between screen readers and VAPAs
in the context of web browsing through an online survey of 53 blind screen reader users with
experience using VAPAs. They found that VAPAs are convenient, accessible, portable, and sup-
portive of hands- and eyes-free interaction. But, they lack fine-grained control and do not provide
high-level overviews of digital content. By contrast, screen readers excel in both of these areas
by allowing customization of audio output styles (such as voice, speed, pitch, and verbosity) and
allowing quick navigation of content, by giving direct access to headings, links, lists, and tables.
They note, however, that this increased flexibility increases their complexity and learning curve.
The authors proposed a system called VERSE, which capitalized on these relative strengths of
VAPAs and screen readers. The present investigation extends this work by documenting the types
of use cases and needs that people who are blind have when interacting with VAPAs.
2.3 Extant vs. Elicited Methods in the Study of Marginalized Users

Methods that scientists use to study phenomena in the world have deep implications for what can
be known and who can know it, with special considerations needed when working with and on
behalf of marginalized users. In the fields of HCI and AT, it is common to find studies that use direct
observations [Shinohara and Tenenberg 2007], interviews or surveys [Branham et al. 2017], diary
studies [Brewer et al. 2016], field studies [Nicolau et al. 2017], and participatory design approaches
[Albouys-Perrois et al. 2018], to name a few. These are examples of “elicited” or “produced” texts,10
which entail participants “producing written data in response to a researcher’s request and thus
offer a means of generating data” [Charmaz 2006, p. 35]. There is increasing interest in studies
that rely on data such as YouTube videos [Anthony Kim, and Findlater 2013; Lovato and Piper
2015], Amazon reviews [Pradhan Mehta and Findlater 2018], social media forum posts [Storer and
Branham 2019], or design guidelines [Branham and Mukkath Roy 2019]. These are examples of
“extant” or “found” texts, which entail researchers “studying ‘things’ or ‘cultural products’” that
already exist [Reinharz 1992 p. 145]. In this article, we begin with a study of elicited interviews
and end with a study of extant podcast media. We believe ours is the first study to use podcasts
as a data source for understanding technical needs of people who are blind. Like all methods, this
approach comes with advantages and disadvantages. To lay a foundation for exploring the poten-
tial of podcasts in this and future studies, we briefly overview the dimensions of elicited/produced
and extant/found texts, as presented by Charmaz [2006] and Reinharz [1992].
10 Charmaz [2006] uses terms “elicited” and “extant,” whereas Reinharz [1992] uses terms “produced” and “found” to refer
to comparable concepts. We therefore use these interchangeably in this article. One difference is that Charmaz makes a
distinction between interviews and observations on the one hand and elicited texts on the other, because the former are
interactive and not written in print by the participants themselves. Reinharz does not make such a distinction, because
interviews and observations, like surveys and other prompted writings that fit squarely into Charmaz’s notion of elicited
texts, require direct contact with the participant. We find Reinharz’s coarser-grained organization to be more intuitive and
sufficient, so in this article, we include interviews and observations in the elicited/produced category.
Perhaps the most notable difference between approaches is the “naturalistic, ‘found’ quality”
[Reinharz 1992, p. 147] of extant media. Movies, music, website ads, one’s personal journal, public
arrest records, newspaper articles, and podcasts are all examples of artifacts that are independently
produced outside a research context. While this can lend persuasive credibility to findings [Rein-
harz 1992, p. 146], Charmaz warns against misconstruing these artifacts as somehow objective or
“uncontaminated” [Charmaz 2006, p. 38]. All texts are products, with their own situated producers,
implied goals, methods of production, media format, and intended and real audiences. Therefore,
extant texts cannot be studied without being placed in context [Charmaz 2006, p. 39; Reinharz
1992, p. 145], though scholars acknowledge the context of production can sometimes be hidden or
perhaps even “unknowable” [Charmaz 2006, p. 39].
Relatedly, and another important distinction, extant texts are relatively “unobtrusive” in nature
[Reinharz, p. 147]. That is, they do not require any contact with the artifact’s producers, and ar-
tifacts are not affected by the observer in the ways we expect participants might be [Reinharz,
p. 147]. We believe that this quality may be particularly well suited to examining the needs of
populations with disabilities, as it is often difficult to recruit participants with disabilities to gen-
erate elicited text [Sears and Hanson 2011]. Extant texts can additionally raise the visibility of
“historically ignored” populations like women [Reinharz 1992, p. 156] and, we posit, people with
disabilities.
HCI researchers have long recognized the value of triangulation of data sources to increase
confidence [Mackay and Fayard 1997]. Similarly, Charmaz and Reinharz note that extant texts
can complement elicited texts and are therefore often used in combination [Charmaz 2006, p. 37;
Reinharz 1992, p. 148] as demonstrated by the pair of studies we present below. We will revisit
this topic in the Discussion (Section 5) to explore the affordances and tradeoffs of each approach,
particularly as regards podcasts as a unique genre of extant text when conducting research on the
blind population.
3 STUDY 1: INTERVIEW STUDY

3.1 Study Objectives
To understand the diverse views and practices of blind VAPA users, we conducted semi-structured
interviews, in which participants were asked about their interactions with VAPAs, challenges
faced, and the impact of social context on their experiences. The details of this study are described
in our previous work appearing at ACM ASSETS 2018 [Abdolrahmani et al. 2018].
3.2 Study Participants

We recruited and interviewed 14 participants (Table 1) through a combination of mailed advertise-
ments and snowball sampling. Our participants ranged from 21 to 66 years of age (mean of 31 years
of age). Nine participants identified as men, and 5 participants identified as women. Eleven partic-
ipants had visual impairments at birth, while 3 experienced vision loss later in life. At the time of
our study, all participants identified as legally blind. All participants reported using screen readers,
including VoiceOver,11 TalkBack,12 JAWS,13 and NVDA,14 as their primary tool for accessing digi-
tal technologies. Additionally, P3 occasionally used the ZoomText15 magnifier to read digital texts.
11 https://www.apple.com/accessibility/mac/vision/.
12 https://support.google.com/accessibility/android/answer/6283677?hl=en.
13 https://www.freedomscientific.com/products/software/jaws/.
14 https://www.nvaccess.org.
15 https://www.zoomtext.com/products/zoomtext-magnifierreader/.
Table 1. Participant Demographics
ID Age Visual Impairment Description Mobile Home

P1 27 From birth, partial vision left eye, white cane, screen readers Si (SD) Ec* (FW)
P2 36 Later in life, blindness at age 18, white cane, screen readers Si (SD) Ec** (SD)
P3 24 From birth, acuity approx. 200/400, guide dog, some magnified text Si (OD) N/A
P4 37 Later in life, blindness at age 30, white cane, screen readers Si (SD) Ec* (SD)
P5 33 From birth, light perception in one eye, white cane, screen readers Si (SD), GA (SD) GH** (SD)
P6 25 From birth, white cane, screen readers Si (SD) N/A
P7 35 From birth, white cane, screen readers Si (OD) Ec** (SD)
P8 25 From birth, both eyes removed, guide dog, screen readers Si (FW) Ec** (SD)
P9 24 From birth, approx. 5% residual sight, guide dog, screen readers Si (SD), GA (FW) Ec** (SD)
P10 22 Later in life, blindness at age 11, white cane, screen readers Si (SD), GA (OD) Ec* (SD)
P11 38 From birth, white cane, screen readers GA (SD) Ec** (SD), GH** (SD)
P12 66 From birth, white cane, screen readers Si (FW) Ec** (SD), GH** (SD)
P13 21 From birth, blindness at age 4, white cane, screen readers Si (SD) Ec** (SD)
P14 27 From birth, white cane, screen readers Si (OD) Ec* (FW)
Platform: Si=Siri, Ec=Echo, GA=Google Assistant, GH=Google Home; Use: SD=several times a day, OD=once a day, FW=a
few times a week; Ownership: *=used, **=owned.
Many participants used applications such as SeeingAI,16 TapTapSee,17 BlindSquare,18 and Nearby
Explorer19 to enhance their day-to-day activities. Eleven participants used white canes and 3 used
guide dogs for their mobility.
Our participants had a range of conditions leading to blindness. Eleven participants acquired
total or near-total blindness from birth. Other participants lost their vision progressively—like P4,
who was diagnosed with retinitis pigmentosa at seven years of age and lost their remaining sight
around age 30. Still others lost their sight instantaneously later in life—like P10, who lost their
sight in a car accident at 11 years of age. While none of the participants relied solely on functional
vision, some had light perception (e.g., P1, P3, P5), could identify colors, or read large print from a
very close distance (e.g., P1).
Because of the focus of our study, we recruited only people with visual impairments who regu-
larly used mobile or home-based VAPAs. All our participants reported using mobile VAPAs, such
as Siri or Google Assistant,20 at least once per week. Eight participants owned a home-based VAPA,
such as Amazon Echo21 or Google Home.22 Four participants had significant experience interacting
with home-based VAPAs owned by a family member or significant other. These participants each
reported using home-based VAPAs multiple times per week. Two participants reported having no
prior experience with home-based VAPAs.
3.3 Interviews and Data Analysis

Our semi-structured interview protocol was divided into three primary topical areas: (1) Interac-
tion experiences with VAPAs, (2) Accessibility of VAPAs, and (3) Privacy and security concerns.
Across each of these areas, participants were asked about how situational and social contexts
16 https://www.microsoft.com/en-us/ai/seeing-ai.
17 https://taptapseeapp.com.
18 https://www.blindsquare.com.
19 https://www.aph.org/product/nearby-explorer-full-version-on-app-store/.
20 https://assistant.google.com.
21 http://www.amazon.com/echo.
22 http://www.google.com/home.
(e.g., using VAPAs in public or private, alone or with others) influenced their views. We conducted
these interviews remotely by teleconference in 2018 (February 17–March 16). Interviews lasted
between 40 and 80 minutes (mean of 55 minutes). Each interview was audio recorded and later
transcribed verbatim by two researchers independently. Participants were compensated for their
time. The first author conducted a thematic analysis of the data, using an iterative, inductive ap-
proach to develop and identify themes and sub-themes discussed by the participants. A subset of
codes that emerged are presented below as section headings.
3.4 Interview Study Findings

3.4.1 Scenarios of Usage. Participants identified a wide range of tasks suited to VAPAs. Broadly,
these included managing time, seeking information, accessing other apps, services, and devices,
consuming media, and using VAPAs as a math and language aid. Participants often used VAPAs
in circumstances where a timely response was required (e.g., asking Siri “Where am I?” to receive
just-in-time navigational assistance), to aid in multitasking (e.g., asking Siri for a word’s synonyms
while writing in another application), and in situations where a user’s hands are occupied (e.g.,
asking Google Home to dictate a recipe while cooking). Voice interactions were frequently de-
scribed as a “time saver,” as compared to interacting with a touchscreen, keyboard, or especially
in cases that might otherwise require support from a sighted companion.
3.4.2 The Usability and Accessibility of VAPAs.
3.4.2.1 Misinterpreting Input Commands. Ten participants indicated that mobile VAPAs were
prone to misunderstanding or misinterpreting spoken commands, especially in high-noise, pub-
lic environments. This prompted many participants to lose confidence in VAPAs’ accuracy and
utility. P1, for instance, was hesitant to use Siri in public settings at all. P8 noted Siri frequently
misinterpreted him when setting reminders and editing calendar events; this led him to primar-
ily use his touchscreen for such tasks, which takes significantly longer than VAPA interactions.
Colloquial phrases and proper names were identified as particularly difficult for Siri to interpret.
The frequency of VAPAs misinterpreting commands led many participants back to less accessible
devices to verify that their inputs were correctly understood.
3.4.2.2 Inappropriate Feedback Styles. Participants noted that when VAPAs respond to com-
mands, the feedback can be too revealing or irrelevant. P1, for example, recounted an instance
where she whispered her significant other’s name to Siri, who loudly repeated it back, revealing
to her coworkers that she was making personal calls on the job. Several other participants noted
other cases, like checking the hours of operation of local businesses, where Siri provided only a
partial response to an input query. In each of these cases the VAPA’s feedback was at an inappro-
priate granularity—either too verbose or too minimal—for users’ contexts and needs.
3.4.2.3 Supporting a Variety of Apps. For blind users, VAPAs were particularly useful for pro-
viding access to third-party applications or services, many of which are not properly accessible
to screen readers. However, the range of applications currently integrated with mobile VAPAs is
limited. Additionally, VAPAs have a finite time window within which they listen for user input,
which does not scale to account for the nature of the application being used. Often this caused con-
flicts and undue time pressure, when applications expected long or sequential commands. P6, for
example, highlighted the pressure of composing emails by voice, racing to complete her thoughts
“before Siri times out.” This timeout is not readily perceptible, so VAPA users have only rough
estimates of the time available to complete a task.
3.4.2.4 Recovering from Speech Recognition Errors. Even with workarounds to verify their com-
mands were understood, recovering from speech recognition errors is difficult. For example, with
the iOS native VoiceOver screen reader enabled, Siri will read back the dictated text. But, to fix an
identified error, users have to dictate the entire message again. Two participants noted the diffi-
culty this poses, particularly in instances when recovering from errors related to more complex
voice commands, such as setting a repeating reminder or calendar appointment.
3.4.2.5 Desire to Adjust Voice Settings. Many participants expressed a desire to customize voice
output settings, based on general preferences or the needs of a specific task. Experienced screen
reader users are accustomed to interpreting synthesized speech at a fast rate. The naturalistic
speed of VAPAs frequently caused frustration for these users due to inefficiency. Additionally, five
participants thought the ability to customize voice output might increase their privacy in public, as
rapid, synthesized speech is likely less intelligible to sighted passersby than current VAPA voices.
3.4.2.6 Challenges with Voice Activation. Eight participants shared frustrating situations in
which VAPAs did not respond when the wakeword (“Hey, Siri,” “Ok, Google,” etc.) was spoken, or
they were undesirably activated when the wakeword was not spoken. Several participants opted
to disable wakewords entirely, either through settings, muting smart speaker microphones, or de-
liberately keeping their phone muffled. One participant even shared the story of a smart speaker
VAPA that had added items to their online shopping cart autonomously. While VAPAs were com-
mended for their accessible interactions, this imprecision in wakeword recognition prevented
many users from making full use of VAPAs’ potential.
3.4.2.7 Inaccessibility Due to Visual Content. While VAPAs are primarily voice-based, they of-
ten use visual cues to convey information to their users. For example, P8 encountered difficulties
determining which buttons correlated to which functions on his Amazon Echo, because they are
tactilely indistinguishable, and their functions are indicated only by a visual icon. Similarly, Echo
uses a red LED to indicate that the microphone is off, making it difficult to determine the micro-
phone’s status without sight.23 P8 tested his microphone’s status by speaking the wakeword and
awaiting a response, assuming the microphone was switched off if no response was heard—hardly
an ideal strategy when considering the wakeword imprecisions identified above. Some blind VAPA
users even believed these inequalities might lead to increased reliance on sighted assistance.
3.4.3 Comparative Advantages of Different Platforms. Participants who were familiar with mul-
tiple VAPA platforms were asked to compare and contrast their experiences with each one, espe-
cially in terms of (1) search engine performance, (2) speech recognition, and (3) text-to-speech out-
put quality. Eight participants suggested that Amazon Echo and Google Assistant outperformed
Siri in processing speech input, understanding complex commands, remembering the context of
a conversation, and generating appropriate responses. Other VAPA users considered the repu-
tation of the parent company when evaluating a VAPA’s performance. For example, P8 supposed
that, because Google Search provides extensive and relevant search results, Google Assistant likely
provides more appropriate responses to users’ queries. Similarly, P5 chose to purchase a Google
Home because of his previous experiences with Google products and general esteem for Google as
an organization. Still other VAPA users considered whether a VAPA was integrated with external
applications. P2 noted the flexibility of controlling external devices, such as smart lights, through
23 As of the submission of this article in August 2019, this feature has changed. Now, when the microphone button is pressed
to turn Echo’s mic on and off, two respective low-tone audio alerts are played. However, the difference between the tones
is not distinctive enough for the first author (who is a blind screen reader user) to remember which sound means the mic
is on and which sound means the mic is off.
either Siri or her Amazon Echo. Some users valued VAPAs that were native to the device, including
P9 and P10, who both use Siri as their primary VAPA platform because she is native to iOS, despite
noting that Google Assistant (a third-party application) offers a superior experience.
3.4.4 Using Mobile VAPAs in Public. Seven participants indicated a preference not to interact
with mobile-based VAPAs in certain public settings, because of concerns about inadvertently dis-
closing private information, drawing undue attention, or discomfort that strangers may overhear
their phone activities. In public settings, many participants used external headsets to interact with
mobile-based VAPAs, which also increased speech-recognition accuracy in high-noise environ-
ments. Perhaps more importantly, several participants noted using an external headset allowed
them to keep their mobile phone inside their pocket or handbag, which (1) keeps the device out of
view from passersby who may be shoulder-surfing, (2) reduces the risk of the device being stolen,
and (3) frees their hands to use other assistive devices.
3.4.4.1 Social Awkwardness and Disruption to Communication. Nine participants identified the
use of VAPAs in public settings as leading to awkward social encounters or disrupting other com-
munications; in situations where silence is customary, interacting with a voice agent may be inap-
propriate. Some participants noted concerns that, if passersby could not see the mobile device, it
may appear as though they are talking to themselves. Using VAPAs, too, was perceived as disrup-
tive to in-person conversations. Though phone usage is increasingly common for sighted people
during conversations, P6 noted “I wouldn’t like to continue extended text conversations with Siri
[in front of my friends]. It’s awkward.”
3.4.5 Using VAPAs at Home. Participants generally viewed their homes as a safe space for in-
teracting with VAPAs freely, without headsets. P14 even noted that she increases the volume of
Siri’s output at home without worrying about disturbing others or experiencing social discomfort.
Participants used their home-based VAPAs for a wide range of tasks, including light and climate
control (termed: “automation tasks” [Bentley et al. 2018]). For many participants, using VAPAs to
control other devices in the home increased their sense of independence. P10 even suggested that
voice-based interactions represent an important equalization in abilities of blind and sighted indi-
viduals. When using his VAPA to play games, “as there’s no screen, we [blind and sighted gamers]
are all able to interact on the same level when playing games.”
3.4.6 Privacy Concerns. When using VAPAs at home, participants did not experience any of
the same privacy concerns present when using VAPAs in public. In fact, several participants high-
lighted the utility of VAPAs for sharing information, like listening to personal reminders or no-
tifications with close friends. While participants expressed awareness of news stories detailing
privacy concerns involved with VAPAs’ always-on listening in home settings, they were generally
willing to exchange potential data misuse for the ability to perform otherwise inaccessible tasks.
Several participants pointed to the reality that their personal data is already vulnerable—on other
common platforms like Gmail and iCloud—as justification to use VAPAs, despite their privacy risks.
3.5 Study 1 Summary

Through Study 1, we found that blind users appreciated the accessible interaction mechanisms of
VAPAs; they found it was a great “time saver” to use a singular, voice-only interface to access aids
for everything from setting timers, to making calculations, to creating calendar events, to sending
and receiving emails and text messages. However, we also identified opportunities for these devices
to be made more accessible and usable for blind users; they should especially improve their ability
to personalize the conversational interactions and to recover from speech recognition errors—all
without having to change interaction modalities to a screen reader.
At the conclusion of our study, we were satisfied that we had documented many legitimate ac-
cessibility and usability insights of blind users. However, we were not sure of the extent to which
the concerns of our 14 participants would be shared with the blind community more generally.
Rather than continuing to run studies in the time-consuming interview format or pursuing a sur-
vey study that would certainly reach more participants but may stifle open-ended sharing, we
decided to try something a bit different. We chose to run a content analysis on extant podcast
data—a format that the authors’ personal experiences with podcasts suggested would be abundant
and inclusive of blind, technology-oriented podcast hosts. For reasons described in our literature
review as well as our Discussion, we anticipated such data would be thematically complementary
to our interview data, representative of concerns of the larger blind community, and much easier
to collect (though not to analyze). In the following sections, we describe the design and findings
of our second study.
4 STUDY 2: PODCAST STUDY

4.1 Study Objectives
To triangulate and extend the findings of our initial interview study, we analyzed extant data col-
lected from podcasts, in which blind users discuss and review VAPAs. Using freely available data,
like podcasts, allowed us to identify usability issues and concerns with VAPAs, as described by
blind users for an audience of potential blind VAPA users. In doing so, we were able to understand
whether the themes identified in our interview study were consistent with themes identified by
VAPA users outside of a research setting. We found podcasts to be a more appropriate medium
for obtaining user-produced data than other media used in similar studies, like YouTube videos
[Anthony, Kim, and Findlater 2013; Lovato and Piper 2015], because of their accessibility for peo-
ple with visual impairments. Ultimately, because of the unique experience and audience of blind
tech podcasters, the podcast study revealed a significant amount of unanticipated new insights, a
phenomenon that we describe in more detail in the Discussion.
4.2 Methodology
4.2.1 Podcast Collection Methods. To identify relevant podcasts, we adopted a systematic ap-
proach similar to Anthony, Kim, and Findlater [2013], in which two researchers categorized rele-
vant search keywords into two enumerated lists: (1) A list of terms describing vision loss (“blind,”
“visually impaired,” “low vision”) and (2) A list of terms pertaining to specific VAPA platforms
and voice interaction (“Siri,” “HomePod,”24 “Google Home,” “Google Assistant,” “Amazon Alexa,”
“Amazon Echo,” “Cortana”25 ). Then, we developed a list of search keywords, using all permuta-
tions of words from both lists (“blind Siri,” “visually impaired Siri,” “blind Google Home,” etc.),
resulting in 21 unique queries. Using these 21 queries, we searched the popular podcast hosting
platform Stitcher26 for all related podcasts during the first week of August 2018. It is common for
podcasters to release podcast episodes on multiple platforms, so it is reasonable to expect that
the results returned by querying Stitcher are representative of performing these same searches on
other podcast hosting sites (e.g., Apple Podcast,27 Downcast,28 Spotify29 ).
We performed three rounds of filtering to determine relevant podcast episodes for this study.
First, we searched the Stitcher website for all 21 unique combinations of terms from both lists
24 https://www.apple.com/homepod/.
25 https://www.microsoft.com/en-us/cortana.
26 https://www.stitcher.com.
27 https://www.apple.com/itunes/podcasts/.
28 http://www.downcastapp.com.
29 https://www.spotify.com/us/.
using the Google search engine. Examples of formulated search queries include: (1) site:
Stitcher.com Blind “Amazon Echo” (2) site:Stitcher.com “Visually Impaired” “Google Home,”
(3) site:Stitcher.com “Low Vision” Siri. Broader terms, such as “Google Home,” were specifically
chosen to render results that would also include other devices from the same family (e.g., Google
Home Mini). We collected 74 results. In addition, six podcast episodes that did not appear in the
search were identified by the last author from among her personal podcast subscriptions. This
brought the total to 80 episodes. We reviewed each episode title and its description to check that
all the corresponding search terms were present, and the terms appeared to have the correct con-
textual meaning (e.g., “blind” is used to indicate a visual impairment as opposed to a window
furnishing). This process eliminated 13 episodes, reducing the pool to 67. We then cataloged the
following: (1) Search term(s) used to retrieve the episode, (2) URL, (3) Podcast title, (4) Episode title,
(5) Description of the episode (if available), (6) duration of episode, and (7) date episode aired.
In the second round of filtering, we examined all 67 episodes to identify duplicates and those that
did not have any associated audio file. We found that some podcast episodes were retrieved as the
result of multiple queries. This was typically the case when a single episode discussed more than
one VAPA. Thirty-four total episodes were unique results for which we found associated audio
files.
Last, the first author listened to each episode in full and categorized whether the actual au-
dio content detail was considered relevant to the area of vision loss and VAPA interaction, ac-
cessibility and usability challenges, set-up procedure experiences, and usage scenarios for day-
to-day activities. Out of 34 podcast episodes, 28 episodes were found to be relevant. The final
corpus of episodes is listed in Table 2. The corpus can be accessed at https://www.ics.uci.edu/∼
sbranham/data/podcasts.zip.
4.2.2 Description of Podcasts and Podcasters. Using the filtering process described above, we
compiled a corpus of 28 podcast episodes from nine unique podcast series. Podcasts were pro-
duced in a range of westernized countries: United States (four), Canada (four), and New Zealand
(one). Of these nine series, eight indicated that they are produced for an intended audience of blind
and visually impaired listeners. The one podcast series targeting a more general audience (Alexa
in Canada30 ) focused on topics directly related to the use of Amazon Alexa. From this series, we
found and analyzed one relevant podcast episode, which featured an interview with a blind Ama-
zon Echo user about use and usability of that platform. The guest (who was also interviewed in
E27) was a blind AT instructor for the Royal National Institute for the Blind31 and a founding
member of AbilityNet,32 a nonprofit group specializing in user testing with people with disabili-
ties. Seven of the podcast series primarily covered AT and accessibility news for people who are
blind and reviews of digital tools in these spaces. One of these seven series (Blind Bargains Au-
dio33 ) is associated with the AT Guys website,34 which offers accessible products for blind and
visually impaired users. Two of these seven podcast series additionally included content related to
life as a blind or visually impaired person (strategies for finding employment, navigating college
life, and success stories, for example). Finally, the eighth podcast was oriented toward exploring
blind lifestyles and imparting general blindness skills, whether or not they involve technology.
Each of the podcast series in our final corpus, except for one, included biographical information
about the podcasts’ host(s) in either their description or as part of the podcast series’ content. We
30 https://alexaincanada.ca.
31 https://www.rnib.org.uk.
32 https://abilitynet.org.uk.
33 https://www.blindbargains.com/audio/.
34 https://www.atguys.com/store/.
Table 2. Podcast Episodes Analyzed for This Study, Dates They Aired, and Total Duration in Minutes of
the Segment(s) of Episodes that Were Deemed Relevant and Transcribed for Analysis
ID Podcast Title Episode Title Air Date Min.

E01 Alexa in Canada Accessibility and Alexa with Robin Christopherson 4/3/18 31
E02 All Cool Blind Tech Shows35 A demonstration of the Google Home. . . 4/7/17 10
E03 All Cool Blind Tech Shows Enhanced Siri on iOS 11 Beta 6/9/17 7
E04 All Cool Blind Tech Shows Keeping it clean 8/4/17 21
E05 All Cool Blind Tech Shows We are taking you to the movies 9/9/17 10
E06 All Cool Blind Tech Shows How to unbutton the Google Assistant button 11/7/17 7
E07 All Cool Blind Tech Shows Google Home Mini for the first time 12/30/17 27
E08 AT Banter Podcast The Marrakesh Treaty 6/15/18 5
E09 Blind Abilities Podcast36 Echo, Dot, Siri, Home Pod and possibly, the Bryan? 9/4/17 19
E10 Blind Abilities Podcast And you thought just apples were coming out! 9/28/17 15
E11 Blind Abilities Podcast SeeingAI update, Echo wakes up singing. . . 12/23/17 21
E12 Blind Bargains Audio Alexa and the muffin man 3/4/16 41
E13 Blind Bargains Audio Amazon reaffirms its commitment to accessibility 3/25/16 11
E14 Blind Bargains Audio Echoes of herbs and spices 8/29/16 11
E15 Blind Bargains Audio Merry Christmas Bill Cullen 12/22/17 5
E16 Blind Bargains Audio Oreo praline puff pastry 5/12/18 11
E17 Blind Bargains Audio Unlimited stick 6/22/18 13
E18 Blind Bargains Audio AT guys convention showcase 6/30/18 32
E19 Blind Bargains Audio What’s new for Google Home and Google Assistant 7/3/18 16
E20 Double Tap Canada37 Aira comes to Canada & Acer teams up w/ Alexa 5/24/18 10
E21 DoubleTap Canada Apple HomePod 2/22/18 12
E22 iAccessVO38 You now have a voice on iAccessVO... 5/20/16 3
E23 iAccessVO Meet Randy Meyer/is Siri putting you off. . . 1/27/17 3
E24 iAccessVO Reverb app, allows you to interact with Alexa. . . 2/24/17 3
E25 iAccessVO Visually impaired attorney, Will Schell from. . . 5/26/17 7
E26 Life After Sight Loss Radio39 20 things that impact my life as a VI person 5/23/18 7
E27 The Blind Side Podcast40 Robin Christopherson from the Dot to Dot. . . 1/31/18 12
E28 The Blind Side Podcast Beginning a smart home project,. . . 6/19/18 11
also gathered biographical information from other internet pages such as personal websites and
interviews with other news outlets. Fifteen hosts identified as men, and one series was produced
and hosted by a team of an unreported size and gender distribution. Twelve podcasters identified
themselves as blind, and three did not describe their visual acuity. The podcast produced by a
team indicated that each of their team members identified as visually impaired. We expected that,
by virtue of the podcasts’ topics, podcast hosts in our sample would be highly tech-savvy rela-
tive to the general population. But, aside from producing accessibility- and technology-oriented
media, nine hosts worked professionally in the tech sector, and nine were involved in disability
35 https://www.stitcher.com/podcast/cool-blind-tech.
36 https://www.stitcher.com/podcast/blind-abilities.
37 https://www.stitcher.com/podcast/amiaudio/double-tap-canada.
38 https://www.stitcher.com/podcast/accessquest/iaccessvo-making-it-happen-with-voiceover-accessibility.
39 https://www.stitcher.com/podcast/derek-daniel/life-after-sight-loss-radio.
40 https://www.stitcher.com/podcast/jonathan-mosen/the-blind-side-podcast/e/50826319.
advocacy either personally or professionally. Additionally, the team of unknown size and gender
identified each of their members as technology experts. However, it is worth noting that while
podcast hosts in our sample were highly tech-savvy, their insights were most often directed at a
listening audience of blind users with unknown technical self-efficacy.
Of the 28 episodes in our dataset, only 7 episodes were hosted by an individual and the other
21 had multiple hosts. Sixteen episodes contained interviews or conversations with guests. Guests
in 10 episodes identified as blind, and 13 guests indicated some technology expertise (e.g., profes-
sional affiliation with technology companies). Twenty-five of the episodes were recorded entirely
in a studio, 2 episodes were recorded in public events such as commercial electronic shows or
gatherings of blind-advocacy groups, and 1 episode took place in both settings. Nineteen episodes
covered multiple topics (e.g., reviewing a device and interviewing a tech expert), and 9 focused
on only one topic. Twenty-five episodes discussed more than one technology or device. Eleven
episodes primarily provided tutorial information about one or more VAPAs (e.g., setting up Google
Home or installing Amazon Echo).
4.2.3 Podcast Analysis Methods. During the close listening of each podcast described in Sec-
tion 4.2.1, the first author determined which segments of the episodes needed to be transcribed
verbatim for analysis. In total, over six hours of audio were transcribed. All segments were tran-
scribed accordingly and thematically coded by the first author. To understand the extent to which
the themes of these podcasts corroborated and extended the findings of our interview study, we
initially coded Study 2 transcripts using the same codes described in Study 1. Where our prelim-
inary codes did not apply, new codes were developed. After writing a first draft of this article,
the first, second, and fifth authors iterated on the coding schema for themes that were unique
to Study 2. This involved gathering all instances of each theme and writing qualitative memos
[Charmaz 2006]. Some themes were deemed too thin (i.e., represented by too few instances or
lacking sufficient clarity for nuanced interpretation) to warrant inclusion in the final manuscript,
one axial code was merged into the others, and code labels were sometimes adjusted to better fit
the narratives emerging from memos.
We share findings of how this study of extant podcast data supports and builds upon Study 1
in the following section. Because the features of commercial VAPAs are in a nearly constant state
of change, when excerpts from podcasts are referenced, we report both the episode ID and the
year in which the episode aired. We also indicate instances where a problematic feature has been
rectified since the episode’s air date both here and in the Discussion.
4.3 Podcast Study Findings

Here, we share the results of our analysis of 28 podcasts, hosted by blind podcasters, discussing and
reviewing VAPAs. First, we share our findings, as they corroborate the findings of the interview
study described in Study 1. Second, we discuss new findings from this study, as they extend the
findings discussed in Study 1.
4.3.1 Replicating Interview Study Findings. The majority of the open codes derived in the inter-
view study described in Study 1 were identified in our additional podcast study (Table 3). Of the 23
open codes from Study 1, only 7 were not identified by the podcasts alone: input command timeout,
challenges in recovering from speech recognition errors, social awkwardness, communication dis-
ruption, when others are around in public, when others are around at home, and accessing personal
data by VAPA. This suggests that the usability concerns discussed in Study 1 are also identified by
blind VAPA users outside of a research setting, strengthening and supporting our original findings.
In addition to reinforcing results from Study 1, several new themes emerged that were unique
to our podcast study. Podcasters revealed new previously undocumented accessibility challenges.
Table 3. Open Codes from Study 1 and Example Quotes from Podcasts
Open Code # Example

Misinterpreting input E27, 2018: “. . . people in Scotland here in the UK have got a very strong accent
13
commands in some places and they’re finding it difficult. . . to be understood [by Alexa].”
Inappropriate E01, 2018: “I’ve done a web search [with Siri] for that and [results are
17
feedback styles displayed] on screen. . . That’s really not serving the user well.”
Supporting a variety E21, 2018: “. . . [Apple HomePod has] no support for anything else, such as
6
of apps Spotify, natively. . . I wouldn’t buy one. It’s just too locked in for me.”
Desire to adjust voice E12, 2016: “I don’t think you can enjoy the speed [while reading Kindle books
3
settings on Amazon Alexa]. Can you control the speed? . . . No.”
E04, 2017: “[To control the volume on HomePod] you’re probably going to say
Challenges with . . . ‘Hey Siri volume down.’ You’re probably not. . . going to use [the
29
voice activation touchscreen]. (Siri accidentally activated in the background says: ‘You don’t
seem to be playing anything right now’) Oh, shut up!”
Inaccessibility due to E12, 2016, “However, there is a light ring [on Echo] that. . . there’s lights that
13
visual content light up indicating different things, which—it may freak you out.”
E09, 2017: “When [Siri] first came out, it was so great. . . The quality has gone
Comparison of
5 downhill. I don’t really feel that [Apple has] advanced it as much as they should
intelligence
have.”
Comparison of E06, 2017: “I’ve used the Amazon Echo, and it’s about 75% of the market, but
37
platforms Google’s coming on strong with their devices. . . .”
E16, 2018: “[Unlike Google, which lets you ask two-part questions], Amazon
Comparison of
13 seems to be very one-on-one. You know, the question leads to an answer and
command complexity
then you have a follow-up. But, you’re literally starting another question.”
E27, 2018: “Alexa’s now working with Sonos [One],41 which is a very big deal
Comparison of
for a lot of blind people. . . they’re also going to be supporting Google Assistant
multi-platform 28
a bit later this year. So with that one speaker you will be able to talk to both of
service accessibility
these virtual assistants.”
E01, 2018: “. . . to open up the phone to go into the calendar app and to see what
Convenience of
appointments I’ve got that day would take several seconds, but to just ask [Siri]
service access (e.g., 24
‘what are my appointments today, what’s on tomorrow, what am I doing on
placing orders)
Sunday afternoon?’ That’s. . . really, really easy.”
E17, 2018: “If you have. . . a Google Home–enabled washer and dryer or oven,
Smart speakers and
16 you can check the time on that. Or, the Nest thermostat, set the temperature. . . .
home control
So that’s really cool.”
Access to otherwise E19, 2018: “I also have this thing called Nest Hello doorbell. You actually can
inaccessible 14 assign names to faces that it recognizes. . . . it will automatically say ‘Chris is at
interfaces the door,’ and it answers it on your Google Home.”
E26, 2018: “The nice thing about the Google Home is you can have multi users,
Multiple users,
so it will recognize my voice and play music from my stuff. Then it will
customized access to 2
recognize my wife’s voice and play music from her stuff or add to her shopping
the shared VAPA
list... So it’s been really cool.”
E01, 2018: “Rightfully so, some people are concerned. I, personally, I just think
Privacy-convenience
6 the benefits of [VAPA] technology—and it’s going to become more and
tradeoff is acceptable
more—so far outweighs any of the risk.”
Auto listening mode E27, 2018: “The value of [auto listening mode] is that they’re always listening,
6
and privacy but also for some people the scariness of them is that they’re always listening.”
And, because blind podcasters often use podcasts as a public forum to speak more directly with
VAPA designers, our analysis revealed features blind users expect in future VAPAs, which cur-
rent features are particularly appreciated, and what they wish VAPA designers knew about blind
users more generally. Here, we begin by sharing our new findings regarding the usability and
41 https://www.sonos.com.
Table 4. Recommendations for Rectifying Accessibility Issues
Accessibility Issue Design Recommendation

Misinterpreting • Allow users to train VAPAs how to interpret problematic words.
input commands • Identify repeated unsuccessful commands and seek clarification.
Inappropriate • Separate volume settings for mobile devices and mobile VAPAs to
feedback styles increase output privacy in public settings.
• Mirror the user’s input voice level in responses (e.g., whispering).
• Allow users to customize the verbosity of responses.
Supporting a variety • Increase standards for independent mobile application developers
of apps to provide maximal integration with VAPAs.
• Allow users to extend VAPA’s time-out window for complex
commands.
Recovering from • Support text navigation and editing commands (e.g., “Spell that
errors word for me,” “Change the last word to X”) and gestures (e.g., swipe
to navigate the “cursor” back, double tap to edit).
Desire to adjust • Allow adjustments to output style, like speed and pitch, for
voice settings different tasks (reading texts vs. information retrieval) or contexts.
• Provide a voiceandgesture-based personalization interface
Challenges with • Provide configurable wake-sensitivity levels or allow users to set
voice activation custom wakewords, which may be easier for them to articulate.
Inaccessibility due • Provide auditory options for all outputs and indicators that are
to visual content displayed visually.
Recalling previously • Provide a mechanism for non-visually browsing enabled skills by
installed skills type or category (e.g., “What trivia games do I have?”) with optional
browsing using swipe gestures
Learning physical • Provide auditory walkthroughs describing the device components.
layouts
Companion Apps • Ensure that companion apps are compliant with screen reader
may be inaccessible navigability standards.
VAPAs may conflict • Consider that users may be acting in contexts with assistive aids
with assistive aids and ensure VAPAs’ features are accessible in their presence.
Updates break • Integrate accessibility checks into automated testing and explicitly
accessibility report results to users in release notes.
accessibility of VAPAs for blind users. Then, we highlight the insights blind podcasters wish to
share with technologists to improve VAPAs’ utility for, and designers’ understanding of, blind
users.
4.3.2 Accessibility Insights from Blind Podcasters. Understanding usability and accessibility
challenges was a primary line of inquiry in our first study. The thematic analysis performed in
Study 2 revealed a significant number (16 of 23) of the themes identified in Study 1. In addition,
podcasts revealed usability and accessibility issues that did not emerge in our interviews. We dis-
cuss these here and summarize them in Table 4.
4.3.2.1 Recalling Previously Installed Skills Is Challenging. The ability to recall VAPA commands
that are enabled through “skills,” or extended voice command sets and capabilities offered through
third-party apps, was identified as a significant challenge for blind users in two podcast episodes.
One podcaster noted that, for many skills, they “loved them that first time, and then [they] can
never remember what the key words are to launch [the feature again]!”—(E09, 2017). But, the
problem is not just a matter of recalling command wording; it can be challenging to remember
that the skill itself is installed and at the ready. Another podcaster suggested this was an issue
once more than 10 skills are installed:
“You know, I like my Amazon Echo, don’t get me wrong. But, half the time I forget
which skills I have enabled . . . I look at my Amazon Echo app and go, ‘Oh, right,
I enabled that one . . . I can listen to my old-time radio.’ You know, [once I have]
more than 10 skills installed on there, yeah, I tend to forget which ones I have
set off . . . You have to remember the skill and how to phrase the question [to use
it].”—(E08, 2018)
Of course, “recall” issues are not unique to VAPAs, nor are they unique to blind users. However,
for blind VAPA users, it is significant that recognizing enabled functionalities requires changing
interaction modalities. In the previous example from (E08, 2018), for instance, the podcaster notes
the need to shift from speaking to the non-visual voice assistant, to searching through a visual list
of enabled features in the companion app with their mobile device’s screen reader. This was also
a concern for the podcaster from E09, who offered a voice-based solution:
“They need to have [a feature] where you can ask it, “What’s my skills?” or, [search
by topic area, like] “What’s my trivia skills?” and it can tell you. So, it’s just quick
and easy, because I can’t stand having to go into the app [when I could use my
voice].”—(E09, 2017)
While individual features can be used in a way that is accessible for blind VAPA users, discovering
these features might simply refer users back to other devices or menus, which may or may not
be accessible. At present, using voice inputs to inquire about a device’s features often provides
only one or two example commands, intended to acquaint new users to a VAPA’s functionality.
Similarly, using voice commands to inquire about enabled skills prompts many VAPAs to read the
names and descriptions of skills in a linear fashion, which is time-consuming. But, currently there
is no method for quickly browsing enabled features and their corresponding commands directly
from the VAPA itself.
4.3.2.2 There Is No Way to Learn the Physical Layout. In four podcasts, blind VAPA users indi-
cated difficulty understanding the physical layout of VAPAs’ hardware components. Many VAPAs
in the form of smart speakers, like Amazon Echo, have hardware components with a low profile,
sitting nearly flat to the device’s surface. So, there are limited tactile cues that could scaffold blind
users’ initial understanding of these devices. One podcast guest had no idea the Echo device had
any buttons on it. Upon learning from the host that Alexa can be awoken by a button rather than
a spoken command, they interjected, “What are the buttons? What do you mean you pressed the
button?!”—(E11, 2017). This same issue was identified in determining the spatial layout of buttons
on peripheral devices, like Amazon’s Alexa Voice Remote, which can be used to interact with Echo
devices at a distance. One podcaster included a description of the remote’s physical layout as part
of a tutorial for their listeners:
“It’s probably about five and a half inches long by, maybe, an inch and a half wide,
and an inch thick. The buttons are on the surface of the device, and then there’s
a little light at the top. . . You have one round button—that round button is what
you press and hold in order to talk to Alexa, and then below that you have a round
circle which does four things: pressing it up will turn the volume up, pressing it
down will turn the volume down. . . .”—(E14, 2016)
In fact, the difficulty of determining the physical layout of VAPA devices was so prevalent that
including descriptions of device components was a relatively common practice among blind pod-
casters. Comparably to the podcaster from (E14, 2016) in the prior example, another podcaster
instructed their listeners:
“Take your Dot, [find] where the power cord goes in—let’s call that 6 o’clock. Right
above there, you’ll find the ‘plus volume’ button. So, at 12 o’clock you have the
‘minus volume’ button, and at nine o’clock you’re going to have what they call
the action button. . . If you hit [the microphone button at 3 o’clock] once and then
try saying the word, and it doesn’t work, that means your microphone is off.”
—(E11, 2017)
Here, the podcaster used several best practices for providing non-visual descriptions of physical
spaces. They began by identifying a tactilely distinguishable feature (the power cord insertion
point) and orienting their listeners using clock language—known and used by many blind users.
Additionally, this podcaster identified which hardware components would not be usable for blind
users (the microphone button, which illuminates to provide a visual cue when the microphone
is off) and provided a non-visual workaround for determining the device’s status (attempting the
wakeword and waiting for a non-response).
4.3.2.3 Companion Apps May Be Inaccessible. As described in 4.3.2.1 above, VAPAs often require
users to engage with a secondary application—either on their phone or on the web—to complete
their setup, to integrate with third-party services, or to enable new skills. Beyond being disruptive
to the user experience, five podcast episodes identified accessibility issues in these companion ap-
plications as significantly degrading the overall experiences of the VAPAs themselves. For example,
one podcaster identified Amazon Echo’s companion application, stating:
“[Amazon Echo’s companion app is] not particularly accessible—it’s quite
clunky. . . It’s the only app I’ve ever come across on iOS where, sometimes, when
you tap on something, [the input] isn’t accepted. You know, it’s, umm, it’s ig-
nored. So, it just feels a bit horrible. Having said that, it’s. . . doable, just clunky.”
—(E27, 2018)
Another podcaster identified this same accessibility issue in the Echo companion app. Noting that
it is unusual for iOS applications to be inaccessible to screen readers, they attempted to infer the
underlying technical flaws, suggesting:
“Accessibility is something that people with disabilities really rely on and apps,
particularly on iOS, are really quite accessible. But, as I’m sure people are
aware, or suspect, the ‘A. Lady’ [Alexa] app is basically just a website—it’s
alexa.amazon.co.uk, here—a website wrapped in, kind of, an app wrapper and
it’s not very accessible. If they built a native app, it would be so much better.”
—(E01, 2018)
And, while many podcasters, like those above, resigned themselves to making do with a “clunky”
app that could be “so much better,” others called for collective action to fix this issue. For example,
one podcaster urged their listeners, “I do strongly suggest if you own an Echo write to Amazon—let
them know that their app needs some work!”—(E12, 2016).
4.3.2.4 VAPAs May Conflict with Assistive Aids. In one episode, podcasters identified sev-
eral situations in which VAPAs interact negatively with other assistive aids, or their use as an
accessibility tool conflicts with their other features. For example, one podcaster noted that the
mutual exclusion of Siri’s voice and touch inputs creates issues for users who employ Siri for
accessibility:
“Have you guys played around [with] ‘Type to Siri’ at all? I haven’t, mainly be-
cause, if you type to Siri, then you lose the ability to talk to it. Yeah, somebody
pointed out that you could invoke the ‘Hey Siri’ option on your phone. . . That
way if you want to use it, you know, you have [the option to say], ‘Hey Siri.’ But
you lose the [ability to trigger Siri with the] button and everything. Yeah, I’m just
so used to hitting the button. . . I don’t understand why it’s not [an option] where,
if you want to type it, you just tap on the screen [and] start typing.”—(E10, 2017)
This podcaster suggests that the Type to Siri feature ought to be accessible by simply tapping on the
screen after triggering Siri with a voice command. Notably, switching between these interaction
modes can only be done through the main iPhone settings menu—which may be a few clicks for a
sighted user and significantly more swipes when using a screen reader. So, it is difficult for blind
users to quickly transition between these two methods of entry.
In another example, a podcaster shared the struggle of mediating the relationship between their
Amazon Echo and their service dog. When the dog was displeased, it made a sound that uninten-
tionally awoke Alexa, who then replied, “I’m sorry, I didn’t understand that”—only adding another
layer of noise and confusion to the dog’s howling. Their example highlights that, while VAPAs are
largely accessible for blind users, they may not be designed to accommodate the other assistive
devices (or animals) that will be in the environments in which they are deployed.
4.3.2.5 Updates Break Accessibility. In four episodes, podcasters identified the potential for soft-
ware updates to degrade or remove accessibility features that were present in earlier versions as a
significant source of frustration, while a fifth episode noted the importance of improving accessi-
bility of older generation devices through software updates to newer generation devices.
The perennial issue of updates breaking accessibility features is evidenced by one podcaster’s
comment that they have “spoken before about blind people complaining about updates and stuff
and breaking stuff” (E11, 2017). Another podcaster explains:
“That’s the one problem that [blind people have] always had. It’s like, well the
app will be accessible for a little bit and then one day the developers decide to
update—and guess what happens. It’s no longer accessible.”—(E06, 2017)
For this podcaster, update-related accessibility issues varied by company, and they specifically
commended LG for ensuring accessibility in each update: “The great thing about the LG is that
you don’t [need] to worry about the app ever breaking.”
Not all updates work against accessibility; some do in fact improve it, as we see from another
podcaster who similarly commended Amazon. The company chose to make voice-calling features
available not only on the new second generation Echo, but also on the old first-generation Echo
and across the range of Echo devices:
“A couple of things I was very impressed with. First off, the fact that [voice-calling
features] came out with . . . the 2.0 Echo, I guess, [and] the [Echo] Show, with the
video screen, and the fact that the voice calling works on the first generation is
great, because that was my one fear—that this voice calling would only work on
the new device. That’s how they get you to buy the new device. So, kudos to Ama-
zon for making this work on all Echo devices, the Dot, everything you got there.”
—(E25, 2017, emphasis added)
While the podcaster gives Amazon kudos, the general takeaway is that profit-driven motivations of
VAPA companies often prevent owners of earlier models from using new features, including those
that can improve the system accessibility. A final example of this theme reveals some additional
complexities of updates on VAPA platforms:
Podcaster A: “My first call to TiVo was to make sure that new software actually
worked with the screen reader, because it was much more important to make sure
the screen reader worked—”
Podcaster B: “Yeah.”
Podcaster A: “Sometimes when you deal with mainstream companies, the guy in
India swears that the screen reader works with the new software—”
Podcaster B: “OK then.”
Podcaster A: “So I’ll find things out.”—(E11, 2017)
In this excerpt, both podcasters agree that updates can break features, but more importantly, there
are accessibility tradeoffs to consider when upgrading. Some accessibility features, like the screen
reader, are non-negotiable, so any updates that may conflict with the screen reader—even if they
can improve access—are not viable. The onus, of course, is on the blind user to not only contact
the company, which may not even be aware of the state of screen reader compatibility with new
features, but also to update the system to check for herself (“So, I’ll find things out”). This is yet
another example of our theme “Conflicts between VAPAs and Assistive Aids.” Moreover, the valued
integration of VAPAs into myriad appliances (described in Section 4.3.4.1) may actually lead to
more challenges regarding software updates.
Attending to accessibility and backward compatibility in all phases of design—including
updates—was greatly appreciated by blind podcasters in our sample, who were not hesitant to
give explicit commendations to companies who did so.
4.3.3 Voice Interaction Is Used for Productive Tasks. Blind podcasters in our sample frequently
mentioned their desire to use VAPAs for productive tasks—in which efficient interaction was im-
portant. Examples of this axial code included situations in which the VAPA would be used in
workplace settings, in situations where it would be faster than alternative options, and when the
VAPA could respond to more complex command formulations.
4.3.3.1 VAPAs Are Used for Work, Not Just Home/Play. In five episodes, podcasters indicated the
potential for VAPAs to be used for tasks related to work. Many podcasters noticed that “some-
times people don’t use [a VAPA] that much, they might just use it for music, setting timers, that
sort of thing” (E01, 2018), and they likewise enjoyed using VAPAs for these more lightweight,
playful tasks. “There’s so many games, so many fun things you can do with it,” said one podcaster
from (E01, 2018), “as well as the really useful things, too.” So, podcasters also realized that VAPAs
have the potential to also support more utility- and work-focused tasks. One podcaster even said:
“I’ve often thought of ways to get [Alexa] into my workplace, because I find her really useful.”
—(E20, 2018).
The types of productivity applications that podcasters were interested to use included function-
ality for calendars, meeting schedules, reminders, and emails. One podcaster shared his favorite
email skill, “Mastermind, which does require logging into a separate account; I’m waiting for
[Alexa] to actually build that in—the ability to read emails and text and that sort of thing.”—(E27,
2018). In this case, accessing preferred messaging capabilities required use of a less convenient
third-party app, and the podcaster desired to have such productivity tools natively available on
the VAPA. Yet another podcaster shared access to one’s work schedule through Alexa is more
convenient than reaching for their laptop:
“Now, with Alexa you can access all of the productivity-related features of Cortana,
which [people] don’t have as much access with Alexa. Most people who are em-
ployed are using the Microsoft Office features, especially the ones on Outlook.com
and Cortana can access their calendars, their meetings, all of their business re-
minders, things like that. And, it just makes it so much easier if that’s tied in with
Alexa. So, if you’re, you know, you’re in your kitchen making lunches for your
kids, you can have Alexa find out what your work schedule is without having to
go grab your PC laptop and try to figure it out that way.”—(E05, 2017)
The recent integration of Microsoft Cortana capabilities into Amazon’s Alexa in this example and
the comment above regarding native access to email management both suggest that there is value
for the blind community in making more productivity-focused features available out-of-the-box
on VAPA platforms.
4.3.3.2 VAPAs Are Faster than Alternatives. Across different types of tasks, blind podcasters
indicated VAPAs were valuable, because they could support more efficient interactions than alter-
native modes, like a mobile or desktop interface with a screen reader. For example, one podcaster
suggests it is much more “simple” and “quick” to navigate their work calendar using Alexa as
opposed to their desktop calendar:
“I find [Alexa] really useful. And for the most simplest of tasks, like, for example
[asking] ‘What date will it be, a week [from] Tuesday?’ You know, you think to
yourself, trying to navigate a calendar on a computer, I just haven’t mastered yet—
just to be able to ask that question, and just get the answer quickly when you’re
doing something, instead of taking half an hour trying to work out [a solution with
your screen reader]. . . Yes, I’m excited by this.”—(E20, 2018)
The visually impaired guest on (E19, 2018) provides a more detailed comparison of why a voice
query is desirable over mobile interactions through a screen reader. Again, they find the interaction
much less complex, because it doesn’t require unlocking one’s phone and opening a second app
with “a bunch” of additional taps and gestures:
“The primary advantage with something like the Google Assistant/Google Home
is the conversational nature of the interface, so you can do things without actually
having to open up an app. Let’s take a simple thing of actually being able to read an
incoming message. Without something like the Google Assistant on your phone,
you would actually have to unlock your phone, use the screen reader to. . . open
up the messaging app, and tap, and do a bunch of gestures. Whereas, you can
just ask your assistant to read my latest message or unread messages. The same
goes for smart home controls, so you take out all the complexity of working with
third-party apps you have on your phone, which may or may not be accessible.”
—(E19, 2018)
Moreover, the voice assistant provides a singular consistent interface. In addition to cutting out
third-party apps, it supports interaction with smart home controls. In the words of another
podcaster, VAPAs help you acquire information more quickly, “and sometimes, quickly is the
key.”—(E16, 2018).
4.3.3.3 Support for Complex Commands Increases Efficiency. At present, many VAPAs support
only one request at a time and require users to repeat the wakeword between interactions. In one
podcast episode, a visually impaired Google Accessibility Program Manager introduced two new
features that address these shortcomings. The first is “continued conversation,” a mode that allows
users to input commands in sequence without repeating the “Ok, Google” command. The second
feature is called “multiple queries,” which supports compound commands, such that you can just
say, ‘What is the population of California and Texas.’ And it’ll give you two separate answers.”—
(E19, 2018). In this episode, the blind podcaster introduced the features by noting how “useful” and
“efficient” it can be to give “back-to-back commands without a need to say [the wake]word all the
time.”—(E19, 2018).
The other episode, produced by the same podcaster, provided a step-by-step tutorial to enable
these features: “You have to go in and enable this. This is not on by default. So, what you have to do,
my man, is grab your Google Home app and then go over to the main menu...”—(E18, 2018). Pod-
casters described what they referred to as the “accessibility chime” option, an accessible equivalent
to the light ring, which indicates when Google Assistant is listening for a potential next command.
While they were “thrilled” about this feature, because it was brand new, they questioned how of-
ten and in which situations they might use the technology. They ultimately decided that most
commands are just “one-off,” but this compound structure could be beneficial:
Podcaster A: “If you were looking up flights. . . or if you were doing the weather
and you wanted to know the weather tomorrow or the weather next day; or if you
were doing items; or even recipes then yeah—”
Podcaster B: “That’s where some of that could make sense.”—(E18, 2018)
While the utility of these features in actual use is unclear from these examples, the fact that they
were discussed by blind podcasters on two different occasions suggests that they may hold poten-
tial for improving efficiency for blind listeners.
4.3.4 Why VAPAs Appeal to Blind Users. Podcast hosts and their guests offered several insights
into why VAPAs are particularly appealing to blind users. Specifically, they shared that VAPAs
create a sense of empowerment and independence, support accessible interactions on a mainstream
technology, and they are a customizable and affordable technology that support multiple users of
varying abilities.
4.3.4.1 They Support Platform Integration for Flexible Access. Podcasters in four episodes praised
the integration of individual VAPA platforms, in particular Alexa, with a growing number of de-
vices that may otherwise be inaccessible to blind users. We refer to this as a “one-VAPA-to-many-
appliances” model. For example, one podcaster and self-reported audiophile noted the prevalence
of Alexa on devices at the Consumer Electronics show, stating:
“I mean, just look at all the places that [Alexa] is now—in the Consumer Electronics
Show, she was everywhere. She’s now working with Sonos [smart speakers], which
is a very big deal for a lot of blind people who care about their audio [quality].”
The podcaster suggests that integration with high-quality audio devices is “a very big deal” for
blind users, generally. Others noticed, similarly, that Alexa can now be used on a wide range of
kitchen appliances. One such podcaster even argued that these integrations are the most valuable
feature of Alexa, saying:
“And Amazon’s integrating into every single appliance. So, the actual physi-
cal speaker really does not mean as much as it used to. It’s just upgrading [sic]
that technology into current devices. And thats what Amazon is an expert at.”
Still other podcasters saw the increasing number of integrations of Alexa into other devices as a
sign of the potential for future integrations of Alexa into all appliances. One podcaster explained:
“. . . you [could integrate Alexa] on any platform, essentially. You can even build
[it] into fridges and washing machines, and all that stuff. I think this is a really
clever move by Amazon!”—(E20, 2018)
Podcasters were generally enthusiastic about current and potential integrations. Notably, the re-
peated identification of Alexa across these examples suggests it is beneficial for a single VAPA
platform to be used as an accessible way to interact with a range of devices.
While the podcasters above expected to see the integration of one VAPA platform with a wide
variety of devices, in four episodes, podcasters also expressed anticipation of the integration of
multiple VAPA platforms into single devices. We refer to this as a “one-appliance-to-many-VAPAs”
model. One podcaster identified a then-upcoming integration of this type, on the Sonos One smart
speaker:
“The Sonos One [currently] supports Alexa. And the interesting thing about the
Sonos One is that they’re continuing Sonos’s classic agnosticism, because they’re
also going to be supporting Google Assistant a bit later this year. So, with that one
speaker, you will be able to talk to both of these virtual assistants.”—(E27, 2018)
Integrating multiple VAPAs into one device allows users to interact with a wider range of devices
that support each VAPA individually, without the need to physically relocate to interface with a
different set of devices. So, as another podcaster noted, “the fact that these things are starting to
all work together is just pretty cool.”—(E16, 2018).
In addition to using one VAPA to access many devices and many VAPAs to access one device,
several podcasters in our sample (four episodes) also expressed excitement about seeing VAPA
platforms integrating with other VAPA platforms. We refer to this as a “one-VAPA-to-many-VAPAs”
model. For example, in 2018, Amazon and Microsoft reached an agreement to integrate their indi-
vidual VAPAs, Alexa and Cortana, to support interactions with each other. Podcasters were widely
appreciative of this move, one even noting, “Amazon has a lot to benefit [by partnering] with Mi-
crosoft, and vice versa.”—(E05, 2017).
The cross-compatibility of multiple VAPAs was identified by some podcasters as a way to max-
imize the range of possible interactions, or even as a way to capitalize on the relative advantages
of each VAPA for more flexibility and control. One podcaster even likened these integrations be-
tween multiple VAPAs to being multilingual, saying, “You don’t just speak one language anymore.
You don’t just use one operating system. You don’t just use one assistant. You have your Google
Home [and] you have things from Microsoft and Amazon.”—(E16, 2018).
Despite the general excitement about these integrations expressed by blind podcasters, some
saw the potential for corporate relationships to stymie future integrations between multiple VA-
PAs. For example, one, who was interested in hearing Siri read a book from their Audible account,
noted that being able to do so was probably “never gonna happen because of, you know, the Apple
and Amazon relationship. So, that’s never going to happen, unfortunately.”—(E27, 2018).
While we can sense the excitement and newsworthiness of platform integrations through these
examples, podcasters rarely explained how this might impact accessibility and user experience.
We speculate more about these in the Discussion.
4.3.4.2 They Offer Accessibility through Mainstream Tech. Offering accessibility in mainstream
technologies was identified as a significant benefit of VAPAs in five podcast episodes in our sample.
One podcaster noted how more and more appliances—like LG’s washers and dryers, as well as
smart TVs—are being shipped with voice assistants installed out-of-the-box; he enthusiastically
observed that this trend is “making the accessibility standard”—(E06, 2017). Another podcaster
made the case that mainstream voice assistants not only open accessibility for (younger) people
who are blind, but also for the significant subpopulation of older adults who are blind, who may
find other assistive technologies more challenging to use:
“We forget sometimes that 80% of the blind population is over the age of 65, and
it really troubles me that so much of that population continues to be left out of all
of this momentum. So, I think it’s just so exciting, on a bunch of levels, that we
have this mainstream piece of technology [Amazon Echo] that, just by virtue of it
being screenless, at least for some of these products, is 100% accessible and really
intuitive to use.”—(E27, 2018)
Simultaneously, several podcasters suggested that VAPAs make the knowledge of blind users rel-
evant and valuable for sighted users, who interact with VAPAs in the same way. This was par-
ticularly significant for the blind users in our sample—all technology podcasters—many of whom
were excited about the prospects of sighted users finding value in their tutorials. One noted:
“In this case, of course by virtue of the platform, everything’s accessible and so—
not only is this podcast of interest to the blind community—but it’s actually of
mainstream interest, because the exact way that you, as a blind person, interact with
the skill is the way that everybody interacts with the skill.”—(E27, 2018, emphasis
added)
Here, offering accessible interactions through mainstream technologies does not only benefit blind
users, it also allows sighted users to capitalize on valuable information provided by blind technol-
ogists, like this podcaster. Together, these examples show that the pervasive, intuitive nature of
VAPAs provides a universal experience, as opposed to one that slices and dices groups into dis-
parate categories based on visual ability.
4.3.4.3 They Are Affordable. One of the benefits of gaining access through a “mainstream de-
vice. . . [is] that it’s all done really well and really cheaply.”—(E01, 2018). Indeed, in seven episodes,
podcasters emphasized the affordability of VAPAs, using phrases like “not that expensive,” “cheap,”
especially when referring to the Amazon Echo Dot or the Google Home Mini. One podcaster jok-
ingly advised their listeners to “look out for [Google Home Minis in their] Christmas stocking”
because, at just $50, “it’s quite affordable.”—(E06, 2017). Another podcaster even identified VAPAs’
relatively low price-point as the reason they have become so popular for all users:
“. . . the rate of adoption of these smart assistants is actually quicker than the rate
of adoption of cell phones, when cell phones were first coming out, because the
entry level is so much more manageable from a cost point of view—you know 30
or 40 pounds [about 40–50 USD] for an Amazon Echo Dot.”—(E01, 2018)
It should be noted that earlier generations and larger smart speakers were often considered
“pricey” and “expensive,” with price points near $200 to $300, and podcasters tended to agree
that the improvements in speaker sound quality are not worth the extra money. When asked
whether he would be willing to buy a Harmon Kardon Invoke (with Cortana),42 one co-host replied
42 https://www.harmankardon.com/invoke.html.
“depends on the price, because I have a Harmon Kardon Studio and that cost me about 300-and-
some dollars, and that’s creeping up on the price of the Apple HomePod.”—(E05, 2017). So, while
we did see an overwhelming preference for the cheapest speakers on the market, some hosts were
willing and able to afford higher-end models.
4.3.4.4 They Are the Natural Interaction Mode for People Who Are Blind. Podcasters in five
episodes suggested that voice interactions are the most “natural,” “intuitive,” “frictionless” mode
of digital interaction for blind users. The sighted podcast host and his visually impaired guest from
(E01, 2018) went as far as to characterize voice assistants as the sort of pinnacle of human-computer
interaction, the “ultimate evolution,” removing the “last level of friction between a human being
interfacing with a computer.” In addition, voice-based interaction was seen as preferable to alter-
natives like desktop and mobile apps paired with a screen reader. The guest demonstrated this
point by juxtaposing his experience of using the Alexa app (“it’s not very accessible”) with his
experience of using the Echo smart speaker (“the Echo is incredibly suitable, applicable to people
with vision impairment, because most of the models don’t have a screen and most of the skills
aren’t utilizing visual stuff.”—E01, 2018).
Relatedly, another podcaster suggested that VAPAs on smart speakers without screens were
superior to those with screens (like Echo Show or Nest Hub):
“. . . the Google Home, the Amazon Echo, the Echo Dot. . . you can get [them] pretty
cheap and start your journey with a smart speaker, because as a visually impaired
person there’s no screen to worry about. So, you control it completely with your
voice.”—(E26, 2018, emphasis added)
The narratives around the preference for screenless VAPAs sometimes appear to be in conflict with
narratives about more universal accessibility, as the following:
“These voice assistants, there’s a huge amount of expectation and excitement, be-
cause... the application for people who have disabilities—whether it’s vision, motor,
cognitive, even hearing—everything she [Alexa] says comes up on the screen of
your app, and if you can’t speak, you can still interact with her by typing into the
app.”—(E01, 2018)
However, on closer inspection, the real fear around VAPAs with screens is that the designers have
not done their due diligence to make the voice interaction on par with visual. The blind podcaster
in (E01, 2018) explains convincingly a key reason why Siri’s interface is less usable than Alexa’s is
because it was initially developed with a screen:
“Siri often says, ‘I’ve done a web search for that’ and on-screen, a lot of, you know,
search results—and that’s really poor, that’s really not serving the user well. And,
the reason why she can do that is because she’s got a screen, and that’s the sort of
easy way out. Whereas the one reason why [Alexa’s] so much better in this area
is because she didn’t have that luxury of just dropping you to a web search on a
screen, because she didn’t have a screen. So, Amazon has put a lot of effort into
really making her smart and coming back with just the right bit of information
served up in a natural, you know, nugget.”—(E01, 2018)
An excellent example of how “gaps” in the auditory experience of VAPAs can be filled is the “ver-
bose mode” offered by Google Assistant and described in one podcast by a visually impaired Google
Accessibility Program Manager. This feature allows users to hear query responses read in detail
by Assistant, rather than sending the details of a response in visual form to a user’s chosen visual
display, which would then be accessed through a screen reader. The guest goes on to explain:
“So, these are small things that we are trying to do, to actually make the experience
really good for users with vision impairments. As it is, because it says it’s a voice-
based technology, the Google Assistant and the Google Home are already pretty
usable and accessible to visually impaired people, but whenever we have things like
these where we identify a gap, we actually are making sure that the experience is
actually on par with sighted users.”—(E19, 2018, emphasis added)
We see here that designing the voice interface such that it is as robust and continuous as any visual
interface experience is not only a matter of maintaining the natural mode of interaction brought
by speech, but also a matter of maintaining an interface that is egalitarian for all users across visual
ability.
4.3.4.5 They Accommodate People with Other and Multiple Disabilities. In five episodes, podcast-
ers also noted the benefits of VAPAs for people with disabilities other than visual impairments, and
people with multiple disabilities. Podcasters described scenarios in which VAPAs might be valu-
able to people with vision, motor, cognitive, and hearing disabilities, including people who have
Alzheimer’s disease. One guest described the experience of their sister, who has both visual and
mobility impairments, with VAPAs:
“She suddenly was able to unlock loads of things—you know, loads of different
choices of media information and environmental control, which we could talk
about, being able to control different objects around her, all by voice, in a very
affordable mainstream technology.”—(E01, 2018)
Another podcast guest, who is blind, described how parts of the Alexa interface are actually ac-
cessible to her mother, who has Alzheimer’s disease:
“I was actually able to set up their Echo device in Las Vegas, with this call feature,
right here from my home [in New York] . . . We were able to go back-and-forth call-
ing, and luckily my mom is remembering the [‘Alexa’ wakeword], because we’ve
been drilling it into her for a few months now. So, she’s able to activate the system
and easily call me.”—(E25, 2017)
We see that VAPAs’ voice-call feature facilitated setting up an Amazon Echo for their disabled
mother from across the country. In addition to supporting users with visual impairments, these
examples show how VAPAs provide an accessible way to navigate the digital world for people with
a variety of abilities. Here, too, it is noteworthy that the simultaneous accessibility of VAPAs for
blind podcasters and their family members with other disabilities allowed these blind technology
experts to assist their family in setting up and using VAPAs.
5 DISCUSSION
In each of our two studies, we found that the voice-input/audio-output modality of VAPAs pro-
vides an accessible digital interaction paradigm for blind users. Voice inputs and audio outputs
have long been recognized as valuable alternative input and sensory substitution strategies for
blind users interacting with digital devices (e.g., Azenkot and Lee 2013; Lazar et al. 2007]. We
found that VAPAs’ use of both strategies provides promising digital accessibility for users with
visual impairments, echoing the conclusions of Pradhan, Mehta, and Findlater [2018]. Our find-
ings suggest that some tasks blind users perform with VAPAs—setting reminders, playing mu-
sic, and requesting other hands-free assistance (3.4.1)—are similar to those performed by sighted
users, as identified by Luger and Sellen [2016]. However, many of the tasks and motivations for
which blind users employ VAPAs are related to productivity and efficiency (4.3.3). These include
composing professional emails, creating complex calendar events, and managing appliances
around the house. Moreover, blind users were interested in interacting with VAPAs to save time by
circumventing screen readers and leveraging continued conversation and multiple queries. While
VAPAs are valuable for sighted users, we suggest that the value and utility of VAPAs for blind
users are likely higher (4.3.4), particularly because VAPAs are mainstream and thus robust and
affordable (4.3.4.3), voice is seen by many blind users as a “natural” mode of interaction (4.3.4.4),
and VAPAs sometimes provide the only accessible UI for certain apps (3.4.2.1).
Many of our findings identified areas in which VAPAs are being used as a tool for accessibility.
For instance, while the turn to touchscreens has rendered many home appliances inaccessible to
blind users [Branham and Kane 2015], the podcasters in our sample saw VAPAs as a viable alter-
native for interfacing with everything from thermostats to washers and dryers to other speakers
using voice (3.4.1 and 4.3.4.1). Still, our results support Branham and Mukkath Roy’s [2019] findings
that commercial VAPA designs do not adequately account for the particular needs of this group
(3.4.2 and 4.3.2), and Vtyurina et al.’s [2019a, 2019b] that voice assistants and screen readers have
complementary qualities (4.3.3.1). Table 4 provides an overview of accessibility issues identified
across our two studies and design recommendations our research team proposes to address them.
Among the most numerous and frustrating of challenges documented in our study are the in-
ability to recover from speech recognition errors (3.4.2.4), lack of personalization (3.4.2.5), and
inaccessibility due to visual content and companion apps (3.4.2.7 and 4.3.2.3). Some of the issues
we documented—like misinterpreting commands and challenges with voice activation—have been
previously reported to pose usability issues for sighted users [Luger and Sellen 2016]. Indeed, as the
podcaster from (E27, 2018) said so eloquently, the “mainstream” and “accessible” nature of VAPA
platforms allows that the insights of blind podcasters are relevant to everyone. In this sense, al-
though our design recommendations are rooted in the experiences and insights of people who are
blind, in keeping with the notion of Universal Design [Story, Mueller, and Mace 1998], integrating
suggested improvements in these areas would likely improve the usability of VAPAs for all users.
We therefore see this body of work, and the work of fellow scholars in this space, as a particularly
fruitful instance of “the blind leading the sighted” 43 in technological improvement and innovation.
Below, we discuss these insights further to identify additional opportunities to make VAPAs more
non-visually accessible.
5.1 A VAPA Persona for Every Task

The findings of our studies suggest that, in contrast to sighted users who often approach VAPAs
as novelty features for entertainment purposes [Luger and Sellen 2016], blind users routinely em-
ploy VAPAs as a primary mode of interaction for completing productive tasks. Here, we define
“productivity” broadly to encompass the wide range of activities identified in which efficiency
was indicated as a primary concern in completing the activity, or tasks that were involved in a
user’s professional labor—for example, word processing a work-related document or maintaining
a calendar. In our interviews, voice interactions were frequently identified as a “time saver,” as
compared to other modes of input such as gestures or typing, supporting previous findings that
voice inputs are an efficient method of text entry for blind users [Azenkot and Lee 2013]. By con-
trast, difficulty recovering from errors, especially in more complex tasks like writing an email,
often decreased users’ efficiency.
43 The popular idiom “the blind leading the blind” is often used in a derogatory fashion to describe a situation in which an
ignorant party is providing misguided advice to another ignorant party. The authors do not condone this turn of phrase
and its implication of blindness as a condition of ignorance. Rather, our title suggests that there can be many occasions,
particularly in designing voice interface technology, where people who are blind can and should lead the way.
Similar to previous work [Luger and Sellen 2016; Pradhan, Mehta and Findlater 2018], we found
that the misinterpretation or misunderstanding of user inputs was a significant issue among par-
ticipants in Study 1. However, we note that these misinterpretations were more prevalent—and
consequential—when performing productive tasks where users’ inputs are complex, and inaccu-
rate interpretation of a user’s input may reflect poorly upon their professionalism. For instance,
interview participant P7 felt he needed to double-check the accuracy of emails and calendar ap-
pointments dictated to Siri to maintain a certain level of professionalism in his work life. When
using text dictation features, sighted users may be able to quickly confirm that VAPAs have un-
derstood their inputs visually. However, double-checking the accuracy of VAPA dictation is much
more laborious for blind users, who may have to employ an additional device or application to
verify their input. VAPAs could reduce this burden by providing spelling and grammar feedback
during dictation. Though this type of error-checking may be unnecessary, or even a nuisance in
casual conversations with personal contacts, it may be critical in other circumstances. We sug-
gest that providing multiple VAPAs would allow blind users to select a more appropriate level of
feedback and accuracy for their context. For example, we can imagine a persona for productive
email messaging that uses a faster, synthesized voice and includes voice commands (e.g., “Spell that
word for me,” “Change the last word to X”) and optional gestural input (e.g., swipe to navigate the
“cursor” back, double tap to edit) for finer-grained interaction.
Prior work has shown that many blind users prefer to receive audio outputs at faster speeds
than sighted users [Abdolrahmani et al. 2018], and we found many of the issues identified in using
VAPAs for productivity tasks were attributed to VAPAs’ output style mimicking human-human
conversation (3.4.2.5 and 4.3.3.1), echoing the findings of Mukkath Roy et al. [2019]. In many situ-
ations, like reading a received email, allowing VAPAs to speak faster than a human would increase
users’ efficiency. However, there are cases, like listening to digital books being read by a VAPA,
where human-like voices may be more appropriate and pleasing. The appropriateness of VAPAs’
output style is often dependent on environmental factors (3.4.2.2). For instance, seeking directions
in a high-noise environment may require a louder voice, while retrieving private or embarrassing
information in public, as in Study 1, may necessitate quieter feedback. Tailoring auditory outputs
to task and context, as can be done with a screen reader, would likely increase usability and acces-
sibility in a wider variety of situations. Since VAPAs are not primarily designed to be used as AT
[Branham and Mukkath Roy 2019], we do not argue that they should be viewed as replacements for
similar accessibility tools, like screen readers, nor should they necessarily behave in the same way.
Rather, we find similarly to Vtyurina and colleagues [2019a, 2019b] that VAPAs and screen readers
have many complementary strengths for blind users and, as such, are often used in conjunction.
So, it may be necessary for VAPA designers to recognize that many of their users have existing
expectations of auditory output styles, informed by their prior use of screen readers. Additionally,
we found that neither mimicking human-human conversation nor replicating the output of screen
readers was universally more acceptable. Instead, we found that whether rapid, mechanical out-
puts or human-like outputs were most appropriate depended heavily on the context and content
of the interaction, and offering simultaneous access to multiple, fundamentally different VAPA
personas would likely be an effective method for providing suitable support in variable contexts.
Additionally, our findings (3.4.3) suggest that blind VAPA users do not wholly prefer any one
VAPA platform over another, but instead choose different VAPAs for different types of tasks based
on their strengths (as briefly noted in Luger and Sellen [2016]). Rather than a singular, consistent,
and static VAPA personality, we suggest that VAPA designers could incorporate multiple conver-
sational personas into each VAPA. Users may benefit from having access to several customizable
VAPA personas—one for cooking, another for navigating, another for scheduling, yet another for
word processing, and so on. This is not dissimilar from the way that actual personal assistants
in real life operate, in that the person best suited to assist in the kitchen may have a very differ-
ent skillset and interaction style as compared to the person who provides accurate navigational
directions. Still, there are situations—like drafting an email as discussed above—where a human-
like agent is not ideal. Therefore, designers should allow users to flexibly configure the speed,
tone, volume, and other characteristics of each persona’s interaction style. As VAPAs continue
to increase in fidelity, designers might consider shifting goals from replicating the behaviors of a
Jane-of-all-trades assistant to an array of assistants that are tailored to the task at hand and their
users’ preferences.
5.2 Continuity of the Voice Interface for Equitable Interactions

Several participants identified the importance of VAPAs being mainstream technology—not only
because they are cheaper and more readily available than custom technologies—but because they
provide more equitable interactions for blind and sighted users (4.3.4.2). That is, because VAPAs’
core voice-input/audio-out interactions are accessible to both blind and sighted users, their individ-
ual VAPA experiences are more comparable than blind and sighted users’ respective experiences
of many other digital devices, like touchscreen mobile phones. This lowers the potential for in-
ducing stigma that is known to be created when disabled users interact with separate assistive
technologies [Shinohara and Wobbrock 2011] and better ensures that there is no disparity in the
information available to blind and sighted users. Providing analogous user interactions for blind
and sighted users was seen as contributing to greater equality in the user experience of VAPAs
(4.3.4.4), in contrast to many approaches that layer accessibility features, like screen readers, on
top of interfaces that are designed from inception to cater to sighted users’ needs. For example,
considering the podcaster in (E06, 2017) in Study 2 who described VAPAs design as “making the ac-
cessibility standard,” draws attention to the many technologies in which accessible interactions—if
included at all—are inferior to the “standard” experience.
Still, our findings show several areas where VAPA designs privilege the use of sight (3.4.2.7,
4.3.2.2, and 4.3.2.3). For instance, ambient indicators of a notification on Amazon’s Echo are signi-
fied using only a light. Because there is no auditory equivalent, a blind user who was outside the
range of the initial auditory notification might not be aware of an item awaiting their attention.
While it may be inappropriate to provide an analogous always-on auditory notification, a periodic
auditory notification for this ambient information would reduce the disparity in information pro-
vided to blind and sighted users. Similarly, in the data collected in Study 2, we found many blind
podcasters devote a significant amount of time to describing the physical layout of VAPA devices.
Many home-based devices, like Amazon Echo, are physically symmetrical, and their buttons are
flush to their surface, making it difficult to tactilely determine their location or purpose. These
features, and their availability, are often hidden from blind VAPA users. Though these issues were
particularly problematic for blind users, they could be solved relatively easily by designers, for
instance, by including an option for VAPAs to provide an auditory description of their physical
layout modeled after those provided by many of the podcasters in Study 2.
While some VAPAs offer users the ability to ask for a tutorial or a list of available features, many
VAPAs provide visual responses to these requests. For instance, Alexa directs users to a list of
available skills that is presented visually in the Alexa mobile application. Similarly, Siri presents
a visual list of commands on the device’s screen. In these and other instances, the continuity of
the voice interface is disrupted, causing accessibility challenges for people who are blind (3.4.2.7
and 4.3.2.1), as well as sighted individuals who have their hands or eyes preoccupied [Luger and
Sellen 2016]. Additionally, previous research by Rodrigues et al. [2018] found that switching con-
texts between multiple devices increases users’ cognitive workload. So, providing a method for
more effectively browsing enabled features auditorily could better facilitate blind users’—and all
users’—initial acclimation to their device, while helping continuing users to recall unfamiliar
or infrequently used features. More generally, we designers should provide continuous voice-
input/audio-output access to all VAPA features, including tutorials, help menus, personalization
interfaces, search results, and input verification, for both accessibility and consistency.
In addition to increasing accessibility, these challenges may take on new significance when
placed in the context of our finding that VAPAs are becoming increasingly central to blind users’
lives (3.4.1, 3.4.2.3, and 4.3.3). For instance, many of our informants used VAPAs to obtain navi-
gational information in public spaces. However, several preferred to interact with VAPAs through
headsets, to keep their mobile device in their pocket and reduce the risk of their phone being
stolen or of being physically harmed themselves. Yet, we found in Study 1 that VAPAs sometimes
produce outputs in a visual format on a mobile phone, which may require a blind user to hold
the device and navigate the output with a screen reader (3.4.2.7). In this light, inaccessible VAPA
outputs may not only be an inconvenience, but could pose threats to safety [Branham et al. 2017].
Collectively, understanding how, why, and in which contexts blind users employ VAPAs provides
new implications and responsibilities for their design. In the scenario just described, for example,
these findings emphasize the significance of consistent non-visual outputs for blind users.
5.3 Blind Tech Podcasters as a Source of Usability Expertise

In conducting this study, we found podcasts—an audio-only medium that lends itself to consump-
tion by people who are blind—to be a more plentiful source of content generated by blind users
than other sources of extant technology review data (e.g., YouTube videos). Our initial search for
podcasts retrieved 67 results concerning visual impairments and VAPA technology, and after fil-
tering, we deemed 28 unique episodes from nine unique podcasts, totaling six hours of audio, to
be directly relevant to our study. Despite this availability, we know of no previous research using
podcasts to investigate blind technology users.
When selecting data sources for studying the blind population, abundance is only one con-
sideration. As described in Charmaz [2006] and Reinharz [1992], it is also important to consider
aspects like the authorship and audience of the content, whose agendas the content foregrounds,
the degree to which the content can be “talked back to” with dialogic questioning, and how data
collection may adversely affect involved parties. Below, we compare the benefits and drawbacks
of podcasts in contrast to interviews, as these each represent one potential source of extant and
elicited data, respectively.
Extant data are often complementary to elicited data [Charmaz 2006, p. 37; Reinharz 1992,
p. 148]. Such was the case with our interview and podcast studies, and the interview data might
even be construed as supplementary to the podcast data. We found that our study of podcasts repli-
cated most of the themes (16 of 23) identified in our interview study, and it added another 13 unique
low-level and 3 high-level themes to our analysis. We posit that the podcast study represented a
near superset of interview themes, because podcasts had an array of purposes—news about and re-
views of assistive/accessible technology, career paths and success stories of blind individuals, and
general voice assistant features. Further, episodes were created in conversation with co-hosts and
guests, each bringing different perspectives. Finally, episodes were created over a period of more
than two years, as VAPA capabilities and public adoption were in flux. In contrast, interviews were
conducted by a single interviewer, with a single semi-structured interview script, over a period of
just one month in 2018. Although podcasts represented a smaller raw dataset in comparison to in-
terviews (6 hours of relevant podcast audio transcribed, as opposed to over 12 hours of interview
audio transcribed), the diversity of podcasts along dimensions described above may be the more
influential factor. This hypothesis warrants further investigation, especially as it may provide a
strong indicator of the value of seeking out podcast data before conducting interviews.
The identity of content producers is important in the analysis process [Charmaz 2006, p. 39;
Reinharz 1992, p. 145]. Identity comes into play in multiple respects in our studies, including one’s
socioeconomic status and degree of technical proficiency, as well as the status of their vision. Both
studies included highly proficient and enthusiastic technology users, many of whom worked as
professionals in the tech sector and were early adopters of VAPAs. Several podcasters reported
from the National Federation of the Blind’s (NFB’s) annual convention and/or the CSUN Assis-
tive Technology Conference, suggesting that they are people of financial means, and therefore
not representative of the many people who are blind who are un(der)employed and/or cannot
afford to attend. While we could have used additional inclusion criteria to diversify our sample
when recruiting for our interview study, we cannot control who chooses to become one of a hand-
ful of blind technology podcasters. It may be the case that choosing to draw from technology
podcasts hosted by individuals who are blind will always be biased towards technology-savvy
users—the technical labor involved in production of podcasts may deter people with low technical
self-efficacy from attempting to produce them.
In addition, eight of nine podcasts were created by blind people, about blind people, and for blind
people. Similarly, interviews were co-constructed by a blind researcher and blind participants with
content concerning blind people, although the audience was predominantly sighted researchers.
We did not see any qualitative differences relating to this distinction, but we might expect substan-
tive differences had interviews been conducted by a sighted person; for example, we can imagine
blind interviewees might emphasize their capabilities more than incapabilities to fend off ableist
presuppositions of a sighted conversational partner. Circling back to the importance of identity
in data analysis, Reinharz [1992] notes that content produced, for example, “by women, about
women, for women” (p. 156) holds significance in that it can raise the visibility of marginalized
populations and allow them to tell their own stories. This insight may be particularly beneficial
in the context of Assistive Technology research, where studying representative users is important
for a study’s accuracy [Sears and Hanson 2011], and the limited availability of users with a given
disability often results in methodological missteps, like simulating disability. Podcasts like those
in our study may circumvent these challenges.
The degree to which data are produced anonymously, privately, and with opportunities to fact-
check can alter trustworthiness of resulting interpretations. Podcasts are publicly syndicated and
de-anonymized, whereas our research interviews were produced in a more intimate one-on-one
setting, and thereafter anonymized for broader audiences. While interviews may suffer from par-
ticipants’ desire to “appear more affable, intelligent, or politically correct” [Charmaz 2006, p. 36],
we can imagine podcasts bring with them a heightened need to put a best foot forward, as these
data will be widely distributed in a de-anonymized form and available for years after produc-
tion. Complicating this picture, we know that “on the Internet, participants may alter what we
define as basic information” [Charmaz 2006, p. 39] to project a desired persona. With inter-
views, a primary benefit in this respect is their dialogic nature, such that the interviewer may
ask probing questions to check the stories of participants [Charmaz 2006, p. 36]. We puzzled over
whether this may have altered participants’ self-reports of satisfaction with and utility of VAPAs.
However, we ultimately did not see any notable differences across our studies. Nevertheless,
researchers weighing the option between podcasts and interviews may want to take this into
consideration.
As described in our literature review, extant texts are produced outside a research context
[Charmaz 2006; Reinharz 1992] and therefore do not require researcher incentives, such as free
travel accommodations to the research site or monetary compensation for participants’ time.
Neither do researchers need to worry about ethical entanglements of overburdening members
of marginalized populations to produce research artifacts (like interviews) that may not directly
benefit them. However, new ethical questions arise about appropriating texts not intended for the
kind of close scrutiny and public dissemination that takes place when the data become the subject
of researchers.
We believe that the very public nature of podcasts, and thus the podcasters’ expectation that a
wide variety of people will have full access, helps to mitigate these ethical concerns. Additionally,
we believe that including people with disabilities in the research team—especially in leadership
positions—can support more accurate and acceptable translations of the original message. Our
research team is actively developing voice agents that can address some of the issues raised, and
we are working with corporate partners to improve mainstream VAPA technologies. We believe
that the case for studying podcasts hosted by people who are blind is best matched to the situation
where the research team is committed to amplifying and actively working towards the needs of
the marginalized groups they study. Finally, we will note that podcasters, especially those who
review technologies, may rely on outside incentives/sponsorships. Technology companies may
donate equipment or pay the podcasters to review or endorse their products, which may lead to
skewed positive reviews. At least one episode was sponsored by a VAPA vendor (E19, sponsored by
Google). However, several of the podcasters in our study explicitly noted that they did not receive
any outside sponsorship. Perhaps the bigger worry is that podcasters are not compensated at all
for their labor. In this case, we believe the onus is on the research team to report study results in
accessible formats for the benefit of the general public and to actively seek to improve the state of
technology.
One final observation we make about using podcasts over interviews is that podcasts are serial
productions and therefore a collection of episodes may represent multiple contributions from the
same podcaster and be distributed over a significant period of time. For example, four of our nine
podcasters contributed to 21 of the total 28 podcasts. Our podcast episode release dates spanned
from March 4, 2016, to June 30, 2018, a period during which Amazon Echo alone released multiple
new hardware platforms and no doubt underwent several software updates. On the contrary, our
interview study did not include follow-ups with the same participants, and data collection was
relatively contemporaneous. A challenge of analyzing podcast data is that it becomes difficult
to contextualize the data in terms of the cultural moment and the technical capabilities of the
platform at a particular time. This may also constitute a form of repeated sampling that is prevalent
among studies of disability [Dee and Hanson 2016]. However, this may also sidestep significant
ethical dilemmas related to overly burdening people with disabilities, who may be particularly
susceptible to “undue inducement,” when compensation for study participation is excessive to
the point of coercion [Ballantyne 2019], as inequitable economic structures disproportionately
impoverish people with disabilities [Palmer 2011]. A potential benefit of leveraging podcasts is
that data are longitudinal and may therefore allow researchers to place more confidence in results
and identify patterns over time.
In summary, podcasts as a data source for understanding technology design with the blind
population—just like any source of data—comes with its own strengths and weaknesses. As “found”
texts, they offer a sort of persuasive credibility that must be questioned and contextualized with
respect to the authors, medium, and audience. With podcasts, some common ethical and method-
ological concerns are mitigated (such as undue inducement or recruiting challenges), and others
are raised (such as non-consensual data collection). We believe that future work may also bene-
fit from integrating podcasts, not simply as a means of triangulation, but perhaps as a primary
and preliminary data gathering mechanism. As such, podcasts may reduce the burden on blind
participants to generate elicited texts to address research questions for which extant data may be
sufficient.
6 CONCLUSION AND FUTURE WORK

Voice-Activated Personal Assistants (VAPAs) are increasingly part of the everyday lives of people
around the world, regardless of (dis)ability. As this technology emerges, studies routinely con-
clude that their interfaces pose myriad usability challenges and opportunities. Our study con-
tributes a foundational accounting of how people who are blind use these voice-based interfaces to
set reminders, add calendar events, access recipes while cooking—similar to many voice-assistant
users—and also stretch them to their limits by using multiple VAPAs to interact with a broad range
of applications and appliances and to accomplish work- and productivity-oriented tasks. Moreover,
we find that people who are blind—and especially blind technology podcasters—are expert voice-
interface users who can and should be consulted during VAPA design from the very beginning.
The design insights we gained from our study—that VAPAs should have multiple personas and
have continuity of voice interaction—are recommendations that have the potential to improve the
usability of VAPAs for blind and sighted individuals.
ACKNOWLEDGMENTS
We thank Cary Chin, Marc Lazaga, Priyanka Soni, Sidas Saulynas, and Mei-Lian Vader for their
help with data gathering and analysis. Additionally, we thank our generous blind participants and
the labor of the blind podcasters and guests for sharing their knowledge of voice interfaces.
REFERENCES
A. Abdolrahmani, R. Kuber, and A. Hurst. 2016. An empirical investigation of the situationally-induced impairments expe-
rienced by blind mobile device users. In Proceedings of the 13th Web for All Conference (W4A’16). 21, 1–8.
A. Abdolrahmani, R. Kuber, and S. M. Branham. 2018. Siri talks at you: An empirical investigation of voice-activated
personal assistant (VAPA) usage by individuals who are blind. In Proceedings of the 20th International ACM SIGACCESS
Conference on Computers and Accessibility (ASSETS’18). 249–258.
J. Albouys-Perrois, J. Laviole, C. Briant, and A. M. Brock. 2018. Towards a multisensory augmented reality map for blind
and low vision people: A participatory design approach. In Proceedings of the CHI Conference on Human Factors in
Computing Systems (CHI’18). ACM, New York, NY, 629, 1–629:14. https://doi.org/10.1145/3173574.3174203
L. Anthony, Y. Kim, and L. Findlater. 2013. Analyzing user-generated YouTube videos to understand touchscreen use by
people with motor impairments. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1223–
1232.
S. Azenkot and N. B. Lee. 2013. Exploring the use of speech input by blind people on mobile devices. In Proceedings of the
15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’13). 11, 1–8.
A. Ballantyne. 2019. Fair compensation or undue inducement? Yale Interdisciplinary Center for Bioethics. Retrieved
from: https://bioethics.yale.edu/research/irb-case-studies/irb-case-payments-subjects-who-are-substance-abusers/
fair-compensation-or.
M. Baldauf, R. Bösch, C. Frei, F. Hautle, and M. Jenny. 2018. Exploring requirements and opportunities of conversational
user interfaces for the cognitively impaired. In Proceedings of the 20th International Conference on Human-Computer
Interaction with Mobile Devices and Services Adjunct. 119–126.
F. Bentley, C. Luvogt, M. Silverman, R. Wirasinghe, B. White, and D. Lottrjdge. 2018. Understanding the long-term use of
smart speaker assistants. Proc.e ACM Interact., Mob., Wear. Ubiq. Technol. 2, 3 (2018), 91.
T. Bogers, A. A. A. Al-Basri, C. O. Rytlig, M. E. B. Møller, M. J. Rasmussen, N. K. B. Michelsen, and S. G. Jørgensen. 2019. A
study of usage and usability of intelligent personal assistants in Denmark. In Proceedings of the International iConference
(iConference’19).
E. C. Bouck, S. Flanagan, G. S. Joshi, W. Sheikh, and D. Schleppenbach. 2011. Speaking math—A voice input, speech output
calculator for students with visual impairments. J. Spec. Educ. Technol. 26, 4 (2011), 1–14.
S. M. Branham, A. Abdolrahmani, W. Easley, M. Scheuerman, E. Ronquillo, and A. Hurst. 2017. Is someone there? Do they
have a gun?: How visual information about others can improve personal safety management for blind individuals. In
Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’17). 260–269.
S. M. Branham and S. K. Kane. 2015. Collaborative accessibility: How blind and sighted companions co-create accessible
home spaces. In Proceedings of the 33rd ACM Conference on Human Factors in Computing Systems (CHI’15). ACM, New
York, NY, 2373–2382. https://doi.org/10.1145/2702123.2702511.
S. M. Branham and A. R. Mukkath Roy. 2019. Reading between the guidelines: How commercial voice assistant guidelines
hinder accessibility for blind users. In Proceedings of the ACM SIGACCESS Conference on Computers & Accessibility
(ASSETS’19).
R. N. Brewer, M. Cartwright, E. Karp, B. Pardo, and A. M. Piper. 2016. An approach to audio-only editing for visually
impaired seniors. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility
(ASSETS’16). ACM, New York, NY, 307–308. https://doi.org/10.1145/2982142.2982196
R. M. Brewer and A. M. Piper. 2017. Rethinking design for aging and accessibility through an IVR blogging system. Proc,
ACM Hum,-Comput, Interact. 1, CSCW, Article 26 (December 2017), 17 pages. https://doi.org/10.1145/3139354
C. Carroll, C. Chiodo, A. Lin, M. Nidever, and J. Prathipati. 2017. Robin: Enabling independence for individuals with cog-
nitive disabilities using voice assistive technology. In Proceedings of the CHI Conference Extended Abstracts on Human
Factors in Computing Systems. 46–53.
K. Charmaz. 2006. Constructing Grounded Theory: A Practical Guide through Qualitative Analysis. Sage.
M. L. Chen and H. C. Wang. 2018. How personal experience and technical knowledge affect using conversational agents.
In Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion (IUI’18). 53:1–2.
C. Chhetri and V. G. Motti. 2019. Eliciting privacy concerns for smart home devices from a user centered perspective. In
Proceedings of the International iConference (iConference’19).
A. Coskun-Setirek and S. Mardikyan. 2017. Understanding the adoption of voice activated personal assistants. Int. J. E-Serv.
Mob. Appl. 9, 3 (2017), 1–21.
B. R. Cowan, N. Pantidi, D. Coyle, K. Morrissey, P. Clarke, S. Al-Shehri, D. Earley, and N. Bandeira. 2017. What can I
help you with?: Infrequent users’ experiences of intelligent personal assistants. In Proceedings of the 19th International
Conference on Human-Computer Interaction with Mobile Devices and Services (MobileHCI’17). 43 1–12.
A. E. Chung, A. C. Griffin, D. Selezneva, and D. Gotz. 2018. Health and fitness apps for hands-free voice-activated assistants:
Content analysis. JMIR mHealth uHealth 6, 9 (2018), e174.
M. Dee and V. L. Hanson. 2016. A pool of representative users for accessibility research: Seeing through the eyes of the
users. ACM Trans. Access. Comput. 8, 1 (2016), 4.
A. Easwara Moorthy and K. P. L. Vu. 2015. Privacy concerns for use of voice activated personal assistant in the public space.
Int. J. Hum.-Compu. Interact. 31, 4 (2015), 307–335.
C. Efthymiou and M. Halvey. 2016. Evaluating the social acceptability of voice based smartwatch search. In Proceedings of
the Asia Information Retrieval Symposium. 267–278.
I. Guy. 2016. Searching by talking: Analysis of voice queries on mobile web search. In Proceedings of the 39th International
ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’16). 35–44.
S. Han and H. Yang. 2018. Understanding adoption of intelligent personal assistants: A parasocial relationship perspective.
Industr. Manag. Data Syst. 118, 3 (2018), 618–636.
J. Lazar, A. Allen, J. Kleinman, and C. Malarkey. 2007. What frustrates screen reader users on the web: A study of 100 blind
users. Int. J. Hum.–Comput. Interact. 22, 3 (2007), 247–269.
Y. Liao, J. Vitak, P. Kumar, M. Zimmer, and K. Kritikos. 2019. Understanding the role of privacy and trust in intelligent
personal assistant adoption. In Proceedings of the International iConference (iConference’19).
I. Lopatovska, K. Rink, I. Knight, K. Raines, K. Cosenza, H. Williams, P. Sorsche, D. Hirsch, Q. Li, and A. Martinez. 2018.
Talk to me: Exploring user interactions with the Amazon Alexa. J. Librar. Inform. Sci. 51, 4 (2018), 984–97.
S. B. Lovato and A. M. Piper. 2015. Siri, is this you?: Understanding young children’s interactions with voice input systems.
In Proceedings of the 14th International Conference on Interaction Design and Children (IDC’15). ACM, New York, NY,
335–338. https://doi.org/10.1145/2771839.2771910
S. B. Lovato, A. M. Piper, and E. A. Wartella. 2019. Hey Google, do unicorns exist?: Conversational agents as a path to an-
swers to children’s questions. In Proceedings of the 18th ACM International Conference on Interaction Design and Children
(IDC’19). ACM, New York, NY, 301–313. https://doi.org/10.1145/3311927.3323150
E. Luger and A. Sellen. 2016. Like having a really bad PA: The gulf between user expectation and experience of conversa-
tional agents. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’16). 5286–5297.
W. E. Mackay and A. L. Fayard. 1997. HCI, natural science and design: A framework for triangulation across disciplines. In
Proceedings of the 2nd Conference on Designing Interactive Systems: Processes, Practices, Methods and Techniques. 223–234.
N. Mallat, V. Tuunainen, and K. Wittkowski. 2017. Voice activated personal assistants–Consumer use contexts and usage
behavior. In Proceedings of the Americas Conference on Information Systems (AMCIS’17). Retrieved from: https://aisel.
aisnet.org/cgi/viewcontent.cgi?article=1548&context=amcis2017.
A. R. Mukkath Roy, A. Abdolrahmani, R. Kuber, and S. M. Branham. 2019. Beyond being human: The (In) accessibility
consequences of modeling VAPAs after human-human conversation. In Proceedings of the International iConference
(iConference’19).
A. M. Mulloy, C. Gevarter, M. Hopkins, K. S. Sutherland, and S. T. Ramdoss. 2014. Assistive technology for students with
visual impairments and blindness. Assistive Technologies for People with Diverse Abilities. Springer, 113–156.
E. Murphy, R. Kuber, G. McAllister, P. Strain, and W. Yu. 2008. An empirical investigation into the difficulties experienced
by visually impaired internet users. Univ. Access Inform. Soc. 7 (1–2), 79–91.
H. Nicolau, K. Montague, T. Guerreiro, A. Rodrigues, and V. L. Hanson. 2017. Investigating laboratory and everyday typing
performance of blind users. ACM Trans. Access. Comput. 10, 1 4 1–4:26. https://doi.org/10.1145/3046785
M. Palmer. 2011. Disability and poverty: A conceptual review. J. Disab. Policy Stud. 21, 4 (2011), 210–218.
F. Portet, M. Vacher, C. Golanski, C. Roux, and B. Meillon. 2013. Design and evaluation of a smart home voice interface for
the elderly: Acceptability and objection aspects. Person. Ubiq. Comput. 17, 1 (2013), 127–144.
A. Pradhan, K. Mehta, and L. Findlater. 2018. Accessibility came by accident: Use of voice-controlled intelligent personal
assistants by people with disabilities. In Proceedings of the CHI Conference on Human Factors in Computing Systems
(CHI’18). 459:1–13.
L. Ran, S. Helal, and S. Moore. 2004. Drishti: An integrated indoor/outdoor blind navigation system and service. In Pro-
ceedings of the 2nd IEEE Conference on Pervasive Computing and Communications (PerCom’04). 23–30. https://doi.org/
10.1109/PERCOM.2004.1276842
S. H. Reinharz and L. Davidman. 1992. Feminist Methods in Social Research. Oxford University Press.
A. Rodrigues, L. Camacho, H. Nicolau, K. Montague, and T. Guerreiro. 2018. Aidme: Interactive non-visual smartphone
tutorials. In Proceedings of the 20th International Conference on Human-Computer Interaction with Mobile Devices and
Services Adjunct (MobileHCI’18). ACM, New York, NY, 205–212. https://doi.org/10.1145/3236112.3236141
G. Schiavo, O. Mich, M. Ferron, and N. Mana. 2017. Mobile multimodal interaction for older and younger users: Exploring
differences and similarities. In Proceedings of the 16th International Conference on Mobile and Ubiquitous Multimedia
(MUM’17). 407–414.
A. Sears and V. Hanson. 2011. Representing users in accessibility research. In Proceedings of the SIGCHI Conference on
Human Factors in Computing Systems. 2235–2238.
Kristen Shinohara and Josh Tenenberg. 2007. Observing Sara: A case study of a blind person’s interactions with technology.
In Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’07). ACM,
New York, NY, 171–178. https://doi.org/10.1145/1296843.1296873
K. Shinohara and J. O. Wobbrock. 2011. In the shadow of misperception: Assistive technology use and social interactions.
In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). 705–714.
K. Shinohara, J. O. Wobbrock, and W. Pratt. 2018. Incorporating social factors in accessible design. In Proceedings of the 20th
International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’18). ACM, New York, NY, 149–160.
Kevin M. Storer and Stacy M. Branham. 2019. That’s the way sighted people do it: What blind parents can teach technology
designers about co-reading with children. In Proceedings of the Designing Interactive Systems Conference (DIS’19). ACM,
New York, NY, 385–398. https://doi.org/10.1145/3322276.3322374
M. F. Story, J. L. Mueller, and R. L. Mace. 1998. The universal design file: Designing for people of all ages and abilities.
School of Design, the Center for Universal Design, NC State, Raleigh.
A. Vtyurina, A. Fourney, M. R. Morris, L. Findlater, and R. W. White. 2019a. Bridging screen readers and voice assistants
for enhanced eyes-free web search. In Proceedings of the World Wide Web Conference. ACM, 3590–3594.
A. Vtyurina, A. Fourney, M. R. Morris, L. Findlater, and R. W. White. 2019b. VERSE: Bridging screen readers and voice
assistants for enhanced eyes-free web search. In Proceedings of the 21st International ACM SIGACCESS Conference on
Computers and Accessibility. ACM.
L. Wulf, M. Garschall, J. Himmelsbach, and M. Tscheligi. 2014. Hands free—care free: Elderly people taking advantage of
speech-only interaction. In Proceedings of the 8th Nordic Conference on Human-Computer Interaction: Fun, Fast, Founda-
tional (NordiCHI’14). 203–206.
C. Yu, H. Shane, R. W. Schlosser, A. O’Brien, A. Allen, J. Abramson, and S. Flynn. 2018. An exploratory study of speech-
language pathologists using the Echo ShowTM to deliver visual supports. Adv. Neurodev. Disord. 2, 3 (2018), 286–292.
Y. Zhong, T. V. Raman, C. Burkhardt, F. Biadsy, and J. P. Bigham. 2014. JustSpeak: Enabling universal voice control on
Android. In Proceedings of Web for All Conference (W4A’14). 36 1–4.
Received April 2019; revised September 2019; accepted October 2019

Blind Leading The Sighted: Drawing Design Insights From Blind Users Towards More Productivity-Oriented Voice Interfaces

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Blind Leading The Sighted: Drawing Design Insights From Blind Users Towards More Productivity-Oriented Voice Interfaces

Uploaded by

Copyright:

Available Formats

Blind Leading the Sighted: Drawing Design Insights

from Blind Users towards More Productivity-oriented

2.2 Voice Interaction for Diverse Groups

2.3 Extant vs. Elicited Methods in the Study of Marginalized Users

3 STUDY 1: INTERVIEW STUDY

3.2 Study Participants

Table 1. Participant Demographics

ID Age Visual Impairment Description Mobile Home

3.3 Interviews and Data Analysis

3.4 Interview Study Findings

3.4.2 The Usability and Accessibility of VAPAs.

3.5 Study 1 Summary

4 STUDY 2: PODCAST STUDY

ID Podcast Title Episode Title Air Date Min.

4.3 Podcast Study Findings

Open Code # Example

Table 4. Recommendations for Rectifying Accessibility Issues

Accessibility Issue Design Recommendation

5.1 A VAPA Persona for Every Task

5.2 Continuity of the Voice Interface for Equitable Interactions

5.3 Blind Tech Podcasters as a Source of Usability Expertise

6 CONCLUSION AND FUTURE WORK

Received April 2019; revised September 2019; accepted October 2019

You might also like