You are on page 1of 21

Received: 14 June 2022 Revised: 1 December 2022 Accepted: 15 December 2022

DOI: 10.1002/widm.1487

ADVANCED REVIEW

Review of artificial intelligence-based question-answering


systems in healthcare

Leona Cilar Budler 1 | Lucija Gosak 1 | Gregor Stiglic 1,2,3

1
Faculty of Health Sciences, University of
Maribor, Maribor, Slovenia Abstract
2
Faculty of Electrical Engineering and Use of conversational agents, like chatbots, avatars, and robots is increasing
Computer Science, University of Maribor, worldwide. Yet, their effectiveness in health care is largely unknown. The aim
Maribor, Slovenia
3
of this advanced review was to assess the use and effectiveness of conversa-
Usher Institute, University of Edinburgh,
Edinburgh, UK
tional agents in various fields of health care. A literature search, analysis, and
synthesis were conducted in February 2022 in PubMed and CINAHL. The
Correspondence included evidence was analyzed narratively by employing the principles of
Gregor Stiglic, Faculty of Health Sciences,
University of Maribor, Zitna ulica 15, 2000 thematic analysis. We reviewed articles on artificial intelligence-based
Maribor, Slovenia. question-answering systems in health care. Most of the identified articles
Email: gregor.stiglic@um.si
report its effectiveness; less is known about its use. We outlined study findings
Funding information and explored directions of future research, to provide evidence-based knowl-
Javna Agencija za Raziskovalno Dejavnost edge about artificial intelligence-based question-answering systems.
RS, Grant/Award Numbers: N2-0101,
P2-0057 This article is categorized under:
Fundamental Concepts of Data and Knowledge > Human Centricity and
Edited by: Elisa Bertino, Associate Editor User Interaction
and Witold Pedrycz, Editor-in-Chief Application Areas > Health Care
Technologies > Artificial Intelligence

KEYWORDS
artificial intelligence, conversational agents, ChatGPT, health care, machine learning

1 | INTRODUCTION

Conversational agents, like chatbots, avatars, robots, embodied conversational agents (ECAs), and virtual patients are
increasingly substituting humans, especially in the applications with simple interaction between the computer and the
end-user (Van Pinxteren et al., 2020). Car et al. (2020) defined conversational agents as: “computer programs designed
to simulate human text or verbal conversations.” They use artificial intelligence (AI) algorithms to interpret user dialog
and conduct useful interactions with users (Montenegro et al., 2019). The key reasons why users interact with conversa-
tional agents are different, as seen by the variety of available conversational agents, which span from the popular
general-purpose voice assistants to domain-specific chatbots (van Heerden et al., 2017). Conversational agents are often
used to provide relevant information about a disease or discussing the results of clinical tests (van Heerden et al., 2017).
Employing conversational agents in clinical practice has various benefits for the health care system, such as support
of health care professionals and patients, disease screening, triage, counseling, health management, and training for

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided
the original work is properly cited.
© 2023 The Authors. WIREs Data Mining and Knowledge Discovery published by Wiley Periodicals LLC.

WIREs Data Mining Knowl Discov. 2023;13:e1487. wires.wiley.com/dmkd 1 of 21


https://doi.org/10.1002/widm.1487
2 of 21 BUDLER ET AL.

health care professionals (Milne-Ives et al., 2020). In recent years, we observe the rapid development of health focused
apps with chatbots, also called “healthbots.” Despite the marketing promises, a study by Parmar et al. (2022) revealed
that only a minority of applications use machine learning based natural language processing techniques. Most applica-
tions supported a finite-state input, in which the system guided the discourse and followed a preset procedure.
Healthbots have the potential to transform care by focusing it on the user; yet they are still in their development and
need further work on development, automation, and adoption to have a population-level health effect.

2 | C L ASS I FICATION OF QUES TION–ANSWERING SYSTEMS

There are different classifications of question-answering systems (QAS) in use. Nißen et al. (2022) classify CAS based
on their “temporal profile,” basically dividing them into short-term and long-term QAS. Short-term QAS are systems in
which users usually communicate with the system just once—for example, self-diagnosis health care chatbots—or for
repeated, ongoing interactions over a longer period; for example, a health care chatbot assisting patients with chronic
illness management (Kowatsch et al., 2018). On the other hand, it is also possible to classify the QAS based on the
knowledge or answer representation where two main categories are usually used, that is, flat and hierarchical
taxonomies. Flat taxonomies have only one level of categories while hierarchical taxonomies consist of super- and sub-
categories (Sundblad, 2007). Recently the first two categories have been extended by the knowledge graph-based sys-
tems (Figure 1) where answers are provided based on data storage structures that rely on principles from graph theory
to represent information (Pietrasik & Reformat, 2020). Knowledge graphs are multi-relational graphs that use triples to
describe entities and their relationships. A triple is often stated as (head entity, relation, tail entity) to indicate that two
entities are associated by a certain connection, such as (Metformin, treats, Type 2 diabetes mellitus). More recently, the
knowledge graph embedding algorithms are gaining even more attention as an integral part of the QAS. These algo-
rithms can learn the representations (i.e., embeddings) of entities and relations in low-dimensional vector spaces, and
these embeddings may be used for extremely quick question answering even in the large knowledge graphs (Wu
et al., 2022).

3 | A I - B A S E D Q A S IN H E A L T H CA R E

To characterize the current state of research in QAS, we conducted a brief literature review, analysis, and synthesis that
were conducted in February 2022. First, a study question was developed based on the preliminary literature search and
following recommendations by Stillwell et al. (2010). The preliminary search was conducted to identify relevant articles,
ensure the validity of the proposed idea and avoid duplication. We included papers describing conversational agents
like chatbots, avatars, and robots and explored their use and/or effectiveness. A literature search was not limited to the
study type or accessibility. One researcher conducted the literature search in CINAHL, and PubMed using the following
search string: (question-answering system OR conversational agent) AND (health care OR nurs* OR medic*) AND
(evaluat* OR accepta*). Additionally, Google Scholar was used to identify more relevant studies.
A total of 65 papers were included in our analysis. Table A1 provides characteristics of included studies, such as
study aim, type of conversational agent, the field of health care, study sample, and main study findings.
It is evident that conversational agents are most frequently used in the field of psychiatry (n = 18), public health
and preventive medicine (n = 7), and general health care (n = 7) (Figure 2).
The effectiveness and use of conversational agents were checked among the general population (n = 23), patients
(n = 14), students (n = 8), health care professionals (n = 3), patient family members (n = 2), and other (health care
professionals and pharmacist, health care professionals and researchers, health care professionals and patients, and
patients, carer, general population; n = 6) (Figure 3).
Most QA systems were designed to empower or improve mental health in company employees/students/patients
with mental illness (n = 13; 23%), followed by mental illness screening systems, behavior change techniques and pro-
grams to reduce/treat smoking and/or alcohol dependence (n = 3; 5%) (Figure 4).
Although, conversational agents are seen as useful (Bibault et al., 2019; Inkster et al., 2018) and easy to use
(Cameron et al., 2019). Zuchowski and Göller (2022) reported that health care professionals preferred to type medical
notes, rather than dictate using speech recognition software. Speech recognition software reduces the amount of time
to complete medical documentation. Also, error rates were lower while using speech recognition software compared to
BUDLER ET AL. 3 of 21

FIGURE 1 Answer type taxonomies in question answering systems

the traditional methods. However, acceptance of new technology among clinicians is limited, which possibly can be
explained by misconceptions about the accuracy of this technology. Mujeeb et al. (2017) show that conversational
agents can be useful for practitioner psychologists to assist a psychologist as well as save time and resources.
Among the population of students, conversational agents were useful in improving the confidence in communica-
tion skills (Borja-Hart et al., 2019), reducing symptoms of depression and anxiety (Fitzpatrick et al., 2017) and in the
development of clinical reasoning and history taking skills (Isaza-Restrepo et al., 2018). Students' willingness to engage
with AI-led health chatbots depends on their IT skills, utility, attitude, and perceived trustworthiness (Nadarzynski
et al., 2019).
Conversational agents were useful also for patients (Chaix et al., 2019; Denecke et al., 2020; Inkster et al., 2018;
Kowatsch et al., 2017; Philip et al., 2017) and helped them to track treatment effectively (Chaix et al., 2019; Dworkin
et al., 2019; Wang et al., 2018). They are most frequently studied in the field of psychiatry. The integration of conversa-
tional agents for mental health was shown to be acceptable to both mental health professionals and users (Danieli
4 of 21 BUDLER ET AL.

FIGURE 2 Field of application of conversational agents

FIGURE 3 Proportion of question answering systems user groups


BUDLER ET AL. 5 of 21

FIGURE 4 Different purposes for QA systems

et al., 2021). Sun et al. (2018) reported that patients found the conversational agents easy to use, quick to learn, with
minimal required interaction. However, conversational agents are relatively new and to properly assess their usefulness
and effectiveness one should adhere to their usage. Bickmore et al. (2018) reported that only 15% of patients used a con-
versational assistant regularly while Wolters et al. (2016) found that the intensity and style of the cognitive assistant's
voice must be adjusted to the user's wishes. The realistic appearance of a virtual agent and responsiveness have a big
role in engaging users with the QAS (Ali et al., 2020). Milne-Ives et al. (2020) conducted a literature review and found
mixed evidence of the effectiveness and resilience of conversational agents.

4 | OBSERVATIONS, O PPORTUNITIES, A ND CHALLENGES

Answering the questions is a current topic in the field of natural language processing. Systems pose a major challenge
as traditional systems are not sufficient for some knowledge domains (Huang et al., 2021). The difficulty in the biomedi-
cal field is that most current systems only handle a limited number of questions and responses, requiring further effort
to increase their effectiveness (Zhou et al., 2018). One of the current solutions is the mapping of new questions to previ-
ously answered questions that are “similar” (Ben Abacha & Demner-Fushman, 2019).
Conversational agents built to mimic human text or voice discussions are being employed in several industries,
including healthcare, as they can improve patient care by enhancing accessibility, personalization, and efficiency (Car
et al., 2020).
Most of the literature reports on text-based conversational agents mention the use of AI and smartphone apps (Car
et al., 2020). The authors report that study participants who experienced interactions with speech and augmented real-
ity in addition to interactions with text-based conversational agents reported higher user engagement and involvement.
Human-like conversational agents with verbal and non-verbal cues improve user engagement through interactivity and
empathy (Chew, 2022).
In contrast to a review on a similar topic, Luo et al. (2022) focused their review on the effectiveness and usage of
QAS in contrast to their technical aspects.
6 of 21 BUDLER ET AL.

5 | C ON C L U S I ON

This review provides some characteristics of the current QAS in healthcare. It was demonstrated that conversational
agents are seen as practical and user-friendly. Additionally, they may save time and resources.
There are many fields where QAS were successfully employed in healthcare field. For example, conversational
agents were beneficial for enhancing students' confidence in their communication abilities, lowering symptoms of sad-
ness and anxiety, and fostering clinical reasoning when delivering cognitive behavior therapy. Additionally, it was
shown that patients often utilize QAS to monitor their therapy. Engagement with AI-driven health chatbots relies on
the user's IT literacy, perceived benefit, attitude, and sense of trust.

A U T H O R C ON T R I B U T I O NS
Leona Cilar Budler: Data curation (equal); formal analysis (equal); investigation (equal); methodology (equal). Lucija
Gosak: Data curation (equal); formal analysis (equal); investigation (equal); methodology (equal). Gregor Stiglic: Con-
ceptualization (equal); data curation (equal); formal analysis (equal); investigation (equal); methodology (equal).

A C K N O WL E D G M E N T
Not applicable.

FUNDING INFORMATION
This work was supported by the Slovenian Research Agency grants ARRS N2-0101, ARRS N3-0307, and ARRS P2-0057.

CONFLICT OF INTEREST
The authors have declared no conflicts of interest for this article.

DATA AVAILABILITY STATEMENT


Data sharing is not applicable to this article as no new data were created or analyzed in this study.

ORCID
Leona Cilar Budler https://orcid.org/0000-0002-6842-7751
Lucija Gosak https://orcid.org/0000-0002-8742-6594
Gregor Stiglic https://orcid.org/0000-0002-0183-8679

R EL ATE D WIR Es AR TI CL E
A critical review of state-of-the-art chatbot designs and applications

FURTHER READING
Cheng, A., Raghavaraju, V., Kanugo, J., Handrianto, Y. P., & Shang, Y. (2018, January). Development and evaluation of a healthy coping
voice interface application using the Google home for elderly patients with type 2 diabetes. In Proceedings of the 2018 15th IEEE Annual
Consumer Communications & Networking Conference (CCNC) (pp. 1–5). IEEE.
Demner-Fushman, D., Mrabet, Y., & Ben Abacha, A. (2020). Consumer health information and question answering: Helping consumers find
answers to their health-related information needs. Journal of the American Medical Informatics Association, 27(2), 194–201.
Ghosh, S., Bhatia, S., & Bhatia, A. (2018). Quro: Facilitating user symptom check using a personalised chatbot-oriented dialogue system.
Studies in Health Technology and Informatics, 252, 51–56.
Greuter, S., Balandin, S., & Watson, J. (2019, October). Social games are fun: Exploring social interactions on smart speaker platforms for
people with disabilities. In Extended Abstracts of the Annual Symposium on Computer-Human Interaction in Play Companion Extended
Abstracts (pp. 429–435).
Kocaballi, A. B., Berkovsky, S., Quiroz, J. C., Laranjo, L., Tong, H. L., Rezazadegan, D., Briatore, A., & Coiera, E. (2019). The personalization
of conversational agents in health care: Systematic review. Journal of Medical Internet Research, 21(11), e15360.
Laranjo, L., Dunn, A. G., Tong, H. L., Kocaballi, A. B., Chen, J., Bashir, R., Surian, D., Gallego, B., Magrabi, F., Lau, A. Y. S., & Coiera, E.
(2018). Conversational agents in health care: A systematic review. Journal of the American Medical Informatics Association, 25(9),
1248–1258.
Lobo, J., Ferreira, L., & Ferreira, A. J. (2017). CARMIE: A conversational medication assistant for heart failure. International Journal of
E-Health and Medical Communications (IJEHMC), 8(4), 21–37.
Moher, D., Liberati, A., Tetzlaff, J., Altman, D., & PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-ana-
lyses: The PRISMA statement. PLoS Medicine, 6(7), e1000097.
BUDLER ET AL. 7 of 21

Preininger, A. M., Rosario, B. L., Buchold, A. M., Heiland, J., Kutub, N., Bohanan, B. S., South, B., & Jackson, G. P. (2021). Differences in
information accessed in a pharmacologic knowledge base using a conversational agent vs traditional search methods. International Jour-
nal of Medical Informatics, 153, 104530.
Schachner, T., Keller, R., & von Wangenheim, F. (2020). Artificial intelligence-based conversational agents for chronic conditions: Systematic
literature review. Journal of Medical Internet Research, 22(9), e20701.
Tawfik, G. M., Dila, K. S., Mohamed, M. F., Tam, D. H., Kien, N. D., Ahmed, A. M., & Huy, N. T. (2019). A step by step guide for conducting
a systematic review and meta-analysis with simulation data. Tropical Medicine and Health, 47(1), 1–9.
The GRADE Working Group. (2019). Grading of recommendations, assessment, development and evaluations (GRADE). Retrieved from
http://www.grade workinggroup.org
Vaidyam, A. N., Linggonegoro, D., & Torous, J. (2021). Changes to the psychiatric chatbot landscape: A systematic review of conversational
agents in serious mental illness: Changements du paysage psychiatrique des chatbots: Une revue systématique des agents con-
versationnels dans la maladie mentale sérieuse. The Canadian Journal of Psychiatry, 66(4), 339–348.

R EF E RE N C E S
Abdullah, A. S., Gaehde, S., & Bickmore, T. (2018). A tablet based embodied conversational agent to promote smoking cessation among vet-
erans: A feasibility study. Journal of Epidemiology and Global Health, 8(3–4), 225–230.
Ali, M. R., Razavi, S. Z., Langevin, R., Al Mamun, A., Kane, B., Rawassizadeh, R., Schubert, L., & Hoque, E. (2020, October). A virtual con-
versational agent for teens with autism spectrum disorder: Experimental results and design lessons. In Proceedings of the 20th ACM Inter-
national Conference on Intelligent Virtual Agents (pp. 1–8).
Amith, M., Anna, Z. H. U., Cunningham, R., Rebecca, L. I. N., Savas, L., Laura, S. H. A. Y., Gong, Y., Boom, J., Roberts, K., & Cui, T. A. O.
(2019). Early usability assessment of a conversational agent for HPV vaccination. Studies in Health Technology and Informatics, 257, 17.
Amith, M., Lin, R., Cunningham, R., Wu, Q. L., Savas, L. S., Gong, Y., Boom, J. A., Tang, L., & Tao, C. (2020). Examining potential usability
and health beliefs among young adults using a conversational agent for HPV vaccine counseling. AMIA Summits on Translational Sci-
ence Proceedings, 2020, 43.
Auriacombe, M., Moriceau, S., Serre, F., Denis, C., Micoulaud-Franchi, J. A., de Sevin, E., Bonhomme, E., Bioulac, S., Fatseas, M., &
Philip, P. (2018). Development and validation of a virtual agent to screen tobacco and alcohol use disorders. Drug and Alcohol Depen-
dence, 193, 1–6.
Beiley, M. R. (2019). Mental health and wellness Chatbot. The University of Arizona.
Ben Abacha, A., & Demner-Fushman, D. (2019). A question-entailment approach to question answering. BMC Bioinformatics, 20(1), 1–23.
Bennion, M. R., Hardy, G. E., Moore, R. K., Kellett, S., & Millings, A. (2020). Usability, acceptability, and effectiveness of web-based conver-
sational agents to facilitate problem-solving in older adults: A controlled study. Journal of Medical Internet Research, 22(5), e16794.
Bian, Y., Xiang, Y., Tong, B., Feng, B., & Weng, X. (2020). Artificial intelligence–assisted system in postoperative follow-up of orthopedic
patients: Exploratory quantitative and qualitative study. Journal of Medical Internet Research, 22(5), e16896.
Bibault, J.-E., Chaix, B., Guillemassé, A., Cousin, S., Escanade, A., Perrin, M., Pienkowski, A., Delamon, G., Nectoux, P., & Brouard, B.
(2019). A chatbot versus physicians to provide information for patients with breast cancer: Blind, randomized controlled noninferiority
trial. Journal of Medical Internet Research, 21(11), e15787.
Bickmore, T. W., Trinh, H., Olafsson, S., O'Leary, T. K., Asadi, R., Rickles, N. M., & Cruz, R. (2018). Patient and consumer safety risks when
using conversational assistants for medical information: An observational study of Siri, Alexa, and Google Assistant. Journal of Medical
Internet Research, 20(9), e11510.
Borja-Hart, N. L., Spivey, C. A., & George, C. M. (2019). Use of virtual patient software to assess student confidence and ability in communi-
cation skills and virtual patient impression: A mixed-methods approach. Currents in Pharmacy Teaching & Learning, 11(7), 170–718.
Cameron, G., Cameron, D., Megaw, G., Bond, R., Mulvenna, M., O'Neill, S., Armour, C., & McTear, M. (2019). Assessing the usability of a
chatbot for mental health care. In INSCI 2018. Lecture Notes in Computer Science (pp. 121–132). Springer.
Cao, Y., Liu, F., Simpson, P., Antieau, L., Bennett, A., Cimino, J. J., Ely, J., & Yu, H. (2011). AskHERMES: An online question answering sys-
tem for complex clinical questions. Journal of Biomedical Informatics, 44(2), 277–288.
Car, L. T., Dhinagaran, D. A., Kyaw, B. M., Kowatsch, T., Joty, S., Theng, Y., & Atun, R. (2020). Conversational agents in health care: Scoping
review and conceptual analysis. Journal of Medical Internet Research, 22(8), e17158.
Chaix, B., Bibault, J.-E., Pienkowski, A., Delamon, G., Guillemasse, A., Nectoux, P., & Brouard, B. (2019). When chatbots meet patients:
One-year prospective study of conversations between patients with breast cancer and a chatbot. JMIR Cancer, 5(1), e12856.
Chavez-Yenter, D., Kimball, K. E., Kohlmann, W., Chambers, R. L., Bradshaw, R. L., Espinel, W. F., Flynn, M., Gammon, A., Goldberg, E.,
Hagerty, K. J., Hess, R., Kessler, C., Monahan, R., Temares, D., Tobik, K., Mann, D. M., Kawamoto, K., Fiol, G. D., Buys, S. S., …
Kaphingst, K. A. (2021). Patient interactions with an automated conversational agent delivering pretest genetics education: Descriptive
study. Journal of Medical Internet Research, 23(11), e29447.
Chew, H. S. J. (2022). The use of artificial intelligence–based conversational agents (chatbots) for weight loss: Scoping review and practical
recommendations. JMIR Medicasl Informatics, 10(4), e32578.
Chinkam, S., Steer-Massaro, C., Herbey, I., Zhang, Z., Bickmore, T., & Shorten, A. (2021). The perspectives of women and their health-care
providers regarding using an ECA to support mode of birth decisions. The Journal of Perinatal Education, 30(3), 135–144.
Danieli, M., Ciulli, T., Mousavi, S. M., & Riccardi, G. (2021). A conversational artificial intelligence agent for a mental health care app: Evalu-
ation study of its participatory design. JMIR Formative Research, 5(12), e30053.
8 of 21 BUDLER ET AL.

Demirci, H. M. (2018). User experience over time with conversational agents: Case study of woebot on supporting subjective well-being [Master's
thesis]. Middle East Technical University.
Denecke, K., Vaaheesan, S., & Arulnathan, A. (2020). A mental health chatbot for regulating emotions (SERMO)-concept and usability test.
IEEE Transactions on Emerging Topics in Computing, 9(3), 1170–1182.
Dimeff, L. A., Jobes, D. A., Chalker, S. A., Piehl, B. M., Duvivier, L. L., Lok, B. C., Zalakea, M. S., Chung, J., & Koerner, K. (2020). A novel
engagement of suicidality in the emergency department: Virtual collaborative assessment and management of suicidality. General Hospi-
tal Psychiatry, 63, 119–126.
Dworkin, M. S., Lee, S., Chakraborty, A., Monahan, C., Hightow-Weidman, L., Garofalo, R., Qato, D. M., Liu, L., & Jimenez, A. (2019).
Acceptability, feasibility, and preliminary efficacy of a theory-based relational embodied conversational agent mobile phone intervention
to promote HIV medication adherence in young HIV-positive African American MSM. AIDS Education and Prevention, 31(1), 17–37.
Easton, K., Potter, S., Bec, R., Bennion, M., Christensen, H., Grindell, C., Mirheidari, B., Weich, S., de Witte, L., Wolstenholme, D., &
Hawley, M. S. (2019). A virtual agent to support individuals living with physical and mental comorbidities: Co-design and acceptability
testing. Journal of Medical Internet Research, 21(5), e12996.
Fitzpatrick, K. K., Darcy, A., & Vierhile, M. (2017). Delivering cognitive behavior therapy to young adults with symptoms of depression and
anxiety using a fully automated conversational agent (Woebot): A randomized controlled trial. JMIR Mental Health, 4(2), e7785.
Fulmer, R., Joerin, A., Gentile, B., Lakerink, L., & Rauws, M. (2018). Using psychological artificial intelligence (Tess) to relieve symptoms of
depression and anxiety: Randomized controlled trial. JMIR Mental Health, 5(4), e64. https://doi.org/10.2196/mental.9782
Gardiner, P. M., McCue, K. D., Negash, L. M., Cheng, T., White, L. F., Yinusa-Nyahkoon, L., Jack, B. W., & Bickmore, T. W. (2017). Engaging
women with an embodied conversational agent to deliver mindfulness and lifestyle recommendations: A feasibility randomized control
trial. Patient Education and Counseling, 100(9), 1720–1729.
Ghandeharioun, A., McDuff, D., Czerwinski, M., & Rowan, K. (2018). EMMA: An emotionally intelligent personal assistant for improving
wellbeing. arXiv preprint arXiv:1812.11423.
Gong, E., Baptista, S., Russell, A., Scuffham, P., Riddell, M., Speight, J., Bird, D., Williams, E., Lotfaliany, M., & Oldenburg, B. (2020). My dia-
betes coach, a mobile app–based interactive conversational agent to support type 2 diabetes self-management: Randomized effectiveness-
implementation trial. Journal of Medical Internet Research, 22(11), e20322.
Hauser-Ulrich, S., Künzli, H., Meier-Peterhans, D., & Kowatsch, T. (2020). A smartphone-based health care chatbot to promote self-
management of chronic pain (SELMA): Pilot randomized controlled trial. JMIR mHealth and uHealth, 8(4), e15806.
Håvik, R., Wake, J. D., Flobak, E., Lundervold, A., & Guribye, F. (2018). A conversational Interface for self-screening for ADHD in adults. In
Internet science. INSCI 2018. Lecture Notes in Computer Science (pp. 133–144). Springer. https://doi.org/10.1007/978-3-030-17705-8_12
Huang, X., Zhang, J., Xu, Z., Ou, L., & Tong, J. (2021). A knowledge graph based question answering method for medical domain. PeerJ Com-
puter Science, 7, e667.
Inkster, B., Sarda, S., & Subramanian, V. (2018). An empathy-driven, conversational artificial intelligence agent (Wysa) for digital mental
well-being: Real-world data evaluation mixed-methods study. JMIR mHealth and uHealth, 6(11), e12106. https://doi.org/10.2196/12106
Isaza-Restrepo, A., Gomez, M. T., Cifuentes, G., & Argüello, A. (2018). The virtual patient as a learning tool: A mixed quantitative qualitative
study. BMC Medical Education, 18(1), 297. https://doi.org/10.1186/s12909-018-1395-8
Kadariya, D., Venkataramanan, R., Yip, H. Y., Kalra, M., Thirunarayanan, K., & Sheth, A. (2019, June). KBot: Knowledge-enabled personal-
ized chatbot for asthma self-management. In Proceedings of the 2019 IEEE international conference on smart computing (SMARTCOMP)
(pp. 138–143). IEEE.
Kamita, T., Ito, T., Matsumoto, A., Munakata, T., & Inoue, T. (2019). A chatbot system for mental healthcare based on SAT counseling
method. Mobile Information Systems, 2019, 2019–2011.
Kobori, Y., Osaka, A., Soh, S., & Okada, H. (2018). MP15-03 novel application for sexual transmitted infection screening with an AI chatbot.
The Journal of Urology, 199(4S), e189–e190.
Kocielnik, R., Xiao, L., Avrahami, D., & Hsieh, G. (2018). Reflection companion: A conversational system for engaging users in reflection on
physical activity. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(2), 1–26.
Kowatsch, T., Nißen, M., Rüegger, D., Stieger, M., Flückiger, C., Allemand, M., & von Wangenheim, F. (2018). The impact of interpersonal
closeness cues in text-based health care chatbots on attachment bond and the desire to continue interacting: An experimental design. In
ECIS Proceedings 2018. AIS eLibrary. 46 pp.
Kowatsch, T., Nißen, M., Shih, C. H. I., Rüegger, D., Volland, D., Filler, A., Künzler, F., Barata, F., Haug, S., Büchter, D., Brogle, B.,
Heldt, K., Gindrat, P., & Farpour-Lambert, N. (2017). Text-based health care chatbots supporting patient and health professional teams:
Preliminary results of a randomized controlled trial on childhood obesity. In Persuasive Embodied Agents for Behavior Change (PEACH
2017) Workshop, co-located with the 17th International Conference on Intelligent Virtual Agents (IVA 2017), Stockholm, Sweden. Centre
for Digital Health Interventions (CDHI), University of Zurich, University of St.Gallen & ETH Zurich.
Kowatsch, T., Schachner, T., Harperink, S., Barata, F., Dittler, U., Xiao, G., Stanger, C., Wangenheim, F. V., Fleisch, E., Oswald, H., &
Möller, A. (2021). Conversational agents as mediating social actors in chronic disease management involving health care professionals,
patients, and family members: Multisite single-arm feasibility study. Journal of Medical Internet Research, 23(2), e25060.
Luo, B., Lau, R. Y., Li, C., & Si, Y. W. (2022). A critical review of state-of-the-art chatbot designs and applications. WIREs Data Mining and
Knowledge Discovery, 12(1), e1434.
Ly, K. H., Ly, A.-M., & Andersson, G. (2017). A fully automated conversational agent for promoting mental well-being: A pilot RCT using
mixed methods. Internet Interventions, 10, 39–46. https://doi.org/10.1016/j.invent.2017.10.002
BUDLER ET AL. 9 of 21

Martínez-Miranda, J., Martínez, A., Ramos, R., Aguilar, H., Jiménez, L., Arias, H., Rosales, G., & Valencia, E. (2019). Assessment of users'
acceptability of a mobile-based embodied conversational agent for the prevention and detection of suicidal behaviour. Journal of Medical
Systems, 43(8), 1–18.
Milne-Ives, M., de Cock, C., Lim, E., Harper Shehadeh, M., de Pennington, N., Mole, G., Normando, E., & Meinert, E. (2020). The effective-
ness of artificial intelligence conversational agents in health care: Systematic review. Journal of Medical Internet Research, 22(10),
e20346. https://doi.org/10.2196/20346
Montenegro, J. Z., da Costa, C. A., & da Rosa, R. R. (2019). Survey of conversational agents in health. Expert Systems with Applications, 129,
56–67.
Mujeeb, S., Javed, M. H., & Arshad, T. (2017). Aquabot: A diagnostic chatbot for achluophobia and autism. International Journal of Advanced
Computer Science and Applications, 8(9), 209–216.
Nadarzynski, T., Miles, O., Cowie, A., & Ridge, D. (2019). Acceptability of artificial intelligence (AI)-led chatbot services in health care: A
mixed-methods study. Digital Health, 5, 2055207619871808.
Nakagawa, S., Enomoto, D., Yonekura, S., Kanazawa, H., & Kuniyoshi, Y. (2018). A telecare system that estimates quality of life through
communication. In Proceedings of the 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS). IEEE.
https://doi.org/10.1109/CCIS.2018.8691360
Nißen, M., Selimi, D., Janssen, A., Cardona, D. R., Breitner, M. H., Kowatsch, T., & von Wangenheim, F. (2022). See you soon again, chatbot?
A design taxonomy to characterize user-chatbot relationships with different time horizons. Computers in Human Behavior, 127, 107043.
Owens, O. L., Felder, T., Tavakoli, A. S., Revels, A. A., Friedman, D. B., Hughes-Halbert, C., & Hébert, J. R. (2019). Evaluation of a
computer-based decision aid for promoting informed prostate cancer screening decisions among African American men: iDecide. Ameri-
can Journal of Health Promotion, 33(2), 267–278.
Parmar, D., Lin, L., Dsouza, N., Joerg, S., Leonard, A. E., Daily, S. B., & Babu, S. (2022). How immersion and self-avatars in VR affect learn-
ing programming and computational thinking in middle school education. IEEE Transactions on Visualization and Computer Graphics,
1. https://doi.org/10.1109/TVCG.2022.3169426
Philip, P., Micoulaud-Franchi, J.-A., Sagaspe, P., De Sevin, E., Olive, J., Bioulac, S., & Sauteraud, A. (2017). Virtual human as a new diagnos-
tic tool, a proof of concept study in the field of major depressive disorders. Scientific Reports, 16, 42656. https://doi.org/10.1038/srep42656
Pietrasik, M., & Reformat, M. (2020, May). A simple method for inducing class taxonomies in knowledge graphs. In European semantic web
conference (pp. 53–68). Springer.
Ponathil, A., Ozkan, F., Welch, B., Bertrand, J., & Chalil Madathil, K. (2020). Family health history collected by virtual conversational agents:
An empirical study to investigate the efficacy of this approach. Journal of Genetic Counseling, 29(6), 1081–1092.
Rehman, U. U., Chang, D. J., Jung, Y., Akhtar, U., Razzaq, M. A., & Lee, S. (2020). Medical instructed real-time assistant for patient with
glaucoma and diabetic conditions. Applied Sciences, 10(7), 2216.
Sezgin, E., Oiler, B., Abbott, B., Noritz, G., & Huang, Y. (2022). “Hey Siri, help me take care of my child”: A feasibility study with caregivers
of children with special health care needs (CSHCN) using voice interaction and automatic speech recognition in remote care manage-
ment. Frontiers in Public Health, 10, 849322. https://doi.org/10.3389/fpubh.2022.849322
Spänig, S., Emberger-Klein, A., Sowa, J.-P., Canbay, A., Menrad, K., & Heider, D. (2019). The virtual doctor: An interactive clinical-decision-
support system based on deep learning for non-invasive prediction of diabetes. Artificial Intelligence in Medicine, 100, 101706. https://doi.
org/10.1016/j.artmed.2019.101706
Stein, N., & Brooks, K. (2017). A fully automated conversational artificial intelligence for weight loss: Longitudinal observational study
among overweight and obese adults. JMIR Diabetes, 2(2), e8590.
Stillwell, S. B., Fineout-Overholt, E., Melnyk, B. M., & Williamson, K. M. (2010). Evidence-based practice, step by step: Asking the clinical
question: A key step in evidence-based practice. The American Journal of Nursing, 110(3), 58–61.
Sun, O., Chen, J., & Magrabi, F. (2018). Using voice-activated conversational interfaces for reporting patient safety incidents: A technical fea-
sibility and pilot usability study. Studies in Health Technology and Informatics, 252, 139–144.
Sundblad, H. (2007). Question classification in question answering systems [Doctoral dissertation, Institutionen för datavetenskap].
Tanaka, H., Negoro, H., Iwasaka, H., & Nakamura, S. (2017). Embodied conversational agents for multimodal automated social skills train-
ing in people with autism spectrum disorders. PLoS One, 12(8), e0182151.
van Heerden, A., Ntinga, X., & Vilakazi, K. (2017). The potential of conversational agents to provide a rapid HIV counseling and testing ser-
vices. In Proceedings of the 2017 International Conference on the Frontiers and Advances in Data Science (FADS) (pp. 80–85).
Van Pinxteren, M. M., Pluymaekers, M., & Lemmink, J. G. (2020). Human-like communication in conversational agents: A literature review
and research agenda. Journal of Service Management, 31, 203–225.
Vita, S., Marocco, R., Pozzetto, I., Morlino, G., Vigilante, E., Palmacci, V., Fondaco, L., Kertusha, B., Renzelli, M., Mercurio, V., Vullo, V.,
Mastroianni, C. M., & Lichtner, M. (2018). The'doctor apollo'chatbot: A digital health tool to improve engagement of people living with
HIV. Journal of the International AIDS Society, 21(suppl 8), e25187.
Wang, H., Zhang, Q., Ip, M., & Lau, J. T. F. (2018). Social media-based conversational agents for health management and interventions. Com-
puter, 51(8), 26–33.
Wilson, N., MacDonald, E. J., Mansoor, O. D., & Morgan, J. (2017). In bed with Siri and Google Assistant: A comparison of sexual health
advice. BMJ, 359, j5635.
Wolters, M. K., Kelly, F., & Kilgour, J. (2016). Designing a spoken dialogue interface to an intelligent cognitive assistant for people with
dementia. Health Informatics Journal, 22(4), 854–866.
10 of 21 BUDLER ET AL.

Wu, T., Khan, A., Yong, M., Qi, G., & Wang, M. (2022). Efficiently embedding dynamic knowledge graphs. Knowledge-Based Systems, 250,
109124.
Zhou, X., Wu, B., & Zhou, Q. (2018). A depth evidence score fusion algorithm for chinese medical intelligence question answering system.
Journal of Healthcare Engineering, 2018, 1–8.
Zuchowski, M., & Göller, A. (2022). Speech recognition for medical documentation: An analysis of time, cost efficiency and acceptance in a
clinical setting. British Journal of Health Care Management, 28, 30–36.

How to cite this article: Budler, L. C., Gosak, L., & Stiglic, G. (2023). Review of artificial intelligence-based
question-answering systems in healthcare. WIREs Data Mining and Knowledge Discovery, 13(2), e1487. https://
doi.org/10.1002/widm.1487
A PP EN D IX
BUDLER ET AL.

TABLE A1 Included study characteristics

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
(Zuchowski & Göller, 2022) Prospective study To evaluate time and Speech recognition— Nephrology, hematology, and Clinicians (n = 15) The majority preferred to type
cost savings Indicda easySpeak emergency medicine participated in the study their medical notes, rather
associated with and dragon (313 samples were than dictating using speech
speech recognition naturally speaking produced, 163 with speech recognition software. They
technology, and software recognition software and also stated that they wanted
potential for 150 by typing) to see greater use of speech
improving health recognition software in the
care processes. hospital.
(Bibault et al., 2019) Randomized To test if AI Chatbot Vik Oncology Patients (n = 142) A conversational agent can be
controlled trial conversational used to inform patients with
agent can provide cancer and save time from
answers to patients visiting doctor.
with breast cancer
with a level of
satisfaction like
answers given by
physicians.
(Borja-Hart et al., 2019) Mixed-methods study To assess students' Virtual patient Pharmacology Pharmacy students (n = 205) VP improved student
confidence and confidence in verbal and
impressions in written communication
using their skills.
communication
skills with VP and
to evaluate their
skills using this
technology.
(Cameron et al., 2019) Cross-sectional study To assess the usability Chatbot iHelpr Psychiatry Health care professionals Conversational agent was
of a chatbot. (n = 7) assessed as easy to use.
(Chaix et al., 2019) Prospective study To evaluate 1 year of Chatbot Vik Oncology Patients (n = 4737) The more the patients used the
conversations chatbot, the more adherent
between patients they were. Patients were
with breast cancer satisfied with a chatbot. Vik
and a chatbot. was their support and helped
them to track treatment
effectively.

(Continues)
11 of 21
TABLE A1 (Continued)
12 of 21

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
(Dimeff et al., 2020) Cross-sectional study To develop and Avatar Dr. Dave Psychiatry, emergency medicine Patients (n = 24) and health Avatar Dr Dave is a powerful
evaluate an avatar care professionals (n = 21) method of facilitating suicide
system for suicidal prevention interventions and
patients and point-of-care tools for
medical workers in suicidal patients.
emergency
departments.
(Fulmer et al., 2018) Randomized To assess the Chatbot Tess Psychiatry, University setting Students (n = 75) There is a significant reduction
controlled trial feasibility and in anxiety symptoms. No
efficacy of using changes in depression
Tess to reduce self- symptoms.
identified symptoms
of depression and
anxiety in college
students.
(Håvik et al., 2018) Cross-sectional study To explore the Chatbot Rob Psychiatry General population (n = 11) Participants were satisfied with
potential of the chatbot Rob. It is evident
conversational useful screening utility in the
interfaces in mental health domain, and
providing screening useful addition to a paper
services for mental version when screening for
health care. ADHD.
(Inkster et al., 2018) Mixed-methods study To present Wysa app Psychiatry Global users (n = 129) The high users' group had
preliminary data on significantly higher average
effectiveness and improvement. Users found
engagement levels the app experience helpful
of the Wysa app on and encouraging.
users with self-
reported symptoms
of depression.
(Isaza-Restrepo et al., 2018) Mixed-methods study To present evidence Virtual patient University setting Medical students (n = 20) VP is a valuable and useful tool
on the effectiveness for the development of
of VP and the clinical reasoning and history
development of taking skills in medical
necessary medical students, as part of a
skills. constructivist learning
course.
(Ly et al., 2017) Mixed-methods study To assess the Chatbot Shim Psychiatry Adult population (n = 28) Significant interaction effects of
effectiveness and group and time on
adherence of psychological well-being and
chatbot Shi) and to perceived stress.
explore participants'
BUDLER ET AL.
TABLE A1 (Continued)

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
BUDLER ET AL.

views and
experiences of
interacting with a
chatbot.
(Nakagawa et al., 2018) Experimental study To collect audio and Talkbox Scikit General health care Adult population (n = 14) The usefulness of the
video data from estimation system was
healthy participants reported using multimodal
during a data.
conversation with a
conversational
agent and to
evaluate the
accuracy.
(Philip et al., 2017) Cross-sectional study To test the Embodied Psychiatry Patients (n = 221) Patients found the face-to-face
performance of a conversational interview with the ECA very
diagnostic system agent acceptable.
for MDD and to
evaluate the
acceptability of
ECA.
(Spänig et al., 2019) Prospective study To develop an AI to Virtual doctor Endocrinology Patients (n = 4814) The system can predict type 2
interact with a CMUSphinx diabetes mellitus based on
patient (virtual non-invasive sensors and
doctor) by using a ChatGPT-like deep neural
speech recognition networks.
and speech It provides an easy-to-interpret
synthesis system. probability estimation for
T2DM for the patient.
(Kowatsch et al., 2017) Randomized To develop and MobileCoach Public health and preventive Patients (n = 10; n = 14) MobileCoach is effective among
controlled trial evaluate a mobile medicine young patients for the
chat app for the treatment of obesity.
open-source
behavioral health
intervention
platform.
(Bickmore et al., 2018) Observational study To determine the Siri, Alexa, Google General health care Patients (n = 54) Only 15% reported using a
prevalence and Assistant conversational assistant
nature of the harm regularly, 41% had never
that could result used one, and 44% had tried
from patients or one a few times.
consumers using
conversational
13 of 21

(Continues)
TABLE A1 (Continued)
14 of 21

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
assistants for
medical
information.
(Denecke et al., 2020) Quasi-experimental To introduce mobile SERMO Psychiatry Patients (n = 129) Efficiency, perspicuity and
study application with attractiveness are considered
integrated chatbot as good.
that implements
methods from
cognitive behavior
therapy to support
mentally ill people.
(Fitzpatrick et al., 2017) Randomized To determine the Text-based Psychiatry, University setting Students (n = 70) Students in the Woebot group
Controlled Trial feasibility, conversational significantly reduced
acceptability, and agent Woebot symptoms of depression and
preliminary efficacy anxiety.
of conversational
agent to deliver a
self-help program
for college students
who are having
symptoms of
anxiety and
depression.
(Ghandeharioun et al., 2018) Quasi-experimental To present the design EMMA––EMotion- Psychiatry Generally mentally healthy Extraverts preferred EMMA
study and evaluation of aware mHealth population (n = 39) significantly more than
EMMA. agent introverts. Personalized
machine learning model
worked as well as relying on
gold-standard self-reports of
emotion from users.
(Kamita et al., 2019) Cross-sectional study To assess the stress Chatbot SAT BOT Psychiatry Students (n = 27) The self-esteem score in the
reduction effect by chatbot course group before
using the chatbot the course is in the “lower”
course with level and the STAI score is in
smartphones and the “much higher” level.
the motivation to
the courses and
investigate the
effectiveness of
chatbot in the self-
guided mental
health care course.
BUDLER ET AL.
TABLE A1 (Continued)

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
BUDLER ET AL.

(Kobori et al., 2018) Cross-sectional study To investigate efficacy Chatbot Public health and preventive Patients (n = 70) The accuracy rate of a diagnosis
and usability of a medicine of sexually transmitted
chatbot. infection with chatbot was
77.7%. 97.7% of patients
thought to visit clinic earlier
after they used a chatbot.
(Kocielnik et al., 2018) Qualitative study To present reflection Reflection companion General health care Adult population (n = 33) Mini-dialogues were successful
companion. in triggering reflection and
this reflection led to
increased motivation,
empowerment, and adoption
of new behaviors.
(Mujeeb et al., 2017) Experimental study To emphasize the use Chatbot Aquabot Psychiatry Adult population (n = 30) Aquabot is useful for
of a chatbot in the practitioner psychologists to
diagnosis of assist a human psychologist.
Achluophobia and Aquabot saved time and
autism disorder. resources, and also achieved
an accuracy of 88 percent
when compared against
human psychologists'
diagnosed results.
(Nadarzynski et al., 2019) Mixed-methods study To explore AI-based chatbot University setting Students (n = 29 in There was moderate
participants' system qualitative part; n = 215 in acceptability which was
willingness to quantitative part) correlated negatively with
engage with AI-led perceived poorer IT skills
health chatbots. and dislike for talking to
computers as well as
positively correlated with
perceived utility, positive
attitude, and perceived
trustworthiness.
(Stein & Brooks, 2017) Longitudinal To evaluate weight Lark weight loss Public health and preventive Adult population (n = 70) The use of an AI health coach
observational study loss, changes in health coach AI medicine is associated with weight loss
meal quality, and (HCAI) comparable to in-person
app acceptability lifestyle interventions.
among users of the
Lark weight loss
health coach AI
(HCAI), with the
overarching goal of
increasing access to
compassionate
15 of 21

(Continues)
TABLE A1 (Continued)
16 of 21

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
health care via
mobile health.
(Vita et al., 2018) Mixed-methods study To design, develop “Doctor Apollo” Infections Patients (n = 34) A chatbot interface between
and evaluate a chatbot patients and clinical centers
digital health is an effective tool for its
personnel-patient flexibility.
interface.
(Wang et al., 2018) Experimental study To evaluate the WeChat platform Public health and preventive Adult population (n = 401) The presence of a
performance of medicine conversational agent
their social media- effectively increased
based participant engagement and
conversational enhanced their smoking
agent in a smoking cessation outcomes.
cessation program.
(Wilson et al., 2017) Cross-sectional study Authors assess the Siri and Google Public health and preventive Adult population (n = 3221) 41% of internet users go online
quality of sexual Assistant medicine for health-related questions,
health advice 22% are having done so in
offered by Siri and the previous week.
Google Assistant in
comparison to
Google search.
(Tanaka et al., 2017) Experimental study To develop and Computer avatar Psychiatry General population (n = 18) Computer-based social skills
evaluate a social and people with ASD training is useful for people
skills training (n = 10) who experience social
system. difficulties.
(Ali et al., 2020) Qualitative study To present the design LISSA––Live Psychiatry Teenagers with autism Realistic appearance of a
of an online social interactive social (n = 9) virtual agent and
skills development skills assistance responsiveness are important
interface for in engaging users. Users
teenagers with ASD. should be fully briefed at the
outset about the purpose and
limitations of the system, to
avoid unrealistic
expectations.
(Beiley, 2019) Qualitative study To examine the user Chatbot Theodore Psychiatry Students (n = 3) Initial results indicate a positive
experience with user experience,
chatbot
(Demirci, 2018) Qualitative study To explore qualities of Woebot Psychiatry General population (n = 16) Improvements should be
conversational directed to system operations
Agent focused on and functionality in terms of
subjective well- its attractiveness, perspicuity,
BUDLER ET AL.
TABLE A1 (Continued)

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
BUDLER ET AL.

being and propose effectiveness, stimulation,


guidance for dependability, efficiency, and
designing such novelty.
agents.
(Dworkin et al., 2019) Prospective pilot study To evaluate the My personal health Infections Men (n = 43) Pill count adherence was 80%
feasibility, guide improved. The acceptability
acceptability, and of the app was high.
preliminary efficacy Feasibility issues identified
of My personal included loss of usage data
health guide. from unplanned participant
app deletion. Health literacy
was improved.
(Gardiner et al., 2017) Randomized To evaluate the ECA Public health and preventive Women (n = 61) Women in the ECA group
controlled Trial feasibility of using medicine significantly decreased
an ECA to teach alcohol consumption to
lifestyle reduce stress and increased
modifications to daily fruit consumption by
urban women. an average of 2 servings.
(Danieli et al., 2021) Randomized To describe and Mobile personal Psychiatry Adult population (n = 21) The integration into practice of
controlled Trial evaluate a protocol health care agent an AI-based mobile app for
for the participatory (m-PHA) mental health was shown to
design of mobile be acceptable to both mental
apps for mental health professionals and
health. users.
(Amith et al., 2019) Experimental study To implement a Wizard of Oz Infections Adults with children (n = 18) Non-vaccine hesitant parents
Wizard of Oz believed that the agent was
experiment that easy to use and had the
counsels adults on capabilities needed, despite
the HPV vaccine, the desire for additional
using an iPad tablet features.
and dialog script
developed by public
health
collaborators, and
for early testing of a
prospective
conversational
agent.
(Amith et al., 2020) Cross-sectional study To assess the Wizard of Oz Infections Adult population (n = 24) Participants perceived the agent
conversational to have high usability that is
agent among young slightly better or equivalent
college adults. to other voice interactive
17 of 21

(Continues)
TABLE A1 (Continued)
18 of 21

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
interfaces. Agent impacted
their beliefs concerning the
harms, uncertainty, and risk
denials for the HPV vaccine.
(Kadariya et al., 2019) Experimental study To present kBot for Chatbot kBot Infections Clinicians (n = 8) and kBot achieved an overall
health applications researchers (n = 8) technology acceptance value
and adapted to help of greater than 8 (11-point
pediatric asthmatic scale) and a mean System
patients to better Usability Score greater than
control their 80.
asthma.
(Rehman et al., 2020) Experimental study To present a medical Medical instructed Internal medicine Students (n = 119) User experience shows
instructed real-time real-time relatively good results in all
assistant that listens assistant––MIRA aspects. MIRA efficiently
to the user's chief predicts a disease based on
complaint and chief complaints and
predicts a specific supports the user in decision
disease. making.
(Bennion et al., 2020) Randomized To compare the Manage your life General health care Adult population (n = 112) MYLO was rated as
controlled Trial system usability, online––MYLO significantly more helpful
acceptability, and and likely to be used again.
effectiveness in System usability of both the
older adults of two conversational agents was
Web-based associated with the
conversational helpfulness of the agents and
agents that differ in the willingness of the
theoretical participants to reuse.
orientation and
approach.
(Hauser-Ulrich et al., 2020) Randomized To describe the design Chatbot painSELfMA General health care Adult population (n = 102) SELMA is feasible, as revealed
controlled Trial and implementation nagement––SELMA mainly by positive feedback
to promote the and valuable suggestions for
chatbot SELMA, future revisions.
and to present
findings on
effectiveness,
influence of
intention to change
behavior, pain
duration, working
alliance,
BUDLER ET AL.
TABLE A1 (Continued)

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
BUDLER ET AL.

acceptance, and
adherence.
(Gong et al., 2020) Randomized To evaluate the Laura Endocrinology Adult population (n = 187) The MDC program was
controlled Trial adoption, use, and successfully adopted and
effectiveness of the used by individuals with type
My diabetes coach 2 diabetes and significantly
(MDC) program, an improved their quality of life.
app-based
interactive
conversational
agent, Laura.
(Chinkam et al., 2021) Qualitative study To assess the ECA Obstetrics Women with previous Both groups state that the chat
feasibility and caesareans (n = 12) and agent provides easy access to
acceptability of prenatal providers (n = 8) information for patients and
conversational could increase visits to
agent to support providers.
decision-making
about the type of
birth after a
previous cesarean.
Chavez-Yenter et al., 2021) Descriptive study To assess user Automated Oncology Patient (n = 103) The majority of users who
interactions with a conversational completed a chat with a chat
conversational agent agent wanted to continue
agent for pretest testing (21/30, 70%). The
genetics education. interviewing agent has
provided patients with
sufficient information and
may be an expandable
alternative to pretest
counseling for patients
considering genetic testing
for cancer.
Ponathil et al., 2020 Empirical study Evaluate and compare Virtual conversational Family health history General population (n = 50) Participants spent 53 s longer
the conversational agent (VCA) using
approach with the virtual conversational agent
standard interface compared to the standard
for using the FHx task completion interface.
collection. The virtual conversational
agent was rated with a
higher ram than other
approaches.

(Continues)
19 of 21
TABLE A1 (Continued)
20 of 21

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
(Auriacombe et al., 2018) Qualitative study Evaluate the ECA Jeanne Tobacco and alcohol use General population (n = 139) The ECA was very well
acceptability of the disorders received with high marks.
embodied
conversational
agents Jeanne.
(Kowatsch et al., 2021) Single-Arm feasibility To assess the reach of MAX Chronic disease management Patient-family member Conversational agents designed
study MAX, acceptance of (n = 49) as digital gadget represent
MAX, and the potential to improve
conversational skills in managing chronic
agend-patient diseases.
working for
alliance.
(Easton et al., 2019) Cross-sectional study Co-creation of the Autonomous virtual Self-management for patients Patient with COPD (n = 6) The system received a medium
content of an agent with an exemplar long-term and health professionals rating from the participants.
autonomous virtual condition (n = 5) 50% of participants answered
agent and that they would use the
assessment of system often.
acceptability.
(Martínez-Miranda et al., 2019) Cross-sectional study Assessment of user ECA Psychiatry Individuals with suicidality The authors note that
acceptance, antecedents (n = 32) emotional competence and
perception, and level of attachment were
commitment assessed as positive.
embodied
conversational
agent.
(Bian et al., 2020) Exploratory The study aims to AI-assisted follow-up Postoperative care Patient (n = 270) The effectiveness of monitoring
quantitative and compare the cost- conversational patients using intelligent
qualitative Study effectiveness of agent intelligence was not inferior
using AI, or to that of manual
accurate monitoring monitoring. However, it has
of the patient after caused fewer costs.
surgery, and to
compare the
feedback given
through AI or
manual monitoring.
(Sun et al., 2018) Experimental study Evaluate the technical Voice-activated Patient safety Specialist medical doctor and Participants found the system
feasibility of a conversational a pharmacist to be easy to use, quick to
dialog interface interface learn, and minimal
application with interaction required.
speech recognition
software and
BUDLER ET AL.
TABLE A1 (Continued)

Conversational
Reference Study design Study aim agent Field of health care Study sample Main findings
BUDLER ET AL.

conduct a pilot
study of its
applicability to
clinical contexts.
(Cao et al., 2011) Experimental study Present the AskHERMES General health care Clinicians (n = 3) AskHERMES was comparable
AskHERMEES to others with the same
system, compare it median rating of 4, which
with Google tells us that the interface is
(Google and Google efficient and user-friendly.
Scholar) and AskHERMS received the
UpToDate and lowest score in terms of
evaluate them in response quality.
terms of ease of use,
quality of answers,
time spent, overall
efficiency.
(Wolters et al., 2016) Qualitative Study Determine whether Intelligent cognitive Chronic disease management Man with dementia (n = 1), The intensity and style of the
intelligent cognitive assistant man with dementia and intelligent cognitive
assistants are vision impairment (n = 1), assistant's voice must be
acceptable for use male carer (n = 1), older adjusted to the user's wishes.
among the elderly women with dementia
with dementia and (n = 4); older man (n = 2)
how to adapt them and older women (n = 2)
to the patient's without dementia
needs.
(Abdullah et al., 2018) Prospective cohort Assess feasibility and ECA Tobacco and alcohol use Veterans (n = 6) Veterans reported satisfaction
study acceptability to disorders with the use of ECA to set a
support users to smoking cessation date.
quit smoking.
(Owens et al., 2019) Pretest/posttest study Assess the impact of iDecida Oncology Patients (n = 354) Participants reported improving
iDecide on prostate knowledge about prostate
cancer knowledge cancer and the self-efficacy
and self-efficacy in of their decision-making.
decision making.
(Sezgin et al., 2022) Pretest/posttest study Understand feasibility Siri Public health and preventive Parents of children with Voice interaction and the
of voice interaction medicine special health care needs incorporation of automated
and automatic (n = 24) speech recognition into
speech recognition mobile apps are practical and
(ASR) for medical useful for tracking symptoms
note taking at and health events at home.
home.
21 of 21

Abbreviations: AI, artificial intelligence; ASD, autism spectrum disorder; COPD, chronic obstructive pulmonary disease; ECA, embodied conversational agents; IT, innovative technology; MDD, major depressive
disorders (MDD); n, number of participants in the sample; STI, sexually transmitted infections; VP, virtual patient; WA, Watson Assistant.

You might also like