You are on page 1of 88

CUSTOMER SERVICE CHATBOTS:

ANTHROPOMORPHISM, ADOPTION
AND WORD OF MOUTH

Benjamin Thomas Sheehan

Bachelor of International Business, Griffith University


Master of Business Administration (MBA), University of
Queensland

Principal Supervisor: Dr Udo Gottlieb


Associate Supervisor: Dr HS Jin

Submitted in fulfilment of the requirements for the degree of


Master of Philosophy (Business)

School of Advertising, Marketing and Public Relations


Queensland University of Technology
2018
Abstract

Non-human entities can be configured to provide customer service. Most consumers are
accustomed to using an ATM or scanning their own groceries. But a new generation of
interfaces has emerged. Chatbots are similar to existing interfaces in that they replicate the
functional component of customer service. But they are new in that they may replicate the
social component as well. This is theorised to occur through the chatbot’s use of language,
generating anthropomorphism within the user.

The grey literature presents chatbots as ready to provide customer service across a wide range
of industries. Firms are rapidly deploying chatbots, driven by the growth of instant messaging
and advances in artificial intelligence. This thesis examines the relationship between a
chatbot’s humanlike cues and consumers’ behavioural intentions’. Addressing these issues
will help modernise theory while guiding practitioners to develop better chatbots.

This thesis submits two quantitative studies to support the idea that a chatbot’s perceived
humanness is important to consumers. Preliminary results suggest that the anthropomorphism
of a chatbot leads to increases in adoption and recommendation intent. Furthermore, the
source of the anthropomorphic perceptions appears linked to a chatbot’s use of specific
linguistic stratagems which can be manipulated by practitioners and researchers.

Keywords: chatbots, artificial conversational entities, self-service technology, SST,


anthropomorphism, perceived humanness, conversation repair, contextual awareness

i
Table of Contents

Abstract ..................................................................................................................................... i
Table of Contents ..................................................................................................................... ii
List of Figures ......................................................................................................................... iii
List of Tables ........................................................................................................................... iv
Statement of Original Authorship ............................................................................................ v
Chapter 1: Introduction ...................................................................................... 1
1.1 Defining chatbots ........................................................................................................... 2
1.2 Chatbots in customer service roles ................................................................................. 3
1.3 Justification for the choice of theoretical lens................................................................ 4
1.4 Rationale: summary of the research gap ........................................................................ 6
1.5 Research aims and theoretical model ............................................................................. 7
Chapter 2: Literature Review ............................................................................. 9
2.1 Self service technology .................................................................................................. 9
2.2 Justification for the choice of dependent variables ...................................................... 11
2.3 Justification for the choice of independent variables ................................................... 11
2.4 Anthropomorphism ...................................................................................................... 14
2.5 Measuring anthropomorphism ..................................................................................... 15
2.6 Effect of anthropomorphism on adoption and recommendation intent ........................ 17
2.7 Effect of conversational repair and contextual awareness on anthropomorphism ....... 18
2.8 Summary and implications........................................................................................... 23
Chapter 3: Study One ........................................................................................ 24
3.1 Method and design ....................................................................................................... 24
3.2 Results .......................................................................................................................... 31
3.3 Discussion .................................................................................................................... 34
Chapter 4: Study Two........................................................................................ 37
4.1 Method and design ....................................................................................................... 37
4.2 Results .......................................................................................................................... 43
4.3 Discussion .................................................................................................................... 50
Chapter 5: Conclusions...................................................................................... 55
Bibliography ............................................................................................................. 61
Appendices ................................................................................................................ 74

ii
List of Figures

Figure 1. Scholarly works re: “natural language processing”, sorted by year on Web of Science .........4

Figure 2. Scholarly works re: “natural language processing”, sorted by discipline on Web of Science 5

Figure 3. A theoretical model of the relationships explored ....................................................................8

Figure 4. Relationships tested in Study One ..........................................................................................17

Figure 5. Relationships tested in Study Two .........................................................................................18

Figure 6. Example of robotic embodiment and an avatar ......................................................................19

Figure 7. Screenshot of the Facebook page created to introduce participants to the stimulus chatbot ..30

Figure 8. Screenshot from Bot Society website, illustrating how animations were created ..................41

Figure 9. Screenshots from the stimulus animation, illustrating design elements ................................41

Figure 10. Anthropomorphism fully mediates the contextual awareness and adoption relationship.....46

Figure 11. Anthropomorphism fully mediates the contextual awareness and recommendation intent
relationship .............................................................................................................................................47

Figure 12. Graphical representation of mean scores for anthropomorphism, adoption and
recommendation intent across experimental conditions.........................................................................53

Figure 13. Screenshot of an item from the anthropomorphism scale as presented to participants in
study one ................................................................................................................................................59

Figure 14. The original anthropomorphism instrument as presented by Bartneck et al. (2009) ............59

Figure 15. Screenshot of an item from the anthropomorphism scale as presented to participants in
study two ................................................................................................................................................60

iii
List of Tables

Table 1. Independent Variables from the Four Dominant Adoption Models .........................................12

Table 2. Supplementary Independent Variables Used in Conjunction with the Four Dominant
Adoption Models ....................................................................................................................................13

Table 3. The Cooperative Principles, Gricean Maxims of Conversation ..............................................20

Table 4. Means, Standard Deviations, Correlations and Internal Consistency of All Variables ...........32

Table 5. Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor
Variable in a Multiple Regression Analysis of Chatbot Adoption Intent ..............................................33

Table 6. Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor
Variable in a Multiple Regression Analysis of Chatbot Recommendation Intent ..................................34

Table 7. Sample Extract from Animations Illustrating Manipulation of Elicited Agent Knowledge
Across the Experimental Conditions ......................................................................................................40

Table 8. Distribution of Demographic Variables Across the Three Experimental Conditions ..............44

Table 9. Mean Comparison of the Dependent Variables Across the Three Experimental Conditions ..45

Table 10. Results Relative to Hypothesis 3 Through 8 ...........................................................................47

Table 11. Means, Standard Deviations, Correlations & Internal Consistency of All Variables............48

Table 12. Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor
Variable in a Multiple Regression Analysis of Chatbot Adoption Intent ...............................................49

Table 13. Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor
Variable in a Multiple Regression Analysis of Chatbot Recommendation Intent ..................................50

iv
Statement of Original Authorship

The work contained in this thesis has not been previously submitted to meet requirements
for an award at this or any other higher education institution. To the best of my knowledge and
belief, the thesis contains no material previously published or written by another person except
where due reference is made.

Signature: QUT Verified Signature

Date: August, 2018

Note. Professional editor, Robyn Kent (RAK Editing Services), provided copyediting services according
to the guidelines laid out in the university-endorsed national ‘Guidelines for editing research theses’.

v
This page is intentionally left blank.

vi
Chapter 1: Introduction

Much of the customer service literature is predicated on the idea that service delivery is
an interpersonal interaction, occurring between two or more people (van Doorn et al., 2017).
However, non-human entities can be configured to provide customer service. Where a
technological interface allows a customer to produce a service, independent of human staff, it
is known as a self-service technology (SST) (Curran & Meuter, 2005). The SST literature
examines interfaces such as ATMs (Curran & Meuter, 2005), airport self-service kiosks (Fan,
Wu, & Mattila, 2016) and grocery store self-check-outs (Wang, Harris, & Patterson, 2013).

These interfaces are designed to replicate the functional output of a human employee.
ATMs dispenses cash, airport kiosks print boarding passes and self-checkouts calculate the
cost of groceries. However, these SSTs lack the capacity to interact with consumers socially
(van Doorn et al., 2017). As a result, consumers may view SSTs as cold or impersonal
because the service interactions they facilitate lack customisation (Hsuan-Hsuan & Ko-Hsin,
2015).

Many variables used in SST studies view the SST interaction from this functional,
utilitarian perspective, framing the human user as focused on the goal of consumption. Users
are thought to want convenience and efficiency (Collier & Kimes, 2012; Meuter et al., 2000)
from SSTs that are useful and easy to use (Davis, 1989; Weijters et al., 2007). Where
consumer sociality and SSTs intersect, variables focus on the SSTs’ deficits. For example, the
human need for social interaction is considered an antecedent to dissatisfying SST
interactions (Dabholkar, 1996).

SSTs need not be cold and impersonal however. The social robotics literature supports
the idea that users may attribute human characteristics, traits or states to a machine (Bartneck,
Kulic, Croft & Zoghbi, 2009) in a response known as anthropomorphism (Epley, Waytz, &
Cacioppo, 2007). Similarly, social response theory has delivered a series of experiments
which support the idea that humans behave socially towards computers they perceive as social
actors (Nass & Moon, 2000). Chatbots are a form of technology with humanlike cues that
may elicit a psychological response such as anthropomorphism or even a corresponding
behavioural outcome as proposed by Nass and Moon (2000). Furthermore, chatbots can be

Chapter 1: Introduction 1
configured to provide customer service. Thus, chatbots as an SST may be able to replicate
both the functional and social elements of interpersonal service delivery.

This thesis contributes to the self-service literature by examining three broad research
questions, designed to test the antecedents and outcomes of anthropomorphism.

(RQ1): What is the relationship between anthropomorphism, adoption and recommendation


intent for chatbots performing customer service?

(RQ2): What strategies can a chatbot employ in order to affect anthropomorphism?

(RQ3): Do these strategies have a subsequent effect on adoption and recommendation intent?

Anthropomorphism is a process of inductive inference in which the perceiver attributes


humanlike characteristics, motivations, intentions or underlying mental states to a non-human
entity (Waytz et al., 2010). Adoption in the SST context refers to a customer’s decision to
regularly use (or reject) a new technological interface (Walker, et al., 2002). Recommendation
intent, also conceptualised as positive word-of-mouth refers to the likelihood that a customer
would promote the SST to someone else (Huntley, 2006). Thus, both adoption and
recommendation intent are central to the study of SSTs. At the conclusion of the Literature
Review, these research questions are presented as a series of statistically testable hypotheses.

1.1 DEFINING CHATBOTS

Chatbots are computer programs with natural language capabilities, which can be
configured to converse with human users (Maudlin, 1994). Tintarev, O’Donovan, and
Felfernig (2016) conceptualise chatbots as automated advice givers in that they can “propose
and evaluate options while involving their human user in the decision-making process” (p.
26). Dale (2016) describes the commercial chatbot eco-system: “Most visible at the forefront
of the technology, we have the voice-driven digital assistants from the Big Four: Apple’s Siri,
Microsoft’s Cortana, Amazon’s Alexa and Google’s new Assistant. Following up behind, we
have many thousands of text-based chatbots that target specific functionalities, enabled by
tools that let you build bots for a number of widely used messaging platforms” (p. 811). The
text-based systems described by Dale are the focus of this thesis. Text-based chatbots can be
deployed to instant messaging services such as Facebook Messenger, Skype, Twitter, Viber,
WhatsApp and WeChat. Therefore, a text-based chatbot has a theoretical reach of over 2.5
billion people (Statista, 2018).

2 Chapter 1: Introduction
A chatbot is the combination of an interface, an intelligence and back-end systems
(Guzman & Pathania, 2016). The interface is the part of the chatbot that a user interacts with.
Interface access may occur via a phone, a computer or a dedicated device such as Amazon’s
Alexa. Interaction with the interface occurs through vocal or textual communication. The
intelligence and backend-systems facilitate the interaction process and are hidden from the
user. Meanwhile, chatbot intelligence occurs on a spectrum. A low-intelligence chatbot uses
simple rules to appear intelligent. For example, if the human user says X, the chatbot is
programmed to respond with Y. More advanced intelligences may learn from previous
conversations to automatically improve over time. Chatbots may also employ techniques such
as sentiment analysis to improve comprehension. Finally, a chatbot may be connected to other
ancillary, back-end systems, such as a knowledge base, to enable Q&A or a payment gateway
to process financial transactions.

1.2 CHATBOTS IN CUSTOMER SERVICE ROLES

Forecasts suggests that “by 2020, 25% of customer service and support operations will
integrate chatbot technology across engagement channels” (Moore, 2018). The figure of 25%
is up from less than 2% at present. Since work on this thesis began, multinational technology
and social media firms have announced significant steps towards enabling a chatbot to
provide customer service. Google (Perez, 2018), Microsoft (Miller, 2017), Twitter (Perez,
2017) and Facebook (Constine, 2017) have all released developer tools that allow the
proliferation of chatbots on their platforms. At present, Facebook has over 100,000 text-based
chatbots operating on its Messenger service (Johnson, 2017). Research and advisory firm
Gartner (2016) predicts that by 2020 the average person will have more conversations per day
with a chatbot than with their partner. Gartner (2016) does not provide sufficient evidence to
support this claim. However, conversation with non-human entities is likely to affect the
provision of customer service.

Organisations have begun to experiment with chatbots in customer service roles. The
Chatfuel platform (Chatfuel, 2018) claims to support chatbots for clients including Adidas,
British Airways and Volkswagen. Google’s product, Dialogflow is said to power chatbots for
brands such as Comcast, Giorgio Armani and Mercedes (Dialogflow, 2018). Despite the
spread of chatbots across the commercial landscape, empirical research into consumer
perceptions of chatbots is lacking, as illustrated in Figure 2 (p. 5).

Chapter 1: Introduction 3
1.3 JUSTIFICATION FOR THE CHOICE OF THEORETICAL LENS (SST)

The extant literature has only recently discussed chatbots as being conceptually related
to SSTs (van Doorn et al., 2017). But it appears that the marketing and customer service
literature has yet to concretely classify chatbots as belonging to any particular research
stream.

The broader chatbot literature is fractured across many disciplines. It includes


philosophical treatises on an artificial intelligence’s rights to free speech (Ondrej, 2018)
through to technical chatbot coding languages such as ALICE and AIML (AbuShawar &
Atwell, 2016). The majority of papers examining chatbots fall under the academic umbrella of
natural language processing (NLP), which itself is at the intersection of computer science and
linguistics. NLP can be defined as the study of a computer’s ability to understand and
manipulate language via text or speech (Chowdhury, 2003).

Research in NLP has increased sharply in recent years as shown in Figure 1. However,
the majority of papers are technical in nature, reading like patent applications or schematics
for a prototype. This is because the majority of the chatbot literature originates from the
computer science and artificial intelligence research streams, as shown in Figure 2.

Figure 1. Number of peer-reviewed, scholarly works for the keywords “natural language processing”, sorted by
year on Web of Science (1966-present).

Note. Web of Science does not include all scholarly works. Graph is presented as illustrative of a trend, that is an
increase in NLP literature over time.

4 Chapter 1: Introduction
Figure 2. Number of peer-reviewed, scholarly works for the keywords “natural language processing”, sorted by
discipline on Web of Science (1966-present).

Note. Categories listed are not exhaustive. Approximately 80 different disciplines contribute to the body of
knowledge. Graph is presented as illustrative of a trend.

As shown in Figure 2, there is a deficit of knowledge about chatbot performance,


perceptions and applications from a management or business perspective. Data for the
marketing, customer service and consumer behaviour disciplines were not provided by Web
of Science (accessed on 2018, March 25). Addressing this deficiency helps prepare the
customer service literature for an increase in anthropomorphic technologies.

Conceptualising chatbots as an SST is appropriate. Chatbots match the definition of


SST provided by Curran and Meuter (2005) because they can provide customer service
independent of a human employee. The fact that they exist online and may be accessed
through the consumer’s existing technology (as opposed to a kiosk in a public space) does not
invalidate the claim of ‘chatbot as SST’. Well-cited SST literature examines internet banking
(Eriksson & Nilsson, 2007; Nilsson, 2007), online restaurant reservation (Collier & Kimes,
2012) and internet investment software (van Beuningen, de Ruyter, Wetzels, & Streukens,
2008). Each of these examples is a private, online-only SST interface.

Scholars from the information systems discipline consider chatbots as automation, in


that they partially perform a function historically performed by a humans (Parasuraman,
Sheridan, & Wickens, 2000). However, automation and SST differ in application. SST refers
to customer service whereas automation may occur in any process. For example, a printing
press could represent automation.

Where a chatbot performs inside an instant messaging platform, researchers may consider it
as a virtualised process (Overby, 2008) because the human-machine interaction occurs in a

Chapter 1: Introduction 5
non-local space. Finally, to the extent that a chatbot provides advice, the decision sciences
literature considers chatbots as intelligent decision aids (Arnold et al., 2004). In each of these
conceptualisations, the customer service element is lacking. Thus, the SST designation is
considered most appropriate.

1.4 RATIONALE: SUMMARY OF THE RESEARCH GAP

As discussed, this thesis presents the view that (a) chatbots can be an SST, but (b)
chatbots are different from the SSTs previously examined within the literature. Unlike other
SSTs, chatbots are capable of a free-form conversation, which may generate
anthropomorphism within the human user. It is this anthropomorphism or perceived
humanness that is missing from the SST literature. The Literature Review chapter will present
a number of variables that have been empirically linked to the adoption and recommendation
of an SST. Variables that address the social components of the human-machine interaction are
notably missing. The two published literature reviews dealing with variables known to predict
SST adoption (Blut, Wang, & Schoefer, 2016; Hoehle, Scornavacca, & Huff, 2012) do not
mention sociality of the interface.

The conceptual paper of van Doorn et al. (2017) suggests that automated social presence
will allow chatbots to develop relationships with consumers. However, implicit in this
proposition is the assumption that consumers will interact with a chatbot long enough for
relationship building to occur. On the other hand, a consumer might try a customer service
chatbot once, dislike the interaction and vow to use alternative means wherever possible.
Thus, adoption, or the behavioural intention to adopt a chatbot, is central to van Doorn et
als.’s (2017) proposition. Furthermore, van Doorn et al. (2017) imply that automated social
presence will improve customer engagement. However, other researchers have provided
evidence to support an alternative perspective. Ho and MacDorman’s (2017) uncanny-valley
theory demonstrates that the relationship between a social robot’s humanness and positive
user perceptions is non-linear. At some point, the relationship reverses direction and increases
in humanlike cues become unpleasant for users, resulting in perceptions of eeriness.

Addressing the role of anthropomorphism in customer service chatbots will help


integrate the next generation of SSTs into the literature. There is potential to improve the
predictive capabilities of the adoption and recommendation models. Furthermore, in
identifying mechanisms by which anthropomorphic perceptions of a chatbot may be
increased, practical recommendations may be offered.

6 Chapter 1: Introduction
1.5 RESEARCH AIMS AND THEORETICAL MODEL

This thesis attempts to identify a relationship between anthropomorphism as an


independent variable and (a) adoption and (b) recommendation intent as dependent variables.
The first study uses a correlational design. A survey was used to collect quantitative data from
a convenience sample. Multiple regression was used to identify significant relationships
between the variables of interest.

Assuming that anthropomorphism of a chatbot occurs and it can be linked to an


increased likelihood of SST adoption and recommendation intent, then it is possible that
consumers will engage with a chatbot often enough for relationship building to occur, as
proposed by van Doorn et al. (2017).

The second study aims to replicate and extend Study One. An attempt is made to
manipulate anthropomorphic perceptions within the participant. This is achieved by having
the chatbot display particular linguistic constructs: conversational repair and contextual
awareness. In doing so, Study Two demonstrates that anthropomorphic perceptions can be
empirically linked with specific chatbot features. Study Two demonstrates that the
antecedents of anthropomorphism can reside within the technology (as opposed to solely
within the mind of the individual, that is, a participant’s predisposition to anthropomorphise).
Consequently, practitioners could attempt to maximise the chatbot features that have been
identified as contributing to anthropomorphism.

Experimental conditions were used as a means of strengthening any claims made


regarding the relationships between the variables of interest. The second study is an attempt to
move up the hierarchy of evidence, from a correlational study to an experimental study. A
theoretical model emerging from the proposed relationships is shown in Figure 3.

Chapter 1: Introduction 7
Figure 3. A theoretical model of the relationships explored.

The following chapter presents a literature review, which is structured around the topics
of SST and anthropomorphism. The SST section provides justification for the choice of
dependent variables used, while elaborating on the research gap. The anthropomorphism
section discusses ways in which anthropomorphism may be measured or manipulated. Study
One methods, results and discussion are then presented as a single chapter, followed by a
chapter pertaining to the second study, which is presented using the same structure. A
conclusion chapter, discussing implications, limitations and avenues for future research, will
complete the thesis.

8 Chapter 1: Introduction
Chapter 2: Literature Review

The literature review begins with a summation of key findings related to self-service
technology (SST) (sections 2.1-2.3). The purpose of these sections is threefold. First, a
discussion of SSTs helps situate this work in the context of previous findings. Second, the
importance of adoption and recommendation intent as dependent variables is justified.
Finally, the research gap is brought into focus. Twenty-two independent variables from four
dominant adoption models are summarised. Although each of these variables may provide
insight into consumer perceptions of a chatbot, none of them captures a chatbot’s defining
feature. That defining feature is conceptualised here as a chatbot’s use of language, which
may generate anthropomorphism. Adjunct literature gaps are addressed where relevant.

Sections 2.4-2.6 of the literature review focus on anthropomorphism. The dominant


three-factor model of anthropomorphism is presented, with discussion. The various ways in
which previous research has attempted to measure anthropomorphism are discussed along
with a summary of methods employed. By this point, sufficient literature will have been
covered to explore a correlational relationship between variables (Study One). Given that
Study Two seeks to investigate a causal relationship, section 2.7 will discuss how a researcher
might manipulate anthropomorphism. This is crucial, because manipulation is required for
experimental methods (Levine & Parkinson, 2014).

2.1 SELF-SERVICE TECHNOLOGY

The world’s first automatic teller machine (ATM) was installed on 27 June 1967.
Today, there are more than 3,000,000 ATMs dispensing cash around the globe (Holden,
2017). Technology has rapidly changed the nature of service delivery. Many high-touch and
low-tech customer service operations, such as the provision of cash, have been overhauled so
that technology either supports or supplants the human employee (Wang et al., 2013). By
using an SST, the customer produces the service themselves, although interpersonal
interaction with staff may still occur when service delivery fails. For example, staff are often
on the periphery of grocery self-scan or airport kiosks so they can assist the consumer if
needed.

Chapter 2: Literature Review 9


The SST literature is mature (Blut et al., 2016). In the case of banking SSTs, more than
30 years of empirical data has been compiled into a meta-analysis (Hoehle et al., 2012).
Wang, Harris and Patterson (2013) provide a chronological review of the major empirical
findings from SST studies, which suggests a lack of significant developments in recent years.
Of course, the present author contends that this is due to change with the proliferation of next-
generation SSTs such as chatbots.

As discussed, current SSTs lack customisation. The interfaces require the consumer to
select from pre-set menu options on a screen. Everyone has approximately the same service
experience, even if perceptions of that experience differ. This makes sense when the
consumer is viewed as decisive and economical with regards to time and effort. Knowing the
precise combination of buttons to press in order to purchase a train ticket each day is efficient.
But the standardised approach maybe suboptimal in other scenarios. Selnes and Hansen
(2001) claim that SST usage has a negative impact on social attachment to the firm, reducing
customer loyalty.

Standardised SSTs may struggle with complex service delivery, yet a number of start-up
firms are demonstrating that chatbots are capable. For instance, Babylon is a chatbot that
provides triage advice to users who have a medical concern. The system was on trial with the
United Kingdom’s NHS in 2016-17 (Burgess, 2017). DoNotPay is the product of a Stanford
student who describes it as the world’s first robo-lawyer. It has overturned more than
160,000 parking tickets and is being modified to help refugees claim asylum (Cresci, 2017).
Both medical and legal advice are considered credence services, the successful delivery of
which draws on social constructs between parties (Ding, Verma, & Iqbal, 2007). Existing
SSTs may try to evoke sociality by welcoming or thanking a consumer for their patronage,
but the consumer is never under the illusion that the SST is human. Conversely, the “perfect”
chatbot is indistinguishable from a human.

The key dimension of an SST is that the service encounter lacks human involvement
(Curran & Meuter, 2005). However, it may be time to revisit this notion. The SST literature
could benefit from finding a way to reintroduce some approximation of “humanness” into the
discussion through variables such as anthropomorphism.

10 Chapter 2: Literature Review


2.2 JUSTIFICATION FOR THE CHOICE OF DEPENDENT VARIABLES
(ADOPTION AND RECOMMENDATION INTENT)

The studies presented in this thesis use adoption and recommendation intent as
dependent variables. Adoption in the SST context refers to a customer’s decision to regularly
use (or reject) a new technological interface (Walker, et al., 2002). Recommendation intent,
also conceptualised as positive word-of-mouth refers to the likelihood of a customer
promoting the SST to someone else (Huntley, 2006). Thus, both adoption and
recommendation intent are central to SST studies.

From a firm’s perspective, deploying SSTs offers a number of benefits. First, SSTs can
increase efficiency and customer satisfaction (Bitner et al., 2002; Huang & Rust, 2013; Lee,
2014). Second, an SST can standardise service delivery (Selnes & Hansen, 2001). Third,
because an SST may supplement or act as a substitute for the human employee, empirical
research has linked investment in SSTs with a firm’s positive financial performance (Hung,
Yen, & Ou, 2012) and an increase in stock price (Yang & Klassen, 2008). Therefore, Meuter
et al. (2005) talk of the “tremendous lure” of automating service delivery. However, it is not
the act of deploying SSTs that delivers benefits to the firm. Rather, firms enjoy the benefits
once consumers try the SST and commit to future use (adoption). This process of trial,
evaluation and adoption often begins when the SST is recommended to the consumer by
someone else. Thus recommendation intent or positive word of mouth is a dependent variable
discussed in conjunction with adoption (Cheung, 2008; Curran & Meuter, 2005; Safdar,
2018).

2.3 JUSTIFICATION FOR THE CHOICE OF INDEPENDENT VARIABLE


(ANTHROPOMORPHISM)

Four theoretical models have emerged as dominant in the study of technology adoption.
They are:

• The theory of reasoned action (TRA) (Fishbein & Ajzen, 1975) and the updated
version, the theory of planned behavior (TPB) (Ajzen, 1991)
• The diffusion of innovation (DOI) constructs (Rogers, 1983, 2003) conceptualised in
Cooper and Zmud’s’ theoretical model (Cooper & Zmud, 1990)
• The technology acceptance model (TAM) (Davis, 1989), also derived from the TRA
• The unified theory of acceptance and use (UTAUT) (Venkatesh et al., 2003)

Chapter 2: Literature Review 11


Each of the models features adoption as the dependent variable. Given the maturity of
the field, the likelihood of a revolutionary new model is low (Blut et al., 2016; Hoehle et al.,
2012). Meta-analysis suggests that the best approach is to test new “application-specific”
variables in any of the dominant theories as required (Blut et al., 2016; Hoehle et al., 2012).

Rather than trying to determine the pre-eminence of one model, the literature was
examined to identify the range of independent variables that have been shown to influence
adoption or recommendation in previous studies. Tables 1 and 2 summarise the independent
variables identified. These variables are defined in Appendix Item 1.

Table 1.
Independent Variables from the Four Dominant Adoption Models

Independent variables Source of the Independent variables Source of the


predicting adoption construct predicting adoption intent construct
intent (model) (model)

Perceived ease of use TAM / TRA Relative advantage DOI

Perceived usefulness TAM / TRA Compatibility DOI

Performance expectancy UTAUT Complexity DOI

Effort expectancy UTAUT Trialability DOI

Social influence UTAUT Observability DOI

Facilitating conditions UTAUT

Note. TAM = technology acceptance model (Davis, 1989), TRA = theory of reasoned action (Fishbein & Ajzen,
1975), UTAUT = unified theory of acceptance and use (Venkatesh et al., 2003), DOI = diffusion of innovation
(Rogers, 1983, 2003).

The independent variables (Table 1) from the dominant models have been augmented
by the independent variables (Table 2), as the technology or application dictates.

12 Chapter 2: Literature Review


Table 2.
Supplementary Independent Variables Used in Conjunction with the Four Dominant Adoption Models

Variable Reference Variable Reference


Risk (Walker et al., 2002) Habit (Venkatesh et al., 2003,
Limayem, Hirt, &
Cheung, 2007)
Technology (Meuter, Ostrom, Bitner, & Fun (Dabholkar, 1996b)
anxiety Roundtree, 2003)
Image (Moore & Benbasat, 1991) Privacy (Lallmahamood, 2007)
Demonstrability (Moore & Benbasat, 1991) Subjective (Venkatesh et al., 2003)
norms
Voluntariness (Moore & Benbasat, 1991) NFHI (Dabholkar, 1996b)
Technology (Parasuraman & Colby,
readiness 2015)
Note. NFHI = need for human interaction’.

These variables can be classified as either measuring characteristics of the SST or


measuring characteristics of the consumer (Wang et al., 2013). However, this dichotomy is
questionable. For example, ease of use and relative advantage are considered elements of the
SST, even though the instruments capture user perceptions of these dimensions as opposed to
some objective, easily identifiable feature of the SST. Characteristics of the consumer are
measured by variables such as habit or the need for human interaction, along with
demographic variables such as age, gender and education, which have been shown to have an
impact (Nilsson, 2007).

Each of these variables could significantly predict adoption of a customer service


chatbot. For example, users who are high in technological anxiety are probably less likely to
adopt a customer service chatbot. However, none of the variables listed adequately captures
what is unique to a chatbot. None of these predictors addresses a chatbot’s defining feature,
that is the conversational element of the medium that may generate anthropomorphic
perceptions within the user.

If anthropomorphism of a chatbot contributes to adoption and recommendation


intention, then research should seek to identify the chatbot features responsible for this
perception. Marketing practitioners could then maximise those features, increasing the
likelihood of adoption and recommendation. Conversely, if anthropomorphism decreases
adoption and recommendation intent, it could suggest that consumers are uncomfortable with

Chapter 2: Literature Review 13


chatbots, either finding automated social presence unpleasant or simply preferring to navigate
preset menu options when using an SST.

2.4 ANTHROPOMORPHISM

Humans anthropomorphise non-human entities from a young age (Barrett, Richert, &
Driesenga, 2001; Lane, Wellman, & Evans, 2010). This is because anthropomorphism ties
into a number of motivations that are central to the human experience. Epley, Waytz &
Cacioppo (2007) categorise these motivations as sociality motivation and effectance
motivation. Sociality motivation refers to the need to establish social connections, which
promotes co-operation. Co-operation is of evolutionary benefit to humans (Axelrod &
Hamilton, 1981). People who are more socially connected are said to have a lower level of
sociality motivation. In support of this idea, chronically lonely individuals are more likely to
anthropomorphise technology (Epley et al., 2008). Effectance motivation can be
conceptualised as the desire to understand and master one’s environment. Epley et al. (2007)
present effectance as a strategy to reduce uncertainty. By anthropomorphising the non-human,
a person can anticipate the entity’s behaviour, increasing the odds of a favourable interaction.
Evidence that effectance motivation contributes to anthropomorphism of technology is well
presented by Waytz et al. (2010) in five experiments, including measurement via
neuroimaging.

Both sociality and effectance motivation reside inside the mind of the human. That is,
they are qualities of the person who is anthropomorphising. However, Epley et al. (2007)
propose a third antecedent of anthropomorphism, known as elicited agent knowledge (EAK).
EAK refers to the actual humanlike cues projected by the non-human agent. A person
interacting with a non-human entity will examine the entity’s features and behaviour to check
for perceived similarity. For example, some vehicles are designed so that the headlights may
appear as eyes and the radiator grill as a mouth. In the same way that antecedents of
technology adoption are found within both the user and the technology, as Waytz et al. (2010)
explain, anthropomorphism appears to originate within both the perceiver and the perceived.

14 Chapter 2: Literature Review


2.5 MEASURING ANTHROPOMORPHISM

If anthropomorphism is to be used as a theoretical extension or modernisation of the


SST literature, it must be measurable. This is challenging for several reasons. A number of
chatbot and robotics studies use variables related to anthropomorphism without explicitly
measuring it. For example, Cassell and Bickmore (2003) found that a chatbot’s use of small
talk increased user perceptions of the chatbot’s intelligence, engagement, credibility and trust.

Small talk could be seen as a feature that contributes to perceptions of


anthropomorphism, but anthropomorphism itself was not measured. Social robotics research
has examined a robot modelling the traits of humour and empathy (Niculescu et al., 2013).
Humour and empathy are central to the human condition (Davis, 1983; Feingold, 1992), but
they are not the same as anthropomorphism.

Where previous research has attempted to quantify anthropomorphic perceptions


directly, there is a lack of consistency in the constructs and instruments used. For example,
Salem, Eyssel, Rohlfing, Kopp & Joublin’s (2013) work with robots operationalised
humanness via an instrument designed to measure a person’s attribution of humanness to
another person. Thus, the instrument could be said to lack validity. MacDorman, co-creator of
the widely cited uncanny-valley theory (Ho & MacDorman, 2017; MacDorman & Ishiguro,
2006) has used a single item instrument on a Likert scale anchored with the words mechanical
and humanlike (MacDorman, 2006). However, a single item may not be sufficient to capture
the complexity of the anthropomorphism construct.

Bartneck et al. (2009, p. 72) suggest most engineers run naïve experiments to verify a
design and have “a tendency to cook up their own questionnaires”. Morrissey and Kirakowski
(2013) developed a scale to measure the naturalness of chatbots. They propose the naturalness
construct consists of conscientiousness, originality, politeness and thoroughness. But the
factors said to represent naturalness do not line up well with anthropomorphism. For example,
static signage can be seen as polite (Meis & Kashima, 2017) without ever being perceived as
human.

Other popular methods for investigating chatbots are the Turing Test and the Wizard of
Oz method, both of which may be suitable for investigating anthropomorphism. Turing was a
seminal scholar in the chatbot field. Before the 1950s the philosophical enquiry into minds
and machines was focused around the question: Can machines think? Turing found the
“thinking machine” question of little value from a deductive, positivist perspective, instead

Chapter 2: Literature Review 15


asking “are there imaginable digital computers which would do well in the imitation game?”
(1950, p. 442). Thus, the Turing Test involves matching either a chatbot or human
interlocutor with a panel of human judges. If the human judges are unable to distinguish
between the chatbot or the human confederate, the chatbot is said to pass the Turing Test.
Turing was not concerned with the attribution of human characteristics to machines, as such,
but rather wholesale impersonation. Turing Tests typically use a panel of 3–5 human judges,
which is problematic when attempting to generalise results to a broader population.

The Wizard of Oz (WOZ) technique is an experimental method developed by the social


robotics discipline. WOZ has been used in many studies of chatbots and robots (André &
Pelachaud, 2010; Carberry & De Rosis, 2008; Jenkins et al., 2007; Knijnenburg & Willemsen,
2016; Miwa & Terai, 2012; Quarteroni & Manandhar, 2007; Stevens et al., 2016). Using the
WOZ method, the researcher takes on the role of a “wizard”, assuming control of the chatbot.
The participant believes they are communicating with an artificial intelligence when in fact
they are engaged in conversation with a human researcher (Costello & Donnellan, 2007). This
deception enables prototyping and the generation of ecologically valid data, years before the
required technology exists. Ensuring consistent exposure to the chatbot stimulus is impossible
when using the WOZ technique because every research participant will phrase their input
differently. Some participants will attempt to probe the capabilities of the chatbot, while
others will ask only basic questions.

In general, the measurement of anthropomorphic qualities is still developing. The most


recent work in the customer service discipline is that of Doorn et al. (2017) with their
theoretical piece on automated social presence (ASP). ASP is proposed to be the degree to
which a machine (chatbots are specifically mentioned) can “make consumers feel that they are
in the company of another social entity” (van Doorn et al., 2017, p. 45). ASP is presented as a
conceptual combination of the variables competence and warmth from social psychology’s
stereotype content model (Fiske, Cuddy, & Glick, 2007).

This thesis uses the anthropomorphism construct from the Godspeed Questionnaire, one
of the most frequently cited instruments in the human-robot interaction literature (Weiss &
Bartneck, 2015). Full details of the instrument are provided in Study One. Steinfeld et al.
(2006) have suggested that the use of standardised instruments is essential in developing a
robust evidentiary body of knowledge regarding human perceptions of non-human entities. It
is hoped that any future research into the anthropomorphism of chatbots uses this scale, so
that results may be compared.

16 Chapter 2: Literature Review


2.6 HYPOTHESIS DEVELOPMENT: EFFECT OF ANTHROPOMORPHISM ON
ADOPTION AND RECOMMENDATION INTENT

A number of scholarly works support the hypothesised relationships shown in Figure 4.

Figure 4. Relationships tested in Study One.

Previous research has linked anthropomorphism with positive attitudinal and


behavioural responses from consumers. For example, anthropomorphised brand mascots have
been shown to increase emotional engagement (Bennett & Thompson, 2016; Luo,
McGoldrick, Beatty, & Keeling, 2006). Anthropomorphic ascriptions to logos (Payne,
Hyman, Niculescu, & Huhmann, 2013) or vehicles (Aggarwal & Mcgill, 2007) are linked to
increases in consumer affect. These findings are conceptually congruent with Epley et al.’s
(2007) notion of sociality motivation as an antecedent of anthropomorphism, in that the
anthropomorphised product or brand allows the consumer to feel a sense of social
connectedness.

Consumers in other self-service domains prefer anthropomorphic interfaces, that is,


individuals who report a high degree of personal power perceive electronic poker machines as
easier to control and less risky when the interface includes anthropomorphic elements (Kim &
McGill, 2011). Furthermore, research has found a corresponding effect regarding SST
switching intentions during service delivery failure (Fan et al., 2016). In this way, Epley et
al.’s (2007) effectance motivation for anthropomorphism is related to positive attitudinal
responses in consumers, in that anthropomorphism increases a consumer’s sense of control
and confidence. Accordingly, in the examination of RQ1 (“What is the relationship between
anthropomorphism, adoption and recommendation intent for chatbots performing customer
service?”), the following hypotheses are proposed:

Chapter 2: Literature Review 17


H1: Anthropomorphism will significantly predict adoption intent

H2: Anthropomorphism will significantly predict recommendation intent

2.7 HYPOTHESIS DEVELOPMENT: EFFECT OF CONVERSATIONAL REPAIR


AND CONTEXTUAL AWARENESS ON ANTHROPOMORPHISM

A number of scholarly works support the hypothesised relationships shown in Figure 5.

Figure 5. Relationships tested in Study Two.

By having participants interact with a chatbot and measuring anthropomorphism,


adoption and recommendation intent, Study One examines the data for a correlational link
between the variables. What remained unclear at the conclusion of Study One was whether
the anthropomorphism detected was the result of (a) properties unique to the individual or (b)
properties of the chatbot itself. Study Two addresses this issue by using an experimental
method, which included manipulating the chatbot’s behaviour and randomly assigning
participants to the experimental conditions.

The purpose here is not to engage in what Burger and Sheehy (2012) call the 40-year
battle between situationists and personality psychologists. This thesis takes at face value the
idea that both the person and the situation cause human behaviour. However, sociality and
effectance motivation are individual difference variables. Therefore, a marketing practitioner
cannot program a chatbot to leverage these user characteristics, unless, of course, the chatbot
was designed to specifically target those with unusually high or low sociality and effectance
motivation, such as the chronically lonely. The purpose of Study Two was to demonstrate that
certain chatbot behaviours are responsible for anthropomorphism (through access to EAK).

18 Chapter 2: Literature Review


This suggests that practitioners can maximise those features in order to increase adoption and
positive word-of-mouth.

Determining which features of a textual chatbot might increase or decrease access to


EAK is challenging. To illustrate this, compare a text-only chatbot with a social robot or a
chatbot with an avatar, as displayed in Figure 6.

Left: A NAO robot illustrative of physical embodiment. Right: Mitsuku, winner of the Loebner Prize (2013,
2016, 2017) for artificial conversational entities, illustrative of an avatar.

Figure 6. Example of embodiment and an avatar.

The social robot on the left has a physical body (embodiment) and the chatbot on the
right has an avatar, or a graphical representation of a face or face and body (Blascovich et al.,
2002). The features of a social robot or avatar likely to influence access to EAK are
unambiguous. Previous research has considered the relationship between anthropomorphism
and humanlike cues such as the shape and characteristics of robotic heads (Duffy, 2003) or
the facial expression and gaze of a virtual agent (Marschner et al., 2015). Others have
examined the impact of cues such as vocal characteristics (Elkins & Derrick, 2013; Nass &
Lee, 2001), race (Marino, 2014) or gender (Nass, Moon, & Green, 1997). A textual chatbot
running through an instant messaging platform provides none of these anthropomorphic cues.

Nass (2004) suggests that humanlike cues stack synergistically to create perceptions of
humanness in computers that are perceived as social actors. The more humanlike cues a
computer can demonstrate, the stronger the perceptions of a social other. Humanlike
dimensions listed by Nass include language use, a voice (real or synthetic), a face via an
avatar, interactivity and unpredictability. Of Nass’s (2004) dimensions, a textual chatbot on
Facebook can demonstrate language use and interactivity. The human-robot interaction
literature also supports the idea of humanlike attributes increasing anthropomorphism in a
non-linear fashion. Kiesler et al. (2008) demonstrate that the most salient human
characteristics are appearance, back-story, observable behaviour and etiquette.

Chapter 2: Literature Review 19


Whether using the dimensions provided by Nass (2004) for computers or by Kiesler
(2008) for robots, a textual chatbot is missing key humanlike cues. A consumer will not have
the same access to EAK with a textual chatbot as with a social robot. On one hand, a chatbot
cannot display non-verbal cues such as facial expressions. On the other hand, the lack of a
face may increase anthropomorphism, given that most robotic faces are clearly non-human
(Duffy, 2003). As was Turing’s (1950) original contention, the user of a “perfect” chatbot
would have no cues to suggest the other party was anything but human.

In the second study presented in this thesis, the stimulus materials focus on chatbot-
human interactions where miscommunication occurs. This increases ecological validity. As
Dale (2017) notes, “the barrier to [chatbot] entry is now very low, so that anyone can be a
chatbot developer”, (p. 644). This means that textual chatbots fail frequently. Even the best
chatbots, such as Mitsuku fail (Worswick, 2018).

Miscommunication may occur when a conversational party violates the cooperative


principle (H. Grice, 1975; P. Grice, 1978). The cooperative principle describes how successful
interpersonal communication is achieved in social situations (Eunson, 2015), given that
communication is a collective activity, requiring coordination of both content and process
between parties (Clark & Brennan, 1991). The cooperative principle has four maxims (Grice,
1975), as summarised in Table 3.

Table 3.
The Cooperative Principles, Gricean Maxims of Conversation

Maxim Description

Quality Do not lie, do not make unsupported claims

Quantity Say neither more nor less than the discourse requires

Manner Be brief and orderly, avoid ambiguity and obscurity

Relevance Be relevant, appropriate to the immediate needs of the conversation

Note. Highlighted cells (Manner & Relevance) form the theoretical basis for the experimental manipulation in
Study Two.

Study Two attempts to manipulate access to EAK (and subsequently


anthropomorphism) through (a) the chatbot’s use of conversation repair once the user has
violated the “manner maxim” by using obscure or ambiguous terms and (b) the chatbot’s
failure to demonstrate contextual awareness, violating the “relevance maxim”.

20 Chapter 2: Literature Review


Conversational Repair: Conversational repair is a communication strategy used by people to
manage miscommunication (Fromkin, 1971), where the miscommunication is known as a
trouble source (Schegloff, 1992). Either the message sender or receiver can initiate the repair
(Hutchby & Wooffitt, 2008). A message sender, sensing the potential for miscommunication
may rephrase their previous statement, following an utterance such as “What I mean is…”.
Alternatively, the message receiver may seek clarification of a particular message using
“huh?”, “what?” or an apology-based format such as “I’m sorry. What do you mean?”
(Robinson, 2006).

The stimulus in Study Two used “other-initiated self-repair” (Schegloff, 2000). The
chatbot struggles to deduce meaning from the human’s input and initiates the repair dialogue.
The actual repair itself is provided by the human when further clarification is given in the
following utterance. Study Two assesses the impact of conversational repair on
anthropomorphism. For example, consider the following vignette;

Human user: “I am headed off on vacation.”

Chatbot: “What? I don’t understand.”

Human user: “I am going on holiday to the USA.”

The chatbot’s use of the word what constitutes initiation of conversation repair.

Conversation repair has been linked to social coordination in that it demonstrates one’s
ability to use synchronised interaction strategies, such as turn-taking and role-switching (Corti
& Gillespie, 2016; Kaplan & Hafner, 2006). Corti and Gillespie (2016) discuss conversation
repair as being a fundamental component of intersubjective effort, where intersubjectivity is
defined as shared meaning, co-created and co-existing within two or more conscious minds
(Stahl, 2015). As such, the following hypotheses are proposed:

H3: A chatbot using conversation repair will be perceived as significantly more


anthropomorphic

H4a & H4b: A chatbot using conversation repair will receive significantly higher scores for (a)
adoption intent and (b) recommendation intent

Chapter 2: Literature Review 21


Contextual Awareness: Context is a set of shared assumptions between individuals, which
provide the basis for communication (Yus, 2006). More precise definitions of context vary
according to a researcher’s scholarly discipline. Linguistics, cognitive science and computer
science offer slightly modified accounts of what constitutes context (Bazire & Brézillon,
2005). Faber and León-Araúz (2016) attempt to unify the differences between disciplines,
presenting a taxonomy of context with two dimensions: local and global. Local context is said
to be shared meaning derived temporally, from utterances five turns before or after the present
utterance. Conversely, global context is the product of all utterances from a conversation as
well as situational dimensions surrounding the communicative act, such as an interlocutor’s
cultural background or the environment in which the act occurs. Flowerdew (2014) argues
that humans are innately skilled in crafting and interpreting context and that contextual
awareness requires little effort. However, developing conversational software that can
logically infer meaning from context is difficult (Scheutz, Schermerhorn, & Cantrell, 2011).

A chatbot failing to track the context of a conversation over time would violate the
Gricean maxim of relevance, which states that a “partner’s contribution should be appropriate
to immediate needs at each stage of the transaction” (Grice, 1975, p. 47). Gricean maxims,
summarised as the “cooperative principle” are essential to successful human communication;
thus, a chatbot violating any of the maxims (relevance in this instance), should be perceived
as presenting less human-like cues. Accordingly, the following hypotheses are proposed;

H5: A chatbot failing to maintain contextual awareness will be perceived as significantly less
anthropomorphic

H6a & H6b: A chatbot failing to maintain contextual awareness will receive significantly lower
scores for (a) adoption intent and (b) recommendation intent

Finally, Study Two provides the opportunity to re-test the relationships examined in
Study One and identify any consistent patterns across the three conditions. Consequently, the
following hypotheses are presented:

H7: A chatbot perceived as higher in anthropomorphism will receive significantly higher


scores for adoption intent

H8: A chatbot perceived as higher in anthropomorphism will receive significantly higher


scores for recommendation intent

22 Chapter 2: Literature Review


2.8 SUMMARY AND IMPLICATIONS

This literature review highlighted a number of research gaps:

1. The SST literature has yet to sufficiently examine interfaces capable of natural
language use. These interfaces are unique among SSTs, in that users may
anthropomorphise them.
2. Extant adoption and recommendation models include a number of independent
variables, as summarised in Tables 1 and 2. However, the role of anthropomorphism
(or related constructs such as automated social presence) has yet to be investigated.
Anthropomorphism may correlate with (or perhaps cause) adoption and positive word-
of-mouth.
3. Pinpointing the features of a robot (that is, facial expression, gaze, movement) that
produce anthropomorphism is possible. However, chatbot features, that act as
antecedents of anthropomorphism are harder to identify.

Addressing these research gaps will:

1. Represent a contribution towards the modernisation of the SST literature.


2. Extend existing SST theory to include the novel independent variable of
anthropomorphism. This will improve the predictive capabilities of existing models.
3. Suggest ways in which practitioners can increase access to elicited agent knowledge
(EAK), in order to increase adoption and recommendation intent via anthropomorphic
perceptions.

Chapter 2: Literature Review 23


Chapter 3: Study One

The previous chapter introduced the concept of anthropomorphism and theoretical


accounts for the phenomenon. The literature regarding chatbots, technology adoption,
recommendation intention and self-service technology (SST) was summarised. The following
chapter will present the method, results and discussion for Study One, the aim of which was
to investigate any correlational relationship between anthropomorphism, adoption and
recommendation intent.

3.1 METHOD & DESIGN

RQ1 (p. 3) was approached from a positivist epistemological paradigm, assuming


knowledge is obtained through observation and measurement (O’Leary, 2004). A deductive,
top-down approach was used, where theories of anthropomorphism and technology adoption
informed specific hypotheses to be formally tested (Babbie, 2016). Therefore, the scope of
work and the methodology employed have methodological fit, categorised within the “mature
archetype” presented by Edmondson and McManus (2007). In other words, a specific
research question, combined with quantitative data and existing constructs, was used to draw
statistical inferences regarding a broader population. Edmondson & McManus describe each
of these methodological choices as congruent and complimentary. Neuman’s (2014)
discussion on the purpose of explanatory research illustrates the aims of the project as being
to (a) elaborate and enrich a theory, (b) extend a theory to new issues or topics and (c) support
or refute a prediction. This study aims to extend the theory of SST adoption to interfaces with
anthropomorphic qualities.

Study One used a correlational survey design to test the relationship between
anthropomorphism and the dependant variables of adoption and recommendation intent.
Participants were asked to have a scripted conversation with a purpose-built chatbot on
Facebook. Participants were advised that this would help to train the chatbot. Quantitative
measurements were taken so that regression analysis could test the following hypotheses;

H1: Anthropomorphism will significantly predict adoption intent

H2: Anthropomorphism will significantly predict recommendation intent

Chapter 3: Study One 25


3.1.1 Participants
A convenience sample of 60 participants from Amazon’s Mechanical Turk (MTurk)
website elected to participate in the study. All participants were over 18 years old, had
English as a primary language, were currently residing in the United States and had valid
Facebook log-in credentials and a positive user score for their previous work on MTurk.

MTurk is an online portal providing access to workers who complete human


intelligence tasks (HITs). Goodman and Paolacci (2017) note that 43% of published
behavioural studies incorporated MTurk in the sampling strategy. More than 15,000 published
papers have referenced MTurk over the past 10 years (Chandler & Shapiro, 2016). Stewart et
al. (2015) estimate the MTurk population available to a typical study at 7,300 workers, based
on their assessment of 114,460 completed sessions. Anyone over the age of 18 can apply to
complete HITs.

It is impossible to calculate a margin of error when using MTurk because participants


are self-selected. The likelihood of the sample being representative of a broader population
beyond MTurk cannot be determined. However, a number of successful replication studies
have compared data collected from crowd-sourced samples and traditional methods
(Goodman, Cryder, & Cheema, 2013; Horton, Rand, & Zeckhauser, 2010; Siddharth & Watts,
2011).

It is plausible to suggest that MTurk samples are biased in some ways. For example, the
average MTurk user may have (a) a more immediate need for financial resources, (b) more
free time, (c) more experience online or (d) a stronger desire to participate in research studies
than a general member of the broader population (Chandler, Mueller, & Paolacci, 2014).
None of these potential differences are theoretically linked to the variables of interest in this
study. Therefore, MTurk was considered an appropriate means of generating a sample. Any
sensitising effects, due to the participants having completed previous studies, are conceptually
counterbalanced against the decreased potential for experimenter expectancy that comes with
anonymous online samples.

Participants were recognised for their contribution with a payment of US$2.00 for a
completed survey submission. This amount was considered in-line with social norms and the
principals of respect in research, striking a balance between compensation for the opportunity
cost of one’s time and the subtle coercion accompanied by larger payments. The US Bureau
of Labor Statistics has set the federal minimum wage at US$7.25 per hour.

26 Chapter 3: Study One


Assuming approximately 20-25 minutes were required for participation, remuneration was
significantly higher than what has been offered by previous social science studies (Bohannon,
2011; Buhrmester, Kwang, & Gosling, 2011).

Litman, Robinson and Rosenzweig (2015) discuss a number of MTurk studies,


including tasks such as picture sorting, crossword puzzles, computer game performance,
survey studies and transcription tasks, to conclude that remuneration rates had almost no
measurable influence on data quality. Buhrmester, Kwang and Gosling (2011) explain these
results by examining the reasons MTurk users chose to participate in studies. The authors
conclude that non-financial motivations, including killing time, having fun, task enjoyment
and self-knowledge, rank alongside monetary reward in the top 5 reasons to participate.
Hence, the US$2.00 per complete submission was deemed appropriate.

The study’s cover page on MTurk explicitly stated that only MTurk users from the
United States could participate in this study. This was reiterated on the study’s Participant
Information Cover Sheet in Qualtrics. However, additional steps were taken to ensure
participants were actually from the United States and did not provide data under false
pretences or through inattention to selection criteria. First, the study was listed at 3:00AM
local time (UTC+10), so that it would be visible to United States participants between 9AM
(UTC-8 in Los Angeles) and midday (UTC-5 in New York). MTurk users elect to participate
on a ‘first-come, first-served’ basis, so this increased the likelihood of sample validity.
Second, Qualtrics was configured to capture users’ meta-data, including geographic location
associated with their IP address. Participants were explicitly asked which US state they
resided in as part of the Qualtrics survey and their responses were matched with IP locations
to verify accuracy. Of course, a sophisticated user could circumvent this process by using a
virtual private network (VPN), but this was deemed low risk.

A sample size of 60 was considered appropriate for this survey. Power analysis for a
linear regression was conducted using G-POWER to determine a sufficient sample, given an
alpha of .05, a power of .80 and a medium effect size (f2 = .15) (Faul et al., 2009). Based on
these assumptions, G-POWER suggested a sample of 55.

3.1.2 Stimulus
Participants were required to hold a scripted conversation with a purpose-built chatbot
deployed on Facebook Messenger. Participants were told that their task was to assist in
training a new chatbot. They were provided with a script, which included statements

Chapter 3: Study One 27


contextualised around (a) ordering flowers for delivery and (b) reserving tickets for a sporting
event. Participants were instructed to type text from the script into the chat window, such as “I
forgot Mother’s Day. I need some flowers ASAP” and the chatbot was programmed to
respond, “Oh no. Ok. What is your budget?” Thus, the stimulus was designed to simulate a
short conversation. A complete transcript of the conversation is provided in Appendix Item 2.

3.1.3 Instruments
Anthropomorphism (IV): This variable was adapted from the godspeed questionnaire
(Bartneck et al., 2009), which provides a set of semantic differential scales to measure the
factors of anthropomorphism, animacy, likability, intelligence and safety of social robots. By
October 2014, the godspeed questionnaire had been cited in over 160 studies, many of which
used only a single factor from the godspeed questionnaire (Weiss & Bartneck, 2015), as was
the case here.

The anthropomorphism instrument includes 5 items: (a) fake – natural, (b) machinelike
– humanlike, (c) artificial – lifelike, (d) unconscious – conscious and (e) communicates
inelegantly – communicates elegantly. Response options are similar to a Likert scale, in that
7 response options are presented. However, the instrument is a semantic differential scale. At
each end of the x axis are polar opposite adjectives. The response options were originally
designed to provide ordinal data, with participants circling an integer. However, this was
modified via Qualtrics so that the data collected was a continuous value between -3 and +3.
Curdy (2014) provides instructions for modifying the HTML in Qualtrics to achieve this.

Adoption intention (DV): This study operationalised adoption with a simple single item
measured on a 7-point Likert scale. Participants were asked to indicate their agreement with
the following statement: “Please think about the chatbot you interacted with today. When
purchasing flowers for delivery (or tickets to a sporting event), I would use a chatbot like
this”. Response options were anchored at strongly agree and strongly disagree. This wording
was modified from similar single-item measures of SST adoption (Chiou & Shen, 2012;
Curran, Meuter, & Surprenant, 2003; Lee, Castellanos, & Chris Choi, 2012).

Recommendation intent (DV): Recommendation intent was measured with a single item
on a 7-point Likert scale with end points at strongly agree and strongly disagree. A number of
measurement instruments for recommendation intent exist; however, many are crafted around
recommendations made specifically via online channels. General purpose recommendation
measures include a single-item approach (Chiou & Shen, 2012; Meuter et al., 2003) or three

28 Chapter 3: Study One


item measures (Zeithaml, Berry, & Parasuraman, 1996). The single-item measure was chosen
so that it was consistent with the single item measurement of adoption intent. Furthermore,
the three-item instrument discusses “brands” and would require significant modification to
suit the chatbot scenario.

Control variables used in this study included demographic variables such as age,
gender, race and education. The personality trait openness (Soto & John, 2017), technological
anxiety (Meuter et al., 2003) and the frequency of previous Facebook use were also measured,
to be used as control variables. Full details of all instruments are provided in Appendix Item
3. All items were monotone, avoiding extreme or suggestive language. None of the items was
likely to induce social desirability bias.

3.1.4 Procedure and Timeline


The chatbot was built using Chatfuel (www.chatfuel.com), a free to use, online
development platform. A number of alternatives are available, including api.ai (purchased by
Google in 2017), IBM Watson or Microsoft’s Bot Framework. Chatfuel was chosen because
it met specifications and was the easiest to configure. The Chatfuel website claims that the
program powers chatbots for many large brands, including the US National Football League
(NFL), TechCrunch and Forbes (Chatfuel, 2018).

The scripted participant inputs were created using rules. When the participant entered a
statement, a rule was triggered and the chatbot provided a pre-determined response. Thus,
pattern-matching is all that occurred, making future replication of the method straightforward.

The chatbot was programmed to address participants by their first name and provide a
basic welcome message and thank you message. The purpose of the welcome message was to
allow participants time to adjust to the change of website (from Qualtrics to Facebook) before
accessing the stimulus. At the end of the chatbot conversation, the chatbot was designed to
give the participant a 4-digit code. Participants were asked to enter this code back into
Qualtrics to confirm that they had followed instructions and successfully completed exposure
to the stimulus materials.

A Facebook page was made for the chatbot (Figure 7). The page used a template
provided by Facebook. Logos and design elements were added, as well as some basic
instructions on how to begin the interaction, for example “click here to get started”. The tone
of the language and design elements was neutral.

Chapter 3: Study One 29


Figure 7. Screenshot of the Facebook page created to introduce participants to the stimulus chatbot.

MTurk users were given the opportunity to provide informed consent via a detailed
participant information cover-sheet. This document outlined the risks and benefits of
participating as well as details regarding privacy and the future use of data. Participants were
expressly cautioned that while QUT Ethics and Privacy protocols covered this project,
interaction with the chatbot was to occur via Facebook. The research team could not
guarantee the privacy of any data generated during the chatbot interaction on the Facebook
website.

Those participants who elected to proceed responded to general demographic questions.


Participants were then provided with a link to the Facebook chatbot and followed instructions
for exposure to the stimulus. When the conversation had been completed, participants
returned to the Qualtrics survey. Measurements of the dependant variable occurred next,
followed immediately by measurement of the independent variables. This was the first time
that participants were exposed to information that suggested the study’s purpose.

The limitations of this design are ecological validity and artificiality. Participants were
given scripts to use in their interaction with the chatbot. However, in a natural setting, chatbot
users may phrase their statements in any number of ways. For example, when tasked with
ordering flowers online, person A may state “I have a $50 budget” and person B may ask
“What do you have for $50”. Here the scripting of the conversation is considered a strength of
the design. Scripted interactions ensure all participants have the same experience, which adds
to internal validity and replicability.

As discussed in the literature review, this level of control is not available in other commonly
used methods such as the Wizard of Oz technique or a Turing Test.

30 Chapter 3: Study One


3.2 RESULTS

The data was analysed using IBM’s SPSS version 25.0. The following sections detail
the data cleaning and assumption testing process, while providing results for the tested
hypotheses. The initial sample size for this study was 60 participants. Of the sample, 52%
were male, with the average age of the sample recorded at 36.33 years (SD = 9.875). More
than 70% of the sample identified as Caucasian Americans and all levels of education were
represented within the sample.

None of the respondents submitted incomplete data, as Qualtrics was configured such
that participants could not proceed without responding to all items presented. However, there
was evidence of careless responding (Curran, Kotrba, & Deninson, 2010) or content non-
responsivity (Meade & Craig, 2011). In two cases (3.2%), the participants entered a code into
Qualtrics that did not match the codes issued by the chatbot at the end of a successful
interaction. This suggests that these participants were incorrectly exposed to the stimulus
material. Their responses to the measurement instruments could not be based on the item
content and were therefore removed.

This failure rate is in-line with estimates of carelessness presented by other researchers
(Ehlers et al., 2009). The cause of this failure is unknown. Perhaps the participants were
distracted by accessing social media (Osgood, Ward, & Meade, 2015) or the nature of an un-
proctored online survey (Weigold, Weigold, & Russell, 2013), which lacks cognitive
researcher involvement (Mohorko & Hlebec, 2016). Perhaps these respondents sought to
repair a perceived imbalance between required effort and reward as described by equity
theory (Adams, 1963) or their involvement in previous research projects on MTurk resulted in
ego-depletion (Meade & Craig, 2011). A third case was removed because the data met the
criteria for an extreme univariate outlier as described by Allen, Bennett and Heritage (2014).

3.2.1 Assumption testing and descriptive statistics


Cronbach’s alpha (α) was calculated to measure the internal consistency of the
independent and control variables. The internal consistency of the anthropomorphism and
openness personality trait instruments could be considered good (α > .80), while technological
anxiety was excellent (α > .90) (DeVellis, 2012; Kline, 2000). Refer to Table 4 for precise
values. Composite variables were created.

Chapter 3: Study One 31


Table 4.
Means, Standard Deviations, Correlations and Internal Consistency of all Variables

Variable Mean (SD) 1 2 3 4 5 6 7


Adoption: Scenario 1 (DV) 5.84 (1.17) --
Adoption: Scenario 2 (DV) 5.28 (1.70) .660** --
Recommendation (DV) 5.76 (1.11) .651** .636** --
Anthropomorphism (IV) 4.82 (1.18) .440** .381** .300* (.882)
Openness (control) 4.45 (1.14) -.118 .121 -.214 .251 (.882)
Technological anxiety 2.29 (1.11) .353** .255 .154 .197 .031 (.919)
(control)
FB use frequency (control) 4.40 (1.46) .022 .061 .100 -.052 .078 .033 --

Note. Adoption: Scenario 1 was the purchase of flowers for delivery and Adoption: Scenario 2 was the purchase
of sporting event tickets.

** p < 0.01 level, two-tailed, * p < .05, two-tailed. Cronbach α reported diagonally in parentheses. A double dash
(--) indicates the variable was measured with single item; thus, α was not calculated.

All predictor and criterion variables were assessed for normality. Exploration of the data
included examination of the 95% confidence interval for the mean, the magnitude of the gap
between the mean and 5% trimmed mean as well as skewness and kurtosis via histograms and
QQ plots. The Shapiro-Wilk test of normality suggested that all variables were within the
tolerances proposed by statisticians (Field, 2013; Pallant, 2013) of ≤±2.00 for skewness and
≤7.00 for kurtosis.

3.2.2 Hypothesis testing


H1: Anthropomorphism will significantly predict adoption intent

To estimate the proportion of variance in a participant’s intention to adopt a chatbot


(adoption scenarios 1 and 2) that can be accounted for by anthropomorphism, a standard
multiple regression analysis was performed. The multi-collinearity tolerances for the predictor
variables were inspected and below thresholds for concern. The Mahalanobis distance did not
exceed the critical χ² for df = 4 (α = .001) of 18.47 for any cases in the dataset, indicating that
multi-variate outliers would not interfere with the ability to interpret the outcome of the
multiple regression analysis.

Adoption: Scenario 1 (flowers): In combination, anthropomorphism and the control


variables accounted for a statistically significant 32.4% of the variability in adoption
intention, R² = .324, adjusted R² = .272, F (4, 52) = 6.218, p = .000. By Cohen’s (1988)
conventions, a combined effect of this magnitude can be considered “large” (f² = .479).

32 Chapter 3: Study One


Standardised regression coefficients (ß) and squared semi-partial (or “part”) correlations (sr²)
for each predictor in the regression model are reported in Table 4.

Adoption: Scenario 2 (sporting tickets): In combination, anthropomorphism and the


control variables accounted for a statistically significant 18.5% of the variability in adoption
intention, R² = .185, adjusted R² = .122, F (4, 52) = 2.95, p = .029. By Cohen’s (1988)
conventions, a combined effect of this magnitude can be considered “medium” (f² = .226).
Standardised regression coefficients (ß) and squared semi-partial (or “part”) correlations (sr²)
for each predictor in the regression model are reported in Table 5.

Table 5.
Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor Variable in a Multiple
Regression Analysis of Chatbot Adoption Intent.

DV Predictor ß 95% CI sr²


Adoption Anthropomorphism .451** .216, .716 .182
intention Openness -.244* -.515, -.007 .055
(purchasing
flowers for Tech. anxiety .269* .039, .552 .069
delivery) FB use freq. .056 -.146, .240 --
Adoption Anthropomorphism .342* .108, .851 .105
intention Openness .024 -.343, .414 --
(ordering tickets
to a sporting Tech. anxiety .185 -.107, .657 --
event) FB use freq. .071 -.207, .368 --
** p < 0.01 level, two-tailed, * p < .05, two-tailed.

Anthropomorphism significantly predicted adoption intent in both the flower scenario


(ß = .451, t = 3.74, p = .000) and the sports ticket scenario (ß = .342, t = 2.58, p = .012). H1 is
supported. The anthropomorphism standardized beta weights (.451, .342) indicate the
predicted delta (in units of standard deviation) in the dependant variable associated with a one
standard deviation delta in adoption. In other words, after controlling for the other predictors
in the model, a one standard deviation increase in anthropomorphism will result in a .451
increase in adoption intention for purchasing flowers and a .342 increase in adoption intention
for purchasing sporting event tickets. The sr² values give the proportion of the variance in
adoption intent, which is uniquely explained by the criterion variables (Allen et al., 2014).

Chapter 3: Study One 33


H2: Anthropomorphism will significantly predict recommendation intent.

The same procedure was followed to examine the data, using recommendation intention
as the dependant variable. In combination, anthropomorphism and the control variables
accounted for a statistically significant 20.7% of the variability in adoption intention, R² =
.207, adjusted R² = .146, F (4, 52) = 3.39, p = .015. By Cohen’s (1988) conventions, a
combined effect of this magnitude can be considered “medium” (f² = .261). Standardised
regression coefficients (ß) and squared semi-partial (or “part”) correlations (sr²) for each
predictor in the regression model are reported in Table 6.

Table 6.
Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor Variable in a Multiple
Regression Analysis of Chatbot Recommendation Intent

DV Predictor ß 95% CI sr²


Recommendation Anthropomorphism .370** .101, .592 .122
Intention Openness -.321* -.561, -.062 .095
Tech. Anxiety .086 -.166, .338 --
FB Use Freq. .141 -.082, .297 --
** p < 0.01 level (2 tailed), * p < .05 (2 tailed).

Anthropomorphism significantly predicted recommendation intent (ß = .370, t = 2.84, p


= .006). H2 is supported. After controlling for the other variables in the model, a one standard
deviation increase in anthropomorphism will result in a .370 increase in recommendation
intent.

3.3 DISCUSSION

The first study presented in this thesis examined a chatbot performing customer service
work via Facebook Messenger. The purpose was to identify any correlational relationships
between anthropomorphism of the chatbot and adoption and recommendation intent as
dependent variables.

The average participant score for the anthropomorphism composite variable was 4.82 on
a 7-point scale. A score above 4.0 would suggest that users perceived the chatbot as having
some anthropomorphic qualities. A review of each of the individual scale items provides
context. The chatbot was perceived as more natural than fake, more human-like than
machinelike, more lifelike than artificial, more conscious than unconscious and
communicating more elegantly than inelegantly. The highest score of an individual item was

34 Chapter 3: Study One


for natural versus fake, at 5.38. The lowest score was for unconscious versus conscious, at
4.49.

Both of the hypotheses tested in Study One were supported by the data, with results
significant at p less than .01. Users who anthropomorphised the chatbot indicated that they
were more likely to use a chatbot again to perform a similar task. Anthropomorphism of the
chatbot also predicted a participant’s intention to recommend the chatbot to others.

What remains unclear is whether the anthropomorphic perceptions of the chatbot


actually affect adoption and recommendation or if the variables simply happen to covary as a
result of other confounding variables. Results from Study One only support a correlational
relationship. Furthermore, as anthropomorphism is generated via the union of the observer
and observed, it remains unclear whether the anthropomorphism detected was the result of
properties unique to the individual or the result of properties of the chatbot itself.

Given that adoption and recommendation intent have been important dependent
variables to SST practitioners and scholars for more than 30 years (Blut, Wang, & Schoefer,
2016; Hoehle, Scornavacca, & Huff, 2012), future research should examine which elements
of the human-chatbot interaction contribute to this anthropomorphism. It cannot be a digital
avatar, the movement of a body, facial expressions or an audible voice because the chatbot
has none of these features. Manipulating these features would allow for the findings from
Study One to be replicated under experimental conditions.

None of the demographic variables measured account for the variance in scores. It
appears as though age, gender, race and education did not affect perceptions of
anthropomorphism, adoption or recommendation intent in this context. These results differ
from previous SST findings, which suggest adaptors are predominantly younger, male and
better educated (Meuter et al., 2003; Nilsson, 2007).

Frequency of a participant’s Facebook use was captured as a control variable and was
not significantly related to any of the other variables. All of the participants were Facebook
users and would know that interpersonal communication via instant messaging occurs on
Facebook. If a Facebook user’s mental model of Facebook Messenger is that it facilitates chat
between two humans, then anything “chatting” on Facebook could be seen as human.
However, a participant’s previous use of Facebook did not appear to have a sensitising effect
in this instance.

Chapter 3: Study One 35


Future studies may wish to investigate the anthropomorphism of chatbots deployed on other
websites that are not associated with asynchronous chatting between two human interlocutors.

Both the human-computer and the human-robot interaction literature states that
individual human-like cues stack synergistically to create perceptions of a social other.
Humanlike dimensions listed by Nass (2004) include language use, a voice (real or synthetic),
a face via an avatar, interactivity and unpredictability. Kiesler et al. (2008) have demonstrated
that the most salient human characteristics are appearance, back-story, observable behaviour
and etiquette. The chatbot in this study had only one of these characteristics – language use.
This study supports the notion that anthropomorphism could occur from language use only.

The results from Study One present anthropomorphic perceptions as contributing


behavioural intentions, with benefits to the firm. However, anthropomorphism may result in
unintended consequences. Previous studies have linked the anthropomorphism of a
technology to a user’s expectations of its capabilities (Knijnenburg & Willemsen, 2016;
Nowak & Biocca, 2003). A very humanlike chatbot may be expected to have very humanlike
cognition. This may result in a user overestimating a chatbot’s abilities and subsequently
being disappointed or frustrated when those expectations are violated. In a customer service
context, this overestimation is important, as the gap between expectations and experience is a
major driver of customer dissatisfaction (Hill & Alexander, 2006).

36 Chapter 3: Study One


Chapter 4: Study Two

4.1 METHOD AND DESIGN

Study Two could be described as a replication-extension study (Bonett, 2012) in that it


aimed to build upon the findings from the first study. Where Study One identified a
correlational relationship between anthropomorphism, adoption and recommendation intent,
Study Two was designed to explore:

• the potential to manipulate anthropomorphism by adjusting a participant’s access to


elicited agent knowledge and in doing so,
• the potential for a causal relationship between anthropomorphism, adoption and
recommendation intent.

The stimulus material used in Study Two was an animation of a chatbot interaction.
This animation was contextualised around a consumer booking hotel accommodation via a
chatbot. Three versions of this stimulus were prepared in order to manipulate a participant’s
access to elicited agent knowledge (EAK), which is theorised to be an antecedent of
anthropomorphism (Epley et al., 2007). The animation was the same across all conditions,
except for the experimental manipulation. The chatbot shown in the conversation repair (CR)
animation failed to interpret elements of user input during the first pass and used conversation
repair strategies to remedy potential misunderstandings. Therefore, the CR chatbot was
repairing a violation of the Gricean maxim of manner. Consequently, the following
hypotheses were developed:

H3: A chatbot using conversation repair will be perceived as significantly more


anthropomorphic

H4a and H4b: A chatbot using conversation repair will receive significantly higher scores for
(a) adoption intent and (b) recommendation intent

The chatbot in the contextual awareness (CA) animation failed to accurately interpret the
user’s utterance. Therefore, the CA chatbot violated the Gricean maxim of relevance,
responding with a statement considered inappropriate for the needs of user at that stage of the
transaction (Grice, 1975). Consequently, the following hypotheses were developed:

Chapter 4: Study Two 37


H5: A chatbot failing to maintain contextual awareness will be perceived as significantly less
anthropomorphic

H6a and H6b: A chatbot failing to maintain contextual awareness will receive significantly
lower scores for (a) adoption intent and (b) recommendation intent

Finally, a third chatbot animation was prepared to serve as a control condition. The chatbot in
this condition correctly interpreted user input in all instances. The following hypotheses were
developed to address broader patterns between the experimental conditions:

H7: A chatbot perceived as higher in anthropomorphism will receive significantly higher


scores for adoption intent

H8: A chatbot perceived as higher in anthropomorphism will receive significantly higher


scores for recommendation intent

The study was designed to meet the four requirements for demonstrating a causal
relationship. The experimental method was selected because experiments show that (a) the
cause and effect are connected, (b) the cause precedes the effect, (c) the cause and effect
relationship occurs consistently across participants and (d) that alternative explanations have
been accounted for (Babbie, 2016).

Providing evidentiary support for a causal relationship within the social sciences is
difficult; however, the use of an experimental design was chosen to maximise internal
validity. The three elements of an experiment are present: manipulation of the independent
variable (access to EAK as a component of anthropomorphism), comparison to control group
and random assignment to experimental conditions (Babbie, 2016). Of course, in the absence
of replication studies, inferences drawn from this study should be approached with care.

Deterministic causality versus probabilistic causality: The term causality in this thesis
refers to probabilistic causality, in that the cause is said to precede the effect and increase the
probability of the effect (Mellor, 1995). Deterministic causality, on the other hand, refers to
the notion that the cause precedes the effect in all observable instances (Hoefer, 2016). For
example, it is widely accepted that smoking causes lung cancer, but not all lung cancers are
temporally preceded by smoking. This statement illustrates probabilistic causality as opposed
to deterministic causality.

38 Chapter 4: Study Two


4.1.1 Participants
A convenience sample of 180 participants from Amazon’s Mechanical Turk (MTurk)
website elected to participate in the study. All participants were aged over 18 years, had
English as a primary language, were currently residing in the United States and had a positive
user score for their previous work on MTurk. Participants were recognised for their
contribution with a payment of US$0.80 for a completed survey submission. Refer to Study
One: Method for a discussion of MTurk as a sampling strategy.

A sample size of 180 was considered appropriate for this survey. Power analysis for
Analysis of Variance (ANOVA) was considered, given an alpha of .05, a power of .80 and a
medium effect size (f² = .15) (Faul et al, 2008). Based on these assumptions, a power analysis
for a 3-condition experiment suggested a sample size of n = 156, or 52 participants per
condition.

Participants were randomly assigned to one of the experimental conditions (control,


conversation repair or contextual awareness). Random assignment increases internal validity,
because randomisation means that the incidence of potentially confounding variables is
approximately evenly distributed across all of the conditions (Salkind, 2010).

4.1.2 Stimulus
Where Study One had participants interact with a chatbot on Facebook, Study Two had
participants watch a pre-recorded animation of a human–chatbot interaction. The decision to
change the way in which participants were exposed to the stimulus was made on the
following grounds:

1. Increased control: If participants interact with a chatbot directly on Facebook, then


their typing speed may affect their perceptions of the experience. For example,
participants who can touch type may have a more fluid interaction and perceive
increased ease of use. Also, participants interacting with a chatbot on Facebook may
become distracted by Facebook itself. For example, they may receive an instant
message from a friend while the stimulus is being administered. Neither typing speed
nor distractions could be controlled for using the previous method.

Chapter 4: Study Two 39


2. Ethical considerations: Facebook is a third-party platform. Despite the researcher’s
assurances of confidentiality, Facebook itself would know that a user had been
involved in the study. Therefore, these participants may be exposed to certain types of
targeted advertising in the future. Given the subsequent Facebook privacy scandal of
early 2018 (Bogle, 2018), the decision seems appropriate.

Three animations were prepared, one for each experimental condition. To illustrate the
experimental manipulation, a textual extract of the animation is provided in Table 7. Full
transcripts of the animations are provided in Appendix Item 4 (p. 79).

Table 7.
Sample Extract from Animations Illustrating Manipulation of Elicited Agent Knowledge Across the Experimental
Conditions

Context: Booking hotel accommodation via a customer service chatbot

Condition 1: Condition 2: Condition 3:


Control Conversational repair Failure to maintain
contextual awareness

Human Room includes a Room includes a Room includes a washer and


washer and dryer? washer and dryer? dryer?

Chatbot Do you mean "clothes I thought we were talking


dryer" or "hair dryer"? about ‘room reservation’ on
the ‘4th of July 2018’.

Human Clothes dryer Does the room include a


clothes dryer?

Chatbot Absolutely. We also Absolutely. We also Absolutely. We also offer an


offer an onsite laundry offer an onsite laundry onsite laundry service upon
service upon request. service upon request. request.

The animations were made using online software from Bot Society
(https://botsociety.io/). Bot Society provides chatbot prototyping and preview tools for
developers. Users of Bot Society can design text and voice chat interfaces for a range of
platforms, including Facebook Messenger, Google Home and Amazon Alexa. The Bot
Society website claims the software has been used by brands including Microsoft,
PriceWaterhouse Coopers, AXA and Nestle (Botsociety, 2018). A screen shot of the Bot
Society animation process is provided in Figure 8.

40 Chapter 4: Study Two


Figure 8. Screenshot from Bot Society website, illustrating how animations were created.

The hotel accommodation context was chosen for the stimulus because it was considered to
be an appropriate use for a chatbot providing customer service. The chatbot was given a
fictitious name (Beachside Hotel) with a custom logo. The Facebook pages of real world
hotels were reviewed in order to determine an appropriate number of “likes” for the chatbot.
Setting the likes too high or low may have acted as social proof of the chatbot’s performance
and had an impact on the results (Lee, Lee, & Oh, 2015). Finally, the animation was set inside
a wireframe image of a white iPhone 6 to add to realism, as shown in Figure 9.

Figure 9. Screenshots from the stimulus animation, illustrating design elements.

Chapter 4: Study Two 41


The three animations were identical, with the exception of the experimental
manipulation. Care was taken to ensure the three animations were consistent in all other
respects. For example, the pause time between the human input and the chatbot response was
the same across all conditions.

4.1.3 Instruments
Measurements for the dependent variables (adoption and recommendation intent) were
taken as described in Study One, methods section. The measurement of anthropomorphism
was modified between studies. The items themselves were not changed; however, responses
were recorded on a 7-point Likert scale rather than a slider-type scale, which could provide
continuous data. The demographic variables were unchanged between the two studies. Age,
gender, race and education were all recorded.

Study Two was designed to produce two sets of data. Dataset 1 includes scores for the
independent and dependent variables across each of the three conditions. These scores were
assessed using ANOVA to identify any significant mean differences resulting from
conversation repair or the failure to maintain contextual awareness. The control variables
measured (detailed below) were less important for Dataset 1. Participants were randomly
assigned to each experimental condition so the variance in control variables should have been
evenly distributed across the three conditions.

Dataset 2 includes data from the control condition only. The data generated by the
control condition was assessed using regression in an attempt a replicate the findings from
Study One. Perceived ease of use, perceived usefulness and the need for human interaction
were measured as control variables for Dataset 2. In this way, a correlational relationship
between the anthropomorphism, adoption and recommendation intent could be retested. If any
relationship was observed, it could then be said to exist when accounting for these new
control variables.

Perceived ease of use & perceived usefulness: Both variables are derived from Fred
Davis’s (1989) seminal technology acceptance model (p. 11). Davis devised the original
instrument to be used in a firm setting. Original items include statements such as
“effectiveness on the job”. This study used a modified version of the instruments, specifically
designed for self-service technology (SST) studies as presented by Curran and Meuter (2005).

42 Chapter 4: Study Two


Three items were used to measure perceived ease of use and three items were used to measure
perceived usefulness. All responses were recorded on a 7-point Likert scale, anchored with
strongly agree and strongly disagree. Full details are provided in Appendix Item 5.

Need for human interaction: This variable was operationalised using Dabholkar’s
(1996) four-item scale. Items such as “I like interacting with the person who provides the
service” focus on the social aspect of interacting with a human service employee. Responses
to the four items were recorded on a 7-point Likert scale, anchored with strongly agree and
strongly disagree.

To ensure a consistent experience for all participants, regardless of which experimental


condition they were allocated, everyone was asked to respond to these control variable
instruments.

4.1.4 Procedure and timeline


Participants were randomly assigned to one of the three experimental conditions. First,
the demographic variables were measured. Participants then watched the animation assigned
to their condition. The animations were 1 minute and 31 seconds in duration. Measures of the
dependent variables were taken, followed by measures of the other variables. Finally,
measures of the control variables from the extant literature (perceived usefulness, perceived
ease of use and the need for human interaction) were taken.

4.2 RESULTS

The initial sample size for this study was 190 participants. Fifty-three percent of the
sample were male, with the average age of the sample recorded at 37.21 years (SD = 11.597).
Seventy-sever percent of the sample identified as Caucasian Americans. The demographic
composition of the participants in Study Two was similar to the previous study.

The data was visually inspected for problematic responses. One case was content non-
responsive. “Consistent response” or “straight-lining” had occurred (Revilla & Ochoa, 2015).
It appeared the participant had rushed through the survey, selecting the same response option
for all items on all instruments. Study Two incorporated the use of reverse-coded items to
identify inconsistent responses, although the efficacy of this strategy has been debated (Van
Sonderen, Sanderman, & Coyne, 2013).

Chapter 4: Study Two 43


Twelve participants (6.3%) were deleted for inconsistent responses identified through
these reverse coded items. A final sample size of 177 participants was retained for analysis.
Experimental results are presented first (n = 177), followed by correlational survey results
which were calculated using the control group only (n = 59).

4.2.1 Experimental results


ANOVA was performed to compare the demographic composition (gender, age,
education and race) of participants across the three experimental conditions. The results were
non-significant, indicating that all groups were similar, as shown in Table 8.

Table 8.
Distribution of Demographic Variables Across the Three Experimental Conditions

Gender Age M(SD) Race M(SD) Edu. M(SD)

Condition 1 (n=59) 50.8% Female 35.5 (10.4) 1.71 (1.5) 3.25 (1.3)

Condition 2 (n=60) 47.6% Female 37.43 (11.8) 1.41 (1.3) 2.79 (1.5)

Condition 3 (n=58) 42.9% Female 38.67 (12.4) 1.67 (1.6) 2.89 (1.4)

Next, ANOVA was performed to investigate the impact of conversational repair and
contextual awareness on anthropomorphism, adoption and recommendation intent. Levene’s
statistics for the dependent variables were non-significant, Adoption: F (2, 186) = 2.181, p =
.116 and Recommendation: F (2, 186) = .577, p = .563. The assumption of homogeneity of
variance was violated for anthropomorphism: F (2, 186) = 4.81, p = .009. However, ANOVA
is not sensitive to violations of equal variance when sample sizes are approximately equal
(Allen et al., 2014) as was the case here.

The three ANOVAs performed (anthropomorphism, adoption and recommendation


intent) were all statistically significant, indicating that the variables are influenced by the
chatbots behaviour. Results were as follows:

Anthropomorphism: F (2, 186) = 14.387, p = .000, n² = .134 (medium effect size)

Adoption: F (2, 186) = 10.798, p = .000, n² = .104 (medium effect size)

Recommendation: F (2, 186) = 6.604, p = .002, n² = .066 (medium effect size)

44 Chapter 4: Study Two


Post-hoc analysis was conducted using Tukey’s HSD, with an α of .05 to examine
which of the conditions produced significantly different mean scores. Across all three
variables of interest, the chatbot that failed to maintain contextual awareness had significantly
lower scores than the control condition. However, there was no significant differences
between scores for the control condition chatbot and the chatbot performing conversation
repair. Using Cohen’s D, the size of the difference between means, expressed in standard
deviations, was calculated for comparisons between Condition 1 and Condition 3. Refer to
Table 9 for full details.

Table 9.

Mean Comparison of the Dependent Variables Across the Three Experimental Conditions

Variable Condition 1 Condition 2 Condition 3 Cohen’s D


(control) (conversation (failing contextual (cond. 1 vs. cond. 3)
repair) awareness)
Mean (SD) Mean (SD) Mean (SD)
Anthropomorphism 5.09 (1.03) 5.11 (1.42) 4.01 (1.43) -0.673 (large)
Adoption 5.63 (1.14) 5.75 (1.35) 4.73 (1.51) -0.553 (large)
Recommendation 5.46 (1.33) 5.46 (1.47) 4.65 (1.51) -0.461 (medium)

The ANOVA results presented in Table 9 suggest that anthropomorphism may mediate
the relationship between a chatbot’s failure to maintain contextual awareness and subsequent
behavioural intentions (adoption and recommendation intent). The mediating role of
anthropomorphism in the contextual awareness condition was formally tested in SPSS using
the process model (Hayes, 2009). The effect of contextual awareness was isolated by creating
a dummy coded variable (Cond. 1 vs. Cond. 3).

Adoption intent: Regression analysis was used to investigate whether anthropomorphism


mediated the effect of contextual awareness on adoption intent. In step 1 of the mediation
model, the regression of contextual awareness on adoption intent, ignoring the mediator, was
significant, ß = -.857, t(124) = -3.535, p = <.001. Step 2 showed that the regression of
contextual awareness on the mediator, anthropomorphism, was also significant, ß = -1.073,
t(124) = -4.815, p = <.001. Step 3 of the mediation process showed that the mediator’s
(anthropomorphism) regression on adoption intent was significant, ß = .559, t(123) = 6.64, p
= <.001. Step 4 of the analyses revealed that, controlling for the mediator
(anthropomorphism), contextual awareness scores were not a significant predictor of adoption
intent, ß = -.257, t(123) = -1.13, p =.261.

Chapter 4: Study Two 45


This is consistent with full mediation. Approximately 33% of the variance in adoption was
accounted for by the predictors (R² = .331). The indirect effect was tested using a bootstrap
estimation approach with 5,000 samples. The unstandardized indirect coefficient was -.600
and the 95% confidence interval ranged from -.98 to -.29.

Figure 10. Anthropomorphism fully mediates the contextual awareness and adoption relationship.

Recommendation intent: Regression analysis was used to investigate whether


anthropomorphism mediated the effect of contextual awareness on recommendation intent. In
step 1 of the mediation model, the regression of contextual awareness on recommendation
intent, ignoring the mediator, was significant, ß = -.809, t(124) = -3.187, p = <.001. Step 2
showed that the regression of contextual awareness on the mediator, anthropomorphism, was
also significant, ß = -1.073, t(124) = -4.815, p = <.001. Step 3 of the mediation process
showed that the mediator’s (anthropomorphism) regression on recommendation intent was
significant, ß = .599, t(123) = 6.86, p = <.001. Step 4 of the analyses revealed that, controlling
for the mediator (anthropomorphism), contextual awareness scores were not a significant
predictor of recommendation intent, ß = -.166, t(123) = -.702, p =.483. This is consistent with
full mediation. Approximately 33% of the variance in recommendation intent was accounted
for by the predictors (R² = .331). The indirect effect was tested using a bootstrap estimation
approach with 5,000 samples. The unstandardized indirect coefficient was -.643 and the 95%
confidence interval ranged from -1.03 to -.34.

46 Chapter 4: Study Two


Figure 11. Anthropomorphism fully mediates the contextual awareness and recommendation relationship.

Results for the hypotheses unique to Study Two are presented in Table 10.

Table 10.
Results Relative to Hypothesis 3 Through 8

Hypotheses Detail Outcome


H3 A chatbot using conversation repair will be perceived Not supported
as significantly more anthropomorphic
H4a and H4b A chatbot using conversation repair will receive Not supported
significantly higher scores for (a) adoption intent and
(b) recommendation intent
H5 A chatbot failing to maintain contextual awareness Supported
will be perceived as significantly less
anthropomorphic
H6a and H6b A chatbot failing to maintain contextual awareness Supported
will receive significantly lower scores for (a)
adoption intent and (b) recommendation intent
H7 A chatbot perceived as higher in anthropomorphism Supported
will receive significantly higher scores for adoption
intent
H8 A chatbot perceived as higher in anthropomorphism Supported
will receive significantly higher scores for
recommendation intent

Chapter 4: Study Two 47


4.2.2 Survey results
Study Two provided the opportunity to replicate the findings of the first study, while
introducing new control variables from the extant literature. Is anthropomorphism still a
significant predictor of adoption and recommendation intent when accounting for perceived
usefulness, ease of use and a participant’s need for human interaction? Only data from
condition 1 (n = 59) is used in this section.

Cronbach’s α was calculated to measure the internal consistency of the independent and
control variables. The internal consistency of the anthropomorphism, usefulness and NFHI
instruments could be considered good (α > .80), while ease of use was acceptable (α > .70)
(DeVellis, 2012; Kline, 2000). Refer to Table 11 for precise values. Composite variables were
created.

Table 11.
Means, Standard Deviations, Correlations & Internal Consistency of all Variables.

Variable Mean (SD) 1 2 3 4 5 6


Adoption (DV) 5.54 (1.22) --
Recommendation (DV) 5.46 (1.33) .635** --
Anthropomorphism (IV) 5.09 (1.03) .558** .427** (.812)
Perceived usefulness 5.55 (1.05) .686** .532** .384** (.855)
(control)
Perceived ease of use 5.83 (.95) .397** .213 .239 .594** (.751)
(control)
Need for human interaction 4.18 (1.41) -.364** -.296* -.060 -.179 -.284* (.896)
(control)

Note. Control variables were treated as IV’s in the analysis but were not variables of theoretical interest in the
study, thus conceptually they are control variables.
** p < 0.01 level, two-tailed, * p < .05, two-tailed. Cronbach α reported diagonally in parentheses. A double
dash (--) indicates the variable was measured with single item; thus, α not calculated.

Normality was assessed and multiple regression was performed, using the same
procedure as outlined in Study One: Results.

48 Chapter 4: Study Two


H1: Anthropomorphism will significantly predict adoption intent

In combination, anthropomorphism and the control variables accounted for a


statistically significant 64.0% of the variability in adoption intention, R² = .640, adjusted R² =
.615, F (4, 58) = 25.72, p = .000. By Cohen’s (1988) conventions, a combined effect of this
magnitude can be considered “large” (f² = 1.77). Standardised regression coefficients (ß) and
squared semi-partial (or “part”) correlations (sr²) for each predictor in the regression model
are reported in Table 12.

Table 12.
Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor Variable in a Multiple
Regression Analysis of Chatbot Adoption Intent

DV Predictor ß 95% CI sr²


Adoption intent Anthropomorphism .349** .211, .617 .104
Usefulness .560** .413, .894 .184
Ease of use -.096 -.382, .136 --
NFHI -.270** -.377, -.091 .067
** p < 0.01 level, two-tailed, * p < .05, two-tailed.

Anthropomorphism significantly predicted adoption intent (ß = .349, t = 2.85, p = .006).


The data from Study Two further supports H1. A one standard deviation increase in
anthropomorphism will result in a .349 increase in adoption intent, when controlling for
usefulness, ease of use and the need for human interaction. The sr² value of .104 suggests that
10.4% of the variance in adoption intention was uniquely explained by anthropomorphism.

H2: Anthropomorphism will significantly predict recommendation intent.

In combination, anthropomorphism and the control variables accounted for a


statistically significant 41.7% of the variability in adoption intention, R² = .417, adjusted R² =
.377, F (4, 58) = 10.381, p = .000. By Cohen’s (1988) conventions, a combined effect of this
magnitude can be considered “large” (f² = .715). Standardised regression coefficients (ß) and
squared semi-partial (or “part”) correlations (sr²) for each predictor in the regression model
are reported in Table 13.

Chapter 4: Study Two 49


Table 13.
Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor Variable in a Multiple
Regression Analysis of Chatbot Recommendation Intent

DV Predictor ß 95% CI sr²


Recommendation Anthropomorphism .267* .064, .622 .061
intent Usefulness .522** .328, .989 .159
Ease of use -.233 -.680, .032 --
NFHI -.253* -.433, -.041 .058
** p < 0.01 level, two-tailed, * p < .05, two-tailed.

Anthropomorphism significantly predicted recommendation intent (ß = .267, t = 2.46, p


= .017). The data from Study Two further supports H2. A one standard deviation increase in
anthropomorphism will result in a .267 increase in recommendation intent, when controlling
for usefulness, ease of use and the need for human interaction. The sr² value of .061 suggests
that 6.1% of the variance in recommendation intent was uniquely explained by
anthropomorphism.

4.3 DISCUSSION

This study aimed to further clarify the relationship between anthropomorphism,


adoption and recommendation intent, for a chatbot performing customer service. This was
achieved through the statistical analysis of eight testable hypotheses. Where Study One
identified a correlational relationship between the variables of interest, Study Two was
designed to:
• Identify potential antecedents of a user’s access to elicited agent knowledge (EAK), a
known component of anthropomorphic perceptions. As such, conversation repair and
contextual awareness were examined.
• Investigate the potential for a causal relationship between anthropomorphism,
adoption and recommendation intent.

This discussion will elaborate on the findings from Study Two, situating the results
within the context of the extant literature. The discussion will move through the hypotheses
sequentially, while the following chapter (Conclusion) will present implications, limitations
and avenues for future research.

50 Chapter 4: Study Two


H1 and H2 were both supported by the data. These results replicate the findings
presented in Study One. Participants who anthropomorphised the chatbot indicated they were
more likely to use a chatbot again to perform a similar task. Anthropomorphism of the chatbot
also predicted a participant’s intention to recommend the chatbot to others.

These results represent a contribution to the SST literature. A substantial number of


empirical studies have examined the adoption and recommendation of SSTs; however, none
to date have considered the role of anthropomorphism (Blut et al., 2016; Hoehle et al., 2012).
Furthermore, the results suggest that anthropomorphism outperforms the widely cited
“perceived ease of use” (Davis, 1989) and “need for human interaction” (Dabholkar, 1996b)
variables in both ß weights and unique variance explained. Therefore, anthropomorphism as
an independent variable may enhance the predictive value of the four dominant theoretical
models, namely the technology acceptance model (TAM) (Davis, 1989), the theory of planned
behavior (TPB) (Ajzen, 1991), the unified theory of acceptance and use (UTAUT) (Venkatesh
et al., 2003) or the diffusion of innovation constructs (DOI) (Rogers, 2003) as conceptualised
by Cooper and Zmud (1990).

These findings suggest that the role of anthropomorphism in SST adoption or word-of-
mouth research needs to be considered wherever the SST is (a) targeting individuals theorised
to be high in sociality or effectance motivation or (b) accessed via an interface thought to be
capable of affecting a user’s access to EAK through language use.

Anthropomorphism of a firm’s products and branding is empirically linked with


positive attitudinal and behavioural responses from consumers (Aggarwal & Mcgill, 2007;
Bennett & Thompson, 2016; Payne et al., 2013). The findings regarding H1 and H2 provide
further evidence to support the notion that anthropomorphism in a marketing context can
provide benefits to the firm.

4.3.1 Conversation repair (CR):


H3 and H4a/b were not supported. By means of the stimulus and instruments detailed, the
use of CR did not significantly increase anthropomorphism, adoption or recommendation
intent beyond the baseline scores recorded for the control condition. The mean scores for the
CR condition were marginally higher; however, the differences were not statistically
significant.

Chapter 4: Study Two 51


In accordance with the theoretical support for H3 and H4a/b, it is possible that
participants perceived the CR chatbot as demonstrating social coordination (Kaplan & Hafner,
2006) or intersubjective effort (Corti & Gillespie, 2016). However, those perceptions did not
translate into statistically different mean scores.

The use of CR may have both positive and negative aspects. A chatbot’s use of CR
could be considered a positive attribute in that CR is an attempt to manage miscommunication
(Fromkin, 1971). Conversely, in responding to CR, the human user is required to re-state or
clarify their previous utterance, which could negatively impact perceptions of the chatbot.
Repeating oneself may be frustrating to the user (Srivastava, 2017) due to increased effort
(Jenkins et al., 2007). Clark & Brennan (1991) would describe this effort as a ‘cost’,
specifically, the costs of formulation and production, which are paid by the speaker. It appears
as though these two opposing forces cancel each other out. While the chatbot’s use of CR did
not significantly increase scores for adoption and recommendation intent, it appears the use of
CR mitigates the potentially deleterious effects of the costs paid when having to repeat
oneself during the repair process. It is likely the number of trouble sources and their perceived
importance to the user affect scores for the dependent variables. For example, many attempts
at CR, to clarify unimportant matters would mean that the costs of formulation and production
outweigh the benefits, leading to lower scores on the dependent variables.

4.3.2 Contextual awareness (CA):


H5 and H6a/b were supported by the data. The chatbot failing to maintain CA scored
significantly lower for anthropomorphism, adoption and recommendation intent. It appears
that CA is critical to both perceptions of humanness and behavioural intentions. A single
instance of failure to maintain CA was sufficient to cause significant mean differences in the
variables of interest.

A chatbot failing to track the context of a conversation over time would violate the
Gricean maxim of relevance (Grice, 1975, p. 47). Violating the maxim of relevance is
uncommon in interpersonal interaction (Grice, 1975, p. 54) but occurs frequently in human-
chatbot interaction (Martin, 2017). Gricean maxims are said to generate implicature, where
implicature is what is suggested as opposed to what is expressly stately (Blackburn, 1996). In
failing to maintain CA, the chatbot appears to have provided the “implicature” (Grice, 1975)
that it has non-human cognition, given that relevance comes naturally to human interlocutors.

52 Chapter 4: Study Two


Epley, Waytz, & Cacioppo's (2007) three-factor model of anthropomorphism states that
antecedents of anthropomorphism reside within the perceiver (sociality and effectance
motivation) and the perceived (elicited agent knowledge [EAK]). Given the random
assignment of participants in Study Two, it is assumed that participants with either high or
low sociality and effectance motivation were equally distributed across the experimental
conditions. Therefore, differences in the mean scores for anthropomorphism across
conditions, can be attributed to the manipulation of EAK. Previous attempts to activate EAK
in order to observe a corresponding effect in anthropomorphism have focused on vocal cues
in robots (Eyssel, Kuchenbrandt, Hegel, & De Ruiter, 2012). The findings from Study Two
suggest that CA in a textual chatbot performs similarly, influencing subsequent judgements of
humanness.

H7 and H8 were supported. The chatbots perceived as higher in anthropomorphism


received significantly higher scores for adoption and recommendation intent. By manipulating
a participant’s access to EAK, Study Two supports the notion that there is a causal
relationship between anthropomorphism, adoption and recommendation intent. This was
confirmed via mediation analysis, where anthropomorphism was found to fully mediate the
relationship between CA and the dependent variables. Although CR failed to increase
anthropomorphism above the scores for the control condition, the pattern between the three
variables, across the three conditions did not change as shown in Figure 12.

5.5

4.5

3.5

3
Anthropomorphism Adoption Recommendation

Condition 1 (Control) Condition 2 (Conversational Repair)


Condition 3 (Contextual Awareness)

Figure 12. Graphical representation of mean scores for anthropomorphism, adoption and recommendation intent
across experimental conditions

Chapter 4: Study Two 53


Chapter 5: Conclusions

Consumers view self-service technologies as cold, impersonal and lacking


customisation (Hsuan-Hsuan & Ko-Hsin, 2015). This is because self-service technologies
(SST) replicate the functionality of interpersonal service delivery but lack the capacity to
engage consumers socially (van Doorn et al., 2017). Consequently, SST usage has been
linked to reductions in both social attachment to the firm and customer loyalty (Selnes &
Hansen, 2001). Chatbots are an SST believed to be capable of remedying these issues (van
Doorn et al., 2017).

This thesis focused upon the intersection of two divergent theoretical positions. SST’s
by definition, lack human employee involvement (Curran et al., 2003). Conversely,
substantial literature is predicated on the notion that machines and computers can be
perceived as human-like (Epley et al., 2007; Moon & Nass, 1996; Weiss & Bartneck, 2015).
The objective of this thesis was to contribute to the SST literature through the partial
reconciliation of these competing views. What follows is a brief summary of findings as they
pertain to each of the research questions.

RQ1: What is the relationship between anthropomorphism, adoption and


recommendation intent for chatbots performing customer service?

Both studies presented provide evidence to support the claim of a significant positive
correlation between anthropomorphic perceptions and consumer’s behavioural intentions to
adopt and recommend a customer service chatbot. This relationship was supported by the data
across a range of customer service scenarios, including the purchase of flowers, the purchase
of tickets to a sporting event and the reservation of hotel accommodation. Furthermore, by
manipulating access to EAK, Study Two supports claims of a significant causal relationship
between the variables of interest. The importance of anthropomorphism in (a) the study of
artificial conversational entities and (b) SSTs theorised as capable of activating perceptions of
automated social presence has been demonstrated.

Chapter 5: Conclusions 55
RQ2: What strategies can a chatbot employ in order to affect anthropomorphism?

RQ3: Do these strategies have a subsequent effect on adoption and recommendation


intent?

Study Two found evidence to support the claim that chatbot behaviour (employing
particular linguistic stratagems, that is, contextual awareness) can affect anthropomorphism
by activating a user’s access to elicited agent knowledge. The antecedence of
anthropomorphism in this context was linked to features of the chatbot as opposed to
individual difference variables. This supports the notion that anthropomorphism is of practical
value. Marketing practitioners and chatbot programmers can take certain steps to maximise
anthropomorphic perceptions of an SST. Due to the causal nature of the relationship between
anthropomorphism and consumers’ behavioural intentions, adjusting access to EAK is likely
to have a corresponding impact on adoption and recommendation intent.

5.1 IMPLICATIONS

These findings have a number of theoretical and managerial implications, presented in


the following sections.

5.1.1 Theoretical implications

1. This research project provides academic research with a newly proposed variable with
which to assess SSTs. Anthropomorphism as measured by the Godspeed
Questionnaire (Bartneck et al., 2009) captures a novel dimension of an SST. The
addition of anthropomorphism to existing adoption models such as TAM (Davis,
1989), the TPB (Ajzen, 1991), UTAUT (Venkatesh et al., 2003) or DOI (Rogers,
2003) may improve predictive capabilities, given anthropomorphism was shown to
outperform well-established variables, such as perceived ease of use (Davis, 1989),
technological anxiety (Meuter et al., 2003) and the need for human interaction
(Dabholkar, 1996b).
2. The thesis lends support to EAK as forming part of Epley et al’s. (2007) three factor
conceptualisation of anthropomorphism. In particular, the second study presented here
is the first to (a) examine the concept with a text-only chatbot as opposed to a robot
and (b) demonstrate that contextual awareness is capable of activating access to EAK.

56 Chapter 5: Conclusions
5.1.2 Methodological implications

1. The relationships of interest were statistically significant, regardless of the means with
which the stimulus was administered. Participants having direct interaction with a live
chatbot on Facebook (Study One) reported similar perceptions to participants viewing
an animation of human-chatbot interaction (Study Two). Researchers aiming to
maximise ecological validity may wish to replicate the method outlined in Study One.
Researchers who wish to control for typing speed or have concerns regarding
participant privacy may consider the method presented in study two.
2. Providing participants with a script to guide their interaction with a chatbot (Study
One) is unique to the literature. This method, unlike techniques such as the Wizard of
Oz (Aaron Steinfeld, Jenkins, & Scassellati, 2009), exposes participants to the
stimulus material in a consistent way.
3. As discussed in Chapter Two, the anthropomorphism instrument was taken from the
more mature social robotics literature. The performance of the anthropomorphism
instrument in this context suggests additional instruments from the social robotics
field could be repurposed to further develop a theoretical understanding of human-
chatbot interaction. For example, the Godspeed Questionnaire (Bartneck et al., 2009)
includes validated instruments for measuring perceived intelligence, likeability and
animacy.

5.1.3 Managerial implications

The findings presented have several implications for firms seeking to develop chatbots to
perform in customer service roles.

1. Chatbots should be designed to provide humanlike cues in order to maximise


anthropomorphism and subsequent adoption and recommendation intent. This can be
achieved by having the chatbot maintain contextual awareness. Previous research has
demonstrated that the use of an avatar (Blascovich et al., 2002) with a gender (Nass et
al., 1997) and cultural background (Marino, 2014) will also increase perceived
humanness. This thesis empirically connects this perceived humanness with
advantages to the firm via consumers’ behavioural intentions.

Chapter 5: Conclusions 57
2. Unlike the work of Nilsson (2007), Meuter et al. (2005) and Ding et al. (2007), the
demographic variables measured, including age, gender, race and education were
shown to have no relationship with adoption and recommendation intent. This
suggests that chatbots may be an acceptable means of providing customer service to a
wide variety of consumers.

5.2 LIMITATIONS

As is the case with all research, this project is not without limitations. The following
limitations are grouped according to methodology, measures and sample.

5.2.1 Methodology
First, both studies presented were cross-sectional, collecting data from a representative
sub-set of a population at a single point in time (Carlson, Miller, Heth, Donahoe, & Martin,
2009). Thus, it is unknown at this stage whether (a) the statistically significant relationships
identified would hold over time (novelty effect) and (b) the behavioural intentions expressed
by participants translate into actual behaviours. Using longitudinal data could resolve this
limitation. Second, the studies presented maximised internal validity at the expense of
ecological validity. The studies could be said to lack mundane realism because the
circumstances in which exposure to the stimulus occurred were artificially developed for the
purpose of the research (Aronson, Wison, & Brewer, 1998). Conditions encountered by
consumers in a natural setting may include distractions or time pressures, which could affect
results. Third, the stimulus in both studies was either (a) delivered by or (b) modelled upon a
chatbot performing within the Facebook instant messaging system. Chatbots can be deployed
to any website. Therefore, caution should be exercised in generalising findings beyond the
Facebook ecosystem.

5.2.2 Measures
Fourth, the way in which anthropomorphism was measured in Study One could be
considered a limitation. As discussed in Study One: Method, anthropomorphism was
measured using a semantic differential scale. The instrument’s designers had originally
presented it as a Likert scale (Bartneck et al., 2009), which would capture discrete data in
whole numbers. While planning the research design, it was decided to instead measure
anthropomorphism using a slider on a continuous scale. The hope was that this would produce

58 Chapter 5: Conclusions
more accurate data, because a sliding scale would allow participants to indicate their position
anywhere between two values. The survey software used in this study (Qualtrics) does not
include a pre-formatted option for continuous slider type items. As a result, the slider was
coded into Qualtrics using the custom HTML tab via instructions provided by Curdy (2014).
An example of a finished item is provided in Figure 13.

Figure 13. Screenshot of an item from the anthropomorphism scale as presented to participants in Study One.

This modified instrument has three potential limitations. First, the labels above each of
the increments have not been validated. These labels were added because the original
instrument does not include or suggest increment labels, as shown in Figure 14.

Figure 14. The original anthropomorphism instrument as presented by Bartneck et al. (2009).

Second, the original instrument only provides 5 discrete response options for each item.
This study used 7 response options on a continuous scale. Seven response options were used
for consistency because the other instruments in the study also used 7-point scales. It was
hoped that participants would benefit from this uniformity. Finally, the anthropomorphism
items used in this study were formatted so that the blue circle, which a participant drags left or
right was set to appear in the neutral position (Figure 6). Perhaps this encouraged participants
to select a response closer to the centre of the scale. A number of these measurement
limitations were rectified in Study Two, as shown in Figure 15.

Chapter 5: Conclusions 59
Figure 15. Screenshot of an item from the anthropomorphism scale as presented to participants in Study Two.

5.2.3 Sample
Finally, with regards to the participant samples, power analysis suggests that the sample sizes
were adequate for the types of analysis employed. However, the sample sizes (n = 60 in Study
One, n = 190 in Study Two) are still small. Future studies should use larger sample sizes.

5.3 FUTURE RESEARCH

A number of avenues for future research exist. Future research may seek to;

1. Identify additional chatbot behaviours capable of activating access to EAK. For


instance, previous research from the study of computer mediated communication lends
theoretical support to EAK hypotheses involving chronemics, that is, pauses and
response times (Kalman, Scissors, Gill, & Gergle, 2013) or the use of textual
paralanguage such as emoji or animated gifs (Luangrath, Peck, & Barger, 2017).
2. Construct a more complete model of consumer perceptions regarding customer service
chatbots by including additional factors from the Godspeed Questionnaire (Bartneck et
al., 2009).
3. Examine the role of anthropomorphism in SSTs designed to assist consumers in
complex service scenarios. The service scenarios used in this thesis (flower delivery,
sporting tickets, hotel accommodation) are examples of experiential services, low in
credence qualities (Mazaheri, Richard, & Laroche, 2012). Future research may wish to
test the relationships identified within this thesis, against service scenarios high in
credence qualities, such as medical or legal advice. As discussed in the literature
review, chatbots are capable of providing triage advice to users with a medical
concern (Burgess, 2017) and legal advice to asylum seekers (Cresci, 2017). Both
medical and legal advice are considered high in credence qualities (Mitra, Reiss, &
Capella, 1999).

60 Chapter 5: ConclusionsBibliography
Bibliography

AbuShawar, B., & Atwell, E. (2016). Usefulness, localizability, humanness, and language-
benefit: additional evaluation criteria for natural language dialogue systems. International
Journal of Speech Technology, 19(2), 373–383.

Adams, J. (1963). Toward an understanding of inequity. Journal of Abnormal Psychology, 67,


422–436.

Aggarwal, P., & Mcgill, A. L. (2007). Is that car smiling at me? Schema congruity as a basis
for evaluating anthropomorphized products. Journal of Consumer Research, 34(4), 468–479.

Ajzen, I. (1991). The theory of planned behavior. Orgnizational Behavior and Human
Decision Processes, 50, 179–211.

Allen, P., Bennett, K., & Heritage, B. (2014). SPSS Statistics (22nd ed.). South Melbourne,
Victoria: Cengage Learning Australia.

Arnold, V., Collier, P., Leech, S., & Sutton, S. (2004). The impact of intelligent decision aids
on expert and novice decision-makers’ and finance judgments. Accounting and Finance,
44(1), 1–26.

Aronson, E., Wison, T., & Brewer, M. (1998). Experimentation in social psychology. In D.
Gilbert, S. Fiske, & G. Lindzey (Eds.), The Handbook of Social Psychology (pp. 99–142).
New York, NY: McGraw-Hill.

Axelrod, R., & Hamilton, W. (1981). The evolution of cooperation. Science, 211, 1390–1396.

Babbie, E. (2016). The Practice of Social Research (14th ed.). Boston, MA: Cengage
Learning.

Barrett, J. L., Richert, R. A., & Driesenga, A. (2001). God’s beliefs versus mother’s: The
development of nonhuman agent concepts. Child Development, 72(1), 50–65.

Bartneck, C., Kulic, D., Croft, E., & Zoghbi, S. (2009). Measurement instruments for the
anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of
robots. International Journal of Social Robotics, 1(1), 71–81.

Bazire, M., & Brézillon, P. (2005). Understanding context before using it. In A. Dey, B.
Kokinov, D. Leake, & R. Turner (Eds.), Modeling and Using Context. Berlin: Springer.

Bennett, D. E., & Thompson, P. (2016). Use of anthropomorphic brand mascots for student
motivation and engagement: A promotional case study with Pablo the Penguin at the
University of Portsmouth Library. New Review of Academic Librarianship, 22(2–3), 225–237.

Bitner, M., Ostrom, A., Meuter, M., & Clancy, J. (2002). Implementing successful self-
service technologies. Academy of Management Executive, 16(4), 96–109.

Bibliography 61
Blascovich, J., Loomis, J., Beall, A., Swinth, K., Hoyt, C., & Bailenson, J. (2002). Immersive
virtual environment technology as a methodological tool for social psychology. Psychological
Inquiry, 13, 103–124.

Blut, M., Wang, C., & Schoefer, K. (2016). Factors influencing the acceptance of self-service
technologies: A Meta-Analysis. Journal of Service Research, 19(4), 396–416.

Bogle, A. (2018). Facebook after Cambridge Analytica: Is this the beginning of the end?
Retrieved April 4, 2018, from http://www.abc.net.au/news/science/2018-03-27/facebook-
after-cambridge-analytica:-what-now/9586604

Bohannon, J. (2011). Human subject research: Social science for pennies. Science, 334(6054),
307.

Buhrmester, M., Kwang, T., & Gosling, S. (2011). Amazon’s Mechanical Turk: A new source
of inexpensive, yet high-quality data? Perspectives on Psychological Science, 6(1), 3–5.

Burger, J., & Sheehy, D. (2012). Individual Differences and Social Influence: A Special Issue
of Social Influence. Milton Park, UK: Taylor & Francis.

Burgess, M. (2017). The NHS is trialling an AI chatbot to answer your medical questions.
Retrieved October 8, 2017, from http://www.wired.co.uk/article/babylon-nhs-chatbot-app

Carberry, S., & De Rosis, F. (2008). Introduction to special Issue on “Affective modeling and
adaptation.” User Modeling and User-Adapted Interaction, 18, 1–9.

Carlson, N., Miller, H., Heth, C., Donahoe, J., & Martin, G. (2009). Psychology: The Science
of Behavior (7th ed.). London, UK: Pearson Education Limited.

Cassell, J., & Bickmore, T. (2003). Negotiated collusion: Modeling social language and its
relationship effects in intelligent agents. User Modeling and Adapted Interaction, 13(1), 89–
132.

Chandler, J., Mueller, P., & Paolacci, G. (2014). Nonnaïveté among Amazon Mechanical
Turk workers: Consequences and solutions for behavioral researchers. Behavior Research
Methods, 46(1), 112–130.

Chandler, J., & Shapiro, D. (2016). Conducting clinical research using crowdsourced
convenience samples. Annual Review of Clinical Psychology, 12(1), 53–81.

Cheung, C. (2008). The impact of electronic word‐of‐mouth: The adoption of online opinions
in online customer communities. Internet Research, 18(3), 229–247.

Chiou, J., & Shen, C. (2012). The antecedents of online financial service adoption the impact
of physical banking services on Internet banking acceptance. Behaviour & Information
Technology, 31(9), 859–871.

Chowdhury, G. (2003). Natural language processing. Annual Review of Information Science


and Technology, 37, 51–89.

62 Bibliography
Clark, H., & Brennan, S. (1991). Grounding in Communication. In L. Resnick, J. Levine, &
S. Teasley (Eds.), Perspectives on socially shared cognition. (pp. 127–149). Washington:
APA Books.

Clark, H., & Schaefer, E. (1989). Contributing to discourse. Cognitive Science, 13, 259–294.

Collier, J. E., & Kimes, S. E. (2012). Only if it is convenient : Understanding how


convenience influences self-service technology evaluation. Journal of Service Research,
16(1), 39–51.

Constine, J. (2017). Facebook will launch group chatbots at F8. Retrieved March 28, 2018,
from https://techcrunch.com/2017/03/29/facebook-group-bots/

Cooper, R., & Zmud, R. (1990). Information technology omplementation research : A


technological diffusion approach. Management Science, 36(2), 123–139.

Corti, K., & Gillespie, A. (2016). Co-constructing intersubjectivity with artificial


conversational agents: People are more likely to initiate repairs of misunderstandings with
agents represented as human. Computers in Human Behavior, 58, 431–442.

Costello, G., & Donnellan, B. (2007). The diffusion of WOZ: Expanding the topology of IS
innovations. Journal of Information Technology, 22(1), 79–86.

Cresci, E. (2017). Chatbot that overturned 160,000 parking fines now helping refugees claim
asylum. Retrieved October 8, 2017, from
https://www.theguardian.com/technology/2017/mar/06/chatbot-donotpay-refugees-claim-
asylum-legal-aid

Curdy, B. (2014). How to create semantic differential (EPA) scales using Qualtrics. Retrieved
August 10, 2017, from http://brentcurdy.net/qualtrics-tutorials/scales/

Curran, J., & Meuter, M. (2005). Self-service technology adoption: Comparing three
technologies. Journal of Services Marketing, 19(2), 103–113.

Curran, J., Meuter, M., & Surprenant, C. (2003). Intentions to use self-service technologies: A
confluence of multiple attitudes. Journal of Service Research, 5(3).

Curran, P., Kotrba, L., & Deninson, D. (2010). Careless responding in surveys: Applying
traditional techniques to organizational settings. In 25th Annual Conference of the Society for
Industrial/Organizational Psychology. Atlanta, GA.

Dabholkar, P. (1996a). Consumer evaluations of new technology-based self-service options:


An investigation of alternative models of service quality. International Journal of Research in
Marketing, 13(1), 29–51.

Dabholkar, P. (1996b). Consumer evaluations of new technology-based self-service options:


An investigation of alternative models of service quality. International Journal of Research in
Marketing, 13, 29–51.

Dale, R. (2016). The return of the chatbots. Natural Language Engineering, 22(5), 811–817.

Bibliography 63
Davis, F. (1989). Perceived usefulness, perceived ease of use, and user acceptance of
information technology. Information Technology MIS Quarterly, 13(3), 319–340.

Davis, M. (1983). Measuring individual differences in empathy: Evidence for a


multidimensional approach. Journal of Personality and Social Psychology, 44(1), 113.

DeVellis, R. (2012). Scale Development: Theory and Applications. Newbury Park, CA: Sage.

Ding, X., Verma, R., & Iqbal, Z. (2007). Self-service technology and online financial service
choice. International Journal of Service Industry Management, 18(3), 246–268.

Duffy, B. R. (2003). Anthropomorphism and the social robot. Robotics and Autonomous
Systems, 42(3–4), 177–190.

Edmondson, A., & McManus, S. (2007). Methodological fit in management field research.
Journal of Management Review, 33(4), 1155–1179.

Ehlers, C., Greene-Shortridge, T., Weekley, J., & Zajack, M. (2009). The exploration of
statistical methods in detecting random responding. In 24th Annual Conference of the Society
for Industrial/Organisational Psychology. Atlanta, GA.

Elkins, A., & Derrick, D. (2013). The sound of trust: Voice as a measurement of trust during
interactions with embodied conversational agents. Group Decision and Negotiation, 22(5),
897–913.

Epley, N., Akalis, S., Waytz, A., & Cacioppo, J. (2008). Creating social connection through
inferential reproduction: Loneliness and perceived agency in gadgets, gods, and greyhounds.
Psychological Science, 19, 114–120.

Epley, N., Waytz, A., & Cacioppo, J. T. (2007). On seeing human: A three-factor theory of
anthropomorphism. Psychological Review, 114(4), 864–886.

Eriksson, K., & Nilsson, D. (2007). Determinants of the continued use of self-service
technology: The case of internet banking. Technovation, 27, 159–167.

Eunson, B. (2015). Communicating in the 21st Century (4th ed.). Milton, QLD: John Wiley &
Sons.

Eyssel, F., Hegel, F., Horstmann, G., & Wagner, C. (2010). Anthropomorphic inferences from
emotional nonverbal cues: A case study. In Proceedings of the IEEE international workshop
on robot and human interactive communication (pp. 646–651).

Eyssel, F., Kuchenbrandt, D., Hegel, F., & De Ruiter, L. (2012). Activating elicited agent
knowledge: How robot and user features shape the perception of social robots. Proceedings -
IEEE International Workshop on Robot and Human Interactive Communication, 851–857.

Faber, P., & León-Araúz, P. (2016). Specialized knowledge representation and the
parameterization of context. Frontiers in Psychology, 7(196). https://doi.org/doi:
10.3389/fpsyg.2016.00196

64 Bibliography
Fan, A., Wu, L., & Mattila, A. S. (2016). Does anthropomorphism influence customers’
switching intentions in the self-service technology failure context? Journal of Services
Marketing, 30(7), 713–723.

Faul, F., Erdfelder, E., Buchner, A., & Lang, A. (2009). Statistical power analyses using
GPower 3.1: Tests for correlation and regression analyses. Behavior Research Methods,
41(4), 1149–1160.

Feingold, A. (1992). Gender differences in mate selection preferences: A test of the parental
investment model. Psychological Bulletin, 112(1), 125–139.

Field, A. (2013). Discovering Statistics using IBM SPSS Statistics: And sex and drugs and
rock “n” roll (4th ed.). London, UK: Sage.

Fishbein, M., & Ajzen, I. (1975). Belief, Attitude, Intention & Behavior: An Introduction to
Theory & Research. Reading, MA: Addison Wesley.

Fiske, S. T., Cuddy, A. J. C., & Glick, P. (2007). Universal dimensions of social cognition:
warmth and competence. Trends in Cognitive Sciences, 11(2), 77–83.

Flowerdew, J. (2014). Discourse in Context: Contemporary Applied Linguistics Volume 3.


London: Bloombury.

Fromkin, V. (1971). The non-anomalous nature of anomalous utterances. Language, 47(1),


27–52.

Gartner. (2016). Top Strategic Predictions for 2017 and Beyond: Surviving the Storm Winds
of Digital Disruption. Retrieved from
https://www.gartner.com/binaries/content/assets/events/keywords/cio/ciode5/top_strategic_pr
edictions_fo_315910.pdf

Goodman, J., Cryder, C., & Cheema, A. (2013). Data collection in a flat world: The strengths
and weaknesses of Mechanical Turk samples. Behavioral Decision Making, 26(3), 213–224.

Goodman, J., & Paolacci, G. (2017). Crowdsourcing consumer research. Journal of Consumer
Research, 44(1), 196–210.

Grice, H. (1975). Logic and conversation. In P. Cole & J. Morgan (Eds.), Syntax and
Semantics 3: Speech Acts (pp. 41–58). New York, NY: Academic Press.

Grosz, B., & Sidner, C. (1986). Attention, intentions, and the structure of discourse. Journal
of Computational Linguistics, 12(3), 175–204.

Guzman, I., & Pathania, A. (2016). Chatbots in Customer Service [White Paper]. Retrieved
14th April, 2017, from Accenture: https://www.accenture.com/t00010101T000000__w__/br-
pt/_acnmedia/PDF-45/Accenture-Chatbots-Customer-Service.pdf

Hayes, A. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new
millennium. Communication Monographs, 76(4), 408–420.

Bibliography 65
Hill, N., & Alexander, J. (2006). The Handbook of Customer Satisfaction and Loyalty
Measurement (3rd ed.). London, UK: Routledge.

Ho, C. C., & MacDorman, K. F. (2017). Measuring the uncanny valley effect: refinements to
indices for perceived humanness, attractiveness, and eeriness. International Journal of Social
Robotics, 9(1), 129–139.

Hoefer, C. (2016). Causal Determinism. Retrieved April 4, 2018, from


https://plato.stanford.edu/entries/determinism-causal/

Hoehle, H., Scornavacca, E., & Huff, S. (2012). Three decades of research on consumer
adoption and utilization of electronic banking channels: A literature analysis. Decision
Support Systems, 54(1), 122–132.

Holden, M. (2017). World’s first ATM machine turns to gold on 50th birthday. Retrieved July
27, 2017, from https://www.reuters.com/article/us-atm-anniversary/worlds-first-atm-machine-
turns-to-gold-on-50th-birthday-idUSKBN19I166

Horton, J., Rand, D., & Zeckhauser, R. (2010). The Online Laboratory: Conducting
Experiments in a Real Labor Market (NBER No. 15961).
Hsuan-Hsuan, K., & Ko-Hsin, H. (2015). Effects of inviting customers to share responsibility
in the context of impersonal service. Journal of Service Theory and Practice, 25(3), 267–284.

Huang, M., & Rust, R. (2013). IT-related service: A multidisciplinary perspective. Journal of
Service Research, 16(3), 251–258.

Hung, C., Yen, D., & Ou, C. (2012). An empirical study of the relationship between a self-
service technology investment and firm financial performance. Journal of Engineering &
Technology Management, 29(1), 62–70.

Huntley, J. (2006). Conceptualization and measurement of relationship quality: Linking


relationship quality to actual sales and recommendation intention. Industrial Marketing
Management, 35, 703–714.

Hutchby, I., & Wooffitt, R. (2008). Conversation Analysis (2nd ed.). Boston, MA: Polity.

Jenkins, M., Churchill, R., Cox, S., & Smith, D. (2007). Analysis of user interaction with
service oriented chatbot systems. Human-Computer Interaction, 4552, 76–83.

Johnson, K. (2017). Facebook Messenger hits 100,000 bots. Retrieved November 22, 2017,
from https://venturebeat.com/2017/04/18/facebook-messenger-hits-100000-bots/

Kalman, Y. M., Scissors, L. E., Gill, A. J., & Gergle, D. (2013). Online chronemics convey
social information. Computers in Human Behavior, 29(3), 1260–1269.

Kaplan, F., & Hafner, V. (2006). The challenges of joint attention. Interaction Studies, 7(2),
135–169.

Kiesler, S., Powers, A., Fussell, S., & Torrey, C. (2008). Anthropomorphic interactions with a
robot and robot-like agent. Social Cognition, 26(2), 169–181.

66 Bibliography
Kim, S., & McGill, A. L. (2011). Gaming with Mr. Slot or gaming the slot machine? Power,
anthropomorphism, and risk perception. Journal of Consumer Research, 38(1), 94–107

Kline, P. (2000). The Handbook of Psychological Testing (2nd ed.). London, UK: Routledge.

Knijnenburg, B., & Willemsen, M. (2016). Inferring capabilities of intelligent agents from
their external traits. ACM Transactions on Interactive Intelligent Systems, 6(4), 1–25.

Lallmahamood, M. (2007). An examination of individual’s perceived security and privacy of


the Internet in Malaysia and the influence of this on their intention to use e-commerce: Using
an extension of the technology acceptance model. Journal of Internet Banking and
Commerce, 12(3), 1–26.

Lane, J., Wellman, H., & Evans, E. (2010). Children’s understanding of ordinary and
extraordinary minds. Child Development, 81(5), 1475–1489.

Lee, H.-J. (2014). Consumer-to-store employee and consumer-to-self-service technology


(SST) interactions in a retail setting. International Journal of Retail and Distribution
Management, 43(8), 676–692.

Lee, K., Lee, B., & Oh, W. (2015). Thumbs up, sales up? The contingent effect of Facebook
likes on sales performance in social commerce. Journal of Management Information Systems,
32(4), 109–143.

Lee, W., Castellanos, C., & Chris Choi, H. S. (2012). The effect of technology readiness on
customers’ attitudes toward self-service technology and its adoption; The empirical study of
U.S. airline self-service check-in kiosks. Journal of Travel & Tourism Marketing, 29(8), 731–
743.

Levine, G., & Parkinson, S. (2014). Experimental Methods in Psychology. New York, NY:
Psychology Press.

Litman, L., Robinson, J., & Rosenzweig, C. (2015). The relationship between motivation,
monetary compensation, and data quality among US- and India-based workers on Mechanical
Turk. Behavior Research Methods, 47(2), 519–528.

Look Who Else Uses Chatfuel. (2018). Retrieved March 19, 2018, from https://chatfuel.com/

Look who is using Bot Society. (2018). Retrieved April 4, 2018, from https://botsociety.io/

Luangrath, A. W., Peck, J., & Barger, V. A. (2017). Textual paralanguage and its implications
for marketing communications. Journal of Consumer Psychology, 27(1), 98–107.

Luo, J. T., McGoldrick, P., Beatty, S., & Keeling, K. A. (2006). On‐screen characters: Their
design and influence on consumer trust. Journal of Services Marketing, 20(2), 112–124.

MacDorman, K. (2006). Subjective ratings of robot video clips for human likeness,
familiarity, and eeriness: An exploration of the uncanny valley. In ICCS/CogSci-2006 long
symposium: toward social mechanisms of android science. Vancouver.

Bibliography 67
MacDorman, K., & Ishiguro, H. (2006). The uncanny advantage of using androids in social
and cognitive science research. Interaction Studies., 7(3), 297–337.

Marino, M. (2014). The racial formation of chatbots. Comparative Literature and Culture,
16(5).

Marschner, L., Pannasch, S., Schulz, J., & Graupner, S.-T. (2015). Social communication
with virtual agents: The effects of body and gaze direction on attention and emotional
responding in human observers. International Journal of Psychology, 97(2), 85–92.

Maudlin, M. (1994). ChatterBots, TinyMuds, and the Turing Test: Entering the Loebner Prize
competition. In Proceedings of the Eleventh National Conference on Artificial Intelligence.
AAAI Press.

Mazaheri, E., Richard, M., & Laroche, M. (2012). The role of emotions in online consumer
behavior: a comparison of search, experience, and credence services. Journal of Services
Marketing, 26(7), 535–550.

Meade, A., & Craig, S. (2011). Identifying Careless Responses in Survey Data. In 26th
Annual Meeting of the Society for Industrial and Organizational Psychology. Chicago, IL.

Meis, J., & Kashima, Y. (2017). Signage as a tool for behavioral change: Direct and indirect
routes to understanding the meaning of a sign. PLoS ONE, 12(8).

Mellor, H. (1995). The Facts of Causation. London, UK: Routledge.

Meuter, M. L., Ostrom, A. L., Bitner, M. J., & Roundtree, R. (2003). The influence of
technology anxiety on consumer use and experiences with self-service technologies. Journal
of Business Research, 56(11), 899–906.

Meuter, M., Ostrom, A., Roundtree, R., & Bitner, M. (2000). Self-service technologies:
Understanding customer satisfaction with technology-based service encounters. Journal of
Marketing Jul, 64(3), 50-64.

Miller, R. (2017). Microsoft makes Azure Bot Service generally available for developers.
Retrieved March 28, 2018, from https://techcrunch.com/2017/12/13/microsoft-makes-azure-
bot-service-generally-available/

Mitra, K., Reiss, M., & Capella, L. (1999). An examination of perceived risk, information
search and behavioral intentions in search, experience and credence services. Journal of
Services Marketing, 13(3), 208–228.

Miwa, K., & Terai, H. (2012). Impact of two types of partner, perceived or actual, in human-
human and human-agent interaction. Computers in Human Behavior, 28(4), 1286–1297.

Mohorko, A., & Hlebec, V. (2016). Degree of cognitive interviewer involvement in


questionnaire pretesting on trending survey modes. Computers in Human Behavior, 62, 79–
89.

68 Bibliography
Moon, Y., & Nass, C. (1996). How “real” are computer personalities? Psychological
responses to personality types in human-computer interaction. Communication Research,
23(6), 651–674.

Moore, G. C., & Benbasat, I. (1991). Development of an instrument to measure the


perceptions of adopting an information technology innovation. Information Systems Research,
2(3), 192–222.

Moore, S. (2018). Gartner Says 25 Percent of Customer Service Operations Will Use Virtual
Customer Assistants by 2020. Retrieved February 20, 2018, from
https://www.gartner.com/newsroom/id/3858564

Morrissey, K., & Kirakowski, J. (2013). “Realness” in chatbots: Establishing quantifiable


criteria. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics), 8007 LNCS(PART 4), 87–96.

Most popular mobile messaging apps worldwide as of January 2018, based on number of
monthly active users (in millions). (2018). Retrieved April 4, 2018, from
https://www.statista.com/statistics/258749/most-popular-global-mobile-messenger-apps/

Nass, C. (2004). Etiquette equality: Exhibitions and expectations of computer politeness.


Communications in Computer and Information Science, 47(4), 35–37.

Nass, C., & Lee, K. (2001). Does computer-synthesized speech manifest personality?
Experimental tests of recognition, similarity-attraction, and consistency-attraction. Journal of
Experimental Psychology: Applied, 7(3), 171–181.

Nass, C., & Moon, Y. (2000). Machines and mindlessness: Social responses to computers.
Journal of Social Issues, 56(1), 81–103.

Nass, C., Moon, Y., & Green, N. (1997). Are machines gender neutral? Gender-stereotypic
responses to computers with voices. Journal of Applied Social Psychology, 27(10), 864–876.

Neuman, W. (2014). Social Research Methods: Qualitative & Quantitative Approaches (7th
ed.). Essex, UK: Pearson Education Limited.

Niculescu, A., van Dijk, B., Nijholt, A., Li, H., & See, S. (2013). Making Social Robots More
Attractive: The Effects of Voice Pitch, Humor and Empathy. International Journal of Social
Robotics, 5(2), 171–191.

Nilsson, D. (2007). A cross‐cultural comparison of self‐service technology use. European


Journal of Marketing, 41(3/4), 367–381.

Nowak, K. L., & Biocca, F. (2003). The effect of the agency and anthropomorphism on users’
sense of telepresence, copresence, and social presence in virtual environments. Presence-
Teleoperators and Virtual Environments, 12(5), 481–494.

Ondrej, B. (2018). An attitude towards an artificial soul? Responses to the “Nazi Chatbot.”
Philosophical Investigations, 41(1), 42–69.

Bibliography 69
Osgood, J., Ward, M., & Meade, A. (2015). The effects of environmental distractions on
careless responding in online surveys. In Annual Meeting of the Association for Psychological
Science. New York, NY.

Overby, E. (2008). Process virtualization theory and the impact of information technology.
Organization Science, 19(2).

Pallant, J. (2013). SPSS Survival Manual. London, UK: McGraw Hill.

Parasuraman, A., & Colby, C. L. (2015). An updated and streamlined technology readiness
index: TRI 2.0. Journal of Service Research, 18(1), 59–74.

Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of
human interaction with automation. Systems and Humans, 30(3).

Payne, C. R., Hyman, M. R., Niculescu, M., & Huhmann, B. A. (2013). Anthropomorphic
responses to new-to-market logos. Journal of Marketing Management, 29(1–2), 122–140.

Perez, S. (2017). Twitter launches a new enterprise API to power customer service and
chatbots. Retrieved March 28, 2018, from https://techcrunch.com/2017/12/19/twitter-
launches-a-new-enterprise-api-to-power-customer-service-and-chatbots/

Perez, S. (2018). Google’s chatbot analytics platform Chatbase launches to public. Retrieved
March 28, 2018, from https://techcrunch.com/2017/11/16/googles-chatbot-analytics-platform-
chatbase-launches-to-public/

Quarteroni, S., & Manandhar, S. (2007). A Chatbot-based Interactive Question Answering


System. In Decalog 2007: Proceedings of the 11th Workshop on the Semantics and
Pragmatics of Dialogue (p. 83–90).

Revilla, M., & Ochoa, C. (2015). What are the links in a web survey between response time,
quality and auto-evaluation of the efforts done? Social Science Computer Review, 33(1), 97–
114.

Robinson, J. (2006). Managing trouble responsibility and relationships during conversational


repair. Communication Monographs, 73(2), 137–161.

Rogers, E. (2003). Diffusion of Innovations. (5th, Ed.). Simon and Schuster.

Rushton, A., & Carson, D. (1985). The marketing of services: Managing the intangibles.
European Journal of Marketing, 23(8), 19–40.

Safdar, H. (2018). Consumers’ online information adoption behavior: Motives and


antecedents of electronic word of mouth communications. Computers in Human Behavior, 80,
22–32.

Salem, M., Eyssel, F., Rohlfing, K., Kopp, S., & Joublin, F. (2013). To err is human(-like):
Effects of robot gesture on perceived anthropomorphism and likability. International Journal
of Social Robotics, 5(3), 313–323.

70 Bibliography
Salkind, N. (2010). Encyclopedia of Research Design. Thousand Oaks, CA: Sage
Publications.
Schegloff, E. (1992). Repair after next turn: the last structurally provided defense of
intersubjectivity in conversation. American Journal of Sociology, 97(5), 1295–1345.

Schegloff, E. (2000). When “others” initiate repair. Applied Linguistics, 21(2), 205–243.

Scheutz, M., Schermerhorn, P., & Cantrell, R. (2011). Toward human-like task-based
dialogue processing for HRI. AI Magazine, 32(4), 77–84.

See who’s using Dialogflow. (2018). Retrieved April 1, 2018, from https://dialogflow.com/

Selnes, F., & Hansen, H. (2001). The potential hazard of self-service in developing customer
loyalty. Journal of Service Research, 4(2), 79–90.

Siddharth, S., & Watts, D. (2011). Cooperation and contagion in web-based, Networked
Public Goods Experiments. PLOS ONE, 6(3), e16836.

Slobin, D. (2002). Language evolution, acquisition, diachrony: Probing the parallels. In T.


Givon & B. Malle (Eds.), The Evolution of Language out of Pre-language (p. 375). John
Benjamins Publishing.

Soto, C., & John, O. (2017). Short and extra-short forms of the Big Five Inventory–2: The
BFI-2-S and BFI-2-XS. Journal of Research in Personality, 68, 69–81.

Srivastava, K. (2017, May). Chatty chatbots and the “time to frustration.” CIO. Retrieved
from
https://gateway.library.qut.edu.au/login?url=https://search.proquest.com/docview/%0A18967
60744?accountid=13380

Stahl, G. (2015). Conceptualizing the intersubjective group. International Journal of


Computer-Supported Collaborative Learning, 10(3), 209–217.

Steinfeld, A., Fong, T., Kaber, D., Lewis, M., Scholtz, J., Schultz, A., & Goodrich, M. (2006).
Common metrics for human-robot interaction. In Proceedings of the 1st ACM
SIGCHI/SIGART Conference on Human- Robot Interaction (pp. 33–40).

Steinfeld, A., Jenkins, O. C., & Scassellati, B. (2009). The Oz of Wizard: Simulating the
human for interaction research. In Proceedings of the 4th ACM/IEEE International
Conference on Human- Robot Interaction (pp. 101–107).

Stevens, C., Pinchbeck, B., Lewis, T., Luerssen, M., Pfitzner, D., Powers, D., Abrahamyan,
A., Leung, Y & Gibert, G. (2016). Mimicry and expressiveness of an ECA in human-agent
interaction: familiarity breeds content! Computational Cognitive Science, 2(1), 1–14.

Stewart, N., Ungemach, C., Harris, A., Bartels, D., Newell, B., Paolaccik, G., & Chandler, J.
(2015). The average laboratory samples a population of 7,300 Amazon Mechanical Turk
workers. Judgment and Decision Making, 10(5), 479–491.

Bibliography 71
Tintarev, N., O’donovan, J., & Felfernig, A. (2016). Introduction to the Special Issue on
Human Interaction with Artificial Advice Givers. ACM Transactions on Interactive
Intelligent Systems (TiiS), 6(4), 26.

Turing, M. (1950). Turing. Computing machinery and intelligence. Mind, 49, 433–460.

van Beuningen, J., de Ruyter, K., Wetzels, M., & Streukens, S. (2008). Customer self-efficacy
in technology-based self-service: Assessing between and within person differences. Journal of
Service Research, 11(4), 407–428.

van Doorn, J., Mende, M., Noble, S. M., Hulland, J., Ostrom, A. L., Grewal, D., & Petersen,
J. A. (2017). Domo arigato Mr. Roboto. Journal of Service Research, 20(1), 43–58.

Van Sonderen, E., Sanderman, R., & Coyne, J. (2013). Ineffectiveness of reverse wording of
questionnaire items: Let’s learn from cows in the rain. PloS One, 8(7).

Venkatesh, V., Morris, M. G., Davis, G. B., & Davis, F. D. (2003). User acceptance of
information technology: Toward a unified view, 27(3), 425–478.

Walker, R., Craig-Lees, M., Hecker, R., & Francis, H. (2002). Technology-enabled service
delivery: An investigation of reasons affecting customer adoption and rejection. International
Journal of Service Industry Management, 13(1), 91–106.

Wang, C., Harris, J., & Patterson, P. (2013). The roles of habit, self-efficacy, and satisfaction
in driving continued use of self-service technologies: A longitudinal study. Journal of Service
Research, 16(3), 400–414.

Waytz, A., Morewedge, C. K., Epley, N., Monteleone, G., Gao, J.-H., & Cacioppo, J. T.
(2010). Making sense by making sentient: Effectance motivation increases
anthropomorphism. Journal of Personality and Social Psychology, 99(3), 410–435.

Web of Science. (2018). Retrieved March 25, 2018, from https://apps-webofknowledge-com

Weigold, A., Weigold, I. K., & Russell, E. J. (2013). Examination of the equivalence of self-
report survey-based paper-and-pencil and internet data collection methods. Psychological
Methods, 18(1), 53–70.

Weijters, B., Rangarajan, D., Falk, T., & Schillewaert, N. (2007). Determinants and outcomes
of customers’ use of self-service technology in a retail setting. Journal of Service Research,
10(1), 3–21.

Weiss, A., & Bartneck, C. (2015). Meta analysis of the usage of the Godspeed Questionnaire
Series. Proceedings - IEEE International Workshop on Robot and Human Interactive
Communication, 2015–Novem, 381–388.

Worswick, S. (2018). Mitsuku. Retrieved April 3, 2017, from


https://www.pandorabots.com/mitsuku/

Yang, J., & Klassen, K. (2008). How financial markets reflect the benefits of self‐service
technologies. Journal of Enterprise Information Management, 21(5), 448–467.

72 Bibliography
Yus, F. (2006). Relevance theory. In K. Brown (Ed.), Encyclopedia of Language and
Linguistics (2nd ed., pp. 512–518). Amsterdam: Elsevier.

Zeithaml, V., Berry, L., & Parasuraman, A. (1996). The behavioral consequences of service
quality. Journal of Marketing, 60(2), 31–46.

Bibliography 73
Appendices

Appendix Item 1
Definitions for the independent variables from the extant literature which have been
shown to predict adoption intent. Definitions taken from Blut et al., (2016) and Hoehle et
al., (2012).

Perceived ease of use (Davis, 1989): The degree to which a user would find the use of a
technology to be free from effort.

Perceived usefulness (Davis, 1989): The subjective probability that using a technology would
improve the way a user could complete a given task.

Performance expectancy (Venkatesh et al., 2003): The degree to which an individual believes
that using the system will help him or her attain gains in job performance.

Effort expectancy (Venkatesh et al., 2003): The degree or ease associated with the use of the
system.

Social influence (Venkatesh et al., 2003): The degree to which an individual perceives that
important others believe he or she should use the new system.

Facilitating conditions (Venkatesh et al., 2003): Objective factors in the environment that
observers agree make an act easy to accomplish.

Relative advantage (Rogers, 1983, 2003): The degree to which an innovation is perceived as
being better than its precursor.

Compatibility (Rogers, 1983, 2003): The degree to which an innovation is perceived as being
consistent with existing values, needs and experiences of potential adopters.

Complexity (Rogers, 1983, 2003): The degree to which an innovation is perceived as being
difficult to use.

Trialability (Rogers, 1983, 2003): The degree to which an innovation may be experimented
with before adoption

Observability (Rogers, 1983, 2003): The degree to which the results of an innovation are
observable to others.

Appendices 75
Risk (Walker et al., 2002): Customer concerns about security, system failure, reliability and
other personal, psychological or financial risks associated with a technology.

Technology anxiety (Meuter et al., 2003): The degree of an individual’s apprehension, or


even fear, when he or she is faced with the possibility of using technology.

Image (Moore & Benbasat, 1991): The degree to which an individual perceives that use of an
innovation will enhance his or her status in his or her social system.

Demonstrability (Moore & Benbasat, 1991): The degree to which an individual believes that
the result of using a system are tangible, observable and communicable.

Voluntariness (Moore & Benbasat, 1991): The degree to which use of the innovation is
perceived as being voluntary, or of free will.

Technology readiness (Parasuraman & Colby, 2015): People’s propensity to embrace new
technologies to accomplish goals in home life and at work.

Habit (Venkatesh et al., 2003, Limayem, Hirt, & Cheung, 2007): The extent to which people
tend to carry out behavior (using SSTs) automatically because of learning.

Fun (Dabholkar, 1996): The extent to which the activity of using a specific system is
perceived to be enjoyable in its own right, aside from any performance consequences
resulting from system use.

Privacy (Lallmahamood, 2007): The claim of individuals, groups or institutions to determine


when and to what extent, information about them is communicated to others.

Subjective norms (Venkatesh et al., 2003): A person’s perception that most people who are
important to them think that they should or should not perform the behavior in question.

Need for human interaction (Dabholkar, 1996): The desire to retain personal contact with
others (particularly frontline service employees) during a service encounter.

76 Appendices
Appendix Item 2
Transcript of the stimulus used in Study 1. Participants were asked to type the ‘human’
text into the chatbot input window. The chatbot responded with the text in the ‘chatbot’
column. Note: All participants completed both scenarios.

Flower Delivery Scenario

Human I forgot Mother’s Day. I need flowers ASAP


Chatbot Oh no. Ok, what is your budget?
Human $100, but I need them to arrive tomorrow
Chatbot Too easy. What about some hot pink Gerberas and candy pink
Asiatic Lilies. (image of flowers)
Human Perfect
Chatbot What message do you want in the card?
Human I love you Mom and I always will. Happy Mother’s Day
Chatbot Ok. Ill prepare them this afternoon to deliver first thing tomorrow.
The address?
Human 1234 Petal Street, Grassland. Thank you so much. You’re the best
Chatbot So glad we could be of service. Talk soon.

Sports Ticket Purchase Scenario

Human I need two tickets to the football game


Chatbot Do you want the good seats or the best seats?
Human The best ones. I am bringing my boss
Chatbot Ok. Smart move. Are you trying to get a promotion?
Human I need one so I can keep buying things off you ShopBot. Thanks
for your help.
Chatbot You know I struggle with detecting sarcasm
Human That’s ok bot. I still love you.
Chatbot Really glad I could help. Do you like classical music? I have some
good tickets to the orchestra for Saturday night.
Human No. I hate classical music.
Chatbot Sorry. I’ll let you get back to it. Just message me if you need me.

Appendices 77
Appendix Item 3
Items from the instruments used in Study One.

Anthropomorphism: “Please rate your impression of the chatbot on these scales by moving
the slider left or right to correspond with the words above the slider. Some of these items may
seem repetitive, please consider them all. Note: The slider does not have to sit directly below
a response option. You can move it to sit between 2 responses if that best reflects your
opinion”. Slider was anchored with the following word pairs: (i) Fake – Natural, (ii)
Machinelike – Human-like, (iii) Artificial – Lifelike, (iv) Unconscious – Conscious, (v)
Communicates Inelegantly – Communicates Elegantly.

Adoption Intent: “I would use a chatbot like this to purchase flowers online / purchase tickets
to a sporting event” (7 response options on Likert scale, anchored at strongly agree – strongly
disagree)

Recommendation Intent: “I would recommend such a chatbot to others” (7 response options


on Likert scale, anchored at strongly agree – strongly disagree)

Openness: “Please indicate your agreement or disagreement with the following statements. I
see myself as someone who:” (7 response options on Likert scale, anchored at strongly agree
– strongly disagree): (i) Is curious about many different things, (ii) Values artistic, aesthetic
experiences, (iii) Has an active imagination, (iv) Likes to reflect, play with ideas, (v) Is
original, comes up with new ideas, (vi) Is inventive, (vii) Is ingenious, a deep thinker, (viii) Is
sophisticated in art, music or literature, (ix) Prefers work that is routine, (x) Has few artistic
interests.

Technology Anxiety: “Please indicate your agreement or disagreement with the following
statements. The following statements accurately describe my feelings about technology:” (7
response options on Likert scale, anchored at strongly agree – strongly disagree): (i) When
given the opportunity to use technology, I fear I might damage it in some way, (ii) I am
confident I can learn new technology related skills, (iii) I am able to keep up with important
technological advances, (iv) I feel apprehensive about using technology, (v) I am sure about
my ability to interpret technological output, (vi) Technological terminology sounds like
confusing jargon to me, (vii) I have difficulty understanding most technological matters, (viii)
I have avoided using technology because it is unfamiliar to me, (ix) I hesitate to use
technology for fear or making mistakes I cannot correct.

78 Appendices
Frequency of Facebook Use: “I the past month, I have used Facebook Messenger” (7 response
options: never, once or twice per month, once or twice every few weeks, once or twice every
week, once or twice every day, 3+ times per day).

Appendices 79
Appendix Item 4
Transcript of the stimulus used in Study Two. Illustrating the experimental
manipulation across conditions.

A: Control B: Contextual Awareness D: Conversation Repair


CB Hi. Welcome to the Beachside Hi. Welcome to the Beachside Hi. Welcome to the Beachside
Hotel. How can I help you Hotel. How can I help you today? Hotel. How can I help you today?
today?
H Any rooms left for Any rooms left for Independence Any rooms left for Independence
Independence Day? Day? Day?
CB -- -- Do you mean 4th of July 2018?
H -- -- Yes
CB We have some rooms still We have some rooms still We have some rooms still
available. How many guests in available. How many guests in available. How many guests in
your party? your party? your party?
H 4 adults 4 adults 4 adults
CB Ok great. We have the Ok great. We have the Oceanview Ok great. We have the Oceanview
Oceanview room for $209 per room for $209 per night. It sleeps room for $209 per night. It sleeps
night. It sleeps 4 adults in 2 4 adults in 2 separate rooms, each 4 adults in 2 separate rooms, each
separate rooms, each with a with a private bathroom. with a private bathroom.
private bathroom. (image of room) (image of room)
(image of room)
H internet included? internet included? internet included?
CB -- -- Do you mean "wi-fi"?
H -- -- yes
CB Yes. Rooms include unlimited Yes. Rooms include unlimited Yes. Rooms include unlimited wi-
wi-fi, air-conditioning, cable wi-fi, air-conditioning, cable TV, fi, air-conditioning, cable TV, a
TV, a full kitchen, daily house- a full kitchen, daily house- full kitchen, daily house-keeping
keeping and all the other extra keeping and all the other extra and all the other extra comfort's
comfort's afforded to our valued comfort's afforded to our valued afforded to our valued Beachside
Beachside guests. Beachside guests. guests.
H Room includes a washer and Room includes a washer and Room includes a washer and
dryer? dryer? dryer?
CB -- Sorry. I thought we were talking Do you mean "clothes dryer" or
about ‘room reservation’ on the "hair dryer"?
‘4th of July 2018’.
H -- Does the room include a clothes Clothes dryer
dryer?
CB Absolutely. We also offer an Absolutely. We also offer an Absolutely. We also offer an
onsite laundry service upon onsite laundry service upon onsite laundry service upon
request. request. request.
H reserve the room for me. reserve the room for me. reserve the room for me.
CB Done. Click buttons below for Done. Click buttons below for Done. Click buttons below for
copies of invoice and booking copies of invoice and booking copies of invoice and booking
confirmation. We look forward confirmation. We look forward to confirmation. We look forward to
to welcoming you and your welcoming you and your guests at welcoming you and your guests at
guests at the Beachside on the the Beachside on the 4th of July the Beachside on the 4th of July
4th of July 2018. 2018. 2018.

80 Appendices
Appendix Item 5
Items from the additional instruments used in Study Two.

Perceived Ease of Use: “Please indicate your agreement with the following statements” (7
response options on Likert scale, anchored at strongly agree – strongly disagree): (i) Learning
to use this chatbot would be easy for me, (ii) I would find it difficult to use this chatbot, (iii) It
would be easy for me to become skilful at using this chatbot

Perceived Usefulness: “Please indicate your agreement with the following statements” (7
response options on Likert scale, anchored at strongly agree – strongly disagree): (i) This
chatbot would be useful for booking a hotel room, (ii) Using this chatbot would improve the
way I book a hotel room, (iii) Using this chatbot would make booking a hotel room easier

Need for Human Interaction: “Please indicate your agreement with the following statements.
When purchasing consumer goods or services;” (7 response options on Likert scale, anchored
at strongly agree – strongly disagree): (i) I like interacting with the person who provides the
customer service, (ii) It bothers me to use a machine when I could talk to a person instead,
(iii) Personal attention by the service employee is important to me, (iv) Human contact makes
the process enjoyable to me

Appendices 81

You might also like