Professional Documents
Culture Documents
ANTHROPOMORPHISM, ADOPTION
AND WORD OF MOUTH
Non-human entities can be configured to provide customer service. Most consumers are
accustomed to using an ATM or scanning their own groceries. But a new generation of
interfaces has emerged. Chatbots are similar to existing interfaces in that they replicate the
functional component of customer service. But they are new in that they may replicate the
social component as well. This is theorised to occur through the chatbot’s use of language,
generating anthropomorphism within the user.
The grey literature presents chatbots as ready to provide customer service across a wide range
of industries. Firms are rapidly deploying chatbots, driven by the growth of instant messaging
and advances in artificial intelligence. This thesis examines the relationship between a
chatbot’s humanlike cues and consumers’ behavioural intentions’. Addressing these issues
will help modernise theory while guiding practitioners to develop better chatbots.
This thesis submits two quantitative studies to support the idea that a chatbot’s perceived
humanness is important to consumers. Preliminary results suggest that the anthropomorphism
of a chatbot leads to increases in adoption and recommendation intent. Furthermore, the
source of the anthropomorphic perceptions appears linked to a chatbot’s use of specific
linguistic stratagems which can be manipulated by practitioners and researchers.
i
Table of Contents
Abstract ..................................................................................................................................... i
Table of Contents ..................................................................................................................... ii
List of Figures ......................................................................................................................... iii
List of Tables ........................................................................................................................... iv
Statement of Original Authorship ............................................................................................ v
Chapter 1: Introduction ...................................................................................... 1
1.1 Defining chatbots ........................................................................................................... 2
1.2 Chatbots in customer service roles ................................................................................. 3
1.3 Justification for the choice of theoretical lens................................................................ 4
1.4 Rationale: summary of the research gap ........................................................................ 6
1.5 Research aims and theoretical model ............................................................................. 7
Chapter 2: Literature Review ............................................................................. 9
2.1 Self service technology .................................................................................................. 9
2.2 Justification for the choice of dependent variables ...................................................... 11
2.3 Justification for the choice of independent variables ................................................... 11
2.4 Anthropomorphism ...................................................................................................... 14
2.5 Measuring anthropomorphism ..................................................................................... 15
2.6 Effect of anthropomorphism on adoption and recommendation intent ........................ 17
2.7 Effect of conversational repair and contextual awareness on anthropomorphism ....... 18
2.8 Summary and implications........................................................................................... 23
Chapter 3: Study One ........................................................................................ 24
3.1 Method and design ....................................................................................................... 24
3.2 Results .......................................................................................................................... 31
3.3 Discussion .................................................................................................................... 34
Chapter 4: Study Two........................................................................................ 37
4.1 Method and design ....................................................................................................... 37
4.2 Results .......................................................................................................................... 43
4.3 Discussion .................................................................................................................... 50
Chapter 5: Conclusions...................................................................................... 55
Bibliography ............................................................................................................. 61
Appendices ................................................................................................................ 74
ii
List of Figures
Figure 1. Scholarly works re: “natural language processing”, sorted by year on Web of Science .........4
Figure 2. Scholarly works re: “natural language processing”, sorted by discipline on Web of Science 5
Figure 7. Screenshot of the Facebook page created to introduce participants to the stimulus chatbot ..30
Figure 8. Screenshot from Bot Society website, illustrating how animations were created ..................41
Figure 9. Screenshots from the stimulus animation, illustrating design elements ................................41
Figure 10. Anthropomorphism fully mediates the contextual awareness and adoption relationship.....46
Figure 11. Anthropomorphism fully mediates the contextual awareness and recommendation intent
relationship .............................................................................................................................................47
Figure 12. Graphical representation of mean scores for anthropomorphism, adoption and
recommendation intent across experimental conditions.........................................................................53
Figure 13. Screenshot of an item from the anthropomorphism scale as presented to participants in
study one ................................................................................................................................................59
Figure 14. The original anthropomorphism instrument as presented by Bartneck et al. (2009) ............59
Figure 15. Screenshot of an item from the anthropomorphism scale as presented to participants in
study two ................................................................................................................................................60
iii
List of Tables
Table 1. Independent Variables from the Four Dominant Adoption Models .........................................12
Table 2. Supplementary Independent Variables Used in Conjunction with the Four Dominant
Adoption Models ....................................................................................................................................13
Table 4. Means, Standard Deviations, Correlations and Internal Consistency of All Variables ...........32
Table 5. Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor
Variable in a Multiple Regression Analysis of Chatbot Adoption Intent ..............................................33
Table 6. Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor
Variable in a Multiple Regression Analysis of Chatbot Recommendation Intent ..................................34
Table 7. Sample Extract from Animations Illustrating Manipulation of Elicited Agent Knowledge
Across the Experimental Conditions ......................................................................................................40
Table 8. Distribution of Demographic Variables Across the Three Experimental Conditions ..............44
Table 9. Mean Comparison of the Dependent Variables Across the Three Experimental Conditions ..45
Table 11. Means, Standard Deviations, Correlations & Internal Consistency of All Variables............48
Table 12. Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor
Variable in a Multiple Regression Analysis of Chatbot Adoption Intent ...............................................49
Table 13. Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor
Variable in a Multiple Regression Analysis of Chatbot Recommendation Intent ..................................50
iv
Statement of Original Authorship
The work contained in this thesis has not been previously submitted to meet requirements
for an award at this or any other higher education institution. To the best of my knowledge and
belief, the thesis contains no material previously published or written by another person except
where due reference is made.
Note. Professional editor, Robyn Kent (RAK Editing Services), provided copyediting services according
to the guidelines laid out in the university-endorsed national ‘Guidelines for editing research theses’.
v
This page is intentionally left blank.
vi
Chapter 1: Introduction
Much of the customer service literature is predicated on the idea that service delivery is
an interpersonal interaction, occurring between two or more people (van Doorn et al., 2017).
However, non-human entities can be configured to provide customer service. Where a
technological interface allows a customer to produce a service, independent of human staff, it
is known as a self-service technology (SST) (Curran & Meuter, 2005). The SST literature
examines interfaces such as ATMs (Curran & Meuter, 2005), airport self-service kiosks (Fan,
Wu, & Mattila, 2016) and grocery store self-check-outs (Wang, Harris, & Patterson, 2013).
These interfaces are designed to replicate the functional output of a human employee.
ATMs dispenses cash, airport kiosks print boarding passes and self-checkouts calculate the
cost of groceries. However, these SSTs lack the capacity to interact with consumers socially
(van Doorn et al., 2017). As a result, consumers may view SSTs as cold or impersonal
because the service interactions they facilitate lack customisation (Hsuan-Hsuan & Ko-Hsin,
2015).
Many variables used in SST studies view the SST interaction from this functional,
utilitarian perspective, framing the human user as focused on the goal of consumption. Users
are thought to want convenience and efficiency (Collier & Kimes, 2012; Meuter et al., 2000)
from SSTs that are useful and easy to use (Davis, 1989; Weijters et al., 2007). Where
consumer sociality and SSTs intersect, variables focus on the SSTs’ deficits. For example, the
human need for social interaction is considered an antecedent to dissatisfying SST
interactions (Dabholkar, 1996).
SSTs need not be cold and impersonal however. The social robotics literature supports
the idea that users may attribute human characteristics, traits or states to a machine (Bartneck,
Kulic, Croft & Zoghbi, 2009) in a response known as anthropomorphism (Epley, Waytz, &
Cacioppo, 2007). Similarly, social response theory has delivered a series of experiments
which support the idea that humans behave socially towards computers they perceive as social
actors (Nass & Moon, 2000). Chatbots are a form of technology with humanlike cues that
may elicit a psychological response such as anthropomorphism or even a corresponding
behavioural outcome as proposed by Nass and Moon (2000). Furthermore, chatbots can be
Chapter 1: Introduction 1
configured to provide customer service. Thus, chatbots as an SST may be able to replicate
both the functional and social elements of interpersonal service delivery.
This thesis contributes to the self-service literature by examining three broad research
questions, designed to test the antecedents and outcomes of anthropomorphism.
(RQ3): Do these strategies have a subsequent effect on adoption and recommendation intent?
Chatbots are computer programs with natural language capabilities, which can be
configured to converse with human users (Maudlin, 1994). Tintarev, O’Donovan, and
Felfernig (2016) conceptualise chatbots as automated advice givers in that they can “propose
and evaluate options while involving their human user in the decision-making process” (p.
26). Dale (2016) describes the commercial chatbot eco-system: “Most visible at the forefront
of the technology, we have the voice-driven digital assistants from the Big Four: Apple’s Siri,
Microsoft’s Cortana, Amazon’s Alexa and Google’s new Assistant. Following up behind, we
have many thousands of text-based chatbots that target specific functionalities, enabled by
tools that let you build bots for a number of widely used messaging platforms” (p. 811). The
text-based systems described by Dale are the focus of this thesis. Text-based chatbots can be
deployed to instant messaging services such as Facebook Messenger, Skype, Twitter, Viber,
WhatsApp and WeChat. Therefore, a text-based chatbot has a theoretical reach of over 2.5
billion people (Statista, 2018).
2 Chapter 1: Introduction
A chatbot is the combination of an interface, an intelligence and back-end systems
(Guzman & Pathania, 2016). The interface is the part of the chatbot that a user interacts with.
Interface access may occur via a phone, a computer or a dedicated device such as Amazon’s
Alexa. Interaction with the interface occurs through vocal or textual communication. The
intelligence and backend-systems facilitate the interaction process and are hidden from the
user. Meanwhile, chatbot intelligence occurs on a spectrum. A low-intelligence chatbot uses
simple rules to appear intelligent. For example, if the human user says X, the chatbot is
programmed to respond with Y. More advanced intelligences may learn from previous
conversations to automatically improve over time. Chatbots may also employ techniques such
as sentiment analysis to improve comprehension. Finally, a chatbot may be connected to other
ancillary, back-end systems, such as a knowledge base, to enable Q&A or a payment gateway
to process financial transactions.
Forecasts suggests that “by 2020, 25% of customer service and support operations will
integrate chatbot technology across engagement channels” (Moore, 2018). The figure of 25%
is up from less than 2% at present. Since work on this thesis began, multinational technology
and social media firms have announced significant steps towards enabling a chatbot to
provide customer service. Google (Perez, 2018), Microsoft (Miller, 2017), Twitter (Perez,
2017) and Facebook (Constine, 2017) have all released developer tools that allow the
proliferation of chatbots on their platforms. At present, Facebook has over 100,000 text-based
chatbots operating on its Messenger service (Johnson, 2017). Research and advisory firm
Gartner (2016) predicts that by 2020 the average person will have more conversations per day
with a chatbot than with their partner. Gartner (2016) does not provide sufficient evidence to
support this claim. However, conversation with non-human entities is likely to affect the
provision of customer service.
Organisations have begun to experiment with chatbots in customer service roles. The
Chatfuel platform (Chatfuel, 2018) claims to support chatbots for clients including Adidas,
British Airways and Volkswagen. Google’s product, Dialogflow is said to power chatbots for
brands such as Comcast, Giorgio Armani and Mercedes (Dialogflow, 2018). Despite the
spread of chatbots across the commercial landscape, empirical research into consumer
perceptions of chatbots is lacking, as illustrated in Figure 2 (p. 5).
Chapter 1: Introduction 3
1.3 JUSTIFICATION FOR THE CHOICE OF THEORETICAL LENS (SST)
The extant literature has only recently discussed chatbots as being conceptually related
to SSTs (van Doorn et al., 2017). But it appears that the marketing and customer service
literature has yet to concretely classify chatbots as belonging to any particular research
stream.
Research in NLP has increased sharply in recent years as shown in Figure 1. However,
the majority of papers are technical in nature, reading like patent applications or schematics
for a prototype. This is because the majority of the chatbot literature originates from the
computer science and artificial intelligence research streams, as shown in Figure 2.
Figure 1. Number of peer-reviewed, scholarly works for the keywords “natural language processing”, sorted by
year on Web of Science (1966-present).
Note. Web of Science does not include all scholarly works. Graph is presented as illustrative of a trend, that is an
increase in NLP literature over time.
4 Chapter 1: Introduction
Figure 2. Number of peer-reviewed, scholarly works for the keywords “natural language processing”, sorted by
discipline on Web of Science (1966-present).
Note. Categories listed are not exhaustive. Approximately 80 different disciplines contribute to the body of
knowledge. Graph is presented as illustrative of a trend.
Where a chatbot performs inside an instant messaging platform, researchers may consider it
as a virtualised process (Overby, 2008) because the human-machine interaction occurs in a
Chapter 1: Introduction 5
non-local space. Finally, to the extent that a chatbot provides advice, the decision sciences
literature considers chatbots as intelligent decision aids (Arnold et al., 2004). In each of these
conceptualisations, the customer service element is lacking. Thus, the SST designation is
considered most appropriate.
As discussed, this thesis presents the view that (a) chatbots can be an SST, but (b)
chatbots are different from the SSTs previously examined within the literature. Unlike other
SSTs, chatbots are capable of a free-form conversation, which may generate
anthropomorphism within the human user. It is this anthropomorphism or perceived
humanness that is missing from the SST literature. The Literature Review chapter will present
a number of variables that have been empirically linked to the adoption and recommendation
of an SST. Variables that address the social components of the human-machine interaction are
notably missing. The two published literature reviews dealing with variables known to predict
SST adoption (Blut, Wang, & Schoefer, 2016; Hoehle, Scornavacca, & Huff, 2012) do not
mention sociality of the interface.
The conceptual paper of van Doorn et al. (2017) suggests that automated social presence
will allow chatbots to develop relationships with consumers. However, implicit in this
proposition is the assumption that consumers will interact with a chatbot long enough for
relationship building to occur. On the other hand, a consumer might try a customer service
chatbot once, dislike the interaction and vow to use alternative means wherever possible.
Thus, adoption, or the behavioural intention to adopt a chatbot, is central to van Doorn et
als.’s (2017) proposition. Furthermore, van Doorn et al. (2017) imply that automated social
presence will improve customer engagement. However, other researchers have provided
evidence to support an alternative perspective. Ho and MacDorman’s (2017) uncanny-valley
theory demonstrates that the relationship between a social robot’s humanness and positive
user perceptions is non-linear. At some point, the relationship reverses direction and increases
in humanlike cues become unpleasant for users, resulting in perceptions of eeriness.
6 Chapter 1: Introduction
1.5 RESEARCH AIMS AND THEORETICAL MODEL
The second study aims to replicate and extend Study One. An attempt is made to
manipulate anthropomorphic perceptions within the participant. This is achieved by having
the chatbot display particular linguistic constructs: conversational repair and contextual
awareness. In doing so, Study Two demonstrates that anthropomorphic perceptions can be
empirically linked with specific chatbot features. Study Two demonstrates that the
antecedents of anthropomorphism can reside within the technology (as opposed to solely
within the mind of the individual, that is, a participant’s predisposition to anthropomorphise).
Consequently, practitioners could attempt to maximise the chatbot features that have been
identified as contributing to anthropomorphism.
Chapter 1: Introduction 7
Figure 3. A theoretical model of the relationships explored.
The following chapter presents a literature review, which is structured around the topics
of SST and anthropomorphism. The SST section provides justification for the choice of
dependent variables used, while elaborating on the research gap. The anthropomorphism
section discusses ways in which anthropomorphism may be measured or manipulated. Study
One methods, results and discussion are then presented as a single chapter, followed by a
chapter pertaining to the second study, which is presented using the same structure. A
conclusion chapter, discussing implications, limitations and avenues for future research, will
complete the thesis.
8 Chapter 1: Introduction
Chapter 2: Literature Review
The literature review begins with a summation of key findings related to self-service
technology (SST) (sections 2.1-2.3). The purpose of these sections is threefold. First, a
discussion of SSTs helps situate this work in the context of previous findings. Second, the
importance of adoption and recommendation intent as dependent variables is justified.
Finally, the research gap is brought into focus. Twenty-two independent variables from four
dominant adoption models are summarised. Although each of these variables may provide
insight into consumer perceptions of a chatbot, none of them captures a chatbot’s defining
feature. That defining feature is conceptualised here as a chatbot’s use of language, which
may generate anthropomorphism. Adjunct literature gaps are addressed where relevant.
The world’s first automatic teller machine (ATM) was installed on 27 June 1967.
Today, there are more than 3,000,000 ATMs dispensing cash around the globe (Holden,
2017). Technology has rapidly changed the nature of service delivery. Many high-touch and
low-tech customer service operations, such as the provision of cash, have been overhauled so
that technology either supports or supplants the human employee (Wang et al., 2013). By
using an SST, the customer produces the service themselves, although interpersonal
interaction with staff may still occur when service delivery fails. For example, staff are often
on the periphery of grocery self-scan or airport kiosks so they can assist the consumer if
needed.
As discussed, current SSTs lack customisation. The interfaces require the consumer to
select from pre-set menu options on a screen. Everyone has approximately the same service
experience, even if perceptions of that experience differ. This makes sense when the
consumer is viewed as decisive and economical with regards to time and effort. Knowing the
precise combination of buttons to press in order to purchase a train ticket each day is efficient.
But the standardised approach maybe suboptimal in other scenarios. Selnes and Hansen
(2001) claim that SST usage has a negative impact on social attachment to the firm, reducing
customer loyalty.
Standardised SSTs may struggle with complex service delivery, yet a number of start-up
firms are demonstrating that chatbots are capable. For instance, Babylon is a chatbot that
provides triage advice to users who have a medical concern. The system was on trial with the
United Kingdom’s NHS in 2016-17 (Burgess, 2017). DoNotPay is the product of a Stanford
student who describes it as the world’s first robo-lawyer. It has overturned more than
160,000 parking tickets and is being modified to help refugees claim asylum (Cresci, 2017).
Both medical and legal advice are considered credence services, the successful delivery of
which draws on social constructs between parties (Ding, Verma, & Iqbal, 2007). Existing
SSTs may try to evoke sociality by welcoming or thanking a consumer for their patronage,
but the consumer is never under the illusion that the SST is human. Conversely, the “perfect”
chatbot is indistinguishable from a human.
The key dimension of an SST is that the service encounter lacks human involvement
(Curran & Meuter, 2005). However, it may be time to revisit this notion. The SST literature
could benefit from finding a way to reintroduce some approximation of “humanness” into the
discussion through variables such as anthropomorphism.
The studies presented in this thesis use adoption and recommendation intent as
dependent variables. Adoption in the SST context refers to a customer’s decision to regularly
use (or reject) a new technological interface (Walker, et al., 2002). Recommendation intent,
also conceptualised as positive word-of-mouth refers to the likelihood of a customer
promoting the SST to someone else (Huntley, 2006). Thus, both adoption and
recommendation intent are central to SST studies.
From a firm’s perspective, deploying SSTs offers a number of benefits. First, SSTs can
increase efficiency and customer satisfaction (Bitner et al., 2002; Huang & Rust, 2013; Lee,
2014). Second, an SST can standardise service delivery (Selnes & Hansen, 2001). Third,
because an SST may supplement or act as a substitute for the human employee, empirical
research has linked investment in SSTs with a firm’s positive financial performance (Hung,
Yen, & Ou, 2012) and an increase in stock price (Yang & Klassen, 2008). Therefore, Meuter
et al. (2005) talk of the “tremendous lure” of automating service delivery. However, it is not
the act of deploying SSTs that delivers benefits to the firm. Rather, firms enjoy the benefits
once consumers try the SST and commit to future use (adoption). This process of trial,
evaluation and adoption often begins when the SST is recommended to the consumer by
someone else. Thus recommendation intent or positive word of mouth is a dependent variable
discussed in conjunction with adoption (Cheung, 2008; Curran & Meuter, 2005; Safdar,
2018).
Four theoretical models have emerged as dominant in the study of technology adoption.
They are:
• The theory of reasoned action (TRA) (Fishbein & Ajzen, 1975) and the updated
version, the theory of planned behavior (TPB) (Ajzen, 1991)
• The diffusion of innovation (DOI) constructs (Rogers, 1983, 2003) conceptualised in
Cooper and Zmud’s’ theoretical model (Cooper & Zmud, 1990)
• The technology acceptance model (TAM) (Davis, 1989), also derived from the TRA
• The unified theory of acceptance and use (UTAUT) (Venkatesh et al., 2003)
Rather than trying to determine the pre-eminence of one model, the literature was
examined to identify the range of independent variables that have been shown to influence
adoption or recommendation in previous studies. Tables 1 and 2 summarise the independent
variables identified. These variables are defined in Appendix Item 1.
Table 1.
Independent Variables from the Four Dominant Adoption Models
Note. TAM = technology acceptance model (Davis, 1989), TRA = theory of reasoned action (Fishbein & Ajzen,
1975), UTAUT = unified theory of acceptance and use (Venkatesh et al., 2003), DOI = diffusion of innovation
(Rogers, 1983, 2003).
The independent variables (Table 1) from the dominant models have been augmented
by the independent variables (Table 2), as the technology or application dictates.
2.4 ANTHROPOMORPHISM
Humans anthropomorphise non-human entities from a young age (Barrett, Richert, &
Driesenga, 2001; Lane, Wellman, & Evans, 2010). This is because anthropomorphism ties
into a number of motivations that are central to the human experience. Epley, Waytz &
Cacioppo (2007) categorise these motivations as sociality motivation and effectance
motivation. Sociality motivation refers to the need to establish social connections, which
promotes co-operation. Co-operation is of evolutionary benefit to humans (Axelrod &
Hamilton, 1981). People who are more socially connected are said to have a lower level of
sociality motivation. In support of this idea, chronically lonely individuals are more likely to
anthropomorphise technology (Epley et al., 2008). Effectance motivation can be
conceptualised as the desire to understand and master one’s environment. Epley et al. (2007)
present effectance as a strategy to reduce uncertainty. By anthropomorphising the non-human,
a person can anticipate the entity’s behaviour, increasing the odds of a favourable interaction.
Evidence that effectance motivation contributes to anthropomorphism of technology is well
presented by Waytz et al. (2010) in five experiments, including measurement via
neuroimaging.
Both sociality and effectance motivation reside inside the mind of the human. That is,
they are qualities of the person who is anthropomorphising. However, Epley et al. (2007)
propose a third antecedent of anthropomorphism, known as elicited agent knowledge (EAK).
EAK refers to the actual humanlike cues projected by the non-human agent. A person
interacting with a non-human entity will examine the entity’s features and behaviour to check
for perceived similarity. For example, some vehicles are designed so that the headlights may
appear as eyes and the radiator grill as a mouth. In the same way that antecedents of
technology adoption are found within both the user and the technology, as Waytz et al. (2010)
explain, anthropomorphism appears to originate within both the perceiver and the perceived.
Bartneck et al. (2009, p. 72) suggest most engineers run naïve experiments to verify a
design and have “a tendency to cook up their own questionnaires”. Morrissey and Kirakowski
(2013) developed a scale to measure the naturalness of chatbots. They propose the naturalness
construct consists of conscientiousness, originality, politeness and thoroughness. But the
factors said to represent naturalness do not line up well with anthropomorphism. For example,
static signage can be seen as polite (Meis & Kashima, 2017) without ever being perceived as
human.
Other popular methods for investigating chatbots are the Turing Test and the Wizard of
Oz method, both of which may be suitable for investigating anthropomorphism. Turing was a
seminal scholar in the chatbot field. Before the 1950s the philosophical enquiry into minds
and machines was focused around the question: Can machines think? Turing found the
“thinking machine” question of little value from a deductive, positivist perspective, instead
This thesis uses the anthropomorphism construct from the Godspeed Questionnaire, one
of the most frequently cited instruments in the human-robot interaction literature (Weiss &
Bartneck, 2015). Full details of the instrument are provided in Study One. Steinfeld et al.
(2006) have suggested that the use of standardised instruments is essential in developing a
robust evidentiary body of knowledge regarding human perceptions of non-human entities. It
is hoped that any future research into the anthropomorphism of chatbots uses this scale, so
that results may be compared.
The purpose here is not to engage in what Burger and Sheehy (2012) call the 40-year
battle between situationists and personality psychologists. This thesis takes at face value the
idea that both the person and the situation cause human behaviour. However, sociality and
effectance motivation are individual difference variables. Therefore, a marketing practitioner
cannot program a chatbot to leverage these user characteristics, unless, of course, the chatbot
was designed to specifically target those with unusually high or low sociality and effectance
motivation, such as the chronically lonely. The purpose of Study Two was to demonstrate that
certain chatbot behaviours are responsible for anthropomorphism (through access to EAK).
Left: A NAO robot illustrative of physical embodiment. Right: Mitsuku, winner of the Loebner Prize (2013,
2016, 2017) for artificial conversational entities, illustrative of an avatar.
The social robot on the left has a physical body (embodiment) and the chatbot on the
right has an avatar, or a graphical representation of a face or face and body (Blascovich et al.,
2002). The features of a social robot or avatar likely to influence access to EAK are
unambiguous. Previous research has considered the relationship between anthropomorphism
and humanlike cues such as the shape and characteristics of robotic heads (Duffy, 2003) or
the facial expression and gaze of a virtual agent (Marschner et al., 2015). Others have
examined the impact of cues such as vocal characteristics (Elkins & Derrick, 2013; Nass &
Lee, 2001), race (Marino, 2014) or gender (Nass, Moon, & Green, 1997). A textual chatbot
running through an instant messaging platform provides none of these anthropomorphic cues.
Nass (2004) suggests that humanlike cues stack synergistically to create perceptions of
humanness in computers that are perceived as social actors. The more humanlike cues a
computer can demonstrate, the stronger the perceptions of a social other. Humanlike
dimensions listed by Nass include language use, a voice (real or synthetic), a face via an
avatar, interactivity and unpredictability. Of Nass’s (2004) dimensions, a textual chatbot on
Facebook can demonstrate language use and interactivity. The human-robot interaction
literature also supports the idea of humanlike attributes increasing anthropomorphism in a
non-linear fashion. Kiesler et al. (2008) demonstrate that the most salient human
characteristics are appearance, back-story, observable behaviour and etiquette.
In the second study presented in this thesis, the stimulus materials focus on chatbot-
human interactions where miscommunication occurs. This increases ecological validity. As
Dale (2017) notes, “the barrier to [chatbot] entry is now very low, so that anyone can be a
chatbot developer”, (p. 644). This means that textual chatbots fail frequently. Even the best
chatbots, such as Mitsuku fail (Worswick, 2018).
Table 3.
The Cooperative Principles, Gricean Maxims of Conversation
Maxim Description
Quantity Say neither more nor less than the discourse requires
Note. Highlighted cells (Manner & Relevance) form the theoretical basis for the experimental manipulation in
Study Two.
The stimulus in Study Two used “other-initiated self-repair” (Schegloff, 2000). The
chatbot struggles to deduce meaning from the human’s input and initiates the repair dialogue.
The actual repair itself is provided by the human when further clarification is given in the
following utterance. Study Two assesses the impact of conversational repair on
anthropomorphism. For example, consider the following vignette;
The chatbot’s use of the word what constitutes initiation of conversation repair.
Conversation repair has been linked to social coordination in that it demonstrates one’s
ability to use synchronised interaction strategies, such as turn-taking and role-switching (Corti
& Gillespie, 2016; Kaplan & Hafner, 2006). Corti and Gillespie (2016) discuss conversation
repair as being a fundamental component of intersubjective effort, where intersubjectivity is
defined as shared meaning, co-created and co-existing within two or more conscious minds
(Stahl, 2015). As such, the following hypotheses are proposed:
H4a & H4b: A chatbot using conversation repair will receive significantly higher scores for (a)
adoption intent and (b) recommendation intent
A chatbot failing to track the context of a conversation over time would violate the
Gricean maxim of relevance, which states that a “partner’s contribution should be appropriate
to immediate needs at each stage of the transaction” (Grice, 1975, p. 47). Gricean maxims,
summarised as the “cooperative principle” are essential to successful human communication;
thus, a chatbot violating any of the maxims (relevance in this instance), should be perceived
as presenting less human-like cues. Accordingly, the following hypotheses are proposed;
H5: A chatbot failing to maintain contextual awareness will be perceived as significantly less
anthropomorphic
H6a & H6b: A chatbot failing to maintain contextual awareness will receive significantly lower
scores for (a) adoption intent and (b) recommendation intent
Finally, Study Two provides the opportunity to re-test the relationships examined in
Study One and identify any consistent patterns across the three conditions. Consequently, the
following hypotheses are presented:
1. The SST literature has yet to sufficiently examine interfaces capable of natural
language use. These interfaces are unique among SSTs, in that users may
anthropomorphise them.
2. Extant adoption and recommendation models include a number of independent
variables, as summarised in Tables 1 and 2. However, the role of anthropomorphism
(or related constructs such as automated social presence) has yet to be investigated.
Anthropomorphism may correlate with (or perhaps cause) adoption and positive word-
of-mouth.
3. Pinpointing the features of a robot (that is, facial expression, gaze, movement) that
produce anthropomorphism is possible. However, chatbot features, that act as
antecedents of anthropomorphism are harder to identify.
Study One used a correlational survey design to test the relationship between
anthropomorphism and the dependant variables of adoption and recommendation intent.
Participants were asked to have a scripted conversation with a purpose-built chatbot on
Facebook. Participants were advised that this would help to train the chatbot. Quantitative
measurements were taken so that regression analysis could test the following hypotheses;
It is plausible to suggest that MTurk samples are biased in some ways. For example, the
average MTurk user may have (a) a more immediate need for financial resources, (b) more
free time, (c) more experience online or (d) a stronger desire to participate in research studies
than a general member of the broader population (Chandler, Mueller, & Paolacci, 2014).
None of these potential differences are theoretically linked to the variables of interest in this
study. Therefore, MTurk was considered an appropriate means of generating a sample. Any
sensitising effects, due to the participants having completed previous studies, are conceptually
counterbalanced against the decreased potential for experimenter expectancy that comes with
anonymous online samples.
Participants were recognised for their contribution with a payment of US$2.00 for a
completed survey submission. This amount was considered in-line with social norms and the
principals of respect in research, striking a balance between compensation for the opportunity
cost of one’s time and the subtle coercion accompanied by larger payments. The US Bureau
of Labor Statistics has set the federal minimum wage at US$7.25 per hour.
The study’s cover page on MTurk explicitly stated that only MTurk users from the
United States could participate in this study. This was reiterated on the study’s Participant
Information Cover Sheet in Qualtrics. However, additional steps were taken to ensure
participants were actually from the United States and did not provide data under false
pretences or through inattention to selection criteria. First, the study was listed at 3:00AM
local time (UTC+10), so that it would be visible to United States participants between 9AM
(UTC-8 in Los Angeles) and midday (UTC-5 in New York). MTurk users elect to participate
on a ‘first-come, first-served’ basis, so this increased the likelihood of sample validity.
Second, Qualtrics was configured to capture users’ meta-data, including geographic location
associated with their IP address. Participants were explicitly asked which US state they
resided in as part of the Qualtrics survey and their responses were matched with IP locations
to verify accuracy. Of course, a sophisticated user could circumvent this process by using a
virtual private network (VPN), but this was deemed low risk.
A sample size of 60 was considered appropriate for this survey. Power analysis for a
linear regression was conducted using G-POWER to determine a sufficient sample, given an
alpha of .05, a power of .80 and a medium effect size (f2 = .15) (Faul et al., 2009). Based on
these assumptions, G-POWER suggested a sample of 55.
3.1.2 Stimulus
Participants were required to hold a scripted conversation with a purpose-built chatbot
deployed on Facebook Messenger. Participants were told that their task was to assist in
training a new chatbot. They were provided with a script, which included statements
3.1.3 Instruments
Anthropomorphism (IV): This variable was adapted from the godspeed questionnaire
(Bartneck et al., 2009), which provides a set of semantic differential scales to measure the
factors of anthropomorphism, animacy, likability, intelligence and safety of social robots. By
October 2014, the godspeed questionnaire had been cited in over 160 studies, many of which
used only a single factor from the godspeed questionnaire (Weiss & Bartneck, 2015), as was
the case here.
The anthropomorphism instrument includes 5 items: (a) fake – natural, (b) machinelike
– humanlike, (c) artificial – lifelike, (d) unconscious – conscious and (e) communicates
inelegantly – communicates elegantly. Response options are similar to a Likert scale, in that
7 response options are presented. However, the instrument is a semantic differential scale. At
each end of the x axis are polar opposite adjectives. The response options were originally
designed to provide ordinal data, with participants circling an integer. However, this was
modified via Qualtrics so that the data collected was a continuous value between -3 and +3.
Curdy (2014) provides instructions for modifying the HTML in Qualtrics to achieve this.
Adoption intention (DV): This study operationalised adoption with a simple single item
measured on a 7-point Likert scale. Participants were asked to indicate their agreement with
the following statement: “Please think about the chatbot you interacted with today. When
purchasing flowers for delivery (or tickets to a sporting event), I would use a chatbot like
this”. Response options were anchored at strongly agree and strongly disagree. This wording
was modified from similar single-item measures of SST adoption (Chiou & Shen, 2012;
Curran, Meuter, & Surprenant, 2003; Lee, Castellanos, & Chris Choi, 2012).
Recommendation intent (DV): Recommendation intent was measured with a single item
on a 7-point Likert scale with end points at strongly agree and strongly disagree. A number of
measurement instruments for recommendation intent exist; however, many are crafted around
recommendations made specifically via online channels. General purpose recommendation
measures include a single-item approach (Chiou & Shen, 2012; Meuter et al., 2003) or three
Control variables used in this study included demographic variables such as age,
gender, race and education. The personality trait openness (Soto & John, 2017), technological
anxiety (Meuter et al., 2003) and the frequency of previous Facebook use were also measured,
to be used as control variables. Full details of all instruments are provided in Appendix Item
3. All items were monotone, avoiding extreme or suggestive language. None of the items was
likely to induce social desirability bias.
The scripted participant inputs were created using rules. When the participant entered a
statement, a rule was triggered and the chatbot provided a pre-determined response. Thus,
pattern-matching is all that occurred, making future replication of the method straightforward.
The chatbot was programmed to address participants by their first name and provide a
basic welcome message and thank you message. The purpose of the welcome message was to
allow participants time to adjust to the change of website (from Qualtrics to Facebook) before
accessing the stimulus. At the end of the chatbot conversation, the chatbot was designed to
give the participant a 4-digit code. Participants were asked to enter this code back into
Qualtrics to confirm that they had followed instructions and successfully completed exposure
to the stimulus materials.
A Facebook page was made for the chatbot (Figure 7). The page used a template
provided by Facebook. Logos and design elements were added, as well as some basic
instructions on how to begin the interaction, for example “click here to get started”. The tone
of the language and design elements was neutral.
MTurk users were given the opportunity to provide informed consent via a detailed
participant information cover-sheet. This document outlined the risks and benefits of
participating as well as details regarding privacy and the future use of data. Participants were
expressly cautioned that while QUT Ethics and Privacy protocols covered this project,
interaction with the chatbot was to occur via Facebook. The research team could not
guarantee the privacy of any data generated during the chatbot interaction on the Facebook
website.
The limitations of this design are ecological validity and artificiality. Participants were
given scripts to use in their interaction with the chatbot. However, in a natural setting, chatbot
users may phrase their statements in any number of ways. For example, when tasked with
ordering flowers online, person A may state “I have a $50 budget” and person B may ask
“What do you have for $50”. Here the scripting of the conversation is considered a strength of
the design. Scripted interactions ensure all participants have the same experience, which adds
to internal validity and replicability.
As discussed in the literature review, this level of control is not available in other commonly
used methods such as the Wizard of Oz technique or a Turing Test.
The data was analysed using IBM’s SPSS version 25.0. The following sections detail
the data cleaning and assumption testing process, while providing results for the tested
hypotheses. The initial sample size for this study was 60 participants. Of the sample, 52%
were male, with the average age of the sample recorded at 36.33 years (SD = 9.875). More
than 70% of the sample identified as Caucasian Americans and all levels of education were
represented within the sample.
None of the respondents submitted incomplete data, as Qualtrics was configured such
that participants could not proceed without responding to all items presented. However, there
was evidence of careless responding (Curran, Kotrba, & Deninson, 2010) or content non-
responsivity (Meade & Craig, 2011). In two cases (3.2%), the participants entered a code into
Qualtrics that did not match the codes issued by the chatbot at the end of a successful
interaction. This suggests that these participants were incorrectly exposed to the stimulus
material. Their responses to the measurement instruments could not be based on the item
content and were therefore removed.
This failure rate is in-line with estimates of carelessness presented by other researchers
(Ehlers et al., 2009). The cause of this failure is unknown. Perhaps the participants were
distracted by accessing social media (Osgood, Ward, & Meade, 2015) or the nature of an un-
proctored online survey (Weigold, Weigold, & Russell, 2013), which lacks cognitive
researcher involvement (Mohorko & Hlebec, 2016). Perhaps these respondents sought to
repair a perceived imbalance between required effort and reward as described by equity
theory (Adams, 1963) or their involvement in previous research projects on MTurk resulted in
ego-depletion (Meade & Craig, 2011). A third case was removed because the data met the
criteria for an extreme univariate outlier as described by Allen, Bennett and Heritage (2014).
Note. Adoption: Scenario 1 was the purchase of flowers for delivery and Adoption: Scenario 2 was the purchase
of sporting event tickets.
** p < 0.01 level, two-tailed, * p < .05, two-tailed. Cronbach α reported diagonally in parentheses. A double dash
(--) indicates the variable was measured with single item; thus, α was not calculated.
All predictor and criterion variables were assessed for normality. Exploration of the data
included examination of the 95% confidence interval for the mean, the magnitude of the gap
between the mean and 5% trimmed mean as well as skewness and kurtosis via histograms and
QQ plots. The Shapiro-Wilk test of normality suggested that all variables were within the
tolerances proposed by statisticians (Field, 2013; Pallant, 2013) of ≤±2.00 for skewness and
≤7.00 for kurtosis.
Table 5.
Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor Variable in a Multiple
Regression Analysis of Chatbot Adoption Intent.
The same procedure was followed to examine the data, using recommendation intention
as the dependant variable. In combination, anthropomorphism and the control variables
accounted for a statistically significant 20.7% of the variability in adoption intention, R² =
.207, adjusted R² = .146, F (4, 52) = 3.39, p = .015. By Cohen’s (1988) conventions, a
combined effect of this magnitude can be considered “medium” (f² = .261). Standardised
regression coefficients (ß) and squared semi-partial (or “part”) correlations (sr²) for each
predictor in the regression model are reported in Table 6.
Table 6.
Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor Variable in a Multiple
Regression Analysis of Chatbot Recommendation Intent
3.3 DISCUSSION
The first study presented in this thesis examined a chatbot performing customer service
work via Facebook Messenger. The purpose was to identify any correlational relationships
between anthropomorphism of the chatbot and adoption and recommendation intent as
dependent variables.
The average participant score for the anthropomorphism composite variable was 4.82 on
a 7-point scale. A score above 4.0 would suggest that users perceived the chatbot as having
some anthropomorphic qualities. A review of each of the individual scale items provides
context. The chatbot was perceived as more natural than fake, more human-like than
machinelike, more lifelike than artificial, more conscious than unconscious and
communicating more elegantly than inelegantly. The highest score of an individual item was
Both of the hypotheses tested in Study One were supported by the data, with results
significant at p less than .01. Users who anthropomorphised the chatbot indicated that they
were more likely to use a chatbot again to perform a similar task. Anthropomorphism of the
chatbot also predicted a participant’s intention to recommend the chatbot to others.
Given that adoption and recommendation intent have been important dependent
variables to SST practitioners and scholars for more than 30 years (Blut, Wang, & Schoefer,
2016; Hoehle, Scornavacca, & Huff, 2012), future research should examine which elements
of the human-chatbot interaction contribute to this anthropomorphism. It cannot be a digital
avatar, the movement of a body, facial expressions or an audible voice because the chatbot
has none of these features. Manipulating these features would allow for the findings from
Study One to be replicated under experimental conditions.
None of the demographic variables measured account for the variance in scores. It
appears as though age, gender, race and education did not affect perceptions of
anthropomorphism, adoption or recommendation intent in this context. These results differ
from previous SST findings, which suggest adaptors are predominantly younger, male and
better educated (Meuter et al., 2003; Nilsson, 2007).
Frequency of a participant’s Facebook use was captured as a control variable and was
not significantly related to any of the other variables. All of the participants were Facebook
users and would know that interpersonal communication via instant messaging occurs on
Facebook. If a Facebook user’s mental model of Facebook Messenger is that it facilitates chat
between two humans, then anything “chatting” on Facebook could be seen as human.
However, a participant’s previous use of Facebook did not appear to have a sensitising effect
in this instance.
Both the human-computer and the human-robot interaction literature states that
individual human-like cues stack synergistically to create perceptions of a social other.
Humanlike dimensions listed by Nass (2004) include language use, a voice (real or synthetic),
a face via an avatar, interactivity and unpredictability. Kiesler et al. (2008) have demonstrated
that the most salient human characteristics are appearance, back-story, observable behaviour
and etiquette. The chatbot in this study had only one of these characteristics – language use.
This study supports the notion that anthropomorphism could occur from language use only.
The stimulus material used in Study Two was an animation of a chatbot interaction.
This animation was contextualised around a consumer booking hotel accommodation via a
chatbot. Three versions of this stimulus were prepared in order to manipulate a participant’s
access to elicited agent knowledge (EAK), which is theorised to be an antecedent of
anthropomorphism (Epley et al., 2007). The animation was the same across all conditions,
except for the experimental manipulation. The chatbot shown in the conversation repair (CR)
animation failed to interpret elements of user input during the first pass and used conversation
repair strategies to remedy potential misunderstandings. Therefore, the CR chatbot was
repairing a violation of the Gricean maxim of manner. Consequently, the following
hypotheses were developed:
H4a and H4b: A chatbot using conversation repair will receive significantly higher scores for
(a) adoption intent and (b) recommendation intent
The chatbot in the contextual awareness (CA) animation failed to accurately interpret the
user’s utterance. Therefore, the CA chatbot violated the Gricean maxim of relevance,
responding with a statement considered inappropriate for the needs of user at that stage of the
transaction (Grice, 1975). Consequently, the following hypotheses were developed:
H6a and H6b: A chatbot failing to maintain contextual awareness will receive significantly
lower scores for (a) adoption intent and (b) recommendation intent
Finally, a third chatbot animation was prepared to serve as a control condition. The chatbot in
this condition correctly interpreted user input in all instances. The following hypotheses were
developed to address broader patterns between the experimental conditions:
The study was designed to meet the four requirements for demonstrating a causal
relationship. The experimental method was selected because experiments show that (a) the
cause and effect are connected, (b) the cause precedes the effect, (c) the cause and effect
relationship occurs consistently across participants and (d) that alternative explanations have
been accounted for (Babbie, 2016).
Providing evidentiary support for a causal relationship within the social sciences is
difficult; however, the use of an experimental design was chosen to maximise internal
validity. The three elements of an experiment are present: manipulation of the independent
variable (access to EAK as a component of anthropomorphism), comparison to control group
and random assignment to experimental conditions (Babbie, 2016). Of course, in the absence
of replication studies, inferences drawn from this study should be approached with care.
Deterministic causality versus probabilistic causality: The term causality in this thesis
refers to probabilistic causality, in that the cause is said to precede the effect and increase the
probability of the effect (Mellor, 1995). Deterministic causality, on the other hand, refers to
the notion that the cause precedes the effect in all observable instances (Hoefer, 2016). For
example, it is widely accepted that smoking causes lung cancer, but not all lung cancers are
temporally preceded by smoking. This statement illustrates probabilistic causality as opposed
to deterministic causality.
A sample size of 180 was considered appropriate for this survey. Power analysis for
Analysis of Variance (ANOVA) was considered, given an alpha of .05, a power of .80 and a
medium effect size (f² = .15) (Faul et al, 2008). Based on these assumptions, a power analysis
for a 3-condition experiment suggested a sample size of n = 156, or 52 participants per
condition.
4.1.2 Stimulus
Where Study One had participants interact with a chatbot on Facebook, Study Two had
participants watch a pre-recorded animation of a human–chatbot interaction. The decision to
change the way in which participants were exposed to the stimulus was made on the
following grounds:
Three animations were prepared, one for each experimental condition. To illustrate the
experimental manipulation, a textual extract of the animation is provided in Table 7. Full
transcripts of the animations are provided in Appendix Item 4 (p. 79).
Table 7.
Sample Extract from Animations Illustrating Manipulation of Elicited Agent Knowledge Across the Experimental
Conditions
The animations were made using online software from Bot Society
(https://botsociety.io/). Bot Society provides chatbot prototyping and preview tools for
developers. Users of Bot Society can design text and voice chat interfaces for a range of
platforms, including Facebook Messenger, Google Home and Amazon Alexa. The Bot
Society website claims the software has been used by brands including Microsoft,
PriceWaterhouse Coopers, AXA and Nestle (Botsociety, 2018). A screen shot of the Bot
Society animation process is provided in Figure 8.
The hotel accommodation context was chosen for the stimulus because it was considered to
be an appropriate use for a chatbot providing customer service. The chatbot was given a
fictitious name (Beachside Hotel) with a custom logo. The Facebook pages of real world
hotels were reviewed in order to determine an appropriate number of “likes” for the chatbot.
Setting the likes too high or low may have acted as social proof of the chatbot’s performance
and had an impact on the results (Lee, Lee, & Oh, 2015). Finally, the animation was set inside
a wireframe image of a white iPhone 6 to add to realism, as shown in Figure 9.
4.1.3 Instruments
Measurements for the dependent variables (adoption and recommendation intent) were
taken as described in Study One, methods section. The measurement of anthropomorphism
was modified between studies. The items themselves were not changed; however, responses
were recorded on a 7-point Likert scale rather than a slider-type scale, which could provide
continuous data. The demographic variables were unchanged between the two studies. Age,
gender, race and education were all recorded.
Study Two was designed to produce two sets of data. Dataset 1 includes scores for the
independent and dependent variables across each of the three conditions. These scores were
assessed using ANOVA to identify any significant mean differences resulting from
conversation repair or the failure to maintain contextual awareness. The control variables
measured (detailed below) were less important for Dataset 1. Participants were randomly
assigned to each experimental condition so the variance in control variables should have been
evenly distributed across the three conditions.
Dataset 2 includes data from the control condition only. The data generated by the
control condition was assessed using regression in an attempt a replicate the findings from
Study One. Perceived ease of use, perceived usefulness and the need for human interaction
were measured as control variables for Dataset 2. In this way, a correlational relationship
between the anthropomorphism, adoption and recommendation intent could be retested. If any
relationship was observed, it could then be said to exist when accounting for these new
control variables.
Perceived ease of use & perceived usefulness: Both variables are derived from Fred
Davis’s (1989) seminal technology acceptance model (p. 11). Davis devised the original
instrument to be used in a firm setting. Original items include statements such as
“effectiveness on the job”. This study used a modified version of the instruments, specifically
designed for self-service technology (SST) studies as presented by Curran and Meuter (2005).
Need for human interaction: This variable was operationalised using Dabholkar’s
(1996) four-item scale. Items such as “I like interacting with the person who provides the
service” focus on the social aspect of interacting with a human service employee. Responses
to the four items were recorded on a 7-point Likert scale, anchored with strongly agree and
strongly disagree.
4.2 RESULTS
The initial sample size for this study was 190 participants. Fifty-three percent of the
sample were male, with the average age of the sample recorded at 37.21 years (SD = 11.597).
Seventy-sever percent of the sample identified as Caucasian Americans. The demographic
composition of the participants in Study Two was similar to the previous study.
The data was visually inspected for problematic responses. One case was content non-
responsive. “Consistent response” or “straight-lining” had occurred (Revilla & Ochoa, 2015).
It appeared the participant had rushed through the survey, selecting the same response option
for all items on all instruments. Study Two incorporated the use of reverse-coded items to
identify inconsistent responses, although the efficacy of this strategy has been debated (Van
Sonderen, Sanderman, & Coyne, 2013).
Table 8.
Distribution of Demographic Variables Across the Three Experimental Conditions
Condition 1 (n=59) 50.8% Female 35.5 (10.4) 1.71 (1.5) 3.25 (1.3)
Condition 2 (n=60) 47.6% Female 37.43 (11.8) 1.41 (1.3) 2.79 (1.5)
Condition 3 (n=58) 42.9% Female 38.67 (12.4) 1.67 (1.6) 2.89 (1.4)
Next, ANOVA was performed to investigate the impact of conversational repair and
contextual awareness on anthropomorphism, adoption and recommendation intent. Levene’s
statistics for the dependent variables were non-significant, Adoption: F (2, 186) = 2.181, p =
.116 and Recommendation: F (2, 186) = .577, p = .563. The assumption of homogeneity of
variance was violated for anthropomorphism: F (2, 186) = 4.81, p = .009. However, ANOVA
is not sensitive to violations of equal variance when sample sizes are approximately equal
(Allen et al., 2014) as was the case here.
Table 9.
Mean Comparison of the Dependent Variables Across the Three Experimental Conditions
The ANOVA results presented in Table 9 suggest that anthropomorphism may mediate
the relationship between a chatbot’s failure to maintain contextual awareness and subsequent
behavioural intentions (adoption and recommendation intent). The mediating role of
anthropomorphism in the contextual awareness condition was formally tested in SPSS using
the process model (Hayes, 2009). The effect of contextual awareness was isolated by creating
a dummy coded variable (Cond. 1 vs. Cond. 3).
Figure 10. Anthropomorphism fully mediates the contextual awareness and adoption relationship.
Results for the hypotheses unique to Study Two are presented in Table 10.
Table 10.
Results Relative to Hypothesis 3 Through 8
Cronbach’s α was calculated to measure the internal consistency of the independent and
control variables. The internal consistency of the anthropomorphism, usefulness and NFHI
instruments could be considered good (α > .80), while ease of use was acceptable (α > .70)
(DeVellis, 2012; Kline, 2000). Refer to Table 11 for precise values. Composite variables were
created.
Table 11.
Means, Standard Deviations, Correlations & Internal Consistency of all Variables.
Note. Control variables were treated as IV’s in the analysis but were not variables of theoretical interest in the
study, thus conceptually they are control variables.
** p < 0.01 level, two-tailed, * p < .05, two-tailed. Cronbach α reported diagonally in parentheses. A double
dash (--) indicates the variable was measured with single item; thus, α not calculated.
Normality was assessed and multiple regression was performed, using the same
procedure as outlined in Study One: Results.
Table 12.
Regression Coefficients and Squared Semi-partial Correlations (sr²) for Each Predictor Variable in a Multiple
Regression Analysis of Chatbot Adoption Intent
4.3 DISCUSSION
This discussion will elaborate on the findings from Study Two, situating the results
within the context of the extant literature. The discussion will move through the hypotheses
sequentially, while the following chapter (Conclusion) will present implications, limitations
and avenues for future research.
These findings suggest that the role of anthropomorphism in SST adoption or word-of-
mouth research needs to be considered wherever the SST is (a) targeting individuals theorised
to be high in sociality or effectance motivation or (b) accessed via an interface thought to be
capable of affecting a user’s access to EAK through language use.
The use of CR may have both positive and negative aspects. A chatbot’s use of CR
could be considered a positive attribute in that CR is an attempt to manage miscommunication
(Fromkin, 1971). Conversely, in responding to CR, the human user is required to re-state or
clarify their previous utterance, which could negatively impact perceptions of the chatbot.
Repeating oneself may be frustrating to the user (Srivastava, 2017) due to increased effort
(Jenkins et al., 2007). Clark & Brennan (1991) would describe this effort as a ‘cost’,
specifically, the costs of formulation and production, which are paid by the speaker. It appears
as though these two opposing forces cancel each other out. While the chatbot’s use of CR did
not significantly increase scores for adoption and recommendation intent, it appears the use of
CR mitigates the potentially deleterious effects of the costs paid when having to repeat
oneself during the repair process. It is likely the number of trouble sources and their perceived
importance to the user affect scores for the dependent variables. For example, many attempts
at CR, to clarify unimportant matters would mean that the costs of formulation and production
outweigh the benefits, leading to lower scores on the dependent variables.
A chatbot failing to track the context of a conversation over time would violate the
Gricean maxim of relevance (Grice, 1975, p. 47). Violating the maxim of relevance is
uncommon in interpersonal interaction (Grice, 1975, p. 54) but occurs frequently in human-
chatbot interaction (Martin, 2017). Gricean maxims are said to generate implicature, where
implicature is what is suggested as opposed to what is expressly stately (Blackburn, 1996). In
failing to maintain CA, the chatbot appears to have provided the “implicature” (Grice, 1975)
that it has non-human cognition, given that relevance comes naturally to human interlocutors.
5.5
4.5
3.5
3
Anthropomorphism Adoption Recommendation
Figure 12. Graphical representation of mean scores for anthropomorphism, adoption and recommendation intent
across experimental conditions
This thesis focused upon the intersection of two divergent theoretical positions. SST’s
by definition, lack human employee involvement (Curran et al., 2003). Conversely,
substantial literature is predicated on the notion that machines and computers can be
perceived as human-like (Epley et al., 2007; Moon & Nass, 1996; Weiss & Bartneck, 2015).
The objective of this thesis was to contribute to the SST literature through the partial
reconciliation of these competing views. What follows is a brief summary of findings as they
pertain to each of the research questions.
Both studies presented provide evidence to support the claim of a significant positive
correlation between anthropomorphic perceptions and consumer’s behavioural intentions to
adopt and recommend a customer service chatbot. This relationship was supported by the data
across a range of customer service scenarios, including the purchase of flowers, the purchase
of tickets to a sporting event and the reservation of hotel accommodation. Furthermore, by
manipulating access to EAK, Study Two supports claims of a significant causal relationship
between the variables of interest. The importance of anthropomorphism in (a) the study of
artificial conversational entities and (b) SSTs theorised as capable of activating perceptions of
automated social presence has been demonstrated.
Chapter 5: Conclusions 55
RQ2: What strategies can a chatbot employ in order to affect anthropomorphism?
Study Two found evidence to support the claim that chatbot behaviour (employing
particular linguistic stratagems, that is, contextual awareness) can affect anthropomorphism
by activating a user’s access to elicited agent knowledge. The antecedence of
anthropomorphism in this context was linked to features of the chatbot as opposed to
individual difference variables. This supports the notion that anthropomorphism is of practical
value. Marketing practitioners and chatbot programmers can take certain steps to maximise
anthropomorphic perceptions of an SST. Due to the causal nature of the relationship between
anthropomorphism and consumers’ behavioural intentions, adjusting access to EAK is likely
to have a corresponding impact on adoption and recommendation intent.
5.1 IMPLICATIONS
1. This research project provides academic research with a newly proposed variable with
which to assess SSTs. Anthropomorphism as measured by the Godspeed
Questionnaire (Bartneck et al., 2009) captures a novel dimension of an SST. The
addition of anthropomorphism to existing adoption models such as TAM (Davis,
1989), the TPB (Ajzen, 1991), UTAUT (Venkatesh et al., 2003) or DOI (Rogers,
2003) may improve predictive capabilities, given anthropomorphism was shown to
outperform well-established variables, such as perceived ease of use (Davis, 1989),
technological anxiety (Meuter et al., 2003) and the need for human interaction
(Dabholkar, 1996b).
2. The thesis lends support to EAK as forming part of Epley et al’s. (2007) three factor
conceptualisation of anthropomorphism. In particular, the second study presented here
is the first to (a) examine the concept with a text-only chatbot as opposed to a robot
and (b) demonstrate that contextual awareness is capable of activating access to EAK.
56 Chapter 5: Conclusions
5.1.2 Methodological implications
1. The relationships of interest were statistically significant, regardless of the means with
which the stimulus was administered. Participants having direct interaction with a live
chatbot on Facebook (Study One) reported similar perceptions to participants viewing
an animation of human-chatbot interaction (Study Two). Researchers aiming to
maximise ecological validity may wish to replicate the method outlined in Study One.
Researchers who wish to control for typing speed or have concerns regarding
participant privacy may consider the method presented in study two.
2. Providing participants with a script to guide their interaction with a chatbot (Study
One) is unique to the literature. This method, unlike techniques such as the Wizard of
Oz (Aaron Steinfeld, Jenkins, & Scassellati, 2009), exposes participants to the
stimulus material in a consistent way.
3. As discussed in Chapter Two, the anthropomorphism instrument was taken from the
more mature social robotics literature. The performance of the anthropomorphism
instrument in this context suggests additional instruments from the social robotics
field could be repurposed to further develop a theoretical understanding of human-
chatbot interaction. For example, the Godspeed Questionnaire (Bartneck et al., 2009)
includes validated instruments for measuring perceived intelligence, likeability and
animacy.
The findings presented have several implications for firms seeking to develop chatbots to
perform in customer service roles.
Chapter 5: Conclusions 57
2. Unlike the work of Nilsson (2007), Meuter et al. (2005) and Ding et al. (2007), the
demographic variables measured, including age, gender, race and education were
shown to have no relationship with adoption and recommendation intent. This
suggests that chatbots may be an acceptable means of providing customer service to a
wide variety of consumers.
5.2 LIMITATIONS
As is the case with all research, this project is not without limitations. The following
limitations are grouped according to methodology, measures and sample.
5.2.1 Methodology
First, both studies presented were cross-sectional, collecting data from a representative
sub-set of a population at a single point in time (Carlson, Miller, Heth, Donahoe, & Martin,
2009). Thus, it is unknown at this stage whether (a) the statistically significant relationships
identified would hold over time (novelty effect) and (b) the behavioural intentions expressed
by participants translate into actual behaviours. Using longitudinal data could resolve this
limitation. Second, the studies presented maximised internal validity at the expense of
ecological validity. The studies could be said to lack mundane realism because the
circumstances in which exposure to the stimulus occurred were artificially developed for the
purpose of the research (Aronson, Wison, & Brewer, 1998). Conditions encountered by
consumers in a natural setting may include distractions or time pressures, which could affect
results. Third, the stimulus in both studies was either (a) delivered by or (b) modelled upon a
chatbot performing within the Facebook instant messaging system. Chatbots can be deployed
to any website. Therefore, caution should be exercised in generalising findings beyond the
Facebook ecosystem.
5.2.2 Measures
Fourth, the way in which anthropomorphism was measured in Study One could be
considered a limitation. As discussed in Study One: Method, anthropomorphism was
measured using a semantic differential scale. The instrument’s designers had originally
presented it as a Likert scale (Bartneck et al., 2009), which would capture discrete data in
whole numbers. While planning the research design, it was decided to instead measure
anthropomorphism using a slider on a continuous scale. The hope was that this would produce
58 Chapter 5: Conclusions
more accurate data, because a sliding scale would allow participants to indicate their position
anywhere between two values. The survey software used in this study (Qualtrics) does not
include a pre-formatted option for continuous slider type items. As a result, the slider was
coded into Qualtrics using the custom HTML tab via instructions provided by Curdy (2014).
An example of a finished item is provided in Figure 13.
Figure 13. Screenshot of an item from the anthropomorphism scale as presented to participants in Study One.
This modified instrument has three potential limitations. First, the labels above each of
the increments have not been validated. These labels were added because the original
instrument does not include or suggest increment labels, as shown in Figure 14.
Figure 14. The original anthropomorphism instrument as presented by Bartneck et al. (2009).
Second, the original instrument only provides 5 discrete response options for each item.
This study used 7 response options on a continuous scale. Seven response options were used
for consistency because the other instruments in the study also used 7-point scales. It was
hoped that participants would benefit from this uniformity. Finally, the anthropomorphism
items used in this study were formatted so that the blue circle, which a participant drags left or
right was set to appear in the neutral position (Figure 6). Perhaps this encouraged participants
to select a response closer to the centre of the scale. A number of these measurement
limitations were rectified in Study Two, as shown in Figure 15.
Chapter 5: Conclusions 59
Figure 15. Screenshot of an item from the anthropomorphism scale as presented to participants in Study Two.
5.2.3 Sample
Finally, with regards to the participant samples, power analysis suggests that the sample sizes
were adequate for the types of analysis employed. However, the sample sizes (n = 60 in Study
One, n = 190 in Study Two) are still small. Future studies should use larger sample sizes.
A number of avenues for future research exist. Future research may seek to;
60 Chapter 5: ConclusionsBibliography
Bibliography
AbuShawar, B., & Atwell, E. (2016). Usefulness, localizability, humanness, and language-
benefit: additional evaluation criteria for natural language dialogue systems. International
Journal of Speech Technology, 19(2), 373–383.
Aggarwal, P., & Mcgill, A. L. (2007). Is that car smiling at me? Schema congruity as a basis
for evaluating anthropomorphized products. Journal of Consumer Research, 34(4), 468–479.
Ajzen, I. (1991). The theory of planned behavior. Orgnizational Behavior and Human
Decision Processes, 50, 179–211.
Allen, P., Bennett, K., & Heritage, B. (2014). SPSS Statistics (22nd ed.). South Melbourne,
Victoria: Cengage Learning Australia.
Arnold, V., Collier, P., Leech, S., & Sutton, S. (2004). The impact of intelligent decision aids
on expert and novice decision-makers’ and finance judgments. Accounting and Finance,
44(1), 1–26.
Aronson, E., Wison, T., & Brewer, M. (1998). Experimentation in social psychology. In D.
Gilbert, S. Fiske, & G. Lindzey (Eds.), The Handbook of Social Psychology (pp. 99–142).
New York, NY: McGraw-Hill.
Axelrod, R., & Hamilton, W. (1981). The evolution of cooperation. Science, 211, 1390–1396.
Babbie, E. (2016). The Practice of Social Research (14th ed.). Boston, MA: Cengage
Learning.
Barrett, J. L., Richert, R. A., & Driesenga, A. (2001). God’s beliefs versus mother’s: The
development of nonhuman agent concepts. Child Development, 72(1), 50–65.
Bartneck, C., Kulic, D., Croft, E., & Zoghbi, S. (2009). Measurement instruments for the
anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of
robots. International Journal of Social Robotics, 1(1), 71–81.
Bazire, M., & Brézillon, P. (2005). Understanding context before using it. In A. Dey, B.
Kokinov, D. Leake, & R. Turner (Eds.), Modeling and Using Context. Berlin: Springer.
Bennett, D. E., & Thompson, P. (2016). Use of anthropomorphic brand mascots for student
motivation and engagement: A promotional case study with Pablo the Penguin at the
University of Portsmouth Library. New Review of Academic Librarianship, 22(2–3), 225–237.
Bitner, M., Ostrom, A., Meuter, M., & Clancy, J. (2002). Implementing successful self-
service technologies. Academy of Management Executive, 16(4), 96–109.
Bibliography 61
Blascovich, J., Loomis, J., Beall, A., Swinth, K., Hoyt, C., & Bailenson, J. (2002). Immersive
virtual environment technology as a methodological tool for social psychology. Psychological
Inquiry, 13, 103–124.
Blut, M., Wang, C., & Schoefer, K. (2016). Factors influencing the acceptance of self-service
technologies: A Meta-Analysis. Journal of Service Research, 19(4), 396–416.
Bogle, A. (2018). Facebook after Cambridge Analytica: Is this the beginning of the end?
Retrieved April 4, 2018, from http://www.abc.net.au/news/science/2018-03-27/facebook-
after-cambridge-analytica:-what-now/9586604
Bohannon, J. (2011). Human subject research: Social science for pennies. Science, 334(6054),
307.
Buhrmester, M., Kwang, T., & Gosling, S. (2011). Amazon’s Mechanical Turk: A new source
of inexpensive, yet high-quality data? Perspectives on Psychological Science, 6(1), 3–5.
Burger, J., & Sheehy, D. (2012). Individual Differences and Social Influence: A Special Issue
of Social Influence. Milton Park, UK: Taylor & Francis.
Burgess, M. (2017). The NHS is trialling an AI chatbot to answer your medical questions.
Retrieved October 8, 2017, from http://www.wired.co.uk/article/babylon-nhs-chatbot-app
Carberry, S., & De Rosis, F. (2008). Introduction to special Issue on “Affective modeling and
adaptation.” User Modeling and User-Adapted Interaction, 18, 1–9.
Carlson, N., Miller, H., Heth, C., Donahoe, J., & Martin, G. (2009). Psychology: The Science
of Behavior (7th ed.). London, UK: Pearson Education Limited.
Cassell, J., & Bickmore, T. (2003). Negotiated collusion: Modeling social language and its
relationship effects in intelligent agents. User Modeling and Adapted Interaction, 13(1), 89–
132.
Chandler, J., Mueller, P., & Paolacci, G. (2014). Nonnaïveté among Amazon Mechanical
Turk workers: Consequences and solutions for behavioral researchers. Behavior Research
Methods, 46(1), 112–130.
Chandler, J., & Shapiro, D. (2016). Conducting clinical research using crowdsourced
convenience samples. Annual Review of Clinical Psychology, 12(1), 53–81.
Cheung, C. (2008). The impact of electronic word‐of‐mouth: The adoption of online opinions
in online customer communities. Internet Research, 18(3), 229–247.
Chiou, J., & Shen, C. (2012). The antecedents of online financial service adoption the impact
of physical banking services on Internet banking acceptance. Behaviour & Information
Technology, 31(9), 859–871.
62 Bibliography
Clark, H., & Brennan, S. (1991). Grounding in Communication. In L. Resnick, J. Levine, &
S. Teasley (Eds.), Perspectives on socially shared cognition. (pp. 127–149). Washington:
APA Books.
Clark, H., & Schaefer, E. (1989). Contributing to discourse. Cognitive Science, 13, 259–294.
Constine, J. (2017). Facebook will launch group chatbots at F8. Retrieved March 28, 2018,
from https://techcrunch.com/2017/03/29/facebook-group-bots/
Costello, G., & Donnellan, B. (2007). The diffusion of WOZ: Expanding the topology of IS
innovations. Journal of Information Technology, 22(1), 79–86.
Cresci, E. (2017). Chatbot that overturned 160,000 parking fines now helping refugees claim
asylum. Retrieved October 8, 2017, from
https://www.theguardian.com/technology/2017/mar/06/chatbot-donotpay-refugees-claim-
asylum-legal-aid
Curdy, B. (2014). How to create semantic differential (EPA) scales using Qualtrics. Retrieved
August 10, 2017, from http://brentcurdy.net/qualtrics-tutorials/scales/
Curran, J., & Meuter, M. (2005). Self-service technology adoption: Comparing three
technologies. Journal of Services Marketing, 19(2), 103–113.
Curran, J., Meuter, M., & Surprenant, C. (2003). Intentions to use self-service technologies: A
confluence of multiple attitudes. Journal of Service Research, 5(3).
Curran, P., Kotrba, L., & Deninson, D. (2010). Careless responding in surveys: Applying
traditional techniques to organizational settings. In 25th Annual Conference of the Society for
Industrial/Organizational Psychology. Atlanta, GA.
Dale, R. (2016). The return of the chatbots. Natural Language Engineering, 22(5), 811–817.
Bibliography 63
Davis, F. (1989). Perceived usefulness, perceived ease of use, and user acceptance of
information technology. Information Technology MIS Quarterly, 13(3), 319–340.
DeVellis, R. (2012). Scale Development: Theory and Applications. Newbury Park, CA: Sage.
Ding, X., Verma, R., & Iqbal, Z. (2007). Self-service technology and online financial service
choice. International Journal of Service Industry Management, 18(3), 246–268.
Duffy, B. R. (2003). Anthropomorphism and the social robot. Robotics and Autonomous
Systems, 42(3–4), 177–190.
Edmondson, A., & McManus, S. (2007). Methodological fit in management field research.
Journal of Management Review, 33(4), 1155–1179.
Ehlers, C., Greene-Shortridge, T., Weekley, J., & Zajack, M. (2009). The exploration of
statistical methods in detecting random responding. In 24th Annual Conference of the Society
for Industrial/Organisational Psychology. Atlanta, GA.
Elkins, A., & Derrick, D. (2013). The sound of trust: Voice as a measurement of trust during
interactions with embodied conversational agents. Group Decision and Negotiation, 22(5),
897–913.
Epley, N., Akalis, S., Waytz, A., & Cacioppo, J. (2008). Creating social connection through
inferential reproduction: Loneliness and perceived agency in gadgets, gods, and greyhounds.
Psychological Science, 19, 114–120.
Epley, N., Waytz, A., & Cacioppo, J. T. (2007). On seeing human: A three-factor theory of
anthropomorphism. Psychological Review, 114(4), 864–886.
Eriksson, K., & Nilsson, D. (2007). Determinants of the continued use of self-service
technology: The case of internet banking. Technovation, 27, 159–167.
Eunson, B. (2015). Communicating in the 21st Century (4th ed.). Milton, QLD: John Wiley &
Sons.
Eyssel, F., Hegel, F., Horstmann, G., & Wagner, C. (2010). Anthropomorphic inferences from
emotional nonverbal cues: A case study. In Proceedings of the IEEE international workshop
on robot and human interactive communication (pp. 646–651).
Eyssel, F., Kuchenbrandt, D., Hegel, F., & De Ruiter, L. (2012). Activating elicited agent
knowledge: How robot and user features shape the perception of social robots. Proceedings -
IEEE International Workshop on Robot and Human Interactive Communication, 851–857.
Faber, P., & León-Araúz, P. (2016). Specialized knowledge representation and the
parameterization of context. Frontiers in Psychology, 7(196). https://doi.org/doi:
10.3389/fpsyg.2016.00196
64 Bibliography
Fan, A., Wu, L., & Mattila, A. S. (2016). Does anthropomorphism influence customers’
switching intentions in the self-service technology failure context? Journal of Services
Marketing, 30(7), 713–723.
Faul, F., Erdfelder, E., Buchner, A., & Lang, A. (2009). Statistical power analyses using
GPower 3.1: Tests for correlation and regression analyses. Behavior Research Methods,
41(4), 1149–1160.
Feingold, A. (1992). Gender differences in mate selection preferences: A test of the parental
investment model. Psychological Bulletin, 112(1), 125–139.
Field, A. (2013). Discovering Statistics using IBM SPSS Statistics: And sex and drugs and
rock “n” roll (4th ed.). London, UK: Sage.
Fishbein, M., & Ajzen, I. (1975). Belief, Attitude, Intention & Behavior: An Introduction to
Theory & Research. Reading, MA: Addison Wesley.
Fiske, S. T., Cuddy, A. J. C., & Glick, P. (2007). Universal dimensions of social cognition:
warmth and competence. Trends in Cognitive Sciences, 11(2), 77–83.
Gartner. (2016). Top Strategic Predictions for 2017 and Beyond: Surviving the Storm Winds
of Digital Disruption. Retrieved from
https://www.gartner.com/binaries/content/assets/events/keywords/cio/ciode5/top_strategic_pr
edictions_fo_315910.pdf
Goodman, J., Cryder, C., & Cheema, A. (2013). Data collection in a flat world: The strengths
and weaknesses of Mechanical Turk samples. Behavioral Decision Making, 26(3), 213–224.
Goodman, J., & Paolacci, G. (2017). Crowdsourcing consumer research. Journal of Consumer
Research, 44(1), 196–210.
Grice, H. (1975). Logic and conversation. In P. Cole & J. Morgan (Eds.), Syntax and
Semantics 3: Speech Acts (pp. 41–58). New York, NY: Academic Press.
Grosz, B., & Sidner, C. (1986). Attention, intentions, and the structure of discourse. Journal
of Computational Linguistics, 12(3), 175–204.
Guzman, I., & Pathania, A. (2016). Chatbots in Customer Service [White Paper]. Retrieved
14th April, 2017, from Accenture: https://www.accenture.com/t00010101T000000__w__/br-
pt/_acnmedia/PDF-45/Accenture-Chatbots-Customer-Service.pdf
Hayes, A. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new
millennium. Communication Monographs, 76(4), 408–420.
Bibliography 65
Hill, N., & Alexander, J. (2006). The Handbook of Customer Satisfaction and Loyalty
Measurement (3rd ed.). London, UK: Routledge.
Ho, C. C., & MacDorman, K. F. (2017). Measuring the uncanny valley effect: refinements to
indices for perceived humanness, attractiveness, and eeriness. International Journal of Social
Robotics, 9(1), 129–139.
Hoehle, H., Scornavacca, E., & Huff, S. (2012). Three decades of research on consumer
adoption and utilization of electronic banking channels: A literature analysis. Decision
Support Systems, 54(1), 122–132.
Holden, M. (2017). World’s first ATM machine turns to gold on 50th birthday. Retrieved July
27, 2017, from https://www.reuters.com/article/us-atm-anniversary/worlds-first-atm-machine-
turns-to-gold-on-50th-birthday-idUSKBN19I166
Horton, J., Rand, D., & Zeckhauser, R. (2010). The Online Laboratory: Conducting
Experiments in a Real Labor Market (NBER No. 15961).
Hsuan-Hsuan, K., & Ko-Hsin, H. (2015). Effects of inviting customers to share responsibility
in the context of impersonal service. Journal of Service Theory and Practice, 25(3), 267–284.
Huang, M., & Rust, R. (2013). IT-related service: A multidisciplinary perspective. Journal of
Service Research, 16(3), 251–258.
Hung, C., Yen, D., & Ou, C. (2012). An empirical study of the relationship between a self-
service technology investment and firm financial performance. Journal of Engineering &
Technology Management, 29(1), 62–70.
Hutchby, I., & Wooffitt, R. (2008). Conversation Analysis (2nd ed.). Boston, MA: Polity.
Jenkins, M., Churchill, R., Cox, S., & Smith, D. (2007). Analysis of user interaction with
service oriented chatbot systems. Human-Computer Interaction, 4552, 76–83.
Johnson, K. (2017). Facebook Messenger hits 100,000 bots. Retrieved November 22, 2017,
from https://venturebeat.com/2017/04/18/facebook-messenger-hits-100000-bots/
Kalman, Y. M., Scissors, L. E., Gill, A. J., & Gergle, D. (2013). Online chronemics convey
social information. Computers in Human Behavior, 29(3), 1260–1269.
Kaplan, F., & Hafner, V. (2006). The challenges of joint attention. Interaction Studies, 7(2),
135–169.
Kiesler, S., Powers, A., Fussell, S., & Torrey, C. (2008). Anthropomorphic interactions with a
robot and robot-like agent. Social Cognition, 26(2), 169–181.
66 Bibliography
Kim, S., & McGill, A. L. (2011). Gaming with Mr. Slot or gaming the slot machine? Power,
anthropomorphism, and risk perception. Journal of Consumer Research, 38(1), 94–107
Kline, P. (2000). The Handbook of Psychological Testing (2nd ed.). London, UK: Routledge.
Knijnenburg, B., & Willemsen, M. (2016). Inferring capabilities of intelligent agents from
their external traits. ACM Transactions on Interactive Intelligent Systems, 6(4), 1–25.
Lane, J., Wellman, H., & Evans, E. (2010). Children’s understanding of ordinary and
extraordinary minds. Child Development, 81(5), 1475–1489.
Lee, K., Lee, B., & Oh, W. (2015). Thumbs up, sales up? The contingent effect of Facebook
likes on sales performance in social commerce. Journal of Management Information Systems,
32(4), 109–143.
Lee, W., Castellanos, C., & Chris Choi, H. S. (2012). The effect of technology readiness on
customers’ attitudes toward self-service technology and its adoption; The empirical study of
U.S. airline self-service check-in kiosks. Journal of Travel & Tourism Marketing, 29(8), 731–
743.
Levine, G., & Parkinson, S. (2014). Experimental Methods in Psychology. New York, NY:
Psychology Press.
Litman, L., Robinson, J., & Rosenzweig, C. (2015). The relationship between motivation,
monetary compensation, and data quality among US- and India-based workers on Mechanical
Turk. Behavior Research Methods, 47(2), 519–528.
Look Who Else Uses Chatfuel. (2018). Retrieved March 19, 2018, from https://chatfuel.com/
Look who is using Bot Society. (2018). Retrieved April 4, 2018, from https://botsociety.io/
Luangrath, A. W., Peck, J., & Barger, V. A. (2017). Textual paralanguage and its implications
for marketing communications. Journal of Consumer Psychology, 27(1), 98–107.
Luo, J. T., McGoldrick, P., Beatty, S., & Keeling, K. A. (2006). On‐screen characters: Their
design and influence on consumer trust. Journal of Services Marketing, 20(2), 112–124.
MacDorman, K. (2006). Subjective ratings of robot video clips for human likeness,
familiarity, and eeriness: An exploration of the uncanny valley. In ICCS/CogSci-2006 long
symposium: toward social mechanisms of android science. Vancouver.
Bibliography 67
MacDorman, K., & Ishiguro, H. (2006). The uncanny advantage of using androids in social
and cognitive science research. Interaction Studies., 7(3), 297–337.
Marino, M. (2014). The racial formation of chatbots. Comparative Literature and Culture,
16(5).
Marschner, L., Pannasch, S., Schulz, J., & Graupner, S.-T. (2015). Social communication
with virtual agents: The effects of body and gaze direction on attention and emotional
responding in human observers. International Journal of Psychology, 97(2), 85–92.
Maudlin, M. (1994). ChatterBots, TinyMuds, and the Turing Test: Entering the Loebner Prize
competition. In Proceedings of the Eleventh National Conference on Artificial Intelligence.
AAAI Press.
Mazaheri, E., Richard, M., & Laroche, M. (2012). The role of emotions in online consumer
behavior: a comparison of search, experience, and credence services. Journal of Services
Marketing, 26(7), 535–550.
Meade, A., & Craig, S. (2011). Identifying Careless Responses in Survey Data. In 26th
Annual Meeting of the Society for Industrial and Organizational Psychology. Chicago, IL.
Meis, J., & Kashima, Y. (2017). Signage as a tool for behavioral change: Direct and indirect
routes to understanding the meaning of a sign. PLoS ONE, 12(8).
Meuter, M. L., Ostrom, A. L., Bitner, M. J., & Roundtree, R. (2003). The influence of
technology anxiety on consumer use and experiences with self-service technologies. Journal
of Business Research, 56(11), 899–906.
Meuter, M., Ostrom, A., Roundtree, R., & Bitner, M. (2000). Self-service technologies:
Understanding customer satisfaction with technology-based service encounters. Journal of
Marketing Jul, 64(3), 50-64.
Miller, R. (2017). Microsoft makes Azure Bot Service generally available for developers.
Retrieved March 28, 2018, from https://techcrunch.com/2017/12/13/microsoft-makes-azure-
bot-service-generally-available/
Mitra, K., Reiss, M., & Capella, L. (1999). An examination of perceived risk, information
search and behavioral intentions in search, experience and credence services. Journal of
Services Marketing, 13(3), 208–228.
Miwa, K., & Terai, H. (2012). Impact of two types of partner, perceived or actual, in human-
human and human-agent interaction. Computers in Human Behavior, 28(4), 1286–1297.
68 Bibliography
Moon, Y., & Nass, C. (1996). How “real” are computer personalities? Psychological
responses to personality types in human-computer interaction. Communication Research,
23(6), 651–674.
Moore, S. (2018). Gartner Says 25 Percent of Customer Service Operations Will Use Virtual
Customer Assistants by 2020. Retrieved February 20, 2018, from
https://www.gartner.com/newsroom/id/3858564
Most popular mobile messaging apps worldwide as of January 2018, based on number of
monthly active users (in millions). (2018). Retrieved April 4, 2018, from
https://www.statista.com/statistics/258749/most-popular-global-mobile-messenger-apps/
Nass, C., & Lee, K. (2001). Does computer-synthesized speech manifest personality?
Experimental tests of recognition, similarity-attraction, and consistency-attraction. Journal of
Experimental Psychology: Applied, 7(3), 171–181.
Nass, C., & Moon, Y. (2000). Machines and mindlessness: Social responses to computers.
Journal of Social Issues, 56(1), 81–103.
Nass, C., Moon, Y., & Green, N. (1997). Are machines gender neutral? Gender-stereotypic
responses to computers with voices. Journal of Applied Social Psychology, 27(10), 864–876.
Neuman, W. (2014). Social Research Methods: Qualitative & Quantitative Approaches (7th
ed.). Essex, UK: Pearson Education Limited.
Niculescu, A., van Dijk, B., Nijholt, A., Li, H., & See, S. (2013). Making Social Robots More
Attractive: The Effects of Voice Pitch, Humor and Empathy. International Journal of Social
Robotics, 5(2), 171–191.
Nowak, K. L., & Biocca, F. (2003). The effect of the agency and anthropomorphism on users’
sense of telepresence, copresence, and social presence in virtual environments. Presence-
Teleoperators and Virtual Environments, 12(5), 481–494.
Ondrej, B. (2018). An attitude towards an artificial soul? Responses to the “Nazi Chatbot.”
Philosophical Investigations, 41(1), 42–69.
Bibliography 69
Osgood, J., Ward, M., & Meade, A. (2015). The effects of environmental distractions on
careless responding in online surveys. In Annual Meeting of the Association for Psychological
Science. New York, NY.
Overby, E. (2008). Process virtualization theory and the impact of information technology.
Organization Science, 19(2).
Parasuraman, A., & Colby, C. L. (2015). An updated and streamlined technology readiness
index: TRI 2.0. Journal of Service Research, 18(1), 59–74.
Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of
human interaction with automation. Systems and Humans, 30(3).
Payne, C. R., Hyman, M. R., Niculescu, M., & Huhmann, B. A. (2013). Anthropomorphic
responses to new-to-market logos. Journal of Marketing Management, 29(1–2), 122–140.
Perez, S. (2017). Twitter launches a new enterprise API to power customer service and
chatbots. Retrieved March 28, 2018, from https://techcrunch.com/2017/12/19/twitter-
launches-a-new-enterprise-api-to-power-customer-service-and-chatbots/
Perez, S. (2018). Google’s chatbot analytics platform Chatbase launches to public. Retrieved
March 28, 2018, from https://techcrunch.com/2017/11/16/googles-chatbot-analytics-platform-
chatbase-launches-to-public/
Revilla, M., & Ochoa, C. (2015). What are the links in a web survey between response time,
quality and auto-evaluation of the efforts done? Social Science Computer Review, 33(1), 97–
114.
Rushton, A., & Carson, D. (1985). The marketing of services: Managing the intangibles.
European Journal of Marketing, 23(8), 19–40.
Salem, M., Eyssel, F., Rohlfing, K., Kopp, S., & Joublin, F. (2013). To err is human(-like):
Effects of robot gesture on perceived anthropomorphism and likability. International Journal
of Social Robotics, 5(3), 313–323.
70 Bibliography
Salkind, N. (2010). Encyclopedia of Research Design. Thousand Oaks, CA: Sage
Publications.
Schegloff, E. (1992). Repair after next turn: the last structurally provided defense of
intersubjectivity in conversation. American Journal of Sociology, 97(5), 1295–1345.
Schegloff, E. (2000). When “others” initiate repair. Applied Linguistics, 21(2), 205–243.
Scheutz, M., Schermerhorn, P., & Cantrell, R. (2011). Toward human-like task-based
dialogue processing for HRI. AI Magazine, 32(4), 77–84.
See who’s using Dialogflow. (2018). Retrieved April 1, 2018, from https://dialogflow.com/
Selnes, F., & Hansen, H. (2001). The potential hazard of self-service in developing customer
loyalty. Journal of Service Research, 4(2), 79–90.
Siddharth, S., & Watts, D. (2011). Cooperation and contagion in web-based, Networked
Public Goods Experiments. PLOS ONE, 6(3), e16836.
Soto, C., & John, O. (2017). Short and extra-short forms of the Big Five Inventory–2: The
BFI-2-S and BFI-2-XS. Journal of Research in Personality, 68, 69–81.
Srivastava, K. (2017, May). Chatty chatbots and the “time to frustration.” CIO. Retrieved
from
https://gateway.library.qut.edu.au/login?url=https://search.proquest.com/docview/%0A18967
60744?accountid=13380
Steinfeld, A., Fong, T., Kaber, D., Lewis, M., Scholtz, J., Schultz, A., & Goodrich, M. (2006).
Common metrics for human-robot interaction. In Proceedings of the 1st ACM
SIGCHI/SIGART Conference on Human- Robot Interaction (pp. 33–40).
Steinfeld, A., Jenkins, O. C., & Scassellati, B. (2009). The Oz of Wizard: Simulating the
human for interaction research. In Proceedings of the 4th ACM/IEEE International
Conference on Human- Robot Interaction (pp. 101–107).
Stevens, C., Pinchbeck, B., Lewis, T., Luerssen, M., Pfitzner, D., Powers, D., Abrahamyan,
A., Leung, Y & Gibert, G. (2016). Mimicry and expressiveness of an ECA in human-agent
interaction: familiarity breeds content! Computational Cognitive Science, 2(1), 1–14.
Stewart, N., Ungemach, C., Harris, A., Bartels, D., Newell, B., Paolaccik, G., & Chandler, J.
(2015). The average laboratory samples a population of 7,300 Amazon Mechanical Turk
workers. Judgment and Decision Making, 10(5), 479–491.
Bibliography 71
Tintarev, N., O’donovan, J., & Felfernig, A. (2016). Introduction to the Special Issue on
Human Interaction with Artificial Advice Givers. ACM Transactions on Interactive
Intelligent Systems (TiiS), 6(4), 26.
Turing, M. (1950). Turing. Computing machinery and intelligence. Mind, 49, 433–460.
van Beuningen, J., de Ruyter, K., Wetzels, M., & Streukens, S. (2008). Customer self-efficacy
in technology-based self-service: Assessing between and within person differences. Journal of
Service Research, 11(4), 407–428.
van Doorn, J., Mende, M., Noble, S. M., Hulland, J., Ostrom, A. L., Grewal, D., & Petersen,
J. A. (2017). Domo arigato Mr. Roboto. Journal of Service Research, 20(1), 43–58.
Van Sonderen, E., Sanderman, R., & Coyne, J. (2013). Ineffectiveness of reverse wording of
questionnaire items: Let’s learn from cows in the rain. PloS One, 8(7).
Venkatesh, V., Morris, M. G., Davis, G. B., & Davis, F. D. (2003). User acceptance of
information technology: Toward a unified view, 27(3), 425–478.
Walker, R., Craig-Lees, M., Hecker, R., & Francis, H. (2002). Technology-enabled service
delivery: An investigation of reasons affecting customer adoption and rejection. International
Journal of Service Industry Management, 13(1), 91–106.
Wang, C., Harris, J., & Patterson, P. (2013). The roles of habit, self-efficacy, and satisfaction
in driving continued use of self-service technologies: A longitudinal study. Journal of Service
Research, 16(3), 400–414.
Waytz, A., Morewedge, C. K., Epley, N., Monteleone, G., Gao, J.-H., & Cacioppo, J. T.
(2010). Making sense by making sentient: Effectance motivation increases
anthropomorphism. Journal of Personality and Social Psychology, 99(3), 410–435.
Weigold, A., Weigold, I. K., & Russell, E. J. (2013). Examination of the equivalence of self-
report survey-based paper-and-pencil and internet data collection methods. Psychological
Methods, 18(1), 53–70.
Weijters, B., Rangarajan, D., Falk, T., & Schillewaert, N. (2007). Determinants and outcomes
of customers’ use of self-service technology in a retail setting. Journal of Service Research,
10(1), 3–21.
Weiss, A., & Bartneck, C. (2015). Meta analysis of the usage of the Godspeed Questionnaire
Series. Proceedings - IEEE International Workshop on Robot and Human Interactive
Communication, 2015–Novem, 381–388.
Yang, J., & Klassen, K. (2008). How financial markets reflect the benefits of self‐service
technologies. Journal of Enterprise Information Management, 21(5), 448–467.
72 Bibliography
Yus, F. (2006). Relevance theory. In K. Brown (Ed.), Encyclopedia of Language and
Linguistics (2nd ed., pp. 512–518). Amsterdam: Elsevier.
Zeithaml, V., Berry, L., & Parasuraman, A. (1996). The behavioral consequences of service
quality. Journal of Marketing, 60(2), 31–46.
Bibliography 73
Appendices
Appendix Item 1
Definitions for the independent variables from the extant literature which have been
shown to predict adoption intent. Definitions taken from Blut et al., (2016) and Hoehle et
al., (2012).
Perceived ease of use (Davis, 1989): The degree to which a user would find the use of a
technology to be free from effort.
Perceived usefulness (Davis, 1989): The subjective probability that using a technology would
improve the way a user could complete a given task.
Performance expectancy (Venkatesh et al., 2003): The degree to which an individual believes
that using the system will help him or her attain gains in job performance.
Effort expectancy (Venkatesh et al., 2003): The degree or ease associated with the use of the
system.
Social influence (Venkatesh et al., 2003): The degree to which an individual perceives that
important others believe he or she should use the new system.
Facilitating conditions (Venkatesh et al., 2003): Objective factors in the environment that
observers agree make an act easy to accomplish.
Relative advantage (Rogers, 1983, 2003): The degree to which an innovation is perceived as
being better than its precursor.
Compatibility (Rogers, 1983, 2003): The degree to which an innovation is perceived as being
consistent with existing values, needs and experiences of potential adopters.
Complexity (Rogers, 1983, 2003): The degree to which an innovation is perceived as being
difficult to use.
Trialability (Rogers, 1983, 2003): The degree to which an innovation may be experimented
with before adoption
Observability (Rogers, 1983, 2003): The degree to which the results of an innovation are
observable to others.
Appendices 75
Risk (Walker et al., 2002): Customer concerns about security, system failure, reliability and
other personal, psychological or financial risks associated with a technology.
Image (Moore & Benbasat, 1991): The degree to which an individual perceives that use of an
innovation will enhance his or her status in his or her social system.
Demonstrability (Moore & Benbasat, 1991): The degree to which an individual believes that
the result of using a system are tangible, observable and communicable.
Voluntariness (Moore & Benbasat, 1991): The degree to which use of the innovation is
perceived as being voluntary, or of free will.
Technology readiness (Parasuraman & Colby, 2015): People’s propensity to embrace new
technologies to accomplish goals in home life and at work.
Habit (Venkatesh et al., 2003, Limayem, Hirt, & Cheung, 2007): The extent to which people
tend to carry out behavior (using SSTs) automatically because of learning.
Fun (Dabholkar, 1996): The extent to which the activity of using a specific system is
perceived to be enjoyable in its own right, aside from any performance consequences
resulting from system use.
Subjective norms (Venkatesh et al., 2003): A person’s perception that most people who are
important to them think that they should or should not perform the behavior in question.
Need for human interaction (Dabholkar, 1996): The desire to retain personal contact with
others (particularly frontline service employees) during a service encounter.
76 Appendices
Appendix Item 2
Transcript of the stimulus used in Study 1. Participants were asked to type the ‘human’
text into the chatbot input window. The chatbot responded with the text in the ‘chatbot’
column. Note: All participants completed both scenarios.
Appendices 77
Appendix Item 3
Items from the instruments used in Study One.
Anthropomorphism: “Please rate your impression of the chatbot on these scales by moving
the slider left or right to correspond with the words above the slider. Some of these items may
seem repetitive, please consider them all. Note: The slider does not have to sit directly below
a response option. You can move it to sit between 2 responses if that best reflects your
opinion”. Slider was anchored with the following word pairs: (i) Fake – Natural, (ii)
Machinelike – Human-like, (iii) Artificial – Lifelike, (iv) Unconscious – Conscious, (v)
Communicates Inelegantly – Communicates Elegantly.
Adoption Intent: “I would use a chatbot like this to purchase flowers online / purchase tickets
to a sporting event” (7 response options on Likert scale, anchored at strongly agree – strongly
disagree)
Openness: “Please indicate your agreement or disagreement with the following statements. I
see myself as someone who:” (7 response options on Likert scale, anchored at strongly agree
– strongly disagree): (i) Is curious about many different things, (ii) Values artistic, aesthetic
experiences, (iii) Has an active imagination, (iv) Likes to reflect, play with ideas, (v) Is
original, comes up with new ideas, (vi) Is inventive, (vii) Is ingenious, a deep thinker, (viii) Is
sophisticated in art, music or literature, (ix) Prefers work that is routine, (x) Has few artistic
interests.
Technology Anxiety: “Please indicate your agreement or disagreement with the following
statements. The following statements accurately describe my feelings about technology:” (7
response options on Likert scale, anchored at strongly agree – strongly disagree): (i) When
given the opportunity to use technology, I fear I might damage it in some way, (ii) I am
confident I can learn new technology related skills, (iii) I am able to keep up with important
technological advances, (iv) I feel apprehensive about using technology, (v) I am sure about
my ability to interpret technological output, (vi) Technological terminology sounds like
confusing jargon to me, (vii) I have difficulty understanding most technological matters, (viii)
I have avoided using technology because it is unfamiliar to me, (ix) I hesitate to use
technology for fear or making mistakes I cannot correct.
78 Appendices
Frequency of Facebook Use: “I the past month, I have used Facebook Messenger” (7 response
options: never, once or twice per month, once or twice every few weeks, once or twice every
week, once or twice every day, 3+ times per day).
Appendices 79
Appendix Item 4
Transcript of the stimulus used in Study Two. Illustrating the experimental
manipulation across conditions.
80 Appendices
Appendix Item 5
Items from the additional instruments used in Study Two.
Perceived Ease of Use: “Please indicate your agreement with the following statements” (7
response options on Likert scale, anchored at strongly agree – strongly disagree): (i) Learning
to use this chatbot would be easy for me, (ii) I would find it difficult to use this chatbot, (iii) It
would be easy for me to become skilful at using this chatbot
Perceived Usefulness: “Please indicate your agreement with the following statements” (7
response options on Likert scale, anchored at strongly agree – strongly disagree): (i) This
chatbot would be useful for booking a hotel room, (ii) Using this chatbot would improve the
way I book a hotel room, (iii) Using this chatbot would make booking a hotel room easier
Need for Human Interaction: “Please indicate your agreement with the following statements.
When purchasing consumer goods or services;” (7 response options on Likert scale, anchored
at strongly agree – strongly disagree): (i) I like interacting with the person who provides the
customer service, (ii) It bothers me to use a machine when I could talk to a person instead,
(iii) Personal attention by the service employee is important to me, (iv) Human contact makes
the process enjoyable to me
Appendices 81