Kruijff Robbert Willem de 11861452 MSC BA

Technology Acceptance of the Smart Speaker
Exploring factors affecting the Use Intention of an
emerging technology
MSc. In Business Administration – Digital Business Track
University of Amsterdam – Amsterdam Business School
Supervisor: Andrea Ganzaroli
June 2018
Robbert Willem de Kruijff 11861542

Statement of originality
This document is written by student Robbert Willem de Kruijff who declares to take full
responsibility for the contents of this document.
I declare that the text and the work presented in this document is original and that no sources other
than those mentioned in the text and its references have been used in creating it.
The Faculty of Economics and Business is responsible solely for the supervision of completion of the
work, not for the contents.
2
Preface
This thesis is written as part of the Master program: Business Administration – Digital
Business track, at the University of Amsterdam. The Digital Business track is considered to
be a boundary spanner between the digital world, the market and other disciplines. This thesis
concerns an empirical research about the acceptance of the Smart Speaker that has the goal to
bring the knowledge acquired during the Master’s program into practice.
I’m really grateful for the valuable comments on earlier drafts of this thesis that were given
by my supervisor Andrea Ganzaroli, as well as for the clarifying insights in the field of
Technology Acceptance. Furthermore, I would like to thank him for the great support he
provided in the process.
3
Abstract
This research contributes to the rational understanding of the acceptance of the
technology known as the Smart Speaker defined as: “A hands-free speaker powered with
digital voice assistant using two-way voice computing technology that is highly connected
(based on Koo & Nam, 2017).”
In the literature review, two fields of theory have be selected and applied being; in the
first place technologies embedded in the Smart Speaker (Spoken Language Dialog System,
Voice Search as application and Smart Technologies) and secondly the Technology
Acceptance Model (TAM), further developments of this concept (TAM2, UTAUT and
HMSAM) including a case study using the TAM.
Factors affecting one’s Use Intention based on the TAM and the development of the
TAM, substantiated by the technology background, are explored by means of a survey
amongst 182 respondents with the following research question in mind: What are motivations
and perceptions that affect people’s intention of adopting the AI-based Smart Speaker?
This resulted in several factors that are proven to be significantly affecting (directly
and indirectly) the Use Intention of the Smart Speaker such as: Social Influence, Perceived
Entertainment, Perceived Usefulness and Perceived Ease of Use. Interesting observations
concern the Interface Familiarity (the new era of human-computer interface with voice
control) and Apprehensiveness (trust and privacy issues when using this AI-based
technology).
Finally, future research could be conducted in the Virtual Assistant software, not
constricted to the Smart Speaker. Furthermore, one could look at specific contexts (such as
the elderly or a work environment), at advertising off-screen, at the Smart Speaker with the
ecosystems of large technology firms, at the actual use of the Smart Speaker in the
Netherlands and at researches more experimental designs (truly experiencing the Smart
Speaker for a longer period of time).
4
Table of Contents
Preface 3
Abstract 4
Abbreviations 8
1. Introduction 9
1.1. A prosperous technology 9
1.2. Google, Amazon… 9
1.3. Expectations 10
1.4. Barriers 11
1.5. This Research 12
1.6. Adoption of Spoken Language Dialog System (SLDS) 13
2. Literature Review 16
2.1. Technology review 16
2.1.1. Spoken Language Dialog System 17
2.1.2. Voice Search and Smart Technologies 18
2.2. Evolution of the Technology Acceptance Model 21
2.2.1. Technology Acceptance Model 21
2.2.2. TAM 2 23
2.2.3. UTAUT 23
2.2.4. HMSAM 24
2.2.5. A previous case study 25
2.3. Summary of the literature review 26
3. Variables and Research Model 28
3.1. Variables 28
3.1.1. Perceived Usefulness 28
3.1.2. Perceived Ease of Use 29
5
3.1.3. Social Influence 29
3.1.4. Perceived Entertainment 30
3.1.5. Apprehensiveness 30
3.1.6. Interface Familiarity 31
3.1.7. Web Skills 32
3.1.8. Use Intention 32
3.2. Research Model 33
3.3. Hypotheses 33
4. Method 34
4.1. Sampling 34
4.2. Measures 35
4.3. Control Variables 36
4.4. Limitations of the design 36
4.5. Tools for Analysis 37
5. Results 38
5.1 Demographics and response rate 38
5.1.1 Demographics 38
5.1.1. Response rate 39
5.2. Analytical Strategy 39
5.2.1. Data 39
5.2.2. Normality 40
5.2.3. Computing means 41
5.2.4. Outliers check 41
5.2.5. Reliability 41
5.2.6. Correlation 42
5.3. Data Analysis direct effects 43
5.3.1. Result Multiple Regression Analysis (direct effects) 43
6
5.3.2. Conclusions from Multiple Regression Analysis 44
5.4. Data Analysis Indirect effects 45
5.4.1. Results of the mediation analysis 46
5.4.2. Conclusions from mediated effects analysis 50
5.5. Outcome model 51
6. Discussion 52
6.1. Interpreting the Results 52
6.1.1. Interface Familiarity 53
6.1.2. Web Skills 54
6.2. Limitations 55
7. Conclusions 56
7.1. Contributions 56
7.1. Managerial Implications 57
7.2. Future research 58
8. Bibliography 59
9. Appendices 64
Appendix 1: The Sun’s article: Echo Breach 64
Appendix 2: Gartner’s Hype Cycles 66
Appendix 3: Description of the Smart Speaker Technology 67
Appendix 4: Adapted Smart Speaker Use Intention Measure 68
Appendix 5: Hierarchical multiple regression 70
Appendix 6: PROCESS model 4 and output SPSS 71
Appendix 7: PROCESS model 6 and output SPSS 75
7
Abbreviations
APP Apprehensiveness
DV Dependent Variable
HMSAM Hedonic-Motivation System Adoption Model
IF Interface Familiarity
IV Independent Variable
H3 Hypothesis 3
M Mediator / Mean
PE Perceived Entertainment
PEU Perceived Ease of Use
PU Perceived Usefulness
SD Standard Deviation
SI Social Influence
SLDS Spoken Language Dialog System
TAM Technology Acceptance Model
TAM 2 Extended Technology Acceptance Model
UI Use Intention
UTAUT Unified Theory of Acceptance and Use of
Technology
WS Web Skills
8
1. Introduction
1.1. A prosperous technology
‘OK Google’ …‘What’s playing tonight?’, Google Assistant will show films at your local
cinema. And if you add ‘We’re planning on bringing the kids’, Google Assistant will know to
serve up show times for kid-friendly films. You could then say ‘Let’s see Jungle Book’, and
the assistant will purchase tickets for you” (Dale, 2016). This statement in the research of
Dale (2016) is a great example of voice-controlled technologies becoming more and more
intelligent in mimicking the human interaction.
Within this piece of innovation a couple of prosperous technologies come together in the
comfort of our homes; i.e. the Smart Speaker is voice-controlled, smart (meaning that it is
possible to connect it to other smart devices) and connected to the internet (providing a
doorway to endless possibilities). A collaborative study of NPR and Edison Research (The
Smart Audio Report, 2017) shows that nowadays 16% of Americans older then 18 already
own a Smart Speaker (that’s about 39 million people).
They do not only own one, but these people also seem highly satisfied. The study shows that
out of the people that own a Smart Speaker 65% of the questioned volunteers could not
imagine a life without one. Due to NPR and Edison Research “Smart Speakers are changing
behaviours and forming new habits” (The Smart Audio Report, 2017).
1.2. Google, Amazon…
In 2014 Amazon was the first one to introduce a commercialized wireless playback device
that featured voice activated digital assistants. This so-called Smart Speaker is on the rise and
the numbers as mentioned by the Smart Audio Report tell us the same story. Since 2014
Google came with it’s Google Home and the Google Assistant as digital assistant software
9
(Lerner, 2017), Apple couldn’t stay behind and recently launched their Apple HomePod
(Jaffe, 2018).
Not only the big technology firms, but also several entrepreneurs tap into this technology with
more “niche” applications of the Smart Speaker such as SMARTY. SMARTY is a virtual
assistant created by a startup called Siliconic Home, uniquely based on the voice of children;
SMARTY positions itself as a kids-friendly Smart Speaker. The patented natural language
processing technology can recognize the voices of kids, which have a significant different
pitch compared to the pitch of adults (Montgomery, 2016).
Another example is Olly, which is being created by a startup called Emotech in London. Olly
is again different compared to the other smart speakers. This virtual assistant actually has a
personality that can develop and evolve as the result of conversations with the consumer. This
means Olly understands people's way of communicating, including context and an
understanding whether or not information is appropriate for the user (Montgomery, 2016).
Both these entrepreneurs, focusing on specific niches, confirm the great possibilities and the
vast development of such technologies.
1.3. Expectations
The research of Gartner shows that expectations are growing significantly and looking at their
Hype Cycle, the Virtual Digital Assistant (which is the software, and thus the backbone of the
Smart Speaker) went from “Innovation Trigger” to “Peak of Inflated Expectations”, as
presented in Appendix 1. They predict that the next 5-10 years the technology is going to
reach the productivity platform (see Appendix 1). The prospects of this Smart Speaker are
tremendous and great growth is shown already. Gartner (2016) forecasts that, by the end of
2020, end-user spending on VPA (Virtual Personal Assistant) -enabled wireless speaker will
reach $2.1 billion.
10
1.4. Barriers
A white paper of Symantec explains the main benefit of using the Smart Speaker is that the
voice-activated assistant can access all the intelligence in the backend, translating every
request into an appropriate task. On the other hand it also has a downside, as the same report
mentions privacy and trust risks. This is illustrated this with metaphors like: “The attack of
the curious child” (children ordering stuff online) and “the tale of the mischievous neighbor”
(whispering requests through the window) both explaining the issues around trust in these
devices (Wueest, 2017). See Appendix 2 for an example of a story of such trust issues in a
news article.
The Smart Speaker can create great opportunities, but obviously there are also barriers for the
adoption of this technology. For instance, research of Voice Labs states the Smart Speaker is
still seen as a luxury product and consumers often still see it as another confusing or
redundant platform or interface (Marchick, 2017). Moreover, a chart from Statista, (based on
data from NPR and Edison Research; The Smart Audio Report, 2017) shows reasons for not
owning a Smart Speaker such as: “too expensive, not enough information available about the
Smart Speaker, not going to use it enough, worried about hackers, bothered that the speaker is
always listening, spend more money with it and listening by the government”. Thus, it can be
concluded that barriers for adoption of this technology are related to concerns about among
others the usefulness of the device and about trust and privacy risks (Cakebread, 2017).
11
1.5. This Research
When it comes to the Smart Speaker, there is a gap. There is not a good match between the
technological feasibility and what the market expects. People don’t know the technology very
well yet and might expect something different or more then is possible or reality. This is why
the Smart Speaker calls for a rational understanding. In other words: the objective of this
thesis is to assess the extent to which people are intending to adopt the Smart Speaker and to
explore important factors possibly driving this intention. With this in mind the following
research question is formulated:
What are motivations and perceptions that affect people’s intention of adopting the AI-based
Smart Speaker?
In order to answer this research question, an assessment on the Smart Speaker is done by
performing an adjusted Technology Acceptance Model (TAM) analysis, an assessment not
made in literature so far. Holistically, the model simply helps the researcher to make up the
sum of perceived “benefits” and “costs” of the technology in order to understand the attitude
towards using the Smart Speaker and eventually adopting the technology. This model as
introduced for the first time by Davis, et al. (1989) roughly looks like Figure 1. In this figure,
Perceived Usefulness can be seen as a benefit in this example and Perceived Ease of Use as
cost. More detailed explanations on the model and the appropriate used variables will be
presented in the Literature Review and the Research Model.
Figure 1: The Technology Acceptance Model (Davis, et al., 1989)
12
1.6. Adoption of Spoken Language Dialog System (SLDS)
The so-called Virtual Digital Assistants such as Google Assistant and Amazon Alexa, is the
software embedded in the Smart Speaker. This software is the backbone of the Smart Speaker
technology and turns your spoken language into a task and creates feedback given to you by
speech. More about this technology, in research more often called a Spoken Language Dialog
System (SLDS), will be explained in the literature review.
But for now the question is, what does previous research say about the adoption of such
SLDS’s? Through the years, a lot of research has been done on such voice controlled systems.
The conclusion so far was pretty much the same. In theory the idea of an SLDS is there and
the future view of application of the technology is very promising (Joshi 1991/ Liddy 2001/
Chowdhury 2003).
The Technology Acceptance Model has been used before for analysing speech recognition
technology or SLDS. A study on the user acceptance of voice recognition technology done by
Simon & Paper (2007) suggests that their adapted TAM had a great predictive power
(incorporating a “subject or social norm factor” next to “Perceived Usefulness” and
“Perceived Ease of Use”). They furthermore mention that the rapidly evolving speech
recognition technology will become less prone to error, more sophisticated, more powerful
and user-friendlier. This can lead to less fluctuations in the Perceived Usefulness, Perceived
Ease of Use and in this research’ case the social norm. Due to Simon & Paper (2007) this also
impacts the intention of and actual system use. This calls out for a new research testing the
most recent technologies with an adapted TAM; is the Smart Speaker the more stable SLDS
technology that people are willing to adopt?
13
6 years later, not much has changed. Dahl (2013) states that looking at the future, applications
for natural language processing would become more and more capable. This development
will be based on factors such as; increase of power of devices, development of new
techniques for exploiting vast amounts of data available on the Internet and related
technologies like speech recognition. In other words, Dahl (2013) says that the synergy of
these factors will make the future applications of natural language processing very likely to
become a part of our lives.
More recent Dale (2016) states an important gap. He says that it makes sense that research is
ahead of the actual products; it makes sense because often the commercial benefits for
companies are not always clear. Dale (2016) mentions the risk that in the past the newest
technologies remained in research. He even mentions: the next milestone for the Big Four of
technology (Apple, Google, Amazon and Facebook) is to go towards truly conversational
interactions, taking context into account rather that analyzing merely a sequence of
independent conversational pairs.
To summarize, a lot of researches mention the promising technology of SLDS and point out
the developments and call for actual products. Did the Big Four of Technology create these
kinds of products with their Smart Speakers? And could this hardware become something
tangible and feasible in this line of thought? The missing link is clear and so is the question
concerning this part of the Smart Speaker. Is the improvement of the technology of SLDS
(like better translating requests into tasks, better understanding conversations including
context and being more robust against noise) going to improve the attitude of people towards
the adoption of this technology?
14
Since the Smart Speaker is still quite young and in the early stages of it’s promising
development, the only research currently available is limited. However the research available
was a good basis to define the Smart Speaker for this thesis. In the literature review the
technologies embedded in the smart speaker are reviewed and combined with the
development and variable versions of the TAM. From this theoretical background variables
are extracted for the research framework (and its hypotheses), which are later researched by
means of a questionnaire. The reported results are followed by a discussion (including some
striking results and limitations to the research) and finally the conclusion (including
contributions, managerial implications and suggestions for further research).
15
2. Literature Review
In the literature review, this thesis will engage into two relevant parts for the further process
of the research. Firstly, the literature on the technologies that are embedded in the Smart
Speaker will be explored. This in order to create a better understanding what research has
found on these technologies and what challenges and thus relevant motivations or barriers can
be formulated as predictors for adopting the Smart Speaker.
Secondly, the TAM and its later adjusted models are reviewed in order to find the variables
that are both relevant and applicable for the research model. These variables will be the basis
for the research model, the corresponding hypotheses and eventually the questions for the
survey.
2.1. Technology review
The Smart Speaker has been given several names (as also seen in the introduction) such as a
voice-controlled speaker an Artificial Intelligence speaker or a Virtual Assistant speaker (Koo
& Nam, 2017). To clarify what the Smart Speaker is, the technology review will start by
giving a definition of the Smart Speaker, which also will be the fundament and support the
cohesion for the reviewed literature.
“A hands-free speaker powered with digital voice assistant using two-way voice
computing technology based on cloud computing (Koo & Nam, 2017).”
Based on this definition three interesting features come together in the Smart Speaker which
combination makes it different from other technologies. The Smart Speaker is: a Spoken
Language Dialog Systems (digital voice assistant using two-way voice computing), a Smart
Technology (connected to other devices) and a cloud computed (meaning it is connected to
the internet making voice search possible, as well as making applications and updates
available) technology. In the literature these subjects will be discussed and explained towards
a more elaborated research gap.
16
2.1.1. Spoken Language Dialog System
As mentioned shortly in the introduction, a Smart Speaker is new version of a so-called
Spoken Language Dialog System (SLDS). Figure 2 creates a general understanding of what
such a system implies. In the definition this is described as “digital voice assistant using a two
way voice computing technology”.
Figure 2: Spoken Language Dialogue System (Bertrand, et al., 2010)
In other words; the system translates your spoken language into text, the dialogue manager
translates that text into meaning and a certain task (this is the work of the digital voice
assistant, possibly with help of its connections or applications), and finally gives a response or
feedback by generating and synthesizing spoken text.
The reason for the Smart Speaker and thus the SLDS to have such great potential can be
explained by taking a step back by looking at the general development of the human
computer interface (interaction between human and computer), which is well illustrated in
Figure 3. The research explains: “the desktop, browser and search metaphors of the last
decades leads to a new solve metaphor focused on context and tasks Bellegarda (2013).”
Figure 3: Natural stages in the evolution of the user interface (Bellegarda, 2013)
17
Bellegarda (2013) concludes this figure by stating that the user will get more used to
expressing a more general need and thereafter let a system fulfill (or solve) this need.
When looking at expressing a more general need, research states speech is proven to be the
most essential and primary way of communicating for human beings (Prabhakar & Sahu,
2013). Logically this means, spoken language has the potential of being an important mode of
interaction with computers. Furthermore the research says that today, speech technologies are
commercially available for a limited but interesting range of tasks. These technologies enable
machines to respond correctly and reliably to human voices and provide useful and valuable
services and tasks (Prabhakar & Sahu, 2013).
2.1.2. Voice Search and Smart Technologies
Two fields of technology research or applications also interesting to look into are Voice
Search and Smart Technologies, since these are also unique qualities that come together
within the Smart Speaker. The Smart Speaker empowers Voice Search by making it able to
do this at home and enables Smart Technologies to be controlled with the Smart Speaker as
one central point.
One specific application of voice based human-computer interaction is the voice web search.
This is separately mentioned since it illustrates the trend of voice-controlled technology.
Schalkwyk, et al. (2010) found that voice (web) search is growing rapidly and many users
intend to become frequent users. When it comes to voice mobile searches the following
results for this research are interesting, Voice Search:
• Is more popular for “on-the-go” topics such as food and drink or local businesses;
• Is less likely to be used for potentially sensitive subjects (adult, social network,
health) relative to typed searches;
• Is less likely when searching for a website that requires more intensive interaction.
18
The question for now is, what the influence of Smart Speakers on this growth is and if these
issues would be different in the privacy of our homes with an intelligent speaker. Answering
this: the research of Moorthy & Vu (2014) found that participants of the research preferred
using a Voice Activated Personal Assistant (VAPA) (another term for the Virtual Assistant or
SLDS) in private locations (home). On the other hand, also in their homes people are
skeptical about using VAPA when it concerns more private input, compared to more general
(less personal) input.
As mentioned in the definition and the name of the device, “smart” is part of the possible
cloud computed connections the Smart Speaker is capable of. This is an interesting and
unique function for its context. The Smart Speaker could be connected to other devices that
have smart features in home (think about light, curtains or the thermostat). In other words; it
could possibly be part of a smart home or a so-called “Ambient Intelligence”, or even more, it
could be the central device controlling the smart home. That is why this part is more focussed
on the possible contribution a Smart Speaker could make in these environments rather then
the technology itself.
Within the research of Chan, et al. (2009) the test case for Ambient Intelligence (AmI) is
based on the elderly; who, with the help of AmI, could be assisted and remain independent.
The article of Chen, et al. (2009) reviews various technologies available for smart homes.
“The devices to monitor health and activity and provide assistance in the home must be non-
obtrusive and acceptable to users. The needs of users require more research.” Chen, et al.
(2009) furthermore states: “AmI means making the environment sensitive to the user by using
technology”.
In a similar research one statement points more into the direction of the Smart Speaker when
Cook, et al. (2009) mentions that such systems should understand when to interrupt a user and
when to suggest something and when not to. This suggests, more control should be possible
in AmI systems.
19
In 2013 (Balta-Ozkan, et al., 2013) state the following about the smart home industry with
energy as a context, but for this research still very relevant outcomes:
• Households say they would adopt such technologies in large quantities if these people
would not have to change daily routines;
• The usefulness and benefits of the smart technology will have to be clearly stated and
demonstrated;
• Increased control of technologies such as these can help to counteract consumer
resistance;
• Finally data privacy is an issue in smart home technologies. This could be dealt with
by privacy friendly techniques. But on the other hand, experts say too much
regulation could kill this whole industry.
20
2.2. Evolution of the Technology Acceptance Model
In order to measure and validate the separate motivations and issues when adopting the Smart
Speaker, a rational research model will have to be formulated and interpreted for the Smart
Speaker. As mentioned in the introduction, for this the Technology Acceptance Model, or an
adjusted version of the TAM, will be presented as the research model. This model examines
specific factors that may influence technology adoption, such as presented in the first version
of the TAM: Perceived Usefulness and Perceived Ease of Use. The development of these
models and its applicable factors for the conceptual model will be discussed and used as basis
for the adapted version of the TAM or research model. The following research and models are
explored:
• Original TAM by Davis, et al. (1989)
• Advanced TAM by Venkatesh & Davis (2000)
• UTAUT model by Venkatesh, et al. (2003)
• HMSAM model by Lowry, et al. (2012)
• A previous Case study by Kwon & Chidambaram (2000)
2.2.1. Technology Acceptance Model
Research in the acceptance of information technology has delivered many different models,
all having factors measuring the acceptance of a certain technology. The first and up until
now the most widely accepted model is, as mentioned in the introduction, the Technology
Acceptance Model (TAM) introduced by Davis, et al. in 1989.
Davis, et al. (1989) based the model on earlier research by Fishbein & Ajzen (1975) that
created the Theory of Reasoned Action (TRA). The theory by Fishbein & Ajzen says that the
behavioral intention is determined by both the attitude towards that behavior and a subjective
norm concerning the behavior in question.
21
The TAM presented in Figure 4 is then an adaptation of the TRA, specifically created for
measuring the acceptance of end-user computing technologies in an organizational context.
As mentioned in the introduction, the first TAM made the sum of “cost” and “benefit” factors
by looking at respectively the Perceived Ease of Use and the Perceived Usefulness.
Figure 4: Technology Acceptance Model (TAM) (Davis, et al., 1989)
In the TAM Davis, et al. (1989, page 985) mention:
• “Perceived Usefulness is the prospective user’s subjective probability that using a
specific application system will increase his or her job performance within an
organizational context”
• “Perceived Ease of Use is the degree to which the prospective user expects the target
system to be free of effort.”
This was as mentioned before the basis for a lot more models and extensions on this idea of
measuring technology acceptance. Important for the construct of the research model is the
indirect effect of Perceived Usefulness on Behavioral Intention through Perceived Ease of
Use, this is the construct used in the research model, also confirmed by the case study
described in paragraph 2.2.5..
22
2.2.2. TAM 2
The further exploration of the TAM in the context of a smart speaker brings us to the TAM 2
as shown in Figure 5. This is an extended version of what Davis et al. introduced in 1989,
created by Venktantesh & Davis (2000). They found several different factors significantly
influencing user acceptance. The findings of this research improved the understanding of user
adoption.
Figure 5: TAM 2 (Venkatesh & Davis, 2000)
One relevant factor extracted for this thesis is the Experience an individual has in comparable
technologies. As mentioned in the introduction, voice-controlled technologies becoming more
and more accepted with the rising usage of voice search as an example. In the case of this
research, Experience will be (as also seen in the UTAUT model by Venkatesh, et al. (2003) in
Figure 6) divided into two different factors, which will be further explained in the next
chapter.
23
2.2.3. UTAUT
In 2003 Venkatesh, et al. did an analysis on a wide arrangement of frameworks concerning
the acceptance of information technologies, which resulted in the Unified Theory of
Acceptance and Use of Technology (UTAUT), shown in Figure 6. From this model, the most
important factor extracted for this thesis is Social Influence. Again, the direct effect of Social
Influence on use intention is proved to be of importance, confirming the earlier statement of
the TAM 2 research of Venkatesh & Davis (2000).
Figure 6: UTAUT (Venkatesh, et al., 2003)
2.2.4. HMSAM
Up until the following research these models remain to have an organizational context. But
the smart speaker is also meant for personal (private, in-home) use, which means there is also
a hedonic factor to be considered. In other words, how entertaining does a user think the
smart speaker is? This Enjoyment is also explained in the research of Mun & Hwang (2003,
page 435). They state that prior research already proposed Enjoyment is a determinant of
behavioral intentions.
24
The importance of this hedonic factor was also acknowledged in the research of Lowry, et al.
(2012), who further extended the TAM with a hedonic factor in the Hedonic-Motivation
System Adoption Model (HMSAM). In Figure 7 is seen that Joy (or Perceived Entertainment,
as used in the case study model in the next paragraph as well as in the research model of this
study) is introduced and further explored in the research of Lowry, et al. (2012).
Figure 7: Van der Heijden's Model as the Baseline for the HMSAM (Lowry, et al. 2012)
2.2.5. A previous case study
As stated in the introduction issues such as trust and privacy also come with a technology
with voice-control as the interface. The case study of cellular telephones using the TAM
mentions the same factor (Kwon & Chidambaram 2000): “the anxiety about using a new
medium or technology”. This finally inspires the last factor influencing the behavioral
intention towards the use of technology: Apprehensiveness. The same research mentions
innate fear and intrusion into personal privacy as part of the Apprehensiveness. As seen in
Figure 8, this is one of the few models mentioning Apprehensiveness and Perceived
Entertainment. One can see the construct of this model is similar to the research model of this
research.
25
Figure 8: Research model used by Kwom & Chidambaram (2000)
26
2.3. Summary of the literature review
Combining the technology literature with the development of the TAM and starting off with
the original TAM (Davis, et al. 1989) the following factors are important to take into account
when assessing the intention to use the smart speaker:
• Social Influence;
• Experience; (later divided into Web Skills and Interface Familiarity)
• Perceived Entertainment;
• Apprehensiveness;
The motivation and explanation for these variables will be presented in chapter 3.
Finally as already mentioned, currently the Smart Speaker is not being sold in the
Netherlands, where this research is conducted. This means the resources of finding out the
actual behavior of purchasing this type of information technology is difficult to measure. As a
consequence, considering the limited time frame of conducting this research, this means only
the behavioral intention (Use Intention) towards this technology will be measured as
dependent variable.
27
3. Variables and Research Model
In this section first all variables will be defined and motivated based on previous research
presented in the literature review. The motivation will be formulated by connecting both the
literature of the technology as well as the models frequently used to measure technology
acceptance. Therefore each variable is defined, motivated and connected to the literature.
With these variables finally the research model will be given and with it the hypotheses that
will be analyzed in this qualitative research.
3.1. Variables
3.1.1. Perceived Usefulness
For the sake of structure, the definition of Perceived Usefulness is presented again: “degree to
which a person believes that using a particular system would enhance his or her job
performance (Davis, et al., 1989)”. This means for example, a person might perceive the
Smart Speaker useful because of reading a recipe, controlling the thermometer or asking for
their daily schedule. Perceived Usefulness is one of the basic elements of the Original TAM
by Davis, et al. (1989). This is also seen back in the technology review with the outcome of
Balta-Ozkan, et al. (2013) stating the acceptence of technology will be enhanced if the
usefulness and benefits of the smart technologies would clearly be stated and demonstrated.
The same research states that increased control of technologies such as these can help to
counteract consumer resistance. A more proactive approach that is provided by the Smart
Speaker might change the Perceived Usefulness or at least enhance the Use Intention.
28
3.1.2. Perceived Ease of Use
Just like Perceived Usefulness, the definition of Perceived Ease of Use: “degree to which a
person believes that using a particular system would be free from effort”. This means for
example, learning how to communicate with the Smart Speaker or connecting with the
Internet and other devices. Yet another basic TAM (Davis, et al., 1989) element, also
essential and in practice seen in the Balta-Ozkan, et al. (2013) research: “a household would
adopt smart technologies in large quantities if they wouldn’t have to change their daily
routines.”
3.1.3. Social Influence
The Social Influence by Cho (2011) defined as: “a person’s perception that most people who
are important to him think he should or should not perform the behavior in question.” Cho
(2011) also acknowledges, what many TAM studies have shown regarding the direct effect of
Social Influence on the behavioral intention (in this research the Use Intention). What your
surroundings think about a certain technology is important, also confirmed in the TAM 2
(Venkatesh & Davis, 2000) and in the research model testing the acceptance of the cellular
telephone (Kwom & Chidambaram, 2000). In the Ambient Intelligence literature it is stated
that elderly could have the opportunity to live independent, which as a development also
potentially influences the social surroundings (Chan, et al., 2009).
29
3.1.4. Perceived Entertainment
In the research of Lowry (2012), the HMSAM model describes Joy (or Perceived
Entertainment) as: ‘the extent to which the activity of using the computer is perceived to
bring about pleasure and Joy for their own sake, apart from any anticipated performance
consequences“. The Smart Speaker is obviously also a technology that can be used for
reasons of Joy or fun, not only in a professional context; meaning this hedonic factor will be
included into the research model. This Perceived Entertainment has not been explored in the
technology review, which makes sense since the technology research is more focused on the
possibility of creating these technologies or focused on context of utility, not a context of fun.
This makes this variable even more important to explore since there seems to be a gap in
practice.
3.1.5. Apprehensiveness
Kwom & Chidambaram (2000) describe Apprehensiveness in their case study to be: “anxiety
about using a new medium or technology”. Apprehensiveness in this thesis is focussed on
people’s personal data and people’s privacy being safe. This is because, as already explained
in the introduction, privacy is something for the 21th century. Everything that has to do with
gathering data is rather sensitive nowadays, thus affecting one’s intention to use a technology.
Apprehensiveness is also one of the main concerns in the technology review, which makes it
an interesting variable to measure and use it in the research model to look if and how it is as a
predictor of Use Intention.
The research of Moorthy & Vu (2014), already found that people prefer using a voice-
controlled assistant at private locations. Moreover, voice search is hardly done when it
concerns sensitive subjects (Schalkwyk, et al., 2010). The smart home industry has a dilemma
whereas people are worried about their privacy of data, even though too much regulation and
control will destroy the industry (Balta-Ozkan, et al., 2013).
30
3.1.6. Interface Familiarity
The development of computer-human interactions shows that in practice there has always
been change in the interface (Bellegarda, 2013). The challenge for now is, knowing what the
background of a respondent is, how familiar he or she is with SLDS (Bertrand et al., 2010)
and how that influences the Use Intention. This is also why Interface Familiarity in the model
has an indirect effect of Use Intention via Perceived Usefulness. A person who is more
familiar with this form of human-computer interaction may have a higher appreciation for the
usefulness of the Smart Speaker.
This is why the first variable in this research is Interface Familiarity, detracted of what
Venkatesh, et al. (2003) in their research call Experience. Even though the original definition
as stated in the literature is slightly different (Gefen, 2000), being not so much focussed on
interface, it does form the basis for the definition used in this thesis: “one's understanding of
an interface based on prior interactions or experiences.”
The definition of Gefen (2000) is more focused on being familiar to a person rather than
communicating in a certain way with a computer. Nevertheless, the definition is very
applicable for this research and therefore used as definition for Interface Familiarity:
“familiarity is an understanding, often based on previous interactions, experiences, and
learning of what, why, where and when others do what they do.”
31
3.1.7. Web Skills
Web Skills are defined as “an individual judgment of one’s capability to use a computer”
(Koufaris, 2002). Being also conducted from the Experience measurement of the UTAUT
(Venkatesh, et al., 2003). It is important to clarify the difference between the variables Web
Skills and Interface Familiarity. The variable Web Skills is more focused on a person’s self
perception of his or her skills on the web and in general is focused on computer skills on the
Internet, whereas the variable Interface Familiarity is more focused on interaction between
human and computer.
The review of the technologies found that people are less likely to search for a website that
requires more intensive interaction (Schalkwyk, et al., 2010). It is in this line of thought that
the decision has been made for Web Skills to be connected to Perceived Entertainment, i.e.
assuming that the more experienced and skilled a person is on the web, the better he or she
knows what a Smart Speaker is capable of, especially when in search of entertainment, which
is perceived to be a more intensive way of interacting.
3.1.8. Use Intention
Finally Use Intention is the dependent variable in the research model and is defined as: “the
degree to which a person has formulated conscious plans to perform or not perform some
specified future behavior” (Venkatesh, et al., 2003). In this research it means whether
someone intents to use a Smart Speaker in the future or not.
As mentioned before, it is practically too difficult to measure the actual use of the Smart
Speaker since is not officially launched in The Netherlands yet. That is why it is important to
know that Use Intention is strongly connected to actual use of a technology, as seen in the
original TAM (Davis, et al., 1989) meaning chances are Use Intention is truly a predictor for
adopting the Smart Speaker. Or as Davis, et al. (1989) stated: “People’s computer use can be
predicted reasonably well from their intentions”.
32
3.2. Research Model
Based on the explored variables with their effect found in theory, the following research
model for this thesis is presented in Figure 9.
Figure 9: Research model of adoption of the smart speaker (based on Technology Acceptance Model)
3.3. Hypotheses
The following hypotheses as presented in Figure 9 are stated and will be tested in the research
H1. Perceived Usefulness has a direct positive effect on Use Intention.

H2. Perceived Entertainment has a direct positive effect on Use Intention.
H3. Interface Familiarity has an indirect, positive effect on Use Intention via Perceived
Usefulness.
H4. Perceived Ease of Use has an indirect, positive effect on Use Intention via Perceived
Usefulness.
Entertainment.
H6. Apprehensiveness has an indirect, positive effect on Use Intention via Perceived
Usefulness.
Entertainment.
H8. Web Skills has an indirect positive effect on Use Intention via Perceived Entertainment.
H9. Social Influence has a direct effect on Use Intention.
33
4. Method
The thesis and the collection of the data will be based on quantitative research. The data is
gathered by conducting a survey, which means the data is cross-sectional. In the introduction
of the survey, a short but clear explanation of the technology was provided to the respondent.
Not everybody is familiar with the term “Smart Speaker”, that is why in this same
introduction two examples of the device with a photo of several Smart Speakers and a clear
text including a definition gave every respondent the same background information as
presented in Appendix 3. This is information was provided in order for a respondent to create
a good idea of concepts such as Perceived Usefulness and Apprehensiveness.
4.1. Sampling
The non-probability, convenience sample will be users and non-users of the technology of the
Smart Speaker. Since the Smart Speaker is not introduced in the Netherlands and the survey
will be spread out via e-mail and social media (Facebook and LinkedIn), starting in the
Netherlands the expectation is that most respondents actually will be not using the product.
The survey was carried out in the period from the 23rd of April until the 10th of May (2018).
By means of an extra incentive, the researcher tried to maximize the number of respondents.
This incentive is to randomly give away one Smart Speaker (Amazon Alexa Dot). Combining
the extra reward incentive with content that is relevant for the interested respondent (sharing
the research on relevant platforms whereas the main interest is such) the number of
respondents aimed to be maximized. The minimum amount of data cases for such a research
would be 200, based on the thumb rule of that being a good sample size. Based on previous
research the response rate of researches comparable to this is very broad and rates from 16.2
(Cho, 2011), to 37 (Kwom & Chidambaram) and over 90 percent (Pavlou, 2003), which
makes it hard to estimate the response rate in this thesis. Also because within the social
platforms it’s not possible to estimate how many people actually received the survey.
34
4.2. Measures
The measures of the survey are presented in Table 1 (all measurements are intervals using a
7-point (completely disagree – completely agree) Likert scale. The Cornbach’s alpha as
mentioned in the cited paper is presented as to justify the use of the variables and items by
presenting the reliability in previous research. Apart from the examples of items presented,
the used questions for the survey are presented in Appendix 4.
Note: the items chosen and thus question asked for the Apprehensiveness variable are asked
in such a way that the effect is positive, as stated in H6 and H7.
Table 1: Used measurements and items and Cronbach’s α
Measure Paper Items Example Cronbach’s α
(or R^2)
Perceived Hong & Tam 3 I would find Smart 0.88
Usefulness (PU) (2006) (cited Speaker to be useful in my
635) daily life.
Perceived Lowry, et al. 3 I would have fun using the 0.93 – 0.98
Entertainment (2012) Smart Speaker
(PE) (cited 116)
Perceived Ease of Venkatesh & 4 I find the Smart Speaker to 0.86 - 0.98
Use (PEU) Davis (2000) be easy to use
(cited 13829
times)
Apprehensiveness Kwom & 3 I would trust my data and R^2 = 0.02
(APP) Chidambaram (altered) information to be secure in
(2000) (cited a Smart Speaker
337)
Social Influence Cho (2011) 3 People who influence me 0.926
(SI) (cited 42) think I should use Smart
Speaker
35
Interface Gefen (2000) 4 I am familiar with 0.89
Familiarity (IF) (cited 3257) (slightly controlling a device with
altered) my voice
Use Intention (UI) Venkatesh, et al. 3 I plan to use this Smart 0.935
(2003) (cited Speaker in the future.
20083)
Web Skills (WS) Koufaris (2002) 3 I am very skilled at using 0.918
(cited 3062) the Web.
4.3. Control Variables
As already explored in the literature review, not a lot of research has been done specifically
on the acceptance of the Smart Speaker. In an effort to add control variables, there has been
looked at previous similar studies including the TAM. Based on that, three control variables
have been chosen: Age, Gender and Educational Level.
4.4. Limitations of the design
The design of the study is not longitudinal but cross-sectional (only a snapshot), meaning the
issue of reversed causality cannot be ruled out. The technology is not for sale in the
Netherlands, meaning the research will be dependent on the interpretation of the description
of the product, whereas also the actual use can not be properly measured, the intention
towards it’s use can and will be.
36
Considering the sample, the frame is not based on the complete population of potential users
of the Smart Speaker, thus the respondents will be found based on convenient sampling. This
means the generalizability and results are not guaranteed to be representative. Furthermore,
the response rate in previous researches has been fluctuating, meaning there is no guarantee
of a high response rate.
Finally, there is always a risk of common method bias and social desirability in answers due
to the self-created and reported survey. Hopefully this will be as low as possible due to the
relevance of the subject, the extra incentive (lottery of a Smart Speaker) and mentioning the
issues of common method bias and social desirability prior to the questions.
4.5. Tools for Analysis
The following tools have been used in order to make the analysis:
• Qualtrics: shaping, designing the online survey in order to gather the data of the
respondents;
• SPSS: the statistical program used to test, clean and analyze the data and give it
statistical meaning and;
• PROCESS: a model created as a plug-in for SPSS in order to analyze the mediated
effects in chapter 5.4..
37
5. Results
5.1 Demographics and response rate
5.1.1 Demographics
In Figures 10,11 and 12 the demographic data is presented. As can be seen, the educational
level and age are rather concentrated, whereas 80% of the respondents either have a
Bachelor’s or a Master’s degree and 80% of the respondents are between 18 and 25 years of
age.
Educational level Gender

1%
High school
9% degree
23% 10% Some college.
No degree 36% Male
Bachelor's 64% Female
degree
57%
Master' degree
Figure 10: Educational level of respondents Figure 11: Gender of the respondents
Age Respondents (years)
26-35 (16%)
36-45 (3%)
Other
18-25 (80%)
46-55 (1%)
Figure 12: The age groups of the respondents
38
5.1.1. Response rate
The survey was distributed amongst large networks; so it’s hard to measure how many people
actually saw it and thus making a real response rate. However, 182 people completed the
questionnaire, whereas 217 are registered to have started filling it out. This makes a 82%
“completion” rate.
5.2. Analytical Strategy
Before doing any analysis with the data in order to test hypotheses, a couple of checks and
modifications have been done.
5.2.1. Data
First, a frequency check was done, which pointed out there were no errors in any of the items.
Thereafter in some of the measurements there were missing values for certain variables,
which were were deleted listwise. Also, for the sake of having a better overview, the for the
analysis irrelevant data such as IP address and start date were ignored. Also, the variable
“Sex” has been recoded into “SexNew”. Whereas Male was 1, it is now 0 and for Female 2
has changed into 1, making the data nominal.
39
5.2.2. Normality
Based on the Kolmogorov-Smirnov & the Shapiro-Wilk output it is significantly tested that
none of the items are normally distributed (p<.05), as presented in Table 2.
Table 2: Normality check using Kolmogorov-Smirnov & Shapiro-Wilk tests
Kolmogorov-Smirnov Shapiro-Wilk
Statistic Significance Statistic Significance
Perceived .111 .000 .967 .000
Usefulness
Perceived .218 .000 .902 .000
Entertainment
Interface .123 .000 .942 .000
Familiarity
Perceived Ease of .142 .000 .955 .000
use
Apprehensiveness .123 .000 .954 .000
Web Skills .101 .000 .951 .000
Social Influence .124 .000 .964 .000
Use Intention .086 .003 .957 .000
40
5.2.3. Computing means
Before running analysis, new variables as a mean of the already existing items are created.
The means of all items (within one variable) were computed as a new set of data points. In the
case of this research the mean of all items used to describe a variable is calculated. For
instance the items: PE_1, PE_2 and PE_3 (all items concerning Perceived Entertainment)
were computed into one variable as a mean of all items of Perceived Entertainment and called
PE_TOT ((PE_1+PE_2+PE_3)/3 = PE_TOT). These computed means are then used to
analyze all direct and indirect effects that are hypothesized in paragraph 3.3..
5.2.4. Outliers check
The outliers are checked by standardizing the means of the variables, looking into the
frequencies of those standardized variables, looking at possible outliers (cases with z>|3|) and
finally examining the distribution whether these possible outliers are isolated cases. Outliers
within the variables Perceived Entertainment and Perceived Ease of Use are excluded. This
left the eventual sample size for data analysis N=171.
5.2.5. Reliability
The reliability analysis will be presented diagonally in the correlation matrix in Table 3
between brackets. Conclusion is that all Cronbach’s α’s are above 0.7, meaning the scales are
reliable.
41
5.2.6. Correlation
Table one shows a matrix of correlations between coefficients of the variables used in this
report. Strong, positive correlations can be seen between several variables (apart from the
control variables that are age, education level and sex). With for instance a strong correlation
between Perceived Usefulness and Use Intention. The important correlations for the further
analyses of the data are presented bold and underlined.
One data point that stands out in Table 3 is that the interpreted control variables do not
significantly correlate to the dependent variable. This suggests that using them in a control
model (in a hierarchical multiple regression) will not have any significant effect.
Table 3: Correlation matrix including means and Cronbach’s α
Means, Standard Deviation and Correlations
Variables M SD 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
1. Age 1,24 .59 -
2. Sex .66 .48 0.05 -
3. Edu 4.17 1.18 0.02 -.06 -
4. PU 4.58 1.31 .01 .169* -.06 (.89)
5. PEU 5.67 .75 -.16* -.08 -.03 .304** (.82)
6. PE 5.65 .97 -.10 -.01 -.002 .59** .40** (.92)
7. APP 3.56 1.59 -.073 -.065 .02 .42** .12 .24** (.93)
8. WS 5.78 .86 -.124 -.08 .01 .08 .44** .12 .07 (.75)
9. IF 4.68 1.64 -.08 -.003 -.08 .27** .37** .25** .20** .27** (.90)
10. SI 3.66 1.43 .04 .1 -.02 .50** .14 .31** .43** .07 .27** (.95)
11. UI 4.71 1.56 -.052 -.003 -.088 .62** .25** .57** .38** .10 .45** .53** (.94)
** Correlation is significant at the 0.01 level (2-tailed)
* Correlation is significant at the 0.05 level (2-tailed)
42
5.3. Data Analysis direct effects
In order to measure and explain the direct effects of all independent variables on Use
Intention, a multiple regression has been made. Since the expected control variables are not
correlated to the dependent variable (Use Intention) and the hierarchical multiple regression
showed no significance for the control variables, a regular multiple regression has been done
excluding the interpreted control variables. (See Appendix 5 for the outcome of the
hierarchical multiple regression, including the interpreted control variables in step 1 of the
model).
5.3.1. Result Multiple Regression Analysis (direct effects)
Tables 4a & 4b: Multiple Regression
Dependent variable: Adjusted R^2
Use Intention .573 ***
Independent variables Beta T Significance
Perceived Entertainment .298 4.587 .000
Web Skills -.028 -.496 .621
Perceived Usefulness .320 4.36 .000
Perceived Ease of Use -.085 -1.368 .173
Interface Familiarity .272 4.807 .000
Social Influence .179 2.892 .004
Apprehensiveness .062 1.074 .285
43
5.3.2. Conclusions from Multiple Regression Analysis
To examine the direct effects of the independent variables on the Use Intention of the Smart
Speaker, a multiple regression has been done. As can be seen in the Tables 4a and 4b in
paragraph 5.3.1. there are 4 variables having a significant direct effect on Use Intention.
These are Perceived Entertainment (β = 0.30, p < .001), Perceived Usefulness (β = 0.32, p <
.001), Interface Familiarity (β = 0.27, p < .001) and Social Influence (β = 0.18, p < .005). This
means for instance, if someone’s Perceived Entertainment increases with 1, the Use Intention
will increase with 0.30.
With this results 3 hypotheses can be significantly explained. For all these, the H0 hypotheses
can be rejected and therefore the following hypotheses accepted are:
H1: Perceived Usefulness has a direct positive effect on Use Intention.
H2: Perceived Entertainment has a direct positive effect on Use Intention.
H9: Social Influence has a direct effect on Use Intention.
Other then that, it is striking that also the Interface Familiarity has proven to have a direct
positive effect on Use Intention. Nevertheless, the on theory based H3: “Interface Familiarity
has an indirect, positive effect on Use Intention via Perceived Usefulness.” is tested in
chapter 5.4..
44
5.4. Data Analysis Indirect effects
The following indirect tests (a mediated effect) are analyzed with the method created by
Hayes (2013), of which the model (Model 4 as seen in Figure 13) used for this analysis is
presented in paragraph 5.4.1.. An example of the SPSS output testing this indirect effect (or
the amount of mediation) can be found In Appendix 6.
The following indirect effects and thus hypotheses are tested:
Usefulness.
Usefulness.
Entertainment.
Usefulness.
Entertainment.
H8. Web Skills has an indirect positive effect on Use Intention via Perceived Entertainment.
45
5.4.1. Results of the mediation analysis
The first indirect hypothesis has been explained more broadly as seen in the Table 5a and 5b.
In the first table two different steps are presented. First with the mediator as outcome,
measuring the connection between the independent variable and the mediator (A1 presented
in Model 4 in Figure 13). Thereafter the dependant variable is used as outcome; the effect of
the mediator (Perceived Usefulness) on the dependent variable (Use Intention) is analysed
(B1), as well as the direct effect of the independent variable (Interface Familiarity) on de
dependant variable (C1’). The second one, if significant, can be used to suggest a direct
effect.
Figure 13: Model 4 template for PROCESS for SPSS and SAS by Andrew F. Hayes and The Guilford Press
46
Table 5a: Indirect effect Interface Familiarity on Use Intention through Perceived Usefulness
First part: separate effects A1, B1, C1’
Consequent
(M) Perceived Usefulness (Y) Use Intention
Antecedent Coeff. SE p Coeff. SE P
IF (X) A1 .21 .059 <.001 C1’ .28 .06 <.001
PU (M) - - - - B1 .64 .07 <.001
Constant I1 3.6 .3 <.001 I2 .39 .37 .28
R2= .072 R2=.47
F(1,170) =13.2 , p<.001 F(2,169)= 73.9, p<.001
Table 5b as shown below calculates the total effect (C1) of the model and eventually the
indirect effect (A1B1), the effect that is of interest (also known as the mediated effect). If the
bootstrapped interval (taking 5,000 samples) is completely above zero, it can be stated there is
a significant indirect effect. This is presented as Boot LLCI and Boot ULVI in Table 5b.
47
Table 5b: Indirect effect Interface Familiarity on Use Intention through Perceived Usefulness
Second part: Total, direct and indirect effect
Effect (B) SE p LLCI ULCI
Total effect C1 .42 .07 <.001
Direct C1’ .28 .06 <.001
effect
Boot SE Boot LLCI Boot ULCI
Indirect A1B1 .14 .046 .047 .23
effect
For H3: Interface Familiarity has an indirect, positive effect on Use Intention via Perceived
Usefulness, the following can be concluded based on the results as presented in Table 5a and
5b. Using a least square path analysis, the mediation is analyzed, exploring the indirect effect
of Interface Familiarity on Use Intention through Perceived Usefulness. As Seen in table 5a,
if a person is familiar with the interface that is voice control, the Perceived Usefulness is also
estimated higher (A1 = .21, p < .001), the same goes for when a respondent has a higher
Perceived Usefulness, the Use Intention increases (as also already proven in the previous
chapter) (B1= .64, p < .001).
Finally, in order to measure the mediated effect (A1B1 = .14), a bias-corrected bootstrap
confidence interval has been done, based on 5,000 bootstrap samples and was entirely above
zero (.047 to .23) as seen in Table 5b. This indicates that people that are familiar with the
interface of voice control perceive the Smart Speaker to be more useful, which finally results
in a larger Use Intention.
48
The model also suggests a direct effect of Interface Familiarity on Use Intention (C1’ = .28, p
< .001). This is true; there is a direct effect as proven in multiple regression analysis in
paragraph 5.3.. Similar to the previous analysis on H3, H4 until H8 have been carried out. An
overview is given in Table 6.
Table 6: Overview of all hypotheses with indirect effects and output
Hypothesis A1B1, indirect C1’, direct C1

A1 (IV à M) B1 (M à DV)
IVàMàDV (IVà Mà DV) (IV à DV) (Total effect)
H3: b= .21, b=.64, b=.14, b=.28, b= .42,
IFà PUà UI p<.001 p<.001 Boot [.047 to .23] p<.001 p<.001
H4: b= .52, b= .71, b=.38, b= .15, b= .52,
PEUà PUà UI p<.001 p<.001 Boot [.19 to .60] p=.25 p<.001
H5: b= .51, b= .89, b=.45, b= .07, b= .53,
PEUà PEà UI p<.001 p<.001 Boot [.26 to .69] p= .61 p<.001
H6: b= .34, b= .69, b=.24, b= .13, b= .37,
APPà PUà UI p<.001 p<.001 Boot [.15 to .35] p=.04 p<.001
H7: b= .15, b= .83, b=.12, b= .25, b= .37,
APPà PEà UI p=.0017 p<.001 Boot [.04 to .22] p<.001 p<.001
H8: b= .12, b= .91, b=.11, b= .076, b= .18,
WSà PEà UI p=.18 p<.001 Boot [-.05 to .29] p=.52 p=.2
49
5.4.2. Conclusions from mediated effects analysis
The in paragraph 5.4.1. presented Table 6, explains very well how the hypothesis should be
concluded. For every hypothesized mediated effect a least square path analysis is used to
analyze all the indirect effects. For this, the main focus is the indirect effect with a bias-
corrected bootstrap confidence interval, based on 5,000 bootstrap samples. The requirement
for the indirect effect to be significant is that this interval has to be completely above zero.
This means that the same conclusion can as H3 be drawn for the following hypotheses; i.e.
the H0’s can be rejected, the hypotheses are accepted and are significantly proven:
Usefulness.
Usefulness.
Entertainment.
Usefulness.
Entertainment.
From Table 6 it can also be concluded that a significant relationship between the independent
variable and the mediator can be seen for each of these hypotheses (A1).
Something noticeable is that the strongest indirect effects are the indirect effects starting with
the Perceived Ease of Use; H4 (A1B1 = .38, Boot [.19 to .60]) and H5 (A1B1 = .45, Boot [.26
to .69]).
Furthermore, interesting is that in this model analyzing the indirect effect of H3, a direct
effect (C1’= .28, p < .001) of Interface Familiarity on Use Intention can be suggested. This
effect was already noticed in the multiple regression analysis in paragraph 5.3.1. The same
could be said for H7, suggesting a direct effect of Apprehensiveness on Use Intention (C1’=
.25, p < .001), but in this case the multiple regression in paragraph 5.3.1. proved otherwise.
50
Finally, for H8: Web Skills has an indirect positive effect on Use Intention via Perceived
Entertainment, the H0 can’t be rejected and thus the effect is not significantly proven. The
indirect effect (A1B1=.38, Boot [-.05 to .29]) is not significant since the interval contains
zero. Also the effect of Web Skills on Perceived Entertainment (A1=.12, P=.18) is neither
significant, nor is the total effect of the model (C1= .18, p=.2).
5.5. Outcome model
In Figure 14 an overview is given of all the relevant results, giving the answers to all the
hypotheses. As shown, Web Skills is the only “red” variable, since this is the only hypothesis
that is rejected.
Figure 14: Outcome model
51
6. Discussion
“What are motivations and perceptions that affect people’s intention of adopting the AI-based
Smart Speaker?” This research question has been answered (partly) with this research based
on the Technology Acceptance Model. Under some conditions and having a rather
concentrated response group, several motivations and perceptions can be significantly proven
to be influencing the intention of using the Smart Speaker. First the detailed interpretation of
the results will be explained and thereafter the limitations that were part of this study will be
elaborated upon.
6.1. Interpreting the Results
First of all, the Technology Acceptance Model and it’s further explorations formed the basis
for isolating the right variables and items, which in general were very good predictors for the
effects on the Use Intention (the dependent variable). Furthermore, the literature review on
the technologies that are an integrated part of the Smart Speaker provided a good motivation
to add those specific variables to the research model typically for this technology.
The hypotheses including the variables Perceived Usefulness, Perceived Entertainment
Perceived Ease of Use, Apprehensiveness and Social Influence acted as expected; the H0’s
were rejected and the hypotheses were accepted (H1, H2, H4, H5, H6, H7 and H9). The
research model (based on the literature review and introduction) as presented turned out to be
fit for purpose and offered an adequate framework for carrying this research. This means,
when looking back at the research question several motivations and perception of people are
found which appear to be affecting the Use Intention.
52
6.1.1. Interface Familiarity
One of the two hypotheses that are not mentioned in the discussion yet, is actually
significantly proven, showed by the result accepting H3 in paragraph 5.3.2.. Nevertheless, this
variable needs mentioning since the multiple regression analysis showed that Interface
Familiarity has a significant (p = .000) direct effect on Use Intention. This is interesting since
the original variable “Experience”, as seen in the TAM 2 model (Venkatesh & Davis, 2000)
as well as the UTAUT model (Venkatesh, et al., 2003), was always used as a moderator, not a
variable with a direct effect on Use Intention.
This can be due to the development of the Spoken Language Dialog Systems (the Smart
Speaker is such a system), explained by Bertrand et al. (2010) in Figure 2. The technology
review mentions the development of the human-computer interface, which is more and more
focused on solving problems and tasks. At the basis of the value of this SLDS is a very simple
and rational understanding (Prabhakar & Sahu, 2013); speech is the most essential, efficient
and primary way of communication for a human being. In this line of thought, typing in a
computer can be seen as an inefficient detour. This explains the confirmed third hypothesis;
the Interface Familiarity has an indirect effect on use Intention through Perceived Usefulness.
It could also explain the Interface Familiarity having a larger impact on Use Intention. This is
because speech as the way of communicating is potentially making the use of any system (not
only the Smart Speaker) more efficient, also stated in the previous mentioned research of
Prabhakar & Sahu (2013). Interesting for future research would be to investigate just the
perception of this new human-computer interface, regardless of the hardware this software is
embedded in.
53
6.1.2. Web Skills
The last hypothesis that needs mentioning is that of the indirect effect of Web Skills on Use
Intention through Perceived Entertainment (H8), which was rejected. Looking at the
correlation matrix in table 3 the correlation between Web Skills and Perceived Entertainment
was not significant which already suggests the effect is not there.
Now looking further at the correlation matrix, one correlation that stands out is that between
Web Skills and Perceived Ease of Use (which in it’s turn has a proven indirect effect on Use
Intention). Even though theory on the TAM doesn’t suggest such an effect, in Appendix 7
model 6 from PROCESS (Hayes, 2013) and output of the analysis of the following, multiple
mediated effect are presented: The indirect effect of Web Skills on Use Intention, through
Perceived Ease of Use and Perceived Usefulness. The output is significant (Effect Ind2=.155,
Boot [.07 to .28], interval based on 5,000 bootstrapped samples) which proves this effect is
happening and rationally this makes sense since certain Web Skills will influence one’s
Perceived Ease of Use. This suggests the outcome model should be closer to the in Figure 5
presented TAM 2 (Venkatesh & Davis, 2000). In this model Experience (or at least Web
Skill’s part of one’s experience) contributes to the effect on the Perceived Usefulness of a
technology (just like Perceived Ease of use).
54
6.2. Limitations
Apart from the limitations of the research design mentioned in paragraph 4.4., the following
limitations to this research are set out below.
First of all, the intended control variables were not significant, forcing this research not using
any of them and doing analysis without them. Furthermore, unfortunately several cases had to
be either removed or ignored in the analysis, as mentioned in paragraph 5.1.1., 217 people
started the questionnaire whereas 182 finished it. Then out of the 182 some outliers and cases
with missing data were deleted list wise. For the results this left 171 data points to analyze.
Reasons for this missing data could for instance be; sensitive questions (especially
Apprehensiveness questions), the survey being too lengthy or irrelevance of questions for
respondents.
The final important factor to mention is that looking at the demographic data, it can be seen it
is rather unbalanced. 80% of the respondents were peaking in age between 18 and 25 years
old and 80% of the respondents had either a Bachelor’s degree of a Master’s degree (57% had
a Bachelor’s degree and 23 % a Master’s degree). This makes the results and analysis not
representative for the whole population, which is the risk of having a non-probability,
convenience sample.
55
7. Conclusions
7.1. Contributions
The purpose of this thesis was to create a rational understanding of why a person would adopt
the Smart Speaker and why not. Once more, the following research question was formulated:
What are the main motivations and perception of people in their process of adopting the AI-
based Smart Speaker? Since the technology is rather young and very little research has been
done on the acceptance of the Smart Speaker, this study contributes to the understanding of
the motivations, perceptions and barriers in one’s intention to use a Smart Speaker as well as
an understanding of what a Smart Speaker is exactly.
Starting from the basic TAM a research model was created appropriate for the Smart Speaker.
Combining managerial reports (in the introduction), the literature review (both technology
and the evolution of the TAM) and the basis of the TAM, a number of variables were
identified and appropriate effects on the dependent variable (Use Intention) were determined.
By conducting a survey amongst a convenience-sampled group of respondents, the variables
were measured based on items of previous research with reliable variables and items. This
resulted in a data analysis of which the summary can be seen in Figure 14 in chapter 5.5..
As expected, based on literature, H1 until H7 and H9 were significantly proven in the results
(these hypotheses can be found in paragraph 3.3.). Apart from these hypotheses, 2
observations stood out; firstly H8 could not be significantly proven; i.e. there was no indirect
effect of Web Skills on Use Intention through Perceived Entertainment. Thereafter in the
discussion another effect was argued which could be interesting to consider (Web Skills
indirectly affecting Use Intention through Perceived Ease of Use and Perceived Usefulness).
56
Finally, the Interface Familiarity turned out to be an interesting factor. The original
hypothesis was significantly proven (indirect effect on Use Intention through Perceived
Entertainment) and apart from that a direct effect was found of Interface Familiarity on Use
Intention, which suggests a greater impact of this new human-computer interface.
7.1. Managerial Implications
In general, the understanding of what a Smart Speaker is and what technologies it contains
can be already used as implications, apart from the answers to the question of this research.
There are several factors proven as to be affecting ones intention to use this piece of
technology, both directly and indirectly, e.g. Social influence, Perceived Entertainment or
Interface Familiarity through Perceived Usefulness.
The hedonic factor is something the original models for acceptance of information systems
did not consider. This research has proven that Perceived Entertainment (Joy) does have an
impact on one’s Use Intention when it concerns the Smart Speaker. This could be of great
importance for any implication of the Smart Speaker.
A good example that is relevant for today is the effect of Apprehensiveness. As mentioned in
the introduction and later proven in the results, the amount of trust people have and the issues
they might have with regard to their privacy are an important factor for them to use or not use
a Smart Speaker. It is important to know this when working with this technology in whatever
context; e.g. developing an application for the Smart Speaker, using a Smart Speaker in any
context (Smart Home for instance) or developing the Smart Speaker as a technology.
57
Finally, it is important to understand the context in which you want to use the technology, i.e.
who will be using it, where is used and with what aim. A good example of a variable to take
into account when thinking about the context is the previously mentioned Interface
Familiarity and Web Skills (which for elderly might be lower then for younger people, or
different in a work context then at home). All these factors might be different for every
person and every context.
7.2. Future research
For further research the following suggestions can be done:
• Researching the Human-Computer interface that is SLDS as it is nowadays, not
constricted to the Smart Speaker but also in context such as cars and phones and the
acceptance of such Virtual Digital Assistants in general;
• Research in different contexts such as: different age groups such as the elderly (as
seen in the smart home theory, the research by Chan, et al. (2009)) or children who
have a higher pitch as seen with SMARTY (Montgomery, 2016) or in a work
environment where efficiency and usefulness are more valued;
• Focus on marketing, where entertainment is more important. How does one create an
advertisement on something that is off-screen? And how do the Smart Speakers work
in the “ecosystem” of the technological Big Four (e.g. Amazon and Google)?
• Interesting to look at the actual use of the Smart Speakers when they are released in
The Netherlands and what the influence of the language will be?
• Research with a more experimental design, let someone use the Smart Speaker for a
period of time and do a longitudinal study on the perceptions, motivations and
barriers.
58
8. Bibliography
Balta-Ozkan, N., Davidson, R., Bicket, M., & Whitmarsh, L. (2013). Social barriers to the
adoption of smart homes. Energy Policy , 63, 363-374.
Bellegarda, J. R. (2013). Spoken Language Understanding for Natural Interaction: The Siri
Experience. (J. Mariani, S. Rosset, M. Garnier-Rizet, & L. Devillers, Eds.) Natural
Interaction with Robots, Knowbots and Smartphones .
Cakebread, C. (2017). The Google Home Mini secretly recorded peoples' conversations and
played into a big fear about Smart Speakers . Retrieved June 1, 2018 from Business insider:
http://www.businessinsider.com/consumers-say-no-thanks-to-expensive-smart-speakers-
chart-2017-10
Chan, M., Campo, E., Esteve, D., & Fourniols, J. Y. (2009). Smart homes - current features
and future perspectives. Maturitas , 64 (2), 90-97.
Cho, H. (2011). Theoretical intersections among social influences, beliefs, and intentions in
the context of 3G mobile services in Singapore: Decomposing perceived critical mass and
subjective norms. Journal of Communication , 61 (2), 283-306.
Chowdhury, G. (2003). Natural Language Processing. Annual Review of Information Science
and Technology , 37, 51-89.
Cook, D. J., Augusto, J. C., & Jakkula, V. R. (2009). Ambient intelligence: Technologies,
applications and opportunities. Pervasive and Mobile Computing , 5 (4), 277-298.
59
Dahl, D. A. (2013). Natural Language Processing: Past, Present and Future. (A. Neustein, &
J. Markowitz, Eds.) Mobile Speech and Advanced Natural Language Solutions , 49-73.
Dale, R. (2016). The return of the chatbots. Natural Language Engineering , 22 (5), 811-817.
Davis, F. D., Bagozzi, R. P., & Warschaw, P. R. (1989). User acceptance of computer
technology: a comparison of two theoretical models. Management Science , 35 (8), 982-1003.
Gartner Inc. (2017, February). Control the Connected Home with Virtual Personal Assistants.
Retrieved February 2018 from Gartner: https://www.gartner.com/smarterwithgartner/control-
the-connected-home-with-virtual-personal-assistants/
Gartner Inc. (2016, October). Gartner Says Worldwide Spending on VPA-Enabled Wireless
Speakers Will Top $2 Billion by 2020. Retrieved Ferbruary 2018 from gartner:
https://www.gartner.com/newsroom/id/3464317
Gartner. (2017, August). Top Trends in the Gartner Hype Cycle for Emerging Technologies,
2017. Retrieved February 2018 from Gartner:
https://www.gartner.com/smarterwithgartner/top-trends-in-the-gartner-hype-cycle-for-
emerging-technologies-2017/
Gefen, D. (2000). E-commerce: the role of familiarity and trust. Omega , 28 (6), 725-737.
Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process
analysis: A regression-based approach. The Guilford Press.
Hong, S. J., & Tam, K. Y. (2006). Understanding the adoption of multipurpose information
appliances: The case of mobile data services. Information system research , 17 (2), 162-179.
60
Jaffe, J. (2018, January). Apple HomePod: Everything we know about the launch date, specs
and price. From cnet.com: Apple HomePod: Everything we know about the launch date,
specs and price
Jeffs, M. (2017, January). OK Google, Siri, Alexa, Cortana; Can you tell me some stats on
voice search? Retrieved Februari 2018 from Branded3:
https://www.branded3.com/blog/google-voice-search-stats-growth-trends/
Joshi, A. (1991). Natural Language Processing. Science , 253 (5025), 1242-1249.
Koo, H., S, K., & Nam, C. (2017). Speaker Wars begins: Which applications will be the killer
content for smart speaker? 14th International Telecommunications Society (ITS) Asia-Pacific
Regional Conference, (pp. 24-27). Kyoto.
Koufaris, M. (2002). Applying the technology acceptance model and flow theory to online
consumer behavior. Information systems research , 13 (2), 205-223.
Kwon, H. S., & Chidambaram, L. (2000). A test of the technology acceptance model: the case
of cellular telephone adoption. Proceedings of the 33rd Annual Hawaii International
Conference on. 1, pp. 7-pp. System Sciences.
Lerner, R. (2017, June). Smart Speaker Are The Future Of Audio. Retrieved February 2018
from Forbes: https://www.forbes.com/sites/rebeccalerner/2017/06/23/smart-speakers-are-the-
future-of-audio/#54b03e0966a9
Liddy, E. D. (2001). Natural Language Processing. In Encyclopedia of Library and
Information Science (2nd Edition ed.). Marcel Decker.
61
Lowry, P. B., Gaskin, J., Twyman, N., Hammer, B., & Roberts, T. (2012). Taking ‘fun and
games’ seriously: proposing the hedonic-motivation system adoption model (HMSAM).
Journal of the Association for Information Systems , 14, 617-671.
Marchick, A. (2017, January). The 2017 Voice Report by VoiceLabs. From VoiceLabs:
http://voicelabs.co/2017/01/15/the-2017-voice-report/
Montgomery, L. (2016, October). Amazon Alexa, Google Home: The Virtual Assistant Arms
Race. Electronic House. From Electronic House: https://www.electronichouse.com/smart
home/amazon-alexa-google-home-virtualassistant-arms-race/
Moorthy, A. E., & Vu, K. P. (2014). Voice activated personal assistant: Acceptability of use
in the public space. In International Conference on Human Interface and the Management of
Information (pp. 324-334). Cham: Springer.
Mun, Y. Y., & Hwang, Y. (2003). Predicting the use of web-based information systems: self-
efficacy, enjoyment, learning goal orientation, and the technology acceptance model.
International journaly of human-computer studies , 59 (4), 431-449.
NPR & Edsion Research. (2017). National Public Media. From The Smart Audio Report
from NPR and Edison Research Fall-Winter: http://nationalpublicmedia.com/smart-audio-
report-fall-winter-2017/
Pavlou, P. A. (2003). Consumer acceptance of electronic commerce: Integrating trust and risk
with the technology acceptance model. International journal of electronic commerce , 7 (3),
101-134.
62
Prabhakar, O., & Sahu, N. (2013). A survey on: Voice command recognition technique.
International Journal of Advanced Research in Computer Science And Software Engineering
, 3 (5), 576-585.
Qualtrics. (2018, June). Qualtrics Platform. From Qualtrics.com:
https://www.qualtrics.com/platform/
Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., et al. (2010).
“Your Word is my Command”: Google Search by Voice: A Case Study. In A. Neustein (Ed.),
Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics (pp. 61-90).
Boston, MA: Springer.
Simon, J., & Paper, D. (2007). User Acceptance of Voice Recognition Technology: An
Empirical Extension of the Technology Acceptance Model. JOEUC , 19, 24-50.
Venkatesh, V., & Davis, F. (2000). A theoretical extension of the Technology Acceptance
Model: Four longitudinal field studies. Management Science , 46 (2), 186-204.
Venkatesh, V., Morris, M., Davis, G., & Davis, F. (2003). User Acceptance of Information
Technology: Toward a Unified View. MIS Quarterly , 27 (3), 425-478.
Wueest, C. (2017). A guide to the security of voice-activated smart speakers. From Symantex:
https://www.symantec.com/blogs/threat-intelligence/security-voice-activated-smart-speakers
63
9. Appendices
Appendix 1: The Sun’s article: Echo Breach
Cops raid music fan’s flat after Alexa Amazon Echo device ‘holds a party on its own’ while
he was out. Oliver Haberstroh's door was broken down by irate cops after neighbours
complained about deafening music blasting from Hamburg flat.
A music fan has been left with a huge bill after his voice-operated Amazon Echo device
'threw a house party' while he was away.
Cops were forced to break into Oliver Haberstroh's flat in Hamburg, Germany, after
neighbours complained about deafening music blasting from inside - but found the apartment
empty.
Mr Haberstroh claims he walked out of his flat to meet a friend on Friday night after checking
that the lights and music were switched off.
He wrote on Facebook: "While I was relaxed and enjoying a beer, Alexa managed on her
own, without command and without me using my mobile phone, to switch on at full volume
and have her own party in my apartment"
"She decided to have it at a very inconvenient time, between 1.50am and 3am. My neighbours
called the police."
After knocking on the door, the officers called an expert to break the lock open - and refused
to hand over keys for the replacement until they'd been paid for the locksmith.
A police spokesman said the source of the noise was "a black jukebox which is usually
activated by voice control".
An Amazon spokesman said: "Working directly with the customer, we have identified the
reason for the incident.
"Echo was remotely activated and the volume increased through the customer’s third party
mobile music-streaming app.
64
"Although the Alexa cloud service worked flawlessly, Amazon has offered the customer to
cover the cost for the incident."
It comes just weeks after a mischievous parrot used Alexa to order itself a set of ten gift
boxes while the owner was away.
The Amazon Echo is an intelligent personal assistant, which allows owners to play music and
control the lighting and thermostat by voice command.
Source: https://www.thesun.co.uk/news/4873155/cops-raid-german-blokes-house-after-his-
alexa-music-device-held-a-party-on-its-own-while-he-was-out/
65
Appendix 2: Gartner’s Hype Cycles
Figure 15: Gartner's Hype Cycle 2016 (Gartner 2016)
Figure 16: Gartner's Hype Cycle 2017 (Gartner 2017)
66
Appendix 3: Description of the Smart Speaker Technology
The smart speaker (such as Google Home and Amazon Alexa and Apple HomePod as seen in
the pictures)
A hands-free speaker powered with digital voice assistant using two-way voice
computing technology (meaning you can talk to it by starting with a wake up word
such as “OK Google”, and it gives feedback by talking back) that is highly connected
and capable of tasks such as:
• Smart connection (think of light, curtains, thermostat)

• Search Online
• Entertain i.e. jokes, games
• Play music
• Assist: reading recipes, save to do lists, read sport fixtures, the weather, your
calendar, any traffic on the way to work
67
Appendix 4: Items for survey
In this Appendix all the variables and corresponding items used for the survey of this thesis
are presented (based on the items of the article in between brackets). The questions are also
asked in this order.
Perceived Entertainment (Lowry, et al., 2012)
• I would have fun using the Smart Speaker
• The Smart Speaker experience seems pleasurable
• I will find the Smart Speaker to be enjoyable
Web Skills (Koufaris, 2002)
• I am very skilled at using the Web
• I know how to find what I want on the Web
• I know more about using the Web than most users
Perceived Usefulness (Hong & Tam, 2006)
• I would find Smart Speaker to be useful in my daily life
• Using Smart Speaker would help me accomplish things more quickly
• Using Smart Speaker would increase my chances of achieving things that are
important to me
Perceived Ease of Use (Venkatesh, et al., 2003)
• I would find the Smart Speaker easy to use
• Learning to operate the Smart Speaker would be easy for me
• It would be easy for me to become skilful at using the Smart Speaker
• My interaction with the Smart Speaker would be clear and understandable
68
Interface Familiarity (Gefen, 2000)
• I am familiar with controlling a device with my voice
• I am familiar with searching with voice control
• I am familiar with voice assistants (such as Siri, Google Assistant or Alexa)
Social Influence (Cho, 2011)
• People who influence me think I should use Smart Speaker
• People who are important to me think I should use a Smart Speaker
• People whose opinion I value prefer me to use a Smart Speaker
Use Intention (Venkatesh, et al., 2003)
• I predict I will use a Smart Speaker in the future
• I plan to use a Smart Speaker in the future.
• I expect my use of a Smart Speaker to continue in the future
Apprehensiveness (Kwon & Chidambaram, 2000)
• I would be comfortable using a Smart Speaker for storing personal information
• I would trust my data and information to be secure in a Smart Speaker
• I would be comfortable about my privacy being affected by using a Smart Speaker.
69
Appendix 5: Hierarchical multiple regression
This table shows, when doing a hierarchical multiple regression the intended control variables
are not significant (underlined) within the control model (step 1). This means a hierarchical
multiple regression analysis will not provide the desired results. Thus, in the thesis a multiple
regression has been done, with no control variables whatsoever.
Reg R R^2 ΔR^2 B SE β t
Step 1 .104 .01
Age -.13 .21 -.05 -.65
Education -.12 .10 -.09 -1.13
Gender -.02 .25 -.01 -.07
Step 2 .68 .47*** .46***
Age -.04 .14 -.02 -.32
Education -.7 .07 -.05 -1.06
Gender -.36 .19 -.11 -1.87
Perceived .39 .09 .33*** 4.40
Usefulness
Perceived Ease of -.19 .13 -.09 -1.50
Use
Perceived .46 .11 .29*** 4.42
Entertainment
Apprehensiveness .05 .06 .05 .87
Web Skills -.05 .10 -.03 -.52
Interface .25 .05 .27*** 4.70
Familiarity
Social Influence .20 .07 .19** 3.00
Statistical significance: *p<.05; **p<.01;***p<.001
70
Appendix 6: PROCESS model 4 and output SPSS
Below given the outcome of the first mediated effect in this thesis. The indirect effect if
Interface Familiarity on use intention through Perceived Usefulness.
Run MATRIX procedure:
************* PROCESS Procedure for SPSS Release 2.16.3 ******************
Written by Andrew F. Hayes, Ph.D. www.afhayes.com
**************************************************************************
Model = 4
Y = UI_TOT
X = IF_TOT
M = PU_TOT
Sample size
172
**************************************************************************
Outcome: PU_TOT
Model Summary
R R-sq MSE F df1 df2 p
.2687 .0722 1.5931 13.2338 1.0000 170.0000 .0004
Model
coeff se t p
71
constant 3.5889 .2908 12.3406 .0000
IF_TOT .2132 .0586 3.6378 .0004
**************************************************************************
Outcome: UI_TOT
Model Summary
.6831 .4666 1.3347 73.9080 2.0000 169.0000 .0000
Model
coeff se t p
constant .3917 .3665 1.0688 .2867
PU_TOT .6471 .0702 9.2178 .0000
IF_TOT .2873 .0557 5.1588 .0000
************************** TOTAL EFFECT MODEL **************************
**
Outcome: UI_TOT
Model Summary
.4454 .1984 1.9939 42.0691 1.0000 170.0000 .0000
Model
coeff se t p
constant 2.7141 .3254 8.3419 .0000
IF_TOT .4253 .0656 6.4861 .0000
72
***************** TOTAL, DIRECT, AND INDIRECT EFFECTS ******************
**
Total effect of X on Y
Effect SE t p
.4253 .0656 6.4861 .0000
Direct effect of X on Y
Effect SE t p
.2873 .0557 5.1588 .0000
Indirect effect of X on Y
Effect Boot SE BootLLCI BootULCI
PU_TOT .1380 .0463 .0471 .2334
Partially standardized indirect effect of X on Y
PU_TOT .0877 .0279 .0309 .1426
Completely standardized indirect effect of X on Y
PU_TOT .1445 .0466 .0506 .2373
Ratio of indirect to total effect of X on Y
PU_TOT .3244 .0958 .1443 .5274
73
Ratio of indirect to direct effect of X on Y
PU_TOT .4802 .2459 .1687 1.1160
R-squared mediation effect size (R-sq_med)
PU_TOT .1144 .0466 .0343 .2182
Normal theory tests for indirect effect
Effect se Z p
.1380 .0410 3.3667 .0008
******************** ANALYSIS NOTES AND WARNINGS *********************
****
Number of bootstrap samples for bias corrected bootstrap confidence intervals:
5000
Level of confidence for all confidence intervals in output:
95.00
NOTE: Some cases were deleted due to missing data. The number of such cases was:
NOTE: Kappa-squared is disabled from output as of version 2.16.
------ END MATRIX -----
74
Appendix 7: PROCESS model 6 and output SPSS
Figure 17: Model 6 template for PROCESS for SPSS and SAS by Andrew F. Hayes and The Guilford Press
In the following outcome of SPSS, based on Model 6 in Figure 17, the indirect effect (Ind2)
of Web Skills on Use Intention through Perceived Ease of Use and Perceived Usefulness is
presented in red and is underlined.
75
Run MATRIX procedure:
************* PROCESS Procedure for SPSS Release 2.16.3 ******************
Written by Andrew F. Hayes, Ph.D. www.afhayes.com
**************************************************************************
Model = 6
Y = UI_TOT
X = WS_TOT
M1 = PEU_TOT
M2 = PU_TOT
Sample size
171
**************************************************************************
Outcome: PEU_TOT
Model Summary
.4170 .1739 .4695 35.5695 1.0000 169.0000 .0000
Model
coeff se t p LLCI ULCI
constant 3.5387 .3638 9.7272 .0000 2.8205 4.2568
WS_TOT .3706 .0621 5.9640 .0000 .2479 .4933
76
**************************************************************************
Outcome: PU_TOT
Model Summary
.3086 .0952 1.5658 8.8409 2.0000 168.0000 .0002
Model
constant 1.8736 .8298 2.2580 .0252 .2355 3.5117
PEU_TOT .5725 .1405 4.0752 .0001 .2951 .8498
WS_TOT -.0945 .1249 -.7571 .4500 -.3410 .1520
**************************************************************************
Outcome: UI_TOT
Model Summary
.6309 .3980 1.5137 36.8059 3.0000 167.0000 .0000
Model
constant .4151 .8281 .5013 .6168 -1.2198 2.0501
PEU_TOT .1068 .1448 .7379 .4616 -.1790 .3927
PU_TOT .7323 .0759 9.6534 .0000 .5825 .8820
WS_TOT .0584 .1230 .4749 .6355 -.1844 .3012
******************** DIRECT AND INDIRECT EFFECTS ***********************
**
77
Direct effect of X on Y
Effect SE t p LLCI ULCI
.0584 .1230 .4749 .6355 -.1844 .3012
Indirect effect(s) of X on Y
Total: .1257 .1151 -.0947 .3547
Ind1 : .0396 .0558 -.0598 .1632
Ind2 : .1554 .0544 .0705 .2857
Ind3 : -.0692 .0988 -.2721 .1141
Indirect effect key
Ind1 : WS_TOT -> PEU_TOT -> UI_TOT
Ind2 : WS_TOT -> PEU_TOT -> PU_TOT -> UI_TOT
Ind3 : WS_TOT -> PU_TOT -> UI_TOT
******************** ANALYSIS NOTES AND WARNINGS *********************
****
Number of bootstrap samples for bias corrected bootstrap confidence intervals:
5000
Level of confidence for all confidence intervals in output:
95.00
NOTE: Some cases were deleted due to missing data. The number of such cases was:
------ END MATRIX -----
78

Kruijff Robbert Willem de 11861452 MSC BA

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kruijff Robbert Willem de 11861452 MSC BA

Uploaded by

Copyright:

Available Formats

Technology Acceptance of the Smart Speaker

Exploring factors affecting the Use Intention of an

MSc. In Business Administration – Digital Business Track

University of Amsterdam – Amsterdam Business School

Supervisor: Andrea Ganzaroli

Robbert Willem de Kruijff 11861542

responsibility for the contents of this document.

work, not for the contents.

provided in the process.

This research contributes to the rational understanding of the acceptance of the

(based on Koo & Nam, 2017).”

HMSAM) including a case study using the TAM.

TAM, substantiated by the technology background, are explored by means of a survey

Entertainment, Perceived Usefulness and Perceived Ease of Use. Interesting observations

Speaker for a longer period of time).

1.1. A prosperous technology 9

1.2. Google, Amazon… 9

1.5. This Research 12

1.6. Adoption of Spoken Language Dialog System (SLDS) 13

2.1. Technology review 16

2.1.1. Spoken Language Dialog System 17

2.1.2. Voice Search and Smart Technologies 18

2.2. Evolution of the Technology Acceptance Model 21

2.2.1. Technology Acceptance Model 21

2.2.5. A previous case study 25

2.3. Summary of the literature review 26

3. Variables and Research Model 28

3.1.1. Perceived Usefulness 28

3.1.2. Perceived Ease of Use 29

3.1.4. Perceived Entertainment 30

3.1.6. Interface Familiarity 31

3.1.7. Web Skills 32

3.1.8. Use Intention 32

3.2. Research Model 33

4.3. Control Variables 36

4.4. Limitations of the design 36

4.5. Tools for Analysis 37

5.1 Demographics and response rate 38

5.1.1. Response rate 39

5.2. Analytical Strategy 39

5.2.3. Computing means 41

5.2.4. Outliers check 41

5.3. Data Analysis direct effects 43

5.3.1. Result Multiple Regression Analysis (direct effects) 43

5.4. Data Analysis Indirect effects 45

5.4.1. Results of the mediation analysis 46

5.4.2. Conclusions from mediated effects analysis 50

5.5. Outcome model 51

6.1. Interpreting the Results 52

6.1.1. Interface Familiarity 53

6.1.2. Web Skills 54

7.1. Managerial Implications 57

7.2. Future research 58

Appendix 1: The Sun’s article: Echo Breach 64

Appendix 2: Gartner’s Hype Cycles 66

Appendix 3: Description of the Smart Speaker Technology 67

Appendix 4: Adapted Smart Speaker Use Intention Measure 68

Appendix 5: Hierarchical multiple regression 70

Appendix 6: PROCESS model 4 and output SPSS 71

Appendix 7: PROCESS model 6 and output SPSS 75

HMSAM Hedonic-Motivation System Adoption Model

PEU Perceived Ease of Use

SLDS Spoken Language Dialog System