You are on page 1of 60

EXPLORING AVATAR &

CONSUMER BEHAVIOR
“Marketing Research Project”

Presented by:
Group 7
Aastha Gaur (H2021001)
Adarsh Srivastava (2021188)
Aditya Dhandhania (2021127)
Anurag Kumar (2021071)
Chandradeep Dessai (2021206)
Priyansh Verma (2021035)
Sarthak Sadani (2021175)
Presented to:

Dr. Anubhav Mishra


Table of Contents
INTRODUCTION ........................................................................................................................................................ 2
LITERATURE REVIEW ............................................................................................................................................. 3
METHODOLOGY ....................................................................................................................................................... 5
RESEARCH OBJECTIVE:.......................................................................................................................................... 6
QUALITATIVE RESEARCH ................................................................................................................................... ..8
Model and Interview Schedule for Qualitative Interview………………………………………………………9
Major Themes Generated from Qualitative In-Depth Interviews ........................................................................... .9
QUANTITATIVE RESEARCH ................................................................................................................................ .9
DATA ANALYSIS .................................................................................................................................................... 10
Demographic details of the respondents ................................................................................................................. 10
Exploratory factor analysis for grouping variables ................................................................................................ 10
Reliability Tests ...................................................................................................................................................... 13
Tests of Normality .................................................................................................................................................. 14
HYPOTHESIS TESTING .......................................................................................................................................... 16
Independent T – Tests ............................................................................................................................................. 16
One-way Analysis of Variance (ANOVA) ............................................................................................................. 17
Two-way ANOVA.................................................................................................................................................. 20
One Way Multiple Analysis of Variance (MANOVA) .......................................................................................... 22
Two-way Multivariate Analysis of Variance (MANOVA) .................................................................................... 26
SIMPLE LINEAR REGRESSION ......................................................................................................................... 34
MULTIPLE LINEAR REGRESSION ................................................................................................................... 35
DISCRIMINANT ANALYSIS .............................................................................................................................. 36
Cluster Analysis ...................................................................................................................................................... 45
K-Means Clustering ................................................................................................................................................ 46
FINDINGS AND CONCLUSION ............................................................................................................................. 57
APPENDIX 1: QUALITATIVE RESEARCH QUESTIONNAIRE ......................................................................... 58
Interview Schedule ..................................................................................................................................................... 59
APPENDIX 2: QUANTITATIVE QUESTIONNAIRE OF THE STUDY ............................................................... 59
References: ................................................................................................................................................................. 60
“Avatar and Consumer Behavior”
INTRODUCTION
Avatar refers to a pictorial representation in a chat environment. (Martin Holzwarth, 2006).
The use of these virtual agents to represent the company in various modes such as chatbots,
gaming, and virtual influencers has emerged a lot over the past decade or so. Companies
have used them to advertise, for customer service and to establish a brand user relationship.
Avatars have created an immersive experience for customers and should influence their
buying behavior of consumers. The goal of this study is to determine the relationship
between avatars and consumer behaviors. Also, to establish how the appearance, gender,
reliability, etc. of the avatar affect the consumers.
For the purpose of this study, the focus is established on chatbots(avatars on brand
websites) and the customer satisfaction part of marketing.

2|Page
LITERATURE REVIEW
• Hemp, P., 2006. Avatar-based marketing. Harvard business review, 84(6),
pp.48-57.
Three-dimensional, avatar-based virtual worlds like Second Life, the most well-
known and fastest-growing one, present a promising corporate communication
channel for interactive advertising, brand marketing, and advergaming. This study,
which draws on presence literature, investigates the effects of spokes-avatars'
presence (versus absence) and consumers' multimodal interactions with these
spokes-avatars on changes in the consumers' involvement with the product, attitude
toward the product, and enjoyment of the online shopping experience. Additionally,
this study looks into how consumers' assessments of spokes-avatars' physical
attractiveness and the informational value of the commercial message are affected
by their spokes-avatars' human versus non-human physical characteristics. The
spokes-avatars' physical appeal plays a mediating role, according to a route analysis.
• Miao, F., Kozlenkova, I.V., Wang, H., Xie, T. and Palmatier, R.W., 2022. An
emerging theory of avatar marketing. Journal of Marketing, 86(1), pp.67-90.
In modern marketing methods, avatars are becoming more and more common, yet
in actual use, their effectiveness for reaching performance outcomes (like purchase
likelihood) differs greatly. The related scholarly literature is disjointed and lacks
conceptual coherence as well as definitional consistency. The three key
contributions this article brings to managerial theory and practise are as follows.
This study first defines and critically assesses the term's essential conceptual
components, presents a definition based on this analysis, then provides a typology
of avatar design components to address the ambiguity surrounding its meaning. The
alignment of an avatar's form realism and behavioural realism, across diverse
circumstances, is said to provide a parsimonious explanation for avatar efficacy,
according to the suggested 2 2 avatar taxonomy. Third, the authors combine existing
research, business practises, and insights from essential avatar components to create
an emergent theory of avatar marketing. This framework incorporates significant
managerial implications, research hypotheses, and fundamental theoretical insights
for this developing field of marketing strategy. Finally, the authors present a study
plan that will be used to test the hypotheses and insights and further current research.
• BARIŞ, A., 2020. A New Business Marketing Tool: Chatbot. GSI Journals
Serie B: Advancements in Business and Economics, 3(1), pp.31-46.
To meet the wants and needs of consumers, marketing tactics have changed along
with the rapidly evolving and increasing technologies. The global use of the internet
and messaging platforms, as well as recent developments in AI and machine
learning, have motivated businesses to concentrate on chatbots. A chatbot is an AI-
based computer programme, a computer programme that can actively communicate
and converse with people. They are additionally known as virtual assistants who are
aware of human potential. Recent statistics show that chatbots are vital for the
brands' long-term survival. Findings are the major goal of the investigation. Find
out how organisations should use chatbots and how they can help to their marketing
efforts so they may be applied to customer communication. L'Oréal Paris' 2017

3|Page
chatbot called Beauty gifter is chosen as the case study. The study's findings indicate
that chatbots can be an excellent method of customer engagement, but firms should
pay close attention to how consumers think and use AI more effectively when
developing chatbots.
• Feine, J., Gnewuch, U., Morana, S. and Maedche, A., 2019, November. Gender
bias in chatbot design. In International Workshop on Chatbot Research and
Design (pp. 79-93). Springer, Cham.
According to a recent UNESCO survey, female voice-based conversational bots are
the most popular. It also describes the possible negative impacts this could have on
society. However, the paper largely focuses on voice-based conversational agents,
and chatbots were not included in the analysis (i.e., text-based conversational
agents). Researchers utilised an automated gender analysis approach to look into
three gender-specific design cues in the 1,375 chatbots listed on the website
chatbots.org. This is because chatbots might be gendered in their design. The gender
of the name was determined using two gender APIs, the gender of the avatar was
determined using a facial recognition API, and the gender-specific pronouns used
in the chatbot's description were examined using a text mining technique. The
findings imply that gender-specific cues are frequently incorporated into chatbot
design, and that the majority of chatbots are either expressly or implicitly built to
communicate a particular gender. More particular, the majority of the chatbots are
labelled as female chatbots and have female identities and avatars. Three application
domains in particular make this very clear (i.e., branded conversations, customer
service, and sales). Thus, discover proof that there is a propensity to favour one
gender (i.e., women) over another (i.e., male). As a result, it is shown that chatbots
in the wild were designed with a gender bias. The researchers develop ideas as a
starting point for future conversations and research to lessen the gender bias in
chatbot design based on these findings.

4|Page
METHODOLOGY
• Step 1: Before starting the project, we researched thoroughly many papers with the help
of EBSCO, Research gate, and Google scholar and studied the subject in detail.
• Step 2: After researching we made a questionnaire for conducting in-depth interviews to
gather as much information by asking open-ended questions. We conducted 5 IDIs per
group member leading to 35 IDI of respondents chosen from different backgrounds,
locations, ages, and gender.
• Step 3: After this, we transcribed the interviews verbatim and started analyzing for
codes, themes, and categories. This enabled us to encounter the most common factors
affecting the by performing a word cloud analysis and manual analysis for themes
emerging out of the data.
• Step 4: After this, we made our theoretical model for further carrying out the research.
Based on the model we prepared research objectives and hypotheses for our quantitative
part.
• Step 5: We identified different scales for measuring our constructs based on our
research from various Journals of Marketing.
• Step-6: Data was collected by floating the google form from a sample size of n = 207
across various age groups, professions, locations, and gender. We collated the responses
in excel format to further carry out the analysis in SPSS.
• Step 7: Data was cleaned, and correlation analysis was conducted followed by the
Cronbach alpha test of reliability to check if there is internal consistency between the
factors formed.
• Step 8: The next step involved running the t-test, ANOVA, MANOVA, Regression and
cluster analysis in SPSS to further explain the hypothesis.
• Step 9: With the given data a model was deduced with the help of Structural Equational
Modelling in AMOS. Confirmatory factor analysis and path analysis were conducted to
further comment on the model.
• Step 10: On the analysis of the significance of the above-conducted tests, we were able
to comment on the model and prove the hypothesis
In a nutshell, we followed the following steps: -
1. Literature Review
2. In-depth Interview
3. Coding and Theme
4. Hypothesis
5. Survey
6. Hypothesis testing
7. Findings

5|Page
RESEARCH OBJECTIVE
1. To understand the concept of chatbot and its AVATAR
2. To identify the effects of the avatar of the chatbot
3. To examine the effect of AVATAR of the chatbot on the consumer behavior
4. To analyze variation in consumer behavior regarding AVATAR based on
demographic factors such as Age, Gender, and Education levels

FACTORS RELATED TO AN AVATAR


1. Physical Appearance:
a) Gender – An Avatar can be in the form of a male or a female. A few organizations
which cater to a particular gender prefer the Avatar of that particular gender only.
For example – ZARA would prefer a female AVATAR since it caters to the female
population. Beardo can prefer a male AVATAR.

b) Form – Some organizations prefer an Avatar of a humanized form to give a


human touch while some of them prefer an animated form

c) Age – An Avatar can be in the form of a kid, middle-aged, or old person. depends
on the consumer base it targets.

d) Color – It can be either light-toned or heavy-toned, considering the color of most


of the population. Some organizations may prefer a black Avatar to promote
inclusivity.

e) Representation – A few chatbots only have names or only a picture or both as a


representation. A name depends on the population base the organization serves. A
picture can be an add-on or can suffice without a name. Some organizations prefer
both names and pictures.

f) Representation of culture – An organization can represent a culture through its


Avatar. The Avatar can cater to any specific population or not represent any culture
at all.

2. User Experience:
a) Dimension –An avatar can have 1-D, 2-D, or 3-D form. As time has progressed,
many organizations opt for a 3-D avatar.

b) Movement –A chatbot can be static or can be moved to any part of the screen.

c) Professionalism –A chatbot can be professional or can have a personal touch to


itself. For example Alexa

6|Page
d) Task-oriented – Generally, chatbots are known to be task-oriented, but there
are certain bots that have a mind of their own.

3. Ease of Use:
a) Mode of communication – A chatbot can write or speak or do both.
Organizations have started to incorporate both the written and spoken modes of
a bot nowadays.

b) Language – Some chatbots use English (universal language) to communicate.


Some can use a regional language in order to be more familiar.

4. Security/Privacy:
a) Trust – Since it is a fair innovation, a lot of people struggle with trusting the bot
and are apprehensive when it comes to sharing their confidential information.

b) Provider – The chatbot service can be provided by google or an independent


organization.

c) Control – A chatbot can be software controlled with already fed responses or it


can be human-controlled where the query is addressed via a human.

7|Page
QUANTITATIVE RESEARCH

Conceptual Model of Study

8|Page
DATA ANALYSIS
Information about the respondents' demographics
The demographic data of the respondent was also gathered in the survey along with
additional questions assessing the Avatar and consumer behavior. This will aid in learning
how various responder types view connecting with a chatbot. This will aid marketers in
developing effective tactics. The details of the demographic profile are shown in the table
below.
Demographic details of the respondents

Demographic Group Frequency Percentage


Male 127 61.352657
Gender Female 80 38.647343
Total 207 100
Below 18 10 4.830917874
Age 18-25 120 58.0
25-35 68 32.9
35-45 3 1.4
45-60 3 1.4
Above 60 3 1.4
Total 207 100
Education 12th and below 9 4.3
Graduation 119 57.5
Post Graduation 79 38.2
Total 207 100

Exploratory factor analysis using KMO and Bartlett's Test:


An exploratory factor analysis was performed to classify these assertions into meaningful
variables. Based on the correlation between the statements, the analysis reduced the number
of statements and categorized them into relevant categories.

KMO and Bartlett's Test


Kaiser-Meyer-Olkin Measure of Sampling .822
Adequacy.
Approx. Chi-Square 1459.097
Bartlett's Test of
df 190
Sphericity
Sig. .000
KMO Measure of Sampling Adequacy examines whether the data is consistent to run
Factor Analysis. A KMO score of .822 is considered good.

9|Page
Null Hypothesis for Bartlett’s Test: The correlation matrix is an identity matrix.

The significant Bartlett's Test p-value is less than 0.05 (p=0.001). With a p-value less than
0.05, the null hypothesis is rejected, indicating that the correlation matrix is not an identity
matrix and that there is some correlation between the variables. As a result, the data are
sufficient for factor analysis.

Total Variance Explained


Component Initial Eigenvalues Extraction Sums of Squared Loadings
Total % of Cumulative Total % of Cumulative
Variance % Variance %
1 5.667 28.333 28.333 5.667 28.333 28.333
2 2.109 10.547 38.881 2.109 10.547 38.881
3 1.612 8.059 46.940 1.612 8.059 46.940
4 1.430 7.151 54.091 1.430 7.151 54.091
5 1.066 5.328 59.419 1.066 5.328 59.419
6 1.008 5.038 64.456 1.008 5.038 64.456
7 .901 4.504 68.960
8 .808 4.042 73.002
9 .784 3.918 76.919
10 .632 3.160 80.080
11 .606 3.030 83.109
12 .521 2.605 85.715
13 .483 2.416 88.131
14 .447 2.236 90.367
15 .422 2.110 92.477
16 .399 1.996 94.473
17 .339 1.695 96.169
18 .307 1.534 97.703
19 .279 1.395 99.098
20 .180 .902 100.000
Extraction Method: Principal Component Analysis.

Eigenvalues show how much variance can be accounted for overall by a particular primary
component. Though in theory, they might be either positive or negative, in actuality they
always explain positive variance. It's positive if the eigenvalues are bigger than zero. All
the values are above zero hence all are acceptable. The total variance explained table
depicts that, there are a total of 20 components were used in the study.

10 | P a g e
Communalities
Initial Extraction
S1 1.000 .776
S2 1.000 .816
S3 1.000 .567
C1 1.000 .650
C2 1.000 .599
C3 1.000 .621
A1 1.000 .697
A2 1.000 .682
A3 1.000 .546
P1 1.000 .707
P2 1.000 .641
P3 1.000 .670
P4 1.000 .367
R1 1.000 .604
R2 1.000 .601
R3 1.000 .610
CS1 1.000 .716
CS2 1.000 .753
CS3 1.000 .587
CS4 1.000 .682
Extraction Method: Principal Component Analysis.

Small values indicate variables that do not fit well with the factor solution and should possibly
be dropped from the analysis. None of the values are less, hence none of it need to be dropped.

Factor Analysis of Factors Affecting Consumer Behavior


Item Factor
code Loading Item
S1 "I am fairly comfortable with sharing my personal information
.848
while talking to a chatbot."
S2 On a scale of 1-5, how comfortable will you be in discussing
.846
any sensitive topic with a chatbot?
S3 How likely are you to prefer your regional language over the English
.320
language (of the chatbot)?
C1 How much does the mode of communication with the chatbot
.756
(Written or Verbal) matters to you as a customer?
C2 How much does a machine replacing a human to address your queries
.683 affect you as
a customer?
C3 .690 How strongly will you prefer a human form avatar over a nonhuman one?
A1 .820 What impact does the gender of a chatbot has while interacting with one?

11 | P a g e
A2 "Name is not necessarily needed when conversing with a chatbot.
.806 A picture as a depiction suffices". How strongly do you agree with
this statement?
A3 .693 "Soft-colored chatbot is preferred while interacting with one."
P1 Rate the response quality you experienced the last time you interacted
.821
with a chatbot.
P2 .344 "Chatbots provide concise and precise solutions."
P3 .541 Experience in terms of the speed of responses coming from the chatbot?
P4 .464 "I consider the solutions provided by the chatbot."
R1 Would you want the chatbot to stick to professionalism over a personal
.746
approach?
R2 How much do you prefer a dynamic (in terms of expressions) chatbot
.716
over a fixed one?
R3 How would you rate your experience the last time you interacted with a
.342
chatbot?
CS1 Were your queries addressed precisely the last time you interacted with a
.821
chatbot?
CS2 "Chatbots are known for effectively addressing customer issues".
.841
How strongly do you agree with this statement?
CS3 How quickly were your queries resolved the last time you interacted with
.737
a chatbot?
CS4 .808 Do you prefer an animated form of the chatbot over a human form?

The factor analysis was conducted using the varimax rotation method. Factors like S2, P2,
and R3 did not correctly load to the assumed factors. Given that the construct was very
similar in nature, the research is being carried forward without removing any items.
Reliability Test:
The internal consistency of the variables is examined using the reliability test. The most
used test to determine dependability is Cronbach's alpha. The alpha value ranges from 0 to
1 and values greater than 0.7 are regarded as excellent. However, any value above 0.5 is
acceptable.
Reliability Test of The Factors Affecting Customer Satisfaction
No.
Item Cronbach's
Item of
Code Alpha
Items
"I am fairly comfortable with sharing my personal
S1 information while talking to a chatbot."
On a scale of 1-5, how comfortable will you be in discussing
S2 any sensitive topic with a chatbot?
S3 "I consider the solutions provided by the chatbot." 3 0.666
How likely are you to prefer your regional language over the
C1 English language (of the chatbot)?
How much does the mode of communication with the
C2 chatbot (Written or Verbal) matter to you as a customer? 3 0.612

12 | P a g e
Would you want the chatbot to stick to professionalism over
C3 a personal approach?
How much does a machine replacing a human to address
A1 your queries affect you as a customer?
How strongly will you prefer a human form avatar over a
A2 non-human one?
How much do you prefer a dynamic (in terms of
A3 expressions) chatbot over a fixed one? 3 0.702
What impact does the gender of a chatbot has while
P1 interacting with one?
"Name is not necessarily needed when conversing with a
chatbot. A picture as a depiction suffices". How strongly do
P2 you agree with this statement?
"Soft-colored chatbot is preferred while interacting with
P3 one."
Do you prefer an animated form of the chatbot over a human
P4 form? 4 0.568
Rate the response quality you experienced the last time you
R1 interacted with a chatbot.
R2 "Chatbots provide concise and precise solutions."
Experience in terms of the speed of responses coming from
R3 the chatbot? 3 0.602
How would you rate your experience the last time you
CS1 interacted with a chatbot?
Were your queries addressed precisely the last time you
CS2 interacted with a chatbot?
"Chatbots are known for effectively addressing customer
issues".
CS3 How strongly do you agree with this statement?
How quickly were your queries resolved the last time you
CS4 interacted with a chatbot? 4 0.866

Normality Test:
Normality tests determine if the data conform to a bell-shaped normal distribution. The
normalcy assumption must be met by the data for many statistical procedures. The
histogram curve, the values of KS or Shapiro Wilk, or other graphical methods can all be
used to visually verify the data's normality. The most popular is Shapiro Wilk. We want to
accept the null hypothesis that data meets the normality assumption (p>0.05) in normality
tests if the p-value is greater than 0.05, which indicates that the data conform to the
normality assumption.
The null hypothesis that the data are normal is violated in the below table since both the
Shapiro-Wilk and KS p values are less than 0.05. As a result, the null hypothesis is rejected.
We draw the conclusion that the normalcy assumption is not met by our data. Non-
parametric tests are used rather than parametric tests when the data is not normal. However,
it can be challenging to get normality in social science studies. The sample size may also

13 | P a g e
be one of the causes. The study's sample size, n = 207, is inadequate. 600 and up is a
reasonable size to test for normalcy. Even though the data defies the assumption of
normality, we nonetheless run parametric tests on them.

Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic Df Sig.
S_Mean .127 207 .000 .968 207 .000
C_Mean .092 207 .000 .980 207 .004
A_Mean
.126 207 .000 .968 207 .000
P_Mean .080 207 .003 .986 207 .041
R_Mean .088 207 .000 .977 207 .002
CS_Mean
.096 207 .000 .975 207 .001

14 | P a g e
HYPOTHESIS TESTING
1.T-test:

Group Statistics
Gender N Mean Std. Deviation Std. Error Mean
Mean_CS male 127 3.1280 .92307 .08191
female 80 3.3188 .95630 .10692

In group statistics, 1.0 denotes male and 2.0 denotes female while 127 & 80 represent their
sample respectively. The mean is high for female group.

Now moving down to actual independent sample test there are two rows which we can read
from, one is equal variance assumed and the other is equal variance not assumed.
After that we can see Levene’s test of equality of variances. Levene’s test will tests the null
hypothesis that the population variances are equal.
If we will accept the null hypothesis then we will use the equal variance assumed data. In
our case, level of significance is equal to 0.776 and we will accept the null hypothesis. As
it is above 0.05 and p value > the value of alpha. Therefore, it is a homogenous variance
and is called as homoscedasticity.
Now we have the significance (2- tailed) with 0.155.
Null hypothesis: There is no significance difference in terms of the means
Alternate hypothesis: There is significance difference in terms of the means
Here, we will accept the null hypothesis which means there is no significance difference
between the two groups.

15 | P a g e
2. One–way Anova:

1.0, 2.0, 3.0, 4.0, 5.0, 6.0 denotes the age group, and N represents the sample where the age
group of 18-25 are 120

Test of Homogeneity of Variances:

Mean_CS
Levene Statistic df1 df2 Sig.
.430 5 201 .828

Null hypothesis: There is the homogeneity of variances.


Alternate hypothesis: There is no homogeneity of variances.
For levenes’s test (test of homogeneity of variances) gives an insignificant result i.e. p-
value > alpha value (0.828 > 0.05).
We can say that there is a homogeneity of variance.
Now for the ANOVA table,

Mean_CS
Sum of
Squares df Mean Square F Sig.
Between
2.804 5 .561 .631 .676
Groups
Within Groups 178.588 201 .888
Total 181.392 206

16 | P a g e
Null hypothesis: There is no statistical difference between the overall satisfaction levels
and age
Alternate hypothesis: There is a statistical difference between the levels of overall
satisfaction and age.
P value > alpha value in this case (0.676 > 0.05) we will accept the null hypothesis and
there is no statistically difference between the levels of overall satisfaction and age.
With this there is no need to evaluate the multiple comparison table as we don’t need to see
the difference between the levels as we have accepted the null hypothesis.

Multiple Comparisons
Dependent Variable: Mean_CS
95% Confidence
Mean Interval
(I) Difference Std. Lower Upper
Age (J) Age (I-J) Error Sig. Bound Bound
Tukey HSD 1.0 2.0 -.04583 .31025 1.000 -.9385 .8469
3.0 .02426 .31924 1.000 -.8943 .9429
4.0 .42500 .62050 .983 -1.3604 2.2104
5.0 -.15833 .62050 1.000 -1.9438 1.6271
6.0 -.82500 .62050 .768 -2.6104 .9604
2.0 1.0 .04583 .31025 1.000 -.8469 .9385
3.0 .07010 .14307 .996 -.3416 .4818
4.0 .47083 .55097 .957 -1.1145 2.0562
5.0 -.11250 .55097 1.000 -1.6979 1.4729
6.0 -.77917 .55097 .718 -2.3645 .8062
3.0 1.0 -.02426 .31924 1.000 -.9429 .8943
2.0 -.07010 .14307 .996 -.4818 .3416
4.0 .40074 .55609 .979 -1.1994 2.0008
5.0 -.18260 .55609 .999 -1.7827 1.4175
6.0 -.84926 .55609 .647 -2.4494 .7508
4.0 1.0 -.42500 .62050 .983 -2.2104 1.3604
2.0 -.47083 .55097 .957 -2.0562 1.1145
3.0 -.40074 .55609 .979 -2.0008 1.1994
5.0 -.58333 .76963 .974 -2.7979 1.6312
6.0 -1.25000 .76963 .584 -3.4645 .9645
5.0 1.0 .15833 .62050 1.000 -1.6271 1.9438
2.0 .11250 .55097 1.000 -1.4729 1.6979
3.0 .18260 .55609 .999 -1.4175 1.7827
4.0 .58333 .76963 .974 -1.6312 2.7979
6.0 -.66667 .76963 .954 -2.8812 1.5479

17 | P a g e
6.0 1.0 .82500 .62050 .768 -.9604 2.6104
2.0 .77917 .55097 .718 -.8062 2.3645
3.0 .84926 .55609 .647 -.7508 2.4494
4.0 1.25000 .76963 .584 -.9645 3.4645
5.0 .66667 .76963 .954 -1.5479 2.8812
Games- 1.0 2.0 -.04583 .37479 1.000 -1.3482 1.2566
Howell 3.0 .02426 .38310 1.000 -1.2849 1.3334
4.0 .42500 .44261 .920 -1.1198 1.9698
5.0 -.15833 .68824 1.000 -3.5421 3.2255
6.0 -.82500 .72749 .846 -4.5590 2.9090
2.0 1.0 .04583 .37479 1.000 -1.2566 1.3482
3.0 .07010 .14294 .996 -.3431 .4833
4.0 .47083 .26375 .584 -1.2832 2.2248
5.0 -.11250 .58936 1.000 -4.7790 4.5540
6.0 -.77917 .63474 .806 -5.8372 4.2789
3.0 1.0 -.02426 .38310 1.000 -1.3334 1.2849
2.0 -.07010 .14294 .996 -.4833 .3431
4.0 .40074 .27544 .710 -1.1871 1.9886
5.0 -.18260 .59468 .999 -4.7105 4.3453
6.0 -.84926 .63969 .767 -5.7758 4.0773
4.0 1.0 -.42500 .44261 .920 -1.9698 1.1198
2.0 -.47083 .26375 .584 -2.2248 1.2832
3.0 -.40074 .27544 .710 -1.9886 1.1871
5.0 -.58333 .63465 .917 -4.4865 3.3198
6.0 -1.25000 .67700 .557 -5.5398 3.0398
5.0 1.0 .15833 .68824 1.000 -3.2255 3.5421
2.0 .11250 .58936 1.000 -4.5540 4.7790
3.0 .18260 .59468 .999 -4.3453 4.7105
4.0 .58333 .63465 .917 -3.3198 4.4865
6.0 -.66667 .85797 .958 -4.7477 3.4144
6.0 1.0 .82500 .72749 .846 -2.9090 4.5590
2.0 .77917 .63474 .806 -4.2789 5.8372
3.0 .84926 .63969 .767 -4.0773 5.7758
4.0 1.25000 .67700 .557 -3.0398 5.5398
5.0 .66667 .85797 .958 -3.4144 4.7477

18 | P a g e
3.Two–way Anova
We have descriptive analysis which has Gender & Education Qualification having two
levels and three levels each respectively and we can see the sample size.

Between-Subjects Factors
Value Label N
Gender 1.0 Male 127
2.0 Female 80
Education Qualification 2.0 class 12 or below 9
3.0 Bachelors
119
degree
4.0 post grad or above 79

The next table shows the descriptive statistics where it shows the mean for males and
females in terms of educational qualification.

19 | P a g e
Descriptive Statistics:

Dependent Variable: Mean_CS


Gender Education Qualification Mean Std. Deviation N
Male class 12 or below 2.7000 .54199 5
bachelors degree 3.1646 .93016 79
post grad or above 3.1105 .94696 43
Total 3.1280 .92307 127
Female class 12 or below 3.5625 1.08733 4
bachelors degree 3.1312 1.05458 40
post grad or above 3.5000 .80178 36
Total 3.3188 .95630 80
Total class 12 or below 3.0833 .89268 9
bachelors degree 3.1534 .96940 119
post grad or above 3.2880 .89969 79
Total 3.2017 .93837 207

Moving down to Leven’s test of equality of error variance,

Levene's Test of Equality of Error Variancesa

Dependent Variable: Mean_CS


F df1 df2 Sig.
.908 5 201 .477
Tests the null hypothesis that the error variance of the dependent variable is equal across
groups.
a.Design:
Intercept+Gender + EducationQualification + Gender * Education Qualification

.Null Hypothesis: There is a homogeneity of variances.


Alternate Hypothesis: There is no homogeneity of variances
As the p-value is 0.477 and is greater than the alpha value of 0.05 we would accept the null
hypothesis here.
This test shows the homogeneity of variances and it is equal among all groups.

20 | P a g e
Now, analyzing the table of test between subject’s effects there is no factor which has a
statistically difference between the dependent variable overall satisfaction & the
independent variable on gender and education qualification.

Gender
Dependent Variable: Mean_CS
Std. 95% Confidence Interval
Gender Mean Error Lower Bound Upper Bound
male 2.992 .151 2.693 3.290
female 3.398 .172 3.060 3.736

Then we have the grand mean of the overall satisfaction. Now we have multiple
comparisons where we can interpret that we don’t have any significant difference between
any education qualifications with any of the other education qualifications as p value >
alpha value.
4.One-Way MANOVA
MANOVA, which is an expansion of ANOVA, is used to analyse how two or more
continuous dependent variables are affected by a categorical independent variable. There
are many metric DVs here, compared to just one metric DV in an ANOVA.
Conditions and Assumptions required:
1) Independent observations
2) Normality of DVs
3) No outliers
4) DV is metric while IV is categorical
5) DVs must be correlated but not highly correlated
6) Linear relation between DVs and IVs
Education level’s relation with scores in customer satisfaction and response quality
parameters

21 | P a g e
Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Mean_CS .096 207 .000 .975 207 .001
Mean_R .087 207 .001 .978 207 .002
a. Lilliefors Significance Correction

22 | P a g e
As can be seen through Shapiro-Wilks’ normality test and the Q-Q plot, the data for both
the parameters is less than 0.05 and thus does not follow normality curve. This is due to
presence of outliers. Outliers though need not be removed in this case because the current
study is formed on Likert scale and presence of outlier is inevitable.

Correlations
Mean_CS Mean_R
Pearson Correlation 1 .720**
Mean_CS Sig. (2-tailed) .000
N 207 207
Pearson Correlation .720** 1
Mean_R Sig. (2-tailed) .000
N 207 207
**. Correlation is significant at the 0.01 level (2-tailed).

Test for correlation confirms that the DVs are positively correlated (0.720) but not too
highly correlated. This mean s that the DVs are not multicollinear. Hence the data satisfy
all the required conditions. Now the research can proceed to One-Way MANOVA for
hypothesis testing.

Main Null Hypothesis 1: There is no significant difference between the means linear
combination of customer satisfaction and response quality parameters with respect to
educational qualifications.

Null hypothesis 2: There’s no statistically significant difference between mean of


education qualification wrt. customer satisfaction parameter.

Null hypothesis 3: There’s no statistically significant differences between mean of


education qualification wrt. response quality parameter.

Descriptive Statistics
EducationQualification Mean Std. Deviation N
class 12 or below 3.0833 .89268 9
Mean_C bachelors degree 3.1534 .96940 119
S post grad or above 3.2880 .89969 79
Total 3.2017 .93837 207
class 12 or below 3.6296 .84071 9
FMean_ bachelors degree 3.2913 .81251 119
R post grad or above 3.4599 .72659 79
Total 3.3704 .78430 207

23 | P a g e
Descriptive stats provide us the information about variables wrt. numbers, mean and SD.
There’s a small difference in means of group as can be seen from the table, however,
statistical validity of how significant the difference between means is would be tested
through further analysis.

Box's Test of Equality of Covariance Matricesa


Box's M 4.799
F .757
df1 6
df2 2993.936
Sig. .603
Tests the null hypothesis that the observed covariance matrices of the dependent variables
are equal across groups.
a. Design: Intercept + EducationQualification

Box’s test is a test of multivariate homogeneity. It examines the equality of covariance.


Here, since the significance level p-value (0.603) is more than 0.001, we accept the null
hypothesis, i.e., there’s no significant difference between the covariance of the variables.
Thus, we can assume equal covariances.

Multivariate Testsa
Effect Value F Hypothes Error Sig. Partial Eta
is df df Squared
.868 665.69 2.000 203.0 .000 .868
Pillai's Trace b
3 00
Wilks' .132 665.69 2.000 203.0 .000 .868
b
Lambda 3 00
Intercept
Hotelling's 6.559 665.69 2.000 203.0 .000 .868
b
Trace 3 00
Roy's Largest 6.559 665.69 2.000 203.0 .000 .868
b
Root 3 00
.028 1.464 4.000 408.0 .212 .014
Pillai's Trace
00
b
Education Wilks' .972 1.461 4.000 406.0 .213 .014
Qualification Lambda 00
Hotelling's .029 1.458 4.000 404.0 .214 .014
Trace 00

24 | P a g e
Roy's Largest .024 2.403c 2.000 204.0 .093 .023
Root 00
a. Design: Intercept + EducationQualification
b. Exact statistic
c. The statistic is an upper bound on F that yields a lower bound on the significance level.

Testing for main null hypothesis 1: There is no significant difference between the means
linear combination of customer satisfaction and response quality parameters with respect
to educational qualifications.

The multivariate test table shows the result of One-way MANOVA. The
EducationQulification row’s Wilks’ Lambda parameters are to be considered. The Wilks’
Lambda value of 0.972 and sig p-value of 0.213 at 5% level of significance means that null
hypothesis needs to be accepted. The partial eta squared value of 0.014 means that 1.4% of
information about DVs is explained by education qualification. Thus, null hypothesis is to
be accepted, i.e., there is no significant difference between mean values of linear
combination of customer satisfaction and response quality parameters with respect to
educational qualifications.

Levene's Test of Equality of Error Variancesa


F df1 df2 Sig.
Mean_CS 1.055 2 204 .350
Mean_R .409 2 204 .665
Tests the null hypothesis that the error variance of the dependent variable is equal across
groups.
a. Design: Intercept + EducationQualification

Hypothesis for Levene’s test


H1 (null): the variance for customer satisfaction is same for all groups of edu.
Qualifications
H2 (null): the variation for response quality parameters is same for all groups of edu.
Qualifcaton
Levene’s test result show the significance value for both the DVs. Customer satisfaction
parameter (0.350) and response quality parameter (0.665), both are greater than p-value of
0.05, hence, it can be concluded that null hypothesis has to be accepted and that the error
variance of DVs is equal.

25 | P a g e
Tests of Between-Subjects Effects
Source Dependent Type III df Mean F Sig. Partial Eta
Variable Sum of Square Squared
Squares
Corrected Mean_CS .992a 2 .496 .561 .572 .005
Model Mean_R 1.982b 2 .991 1.621 .200 .016
686.370 1 686.370 776.1 .000 .792
Mean_CS
62
Intercept
815.314 1 815.314 1333. .000 .867
Mean_R
431
EducationQua Mean_CS .992 2 .496 .561 .572 .005
lification Mean_R 1.982 2 .991 1.621 .200 .016
Mean_CS 180.400 204 .884
Error
Mean_R 124.734 204 .611
Mean_CS 2303.313 207
Total
Mean_R 2478.111 207
Corrected Mean_CS 181.392 206
Total Mean_R 126.716 206
a. R Squared = .005 (Adjusted R Squared = -.004)
b. R Squared = .016 (Adjusted R Squared = .006)

Testing of hypothesis 2: There’s no statistically significant difference between mean of


education qualification wrt. customer satisfaction
Testing for hypothesis 3: There’s no statistically significant differences between mean of
education qualification wrt. response quality
It can be seen from the above table that the sig value for customer satisfaction (0.572) and
response quality (0.200) are more than 0.005 (p>0.005), thus null hypothesis is to be
accepted. i.e., education qualification is not significantly affecting the scores of customer
satisfaction and response quality
It can be concluded that there is no statistically significant difference between the group
means of education qualification wrt. customer satisfaction and response quality. Since
there is no significant difference, we would not be looking at the post hoc tests.

Mean_CS
EducationQualification N Subset
1
class 12 or below 9 3.0833
Tukey bachelors degree 119 3.1534
HSDa,b post grad or above 79 3.2880
Sig. .744
26 | P a g e
class 12 or below 9 3.0833
bachelors degree 119 3.1534
Scheffea,b
post grad or above 79 3.2880
Sig. .765
Means for groups in homogeneous subsets are displayed.
Based on observed means.
The error term is Mean Square(Error) = .884.
a. Uses Harmonic Mean Sample Size = 22.698.
b. Alpha = .05.

Mean_R
Education Qualification N Subset
1
bachelors degree 119 3.2913
Tukey post grad or above 79 3.4599
a,b
HSD class 12 or below 9 3.6296
Sig. .314
bachelors degree 119 3.2913
post grad or above 79 3.4599
Scheffea,b
class 12 or below 9 3.6296
Sig. .348
Means for groups in homogeneous subsets are displayed.
Based on observed means.
The error term is Mean Square(Error) = .611.
a. Uses Harmonic Mean Sample Size = 22.698.
b. Alpha = .05.

Since there is no significant difference in the groups, all groups have been grouped in one
subset.
6.Two-way Multivariate Analysis of Variance (MANOVA)
Two-way MANOVA is an extension of One-way MANOVA that helps us test two
categorical independent variables on multiple dependent variables
Conditions and assumptions required:
✓ DV is on metric scale and IV is categorical scale
✓ Normality of Data
✓ No multicollinearity
✓ No outliers
Since same DVs are taken as in One-way MANOVA testing, all the conditions are tested
previously and thus, normality is assumed and there is no multicollinearity.

27 | P a g e
Physical parameters and non-physical parameters tested on education qualification
and gender
Null Hypothesis 1: There is no significant difference between the mean of customer
satisfaction and response quality parameter for all gender-related groups
Null Hypothesis 2: There is no significant difference between the means of customer
satisfaction and response quality parameters for all education qualification levels
Null hypothesis 3: There is no significant difference between the means of customer
satisfaction and response quality parameters for the joint effect of all the gender groups and
education qualification levels.

Descriptive Statistics
Education Qualification Gender Mean Std. Deviation N
male 2.7000 .54199 5
class 12 or below female 3.5625 1.08733 4
Total 3.0833 .89268 9
male 3.1646 .93016 79
bachelors degree female 3.1313 1.05458 40
Total 3.1534 .96940 119
Mean_CS
male 3.1105 .94696 43
post grad or above female 3.5000 .80178 36
Total 3.2880 .89969 79
male 3.1280 .92307 127
Total female 3.3188 .95630 80
Total 3.2017 .93837 207
male 3.5333 .90062 5
class 12 or below female 3.7500 .87665 4
Total 3.6296 .84071 9
male 3.3460 .72686 79
bachelors degree female 3.1833 .96062 40
Total 3.2913 .81251 119
Mean_R
male 3.3566 .76772 43
post grad or above female 3.5833 .66368 36
Total 3.4599 .72659 79
male 3.3570 .74201 127
Total female 3.3917 .85153 80
Total 3.3704 .78430 207

28 | P a g e
Descriptive stats provide a description of the variable with the help of means, no. of
observations., and standard deviation. The mean values differ a bit as seen in the table,
though, statistical significance will be tested in coming tables.

Box's Test of Equality of Covariance Matricesa

Box's M 17.570
F 1.044
df1 15
df2 1206.080
Sig. .407
Box ‘s M test is used to see if the covariance matrices of the variables are equal across the
matrices. Here, the sig value (p value) is 0.407, with alpha of 0.001, we fail to reject the
null hypothesis. Hence, it can be concluded that population variances are equal.

Multivariate Testsa
Effect Value F Hypothes Error Sig. Partial
is df df Eta
Squared
.867653.86 2.000 200.0 .000 .867
Pillai's Trace
0b 00
Wilks' .133 653.86 2.000 200.0 .000 .867
Lambda 0b 00
Intercept
Hotelling's 6.539 653.86 2.000 200.0 .000 .867
Trace 0b 00
Roy's Largest 6.539 653.86 2.000 200.0 .000 .867
Root 0b 00
.032 1.631 4.000 402.0 .166 .016
Pillai's Trace
00
Wilks' .968 1.629b 4.000 400.0 .166 .016
EducationQualifi Lambda 00
cation Hotelling's .033 1.627 4.000 398.0 .167 .016
Trace 00
Roy's Largest .027 2.752c 2.000 201.0 .066 .027
Root 00
.022 2.223b 2.000 200.0 .111 .022
Pillai's Trace
00
Wilks' .978 2.223b 2.000 200.0 .111 .022
Gender
Lambda 00
Hotelling's .022 2.223b 2.000 200.0 .111 .022
Trace 00

29 | P a g e
Roy's Largest .022 2.223b 2.000 200.0 .111 .022
Root 00
.022 1.142 4.000 402.0 .336 .011
Pillai's Trace
00
b
Wilks' .978 1.139 4.000 400.0 .338 .011
EducationQualifi Lambda 00
cation * Gender Hotelling's .023 1.136 4.000 398.0 .339 .011
Trace 00
c
Roy's Largest .018 1.844 2.000 201.0 .161 .018
Root 00
a. Design: Intercept + EducationQualification + Gender + EducationQualification *
Gender
b. Exact statistic
c. The statistic is an upper bound on F that yields a lower bound on the significance level.

This table will be used to examine the null hypothesis of our research:
Null hypothesis 1: There is no significant difference between the mean of customer
satisfaction and response quality parameter for all gender related groups
Inference: Here, in the gender table, Wilks’ lambda sig value is 0.111 (p>0.05). hence, we
fail to reject the null hypothesis. It can be concluded that there is no significant difference
between mean values of customer satisfaction and response quality parameters for both the
gender groups.
Null Hypothesis 2: There is no significant difference between the means of customer
satisfaction and response quality parameters for all education qualification levels
Inference: We can see in the table that in education qualification’s wilks’ lambda row, the
sig. (p) value is 0.166 (p>0.05). Hence, we fail to reject eh null hypothesis. Thus, it can be
concluded that there is no significant difference between the means of customer satisfaction
and response quality parameters for all the education qualification levels.
Null hypothesis 3: There is no significant difference between the means of customer
satisfaction and response quality parameters for the joint affect of all the gender groups and
education qualification levels.
Inference: In the EducationQualification * Gender row’s Wilks’ lambda, the p value is
0.338 (p>0.05). It means that we failed to reject the null hypothesis. Thus, it can be
concluded that there’s no significant difference between the mean value of customer
satisfaction and response quality parameters for the joint effect of gender groups and
education qualification levels.

Levene's Test of Equality of Error Variancesa


F df1 df2 Sig.
Mean_CS .908 5 201 .477
Mean_R 1.244 5 201 .290

30 | P a g e
Tests the null hypothesis that the error variance of the dependent variable is equal across
groups.
a. Design: Intercept + EducationQualification + Gender + EducationQualification * Gender

By the Levene’s test, we can measure if the DVs have homogenous error variance. The p-
value for customer satisfaction parameter is 0.477 and for response quality parameter is
0.290, both of which are more than the alpha of 0.05. Hence, we accept the null hypothesis
and conclude that both DVs have equal error variance

Tests of Between-Subjects Effects


Source Dependent Type III df Mean F Sig. Partial
Variable Sum of Square Eta
Squares Squared
Mean_CS 5.648a 5 1.130 1.292 .269 .031
Corrected Model
Mean_R 3.796b 5 .759 1.242 .291 .030
682.100 1 682.100 780.12 .000 .795
Mean_CS
5
Intercept
799.473 1 799.473 1307.3 .000 .867
Mean_R
10
EducationQualifi Mean_CS 1.175 2 .588 .672 .512 .007
cation Mean_R 2.613 2 1.307 2.137 .121 .021
Mean_CS 2.757 1 2.757 3.153 .077 .015
Gender
Mean_R .146 1 .146 .239 .625 .001
EducationQualifi Mean_CS 3.104 2 1.552 1.775 .172 .017
cation * Gender Mean_R 1.807 2 .903 1.477 .231 .014
Mean_CS 175.744 201 .874
Error
Mean_R 122.920 201 .612
Mean_CS 2303.313 207
Total
Mean_R 2478.111 207
Mean_CS 181.392 206
Corrected Total
Mean_R 126.716 206
a. R Squared = .031 (Adjusted R Squared = .007)
b. R Squared = .030 (Adjusted R Squared = .006)

The above table examines individual DVs on Individual IVs. Since all the p values are
greater than 0.05, it can be concluded that there is no significant difference in means of DV
wrt. IVs and also with joint effect of both IVs

31 | P a g e
Multiple Comparisons
Dependent (I) (J) Mean Std. Sig. 95% Confidence
Variable EducationQu EducationQua Differen Error Interval
alification lification ce (I-J) Lower Upper
Bound Bound
bachelors -.0700 .3232 .974 -.8333 .6933
class 12 or degree 6
below post grad or -.2046 .3289 .808 -.9814 .5721
above 6
class 12 or .0700 .3232 .974 -.6933 .8333
Tukey bachelors below 6
HSD degree post grad or -.1346 .1357 .583 -.4550 .1858
above 0
class 12 or .2046 .3289 .808 -.5721 .9814
post grad or below 6
above bachelors .1346 .1357 .583 -.1858 .4550
Mean degree 0
_CS bachelors -.0700 .3232 .977 -.8672 .7272
class 12 or degree 6
below post grad or -.2046 .3289 .824 -1.0159 .6066
above 6
class 12 or .0700 .3232 .977 -.7272 .8672
bachelors below 6
Scheffe
degree post grad or -.1346 .1357 .612 -.4693 .2000
above 0
class 12 or .2046 .3289 .824 -.6066 1.0159
post grad or below 6
above bachelors .1346 .1357 .612 -.2000 .4693
degree 0
bachelors .3383 .2703 .424 -.3000 .9767
class 12 or degree 5
below post grad or .1697 .2751 .811 -.4799 .8193
above 2
class 12 or -.3383 .2703 .424 -.9767 .3000
Mean Tukey bachelors below 5
_R HSD degree post grad or -.1686 .1134 .300 -.4366 .0994
above 9
class 12 or -.1697 .2751 .811 -.8193 .4799
post grad or below 2
above bachelors .1686 .1134 .300 -.0994 .4366
degree 9

32 | P a g e
bachelors .3383 .2703 .458 -.3284 1.0050
class 12 or degree 5
below post grad or .1697 .2751 .827 -.5088 .8482
above 2
class 12 or -.3383 .2703 .458 -1.0050 .3284
bachelors below 5
Scheffe
degree post grad or -.1686 .1134 .334 -.4485 .1113
above 9
class 12 or -.1697 .2751 .827 -.8482 .5088
post grad or below 2
above bachelors .1686 .1134 .334 -.1113 .4485
degree 9
Based on observed means.
The error term is Mean Square(Error) = .612.

The above post hoc results also confirm that there is no difference in the means of groups
with all p values greater than 0.05.

7.SIMPLE LINEAR REGRESSION


Simple linear regression is conducted to examine the relationship between one continuous
dependent variable and one continuous independent variable.
Conditions:
1) Both DV and IV should be on metric scale
2) Normality of data
3) There shouldn’t be any significant outliers
4) Homoscedasticity
5) Linear relationship between DV and IV
Formula for regression: Y = B0 + B1X + e
Where, Y = Dependent Variable, B0 = intercept, B1 = regression coefficient, X =
Independent variable, e = error
Null Hypothesis: There is no significant impact of security factor on response quality
Assumptions: All assumptions for simple linear regression are satisfied.

Descriptive Statistics
Mean Std. Deviation N
Mean_R 3.3704 .78430 207
Mean_S 2.9243 .87280 207

33 | P a g e
Correlations
Mean_R Mean_S
Pearson Mean_R 1.000 .420
Correlation Mean_S .420 1.000
Mean_R . .000
Sig. (1-tailed)
Mean_S .000 .
Mean_R 207 207
N
Mean_S 207 207

The correlation between the variables is 0.420 which signifies a positive relationship
between variables.

Model Summaryb
Mode R R Square Adjusted R Std. Error of the Durbin-Watson
l Square Estimate
a
1 .420 .177 .173 .71345 2.021
a. Predictors: (Constant), Mean_S
b. Dependent Variable: Mean_R

The model summary shows the R and R-squared values. The R-value of (0.420) shows that
there is a positive correlation between the two variables. R-square indicates how much of
the total variation in response quality factor is explained by the security factor. The R-
square value of 0.177 means that 17.7% of the variation in response quality can be
explained by security parameters.

ANOVAa
Model Sum of df Mean Square F Sig.
Squares
Regression 22.369 1 22.369 43.945 .000b
1 Residual 104.347 205 .509
Total 126.716 206
a. Dependent Variable: Mean_R
b. Predictors: (Constant), Mean_S

The ANOVA table shows if the regression equation predicts the dependent variable. The
Sig p-value of (0.000) indicates that the regression model works.

34 | P a g e
Coefficientsa
Model Unstandardized Standardized t Sig.
Coefficients Coefficients
B Std. Error Beta
(Constant) 2.266 .174 13.042 .000
1
Mean_S .378 .057 .420 6.629 .000
a. Dependent Variable: Mean_R

Coefficients table provides information to create the predictor model for response quality
from security parameter and also examine whether security parameters significantly
contribute to the prediction model. The p-value of customer security parameter (0.00) is
less than 0.05, thus, we can say that non-physical parameters significantly affect the overall
satisfaction. The unstandardised B-value of .378 means that for every unit change in
security parameter, response quality will change by .378. Collinearity stats are not required
since only one IV is being used
Regression equation: Overall satisfaction = 2.266+0.378(non-physical parameters)
7.Discriminant Analysis:

Discriminant analysis is a classification technique. It classifies the observations into


groups. It is similar to ANOVA; however, in ANOVA we have a categorical independent
variable and metric dependent variable; in discriminant, we use an independent metric
variable and categorical dependent variable. The analysis aims to develop a linear
combination of independent variables that can discriminate the categories of the dependent
variable. This analysis or discriminant function will help in understanding the significant
differences between the groups.

Assumptions:
1. Linearity and Normality of the data
2. Equal variance amongst groups
3. No multicollinearity
4. Group membership must be mutually exclusive
5. Independent of observations

Analysis Case Processing Summary


Unweighted Cases N Percent
Valid 207 100.0
Missing or out-of- 0 .0
Exclude range group codes
d At least one missing 0 .0
discriminating variable

35 | P a g e
Both missing or out-of- 0 .0
range group codes and
at least one missing
discriminating variable
Total 0 .0
Total 207 100.0

The above table provides information on the number of observations and any missing data.
There are 207 observations that we collected. Hence the table shows those observations.
There are no missing or outrange groups in the data collected.

Group Statistics
Gender Mean Std. Valid N (listwise)
Deviation Unweighte Weighte
d d
Comfortability 2.9239 .83274 127 127.000
Ease_of_Communicati 3.4199 .75734 127 127.000
1.0 on
Avatar_Preference 3.6824 .74282 127 127.000
Past_Experience 3.1280 .92307 127 127.000
Comfortability 2.9250 .93829 80 80.000
Ease_of_Communicati 3.4333 .89285 80 80.000
2.0 on
Avatar_Preference 3.6583 .82502 80 80.000
Past_Experience 3.3188 .95630 80 80.000
Comfortability 2.9243 .87280 207 207.000
Ease_of_Communicati 3.4251 .81030 207 207.000
Total on
Avatar_Preference 3.6731 .77373 207 207.000
Past_Experience 3.2017 .93837 207 207.000

The above table provides the descriptive details of the data with mean, SD and number of
observations. The group statistics examines the difference between the groups of gender in
terms of four parameters. In this output, it can be seen that the means of independent
variables differ noticeably in each group of gender. These differences will allow using of
predictors to distinguish observations between the two groups. This table exhibits how far
the groups are variating. However, the statistical significance will be tested in further

36 | P a g e
analysis. In weighted values, the default weight is 1 for each observation, thus both
weighted and unweighted have equal observations

Tests of Equality of Group Means


Wilks' F df1 df2 Sig.
Lambda
Comfortability 1.000 .000 1 205 .993
Ease_of_Communicat 1.000 .013 1 205 .908
ion
Avatar_Preference 1.000 .047 1 205 .828
Past_Experience .990 2.039 1 205 .155

The tests of equality of group mean to measure each independent variable’s potential before
the model is created. Each test displays the result of a One-Way ANOVA for independent
variables using the grouping variable as the factor. If the significant value is greater than
0.10, the variable probably does not contribute to the model.

Pooled Within-Groups Matrices


Comfortabili Ease_of_Co Avatar_Prefe Past_Experie
ty mmunication rence nce
Comfortability 1.000 .409 .125 .510
Ease_of_Communic .409 1.000 .231 .349
Correlation ation
Avatar_Preference .125 .231 1.000 .092
Past_Experience .510 .349 .092 1.000

The above table shows the correlation between the predictor variables.

Box's Test of Equality of Covariance Matrices


Log Determinants
Gender Rank Log Determinant
1.0 4 -2.112
2.0 4 -1.674
Pooled within-groups 4 -1.897
The ranks and natural logarithms of determinants printed are those of the group covariance
matrices.

37 | P a g e
Box's M 9.479
Approx .926
df1 10
F
df2 132167.997
Sig. .507
Tests null hypothesis of equal population covariance matrices.

Box’s M tests the assumption of the equality of covariances across groups. Log
determinants are a measure of the variability of the groups. Larger log determinants
correspond to more variable groups. The rank column represents the number of independent
variables in the study. The study
has 4 IVs. Since the sig p-value is greater than 0.05, the null hypothesis of the equal
population of covariance matrices is accepted. Our data do not differ in their covariances
matrices.

Summary of Canonical Discriminant Functions

Eigenvalues
Functio Eigenvalu % of Cumulative Canonical Correlation
n e Variance %
a
1 .014 100.0 100.0 .117
a. First 1 canonical discriminant functions were used in the analysis.

The eigenvalues table provides information about the relative efficacy of each discriminant
function. The larger the eigenvalue, the function is able to explain more variance in the
dependent variable. This is a measure of goodness of fit. There are two categories in the
dependent variable, thus there is one discriminant function. The canonical correlation value
in the table is 0.117. A square of this value is to be taken that gives us the variance (.013)
value between the discriminant function and the dependent variable group. This is
presented in percentage as 1.3% expressed as the variance between the two groups of
dependent variable gender. This value provides the magnitude of the discrimination
function between the groups (the magnitude is very low).

38 | P a g e
Wilks' Lambda
Test of Function(s) Wilks' Chi- df Sig.
Lambda square
1 .986 2.784 4 .595

Wilks’ lambda is a measure of how well each function separates cases into groups. Smaller
values of Wilks’ lambda indicate the greater discriminatory ability of the function.

Standardized Canonical Discriminant Function Coefficients


Function
1
Comfortability -.536
Ease_of_Communication -.085
Avatar_Preference -.150
Past_Experience 1.166

The standardised coefficients are similar to standardised regression coefficients that


provide relative importance or hierarchy of the variable in predicting the dependent variable
groups. The cut-off value is + or – 0.3. Larger the coefficient, the greater the discriminating
ability.

Canonical Discriminant Function Coefficients


Function
1
Comfortability -.613
Ease_of_Communicati -.105
on
Avatar_Preference -.193
Past_Experience 1.245
(Constant) -1.129
Unstandardized coefficients

The above unstandardized canonical discriminant function coefficients are similar to


multiple regression unstandardized coefficients. These are used to construct the
discriminant prediction equation

Discriminant Function equation:

39 | P a g e
D (Gender): -1.129 +1.245 *Past_Experience -.193 * Avatar_Preference - .105 *
Ease_of_Communication -.613 * Comfortability

Functions at Group Centroids


Gender Function
1
1.0 -.093
2.0 .147
Unstandardized canonical discriminant
functions evaluated at group means

This table provides the cutting point for classifying cases. Any score below -.093 is Male
and any score above 0.147 is Female. For the values between these, take an average of the
values and anything above the average is female and anything below the value is male
(Only when group sizes are equal). If group size is unequal Z cutoff score is taken for
discrimination.

Classification Statistics

Prior Probabilities for Groups


Gender Prior Cases Used in
Analysis
Unweighte Weighted
d
1.0 .500 127 127.000
2.0 .500 80 80.000
Total 1.000 207 207.000

Classification Function Coefficients


Gender
1.0 2.0
Comfortability 1.242 1.095
Ease_of_Communication 2.815 2.790
Avatar_Preference 5.073 5.027
Past_Experience 1.740 2.039
(Constant) -19.384 -19.662
Fisher's linear discriminant functions

Classification Resultsa,c

40 | P a g e
Gender Predicted Group Total
Membership
1.0 2.0
1.0 63 64 127
Count
2.0 36 44 80
Original
1.0 49.6 50.4 100.0
%
2.0 45.0 55.0 100.0
1.0 62 65 127
Count
Cross- 2.0 42 38 80
b
validated 1.0 48.8 51.2 100.0
%
2.0 52.5 47.5 100.0
a. 51.7% of original grouped cases were correctly classified.
b. Cross validation is done only for those cases in the analysis. In cross-validation,
each case is classified by the functions derived from all cases other than that case.
c. 48.3% of cross-validated grouped cases were correctly classified.

How much accuracy the model gives can be checked using the above classification table.
This table will determine how well the discriminant function works. The table shows that
originally those classified as male are accurately classified as male by the function, whereas
those originally classified as female is not correctly classified as females. The hit ratio of
52% denotes that many of the cases are properly classified. Hence it can be concluded that
the discriminant function can discriminate at a level of 52% with a variance of 49%.
9.Multiple Regression

Multiple regression is a statistical technique for examining the relationship between a single
dependent variable and a number of independent variables. The goal of multiple regression
analysis is to use known independent variables to predict the value of a single dependent
variable.
In our model the dependent variable is Customer Experience and the independent variables
are Precision of Responses, Effectiveness of addressing query and Openness in sharing
personal information

CS1: Customer Experience


R2: Precision of Responses
CS3: Effectiveness of addressing query
S1: Openness in sharing personal information

The following tables depict the output obtained

Descriptive Statistics

Mean Std. N
Deviation

41 | P a g e
CS1 3.261 1.0792 207
R2 3.068 1.1680 207
CS3 3.164 1.0937 207
S1 2.758 1.1233 207

The above shows the Mean and Std Deviation of all the variables existing in the model

Correlations
CS1 R2 CS3 S1
Pearson CS1 1.000 .506 .597 .272
Correlation R2 .506 1.000 .531 .153
CS3 .597 .531 1.000 .206
S1 .272 .153 .206 1.000

The Pearson correlation for variables is less than 0.7. There doesn’t exist any
multicollinearity between the Independent Variables. The Independent and Dependent
variables are weakly correlated which makes them fit for the model.

Model Summaryb

Mo R R Adjusted Std. Change Statistics Durbin-


del Squar R Square Error of Watson
e the
Estimate R Square F df1 df2 Sig. F
Change Chang Change
e
.653a .426 .417 .8237 .426 50.20 3 203 .000 2.129
1 1

a. Predictors: (Constant), S1, R2, CS3

b. Dependent Variable: CS1

The multiple correlation coefficient obtained is 0.653. This model has R square value of 0.426
which suggests that 42.6% of the variance in dependent variable is explained by the independent
variable. An explanation of 42.6% of the variance by the independent variable can be considered
as a good output.

42 | P a g e
ANOVAa
Model Sum of df Mean F Sig.
Squares Square
Regression 102.181 3 34.060 50.201 .000b
1 Residual 137.732 203 .678
Total 239.913 206
a. Dependent Variable: CS1
b. Predictors: (Constant), S1, R2, CS3

The Anova results depict a F ratio and Significance value to test the overall fit of regression
model. Significance value of less than 0.05 enables us to reject the null hypothesis which
assume that all the regression coefficients are 0. Hence, we can say that at least of the
coefficient is not equal to 0 at a confidence level of 95%.

Coefficientsa

Model Unstandardized Standardize t Sig. 95.0% Confidence


Coefficients d Interval for B
Coefficients

B Std. Error Beta Lower Upper


Bound Bound
(Constan .808 .218 3.713 .000 .379 1.237
t)
.235 .058 .254 4.048 .000 .121 .349
R2
1
.426 .063 .432 6.809 .000 .303 .550
CS3

.139 .052 .144 2.653 .009 .036 .242


S1

a. Dependent Variable: CS1

Unstandardised coefficients (B) in the above table depict how much the dependent variable
varies with the individual independent variable when rest of the variables are kept constant.
All the independent variables are statistically significant (have a sig value of >0.05). On
basis of this value, we can conclude that coefficients are statistically significantly different
from 0. The constant value in this regression model obtained is 0.808.

43 | P a g e
Therefore, our regression equation can be written down as:

CS1 = 0.235*R2 + 0.426*CS3 + 0.139*S1 + 0.808

Customer Experience = 0.235* Precision of Responses


+ 0.426* Effectiveness of addressing query + 0.139* Openness in sharing personal
information
+ 0.808

Cluster Analysis

Cluster analysis is one of the most fundamental, simple, and very often used methos of
understanding and learning grouping of objects into similar groups based on their
characteristics. This procedure employs a variety of algorithms and methods to create
clusters of a similar type. It is also used in statistical analysis as part of data management.

We try to group a set of objects that have similar attributes, these groups are referred to as
clusters. Since it is relatively difficult to learn the properties of each individual object or
participant we instead try and categorise them into simpler and similar object groups and
have a common structure of properties that the group adheres to.

Few assumptions associated with Cluster Analysis are


1) The variables chosen are a comprehensive representation of the underlying
construct
2) It is also assumed that the sample is a good representative of the population

There are 3 types of Clustering techniques


1) Hierarchical Clustering method is used when there is no information on the number
of clusters that we need to form
2) K-Means Clustering method is used when there is awareness on the number of
clusters needed to be formed
3) Step Wise Clustering method is used to explore the natural clustering ability of the
data set and validate results for the same

Hierarchical Clustering

In this we have chosen Customer Experience, Effectiveness of addressing queries,


Precision of Solution, and Response Speed as the input parameters for formation of clusters

Agglomeration Schedule

Stage Cluster Combined Coefficient Stage Cluster First Next


s Appears Stage

44 | P a g e
Cluster Cluster Cluster 1 Cluster 2
1 2
1 183 206 .000 0 0 15
2 200 205 .000 0 0 5
3 181 202 .000 0 0 17
4 197 201 .000 0 0 7
5 6 200 .000 0 2 34
6 162 198 .000 0 0 34
7 3 197 .000 0 4 38
8 180 192 .000 0 0 18
9 170 189 .000 0 0 157
10 147 188 .000 0 0 49
11 171 187 .000 0 0 26
12 184 186 .000 0 0 14
13 174 185 .000 0 0 23
14 78 184 .000 0 12 104
15 11 183 .000 0 1 53
16 143 182 .000 0 0 53
17 15 181 .000 0 3 29
18 23 180 .000 0 8 48
19 172 179 .000 0 0 25
20 158 178 .000 0 0 38
21 139 176 .000 0 0 57
22 167 175 .000 0 0 29
23 36 174 .000 0 13 32
24 85 173 .000 0 0 104
25 31 172 .000 0 19 43
26 50 171 .000 0 11 41
27 156 169 .000 0 0 40
28 164 168 .000 0 0 32
29 15 167 .000 17 22 85
30 155 166 .000 0 0 41
31 148 165 .000 0 0 48
32 36 164 .000 23 28 47
33 113 163 .000 0 0 81
34 6 162 .000 5 6 75
35 117 161 .000 0 0 78
36 25 160 .000 0 0 179
37 125 159 .000 0 0 70
38 3 158 .000 7 20 137
39 129 157 .000 0 0 67
40 26 156 .000 0 27 118
41 50 155 .000 26 30 63

45 | P a g e
42 149 154 .000 0 0 47
43 31 153 .000 25 0 172
44 79 152 .000 0 0 107
45 145 151 .000 0 0 51
46 133 150 .000 0 0 63
47 36 149 .000 32 42 108
48 23 148 .000 18 31 142
49 2 147 .000 0 10 71
50 120 146 .000 0 0 75
51 13 145 .000 0 45 93
52 119 144 .000 0 0 76
53 11 143 .000 15 16 58
54 5 142 .000 0 0 156
55 138 141 .000 0 0 58
56 136 140 .000 0 0 60
57 68 139 .000 0 21 167
58 11 138 .000 53 55 66
59 60 137 .000 0 0 121
60 16 136 .000 0 56 91
61 57 135 .000 0 0 175
62 124 134 .000 0 0 71
63 50 133 .000 41 46 125
64 54 132 .000 0 0 168
65 130 131 .000 0 0 66
66 11 130 .000 58 65 106
67 4 129 .000 0 39 73
68 122 128 .000 0 0 73
69 76 127 .000 0 0 108
70 9 125 .000 0 37 92
71 2 124 .000 49 62 95
72 102 123 .000 0 0 91
73 4 122 .000 67 68 83
74 95 121 .000 0 0 97
75 6 120 .000 34 50 101
76 39 119 .000 0 52 123
77 101 118 .000 0 0 92
78 98 117 .000 0 35 159
79 111 115 .000 0 0 83
80 22 114 .000 0 0 172
81 28 113 .000 0 33 160
82 97 112 .000 0 0 95
83 4 111 .000 73 79 120
84 108 109 .000 0 0 85

46 | P a g e
85 15 108 .000 29 84 138
86 61 107 .000 0 0 120
87 100 106 .000 0 0 93
88 48 105 .000 0 0 177
89 80 104 .000 0 0 106
90 46 103 .000 0 0 159
91 16 102 .000 60 72 119
92 9 101 .000 70 77 126
93 13 100 .000 51 87 141
94 63 99 .000 0 0 118
95 2 97 .000 71 82 116
96 91 96 .000 0 0 101
97 1 95 .000 0 74 109
98 65 94 .000 0 0 116
99 64 93 .000 0 0 117
100 87 92 .000 0 0 161
101 6 91 .000 75 96 145
102 37 88 .000 0 0 161
103 75 86 .000 0 0 109
104 78 85 .000 14 24 164
105 49 82 .000 0 0 166
106 11 80 .000 66 89 144
107 7 79 .000 0 44 162
108 36 76 .000 47 69 171
109 1 75 .000 97 103 143
110 58 72 .000 0 0 123
111 55 71 .000 0 0 125
112 21 70 .000 0 0 144
113 35 69 .000 0 0 137
114 53 67 .000 0 0 126
115 62 66 .000 0 0 119
116 2 65 .000 95 98 139
117 27 64 .000 0 99 129
118 26 63 .000 40 94 167
119 16 62 .000 91 115 124
120 4 61 .000 83 86 130
121 14 60 .000 0 59 136
122 29 59 .000 0 0 142
123 39 58 .000 76 110 170
124 16 56 .000 119 0 168
125 50 55 .000 63 111 128
126 9 53 .000 92 114 127
127 9 52 .000 126 0 164

47 | P a g e
128 50 51 .000 125 0 169
129 27 47 .000 117 0 171
130 4 45 .000 120 0 170
131 33 44 .000 0 0 139
132 38 43 .000 0 0 136
133 20 42 .000 0 0 145
134 34 41 .000 0 0 138
135 30 40 .000 0 0 141
136 14 38 .000 121 132 165
137 3 35 .000 38 113 140
138 15 34 .000 85 134 152
139 2 33 .000 116 131 148
140 3 32 .000 137 0 155
141 13 30 .000 93 135 146
142 23 29 .000 48 122 173
143 1 24 .000 109 0 157
144 11 21 .000 106 112 151
145 6 20 .000 101 133 173
146 13 18 .000 141 0 169
147 12 17 .000 0 0 148
148 2 12 .000 139 147 150
149 8 10 .000 0 0 150
150 2 8 .000 148 149 174
151 11 207 1.000 144 0 174
152 15 203 1.000 138 0 175
153 190 199 1.000 0 0 199
154 110 195 1.000 0 0 190
155 3 193 1.000 140 0 176
156 5 177 1.000 54 0 191
157 1 170 1.000 143 9 182
158 77 126 1.000 0 0 192
159 46 98 1.000 90 78 182
160 28 90 1.000 81 0 178
161 37 87 1.000 102 100 190
162 7 84 1.000 107 0 183
163 81 83 1.000 0 0 201
164 9 78 1.000 127 104 187
165 14 74 1.000 136 0 177
166 49 73 1.000 105 0 185
167 26 68 1.000 118 57 181
168 16 54 1.000 124 64 180
169 13 50 1.000 146 128 184
170 4 39 1.000 130 123 181

48 | P a g e
171 27 36 1.000 129 108 187
172 22 31 1.000 80 43 179
173 6 23 1.000 145 142 180
174 2 11 1.071 150 151 186
175 15 57 1.100 152 61 184
176 3 89 1.111 155 0 183
177 14 48 1.167 165 88 195
178 19 28 1.250 0 160 191
179 22 25 1.333 172 36 185
180 6 16 1.433 173 168 186
181 4 26 1.458 170 167 193
182 1 46 1.650 157 159 192
183 3 7 1.650 176 162 194
184 13 15 1.769 169 175 194
185 22 49 1.833 179 166 197
186 2 6 1.873 174 180 196
187 9 27 1.923 164 171 195
188 116 196 2.000 0 0 201
189 191 194 2.000 0 0 199
190 37 110 2.000 161 154 202
191 5 19 2.067 156 178 193
192 1 77 2.192 182 158 200
193 4 5 2.299 181 191 197
194 3 13 2.300 183 184 196
195 9 14 2.433 187 177 198
196 2 3 2.555 186 194 198
197 4 22 2.604 193 185 203
198 2 9 2.833 196 195 200
199 190 191 3.000 153 189 204
200 1 2 3.874 192 198 202
201 81 116 4.000 163 188 204
202 1 37 5.631 200 190 203
203 1 4 6.478 202 197 205
204 81 190 7.500 201 199 206
205 1 204 9.020 203 0 206
206 1 81 12.705 205 204 0

The Agglomeration Schedule shows the step wise manner in which the clustering process
is done. It shows which all clusters are combined at every step and the total error resulting
from the solution. To identify the optimal number of clusters we need to check for a

49 | P a g e
significant amount of jump in the error. This showcases that two different clusters have
been combined together.
The highest jump in error is witnessed from 4 to 5.631. Therefore, the optimal number of
clusters that can be formed are 207-202= 5

Dendrogram using Average Linkage (Between Groups)

50 | P a g e
The Dendrogram is used to group the objects together and create clusters. The graph is
read form left to right and is used to visually categorise major clusters present in the data
set.

K-Means Clustering
K-Means Clustering analysis is used to check conformity of the number of clusters formed.
Here we choose the same 4 parameters as used in Hierarchical clustering. The parameters
on basis of which the clusters will be formed are Customer Experience, Effectiveness of

51 | P a g e
addressing queries, Precision of Solution, and Response Speed of the chat bot. The number
for clusters for which the confirmatory test is being run is chosen as 5.
CS1: Customer Experience
CS3: Effectiveness of addressing queries
R2: Precision of Solution
R3: Response Speed

Final Cluster Centers

Cluster
1 2 3 4 5
CS1 3.3 3.1 3.7 1.8 4.2
CS3 2.8 2.8 4.0 1.8 4.4
R2 3.6 2.5 2.3 1.8 4.4
R3 3.9 2.7 4.2 4.0 4.3
The above table shows the final cluster centres of the obtained solution for confirmatory
analysis. It is computed using mean of each variable of within the individual cluster
confirmed

ANOVA

Cluster Error F Sig.


Mean df Mean df
Square Square
CS1 34.819 4 .498 202 69.889 .000
CS3 45.446 4 .320 202 142.034 .000
R2 49.615 4 .409 202 121.341 .000
R3 15.410 4 .532 202 28.973 .000
The F tests should be used only for descriptive purposes because the clusters have been
chosen to maximize the differences among cases in different clusters. The observed
significance levels are not corrected for this and thus cannot be interpreted as tests of
the hypothesis that the cluster means are equal.
All the 4 parameters, Customer Experience, Effectiveness of addressing queries, Precision
of Solution, and Response Speed have a significance value of les than 0.05. This allows us
to reject the null hypothesis and confirm that all the 4 variables are significant in classifying
the clusters at confidence level of 95%

52 | P a g e
Number of Cases in each
Cluster

1 48.000
2 38.000
Cluster 3 32.000
4 39.000
5 50.000
Valid 207.000
Missing .000

Th above table shows the number of cases that belong to individual cluster. Nearly 24% of
the sample belongs to 5th cluster while the 15% of them fall under cluster 3 which is the
least.
2 Step Clustering
In this method we do exploratory analysis to identify any naturally existing clusters in the
dataset. For our analysis the variables we chose for clustering are Age, Education Level,
Gender and Customer Experience. Following were the results we obtained
The below tables show the number wise and percentage wise distribution of data in
individual cluster based on that particular variable

Customer Experience

1 2 3
Frequenc Frequenc Frequenc
Percent Percent Percent
y y y
1 10 66.70% 8 23.50% 10 16.40%
2 0 0.00% 0 0.00% 30 49.20%
3 0 0.00% 17 50.00% 0 0.00%
Cluster 4 0 0.00% 0 0.00% 11 18.00%
5 5 33.30% 9 26.50% 10 16.40%
Combi 100.00 100.00 100.00
15 34 61
ned % % %

4 5

Frequency Percent Frequency Percent

1 0 0.00% 15 71.40%
Cluster 2 0 0.00% 0 0.00%
3 28 36.80% 0 0.00%

53 | P a g e
4 35 46.10% 0 0.00%
5 13 17.10% 6 28.60%
Combined 76 100.00% 21 100.00%

Age

Below 18 18-25 25-35


Frequenc Frequenc Frequenc
Percent Percent Percent
y y y
1 0 0.00% 20 16.70% 22 32.40%
2 5 50.00% 23 19.20% 0 0.00%
3 0 0.00% 25 20.80% 16 23.50%
Cluster 4 1 10.00% 24 20.00% 20 29.40%
5 4 40.00% 28 23.30% 10 14.70%
Combin 100.00 100.00 100.00
10 120 68
ed % % %

35-45 45-60 Above 60


Frequenc Frequenc Frequen
Percent Percent Percent
y y cy
1 0 0.00% 1 33.30% 0 0.00%
2 2 66.70% 0 0.00% 0 0.00%
3 0 0.00% 2 66.70% 2 66.70%
Cluste
4 1 33.30% 0 0.00% 0 0.00%
r
5 0 0.00% 0 0.00% 1 33.30%
Combine 100.00 100.00 100.00
3 3 3
d % % %

Gender

Male Female
Frequenc Percent Frequenc Percent
y y
1 36 28.3% 7 8.8%
2 29 22.8% 1 1.3%
3 45 35.4% 0 0.0%
Cluster 4 17 13.4% 29 36.3%
5 0 0.0% 43 53.8%
Combine 127 100.0% 80 100.0%
d

54 | P a g e
Educational Qualification

12th Grade or Less Bachelor’s Degree


Post-Graduation or
Beyond
Frequenc Percent Frequenc Percent Frequenc Percent
y y y
1 0 0.0% 19 16.0% 24 30.4%
2 5 55.6% 16 13.4% 9 11.4%
3 1 11.1% 44 37.0% 0 0.0%
Cluster 4 0 0.0% 0 0.0% 46 58.2%
5 3 33.3% 40 33.6% 0 0.0%
Combine 9 100.0% 119 100.0% 79 100.0%
d

The model summary shows the total number of input variables were 4 and the overall
number of clusters obtained were 5. The quality of cluster was deemed to fair.

55 | P a g e
RESULTS & CONCLUSION

By applying T-test we realised there is no significance difference when it comes customer


satisfaction between male and female.
One Way Anova showed us there is no statistical difference in customer satisfaction
within different age group
By performing Two Way Manova it was understood that there is no significant level of
mean difference in customer satisfaction of different level in Educational Qualification
and Gender
One Way Manova results implies that there is no statistical difference between the group
means of different Educational Level wrt Customer Satisfaction and Response Quality
Using Two Way Manova we derived that there is no statistically significant difference in
customer satisfaction and response quality wrt different level educational level, within
different genders as well as educational level and gender taken together
Simple linear regression was applied to generate a regression model with Response as
dependent variable and Security as independent variable: Response = 2.266 +
0.378(Security).
Multiple linear regression was used to generate a regression model with Customer
Experience as dependent variable and Precision of Responses, Effectiveness of
addressing query and Openness in sharing personal information as independent variables:
Customer Experience = 0.235(Precision of Responses) + 0.426(Effectiveness of
addressing query) + 0.139(Openness in sharing personal information) + 0.808
By applying hierarchical cluster analysis on variables such as Customer Experience,
Effectiveness of addressing queries, Precision of Solution, and Response Speed we were
able to group the respondents in 5 different clusters. The same was confirmed using K-
Means clustering and it was found that all the variables contribute significantly in the
classification.
Two Step Clustering was performed on Age, Educational Level, Gender and Customer
Experience as variables. The model gave us a classification of 5 clusters and cluster
quality was deemed to be fair.

60 | P a g e
APPENDIX 1: QUALITATIVE RESEARCH QUESTIONNAIRE

Interview Schedule Objective: To examine the effect of AVATAR of the chatbot on the
consumer behavior
Method: In-depth Interview (Virtual / Personal)
Interviewers: Group 7 A
Duration: Approximately 10-20 mins
Interview Questions
Taking Participant Consent: I introduce myself as_________, student at Goa Institute of
Management. I am currently conducting a study on AVATAR of a chatbot for Marketing
Research course project assignment. So, in this regard your valuable inputs will be of great
help. This interview will take approximately 10 to 20 mins and will be recorded for my
further analysis. I promise that the recorded interview conversation, your name and
personal details will be kept confidential. Further, the participation in the study is voluntary
and no monetary benefit or reward is given. There is no physical, psychological or social
risk involved in the study. However, if you feel loaded due to the interview questions at
any moment during interview process you are free to stop and exit at any given point of
time. There is no compulsory obligation to participate or complete the interview process.
So, May I please know, if you are willing to be a part of the study and provide your
voluntary consent in participating in this interview and be a respondent for my research
study. (Voluntary consent will also be obtained through email) Basic Info: Age ______;
Gender ______; Education ______

The questionnaire was divided into four major parts i.e. Ease of Use, User Experience,
Security and Physical Appearance.

1. Security/Privacy
a) How strongly will you as a customer consider suggestions provided by a chatbot?
b) How comfortable are you sharing your personal information while interacting with
a chatbot?
c) On a scale of 1-5 how comfortable will you be in discussing any sensitive topic with
a chatbot?

2. Form of communication
a) How much does the mode of communication (Written or Verbal) matters to you
as a customer?
b) How much do you want the chatbot to stick to professionalism over personal
approach?

57 | P a g e
c) How likely are you to prefer your regional language over English language (of
the chatbot)?

3. Anthropomorphic
a) How much does a machine replacing human to address your queries affect you as a
customer?
b) How much do you prefer a movable (in terms of expressions) chatbot over a static
one?
c) How strongly will you prefer a human form avatar over a non-human one?

4. Physical Appearance
a) How much impact does gender of a chatbot have while interacting with it?
b) "A picture as a depiction of an Avatar suffices. It doesn't necessarily have to have a
name assigned to it". How strongly do you agree with this statement?
c) Do you have any preference towards animated form or human form of chatbot
avatar?
d) "Mild color chatbot is preferred while interacting with a chat bot" How strongly do
you agree with this statement?

5. Response Quality
a) Rate the response quality you experienced the last time you interacted with a chatbot
b) How concise was the response you received the last time you interacted with a
chatbot?
c) How was your experience in terms of speed of replies you received from the chat
bot?

6. Customer Satisfaction
a) How good was your experience when you had last interacted with a chatbot?
b) How quickly were your queries resolving the last time you interacted with a chatbot?
c) How precisely were your queries addressing the last time you interacted with a
chatbot?
d) "Chatbots are known for effectively addressing customer issues" How strongly do
you agree with this statement?

Thank you very much for your valuable time and inputs.

58 | P a g e
REFERENCES

1.Padhy, K. C. (2006). Book review: Harvard Business School Publishing Corporation. Asia
Pacific Business Review, 2(2), 110–110. https://doi.org/10.1177/097324700600200219

2.Miao, F., Kozlenkova, I. V., Wang, H., Xie, T., & Palmatier, R. W. (2021). An emerging
theory of avatar marketing. Journal of Marketing, 86(1), 67–90.
https://doi.org/10.1177/0022242921996646

3. Garnier, M., & Poncin, I. (2013). The avatar in marketing: Synthesis, integrative framework
and Perspectives. Recherche Et Applications En Marketing (English Edition), 28(1), 85–
115. https://doi.org/10.1177/2051570713478335

4. Rosenkrans, G. (2009). The creativeness and effectiveness of online interactive rich media
advertising. Journal of Interactive Advertising, 9(2), 18–31.
https://doi.org/10.1080/15252019.2009.10722152

60 | P a g e

You might also like