Professional Documents
Culture Documents
Group 7 - Section B - MR
Group 7 - Section B - MR
CONSUMER BEHAVIOR
“Marketing Research Project”
Presented by:
Group 7
Aastha Gaur (H2021001)
Adarsh Srivastava (2021188)
Aditya Dhandhania (2021127)
Anurag Kumar (2021071)
Chandradeep Dessai (2021206)
Priyansh Verma (2021035)
Sarthak Sadani (2021175)
Presented to:
2|Page
LITERATURE REVIEW
• Hemp, P., 2006. Avatar-based marketing. Harvard business review, 84(6),
pp.48-57.
Three-dimensional, avatar-based virtual worlds like Second Life, the most well-
known and fastest-growing one, present a promising corporate communication
channel for interactive advertising, brand marketing, and advergaming. This study,
which draws on presence literature, investigates the effects of spokes-avatars'
presence (versus absence) and consumers' multimodal interactions with these
spokes-avatars on changes in the consumers' involvement with the product, attitude
toward the product, and enjoyment of the online shopping experience. Additionally,
this study looks into how consumers' assessments of spokes-avatars' physical
attractiveness and the informational value of the commercial message are affected
by their spokes-avatars' human versus non-human physical characteristics. The
spokes-avatars' physical appeal plays a mediating role, according to a route analysis.
• Miao, F., Kozlenkova, I.V., Wang, H., Xie, T. and Palmatier, R.W., 2022. An
emerging theory of avatar marketing. Journal of Marketing, 86(1), pp.67-90.
In modern marketing methods, avatars are becoming more and more common, yet
in actual use, their effectiveness for reaching performance outcomes (like purchase
likelihood) differs greatly. The related scholarly literature is disjointed and lacks
conceptual coherence as well as definitional consistency. The three key
contributions this article brings to managerial theory and practise are as follows.
This study first defines and critically assesses the term's essential conceptual
components, presents a definition based on this analysis, then provides a typology
of avatar design components to address the ambiguity surrounding its meaning. The
alignment of an avatar's form realism and behavioural realism, across diverse
circumstances, is said to provide a parsimonious explanation for avatar efficacy,
according to the suggested 2 2 avatar taxonomy. Third, the authors combine existing
research, business practises, and insights from essential avatar components to create
an emergent theory of avatar marketing. This framework incorporates significant
managerial implications, research hypotheses, and fundamental theoretical insights
for this developing field of marketing strategy. Finally, the authors present a study
plan that will be used to test the hypotheses and insights and further current research.
• BARIŞ, A., 2020. A New Business Marketing Tool: Chatbot. GSI Journals
Serie B: Advancements in Business and Economics, 3(1), pp.31-46.
To meet the wants and needs of consumers, marketing tactics have changed along
with the rapidly evolving and increasing technologies. The global use of the internet
and messaging platforms, as well as recent developments in AI and machine
learning, have motivated businesses to concentrate on chatbots. A chatbot is an AI-
based computer programme, a computer programme that can actively communicate
and converse with people. They are additionally known as virtual assistants who are
aware of human potential. Recent statistics show that chatbots are vital for the
brands' long-term survival. Findings are the major goal of the investigation. Find
out how organisations should use chatbots and how they can help to their marketing
efforts so they may be applied to customer communication. L'Oréal Paris' 2017
3|Page
chatbot called Beauty gifter is chosen as the case study. The study's findings indicate
that chatbots can be an excellent method of customer engagement, but firms should
pay close attention to how consumers think and use AI more effectively when
developing chatbots.
• Feine, J., Gnewuch, U., Morana, S. and Maedche, A., 2019, November. Gender
bias in chatbot design. In International Workshop on Chatbot Research and
Design (pp. 79-93). Springer, Cham.
According to a recent UNESCO survey, female voice-based conversational bots are
the most popular. It also describes the possible negative impacts this could have on
society. However, the paper largely focuses on voice-based conversational agents,
and chatbots were not included in the analysis (i.e., text-based conversational
agents). Researchers utilised an automated gender analysis approach to look into
three gender-specific design cues in the 1,375 chatbots listed on the website
chatbots.org. This is because chatbots might be gendered in their design. The gender
of the name was determined using two gender APIs, the gender of the avatar was
determined using a facial recognition API, and the gender-specific pronouns used
in the chatbot's description were examined using a text mining technique. The
findings imply that gender-specific cues are frequently incorporated into chatbot
design, and that the majority of chatbots are either expressly or implicitly built to
communicate a particular gender. More particular, the majority of the chatbots are
labelled as female chatbots and have female identities and avatars. Three application
domains in particular make this very clear (i.e., branded conversations, customer
service, and sales). Thus, discover proof that there is a propensity to favour one
gender (i.e., women) over another (i.e., male). As a result, it is shown that chatbots
in the wild were designed with a gender bias. The researchers develop ideas as a
starting point for future conversations and research to lessen the gender bias in
chatbot design based on these findings.
4|Page
METHODOLOGY
• Step 1: Before starting the project, we researched thoroughly many papers with the help
of EBSCO, Research gate, and Google scholar and studied the subject in detail.
• Step 2: After researching we made a questionnaire for conducting in-depth interviews to
gather as much information by asking open-ended questions. We conducted 5 IDIs per
group member leading to 35 IDI of respondents chosen from different backgrounds,
locations, ages, and gender.
• Step 3: After this, we transcribed the interviews verbatim and started analyzing for
codes, themes, and categories. This enabled us to encounter the most common factors
affecting the by performing a word cloud analysis and manual analysis for themes
emerging out of the data.
• Step 4: After this, we made our theoretical model for further carrying out the research.
Based on the model we prepared research objectives and hypotheses for our quantitative
part.
• Step 5: We identified different scales for measuring our constructs based on our
research from various Journals of Marketing.
• Step-6: Data was collected by floating the google form from a sample size of n = 207
across various age groups, professions, locations, and gender. We collated the responses
in excel format to further carry out the analysis in SPSS.
• Step 7: Data was cleaned, and correlation analysis was conducted followed by the
Cronbach alpha test of reliability to check if there is internal consistency between the
factors formed.
• Step 8: The next step involved running the t-test, ANOVA, MANOVA, Regression and
cluster analysis in SPSS to further explain the hypothesis.
• Step 9: With the given data a model was deduced with the help of Structural Equational
Modelling in AMOS. Confirmatory factor analysis and path analysis were conducted to
further comment on the model.
• Step 10: On the analysis of the significance of the above-conducted tests, we were able
to comment on the model and prove the hypothesis
In a nutshell, we followed the following steps: -
1. Literature Review
2. In-depth Interview
3. Coding and Theme
4. Hypothesis
5. Survey
6. Hypothesis testing
7. Findings
5|Page
RESEARCH OBJECTIVE
1. To understand the concept of chatbot and its AVATAR
2. To identify the effects of the avatar of the chatbot
3. To examine the effect of AVATAR of the chatbot on the consumer behavior
4. To analyze variation in consumer behavior regarding AVATAR based on
demographic factors such as Age, Gender, and Education levels
c) Age – An Avatar can be in the form of a kid, middle-aged, or old person. depends
on the consumer base it targets.
2. User Experience:
a) Dimension –An avatar can have 1-D, 2-D, or 3-D form. As time has progressed,
many organizations opt for a 3-D avatar.
b) Movement –A chatbot can be static or can be moved to any part of the screen.
6|Page
d) Task-oriented – Generally, chatbots are known to be task-oriented, but there
are certain bots that have a mind of their own.
3. Ease of Use:
a) Mode of communication – A chatbot can write or speak or do both.
Organizations have started to incorporate both the written and spoken modes of
a bot nowadays.
4. Security/Privacy:
a) Trust – Since it is a fair innovation, a lot of people struggle with trusting the bot
and are apprehensive when it comes to sharing their confidential information.
7|Page
QUANTITATIVE RESEARCH
8|Page
DATA ANALYSIS
Information about the respondents' demographics
The demographic data of the respondent was also gathered in the survey along with
additional questions assessing the Avatar and consumer behavior. This will aid in learning
how various responder types view connecting with a chatbot. This will aid marketers in
developing effective tactics. The details of the demographic profile are shown in the table
below.
Demographic details of the respondents
9|Page
Null Hypothesis for Bartlett’s Test: The correlation matrix is an identity matrix.
The significant Bartlett's Test p-value is less than 0.05 (p=0.001). With a p-value less than
0.05, the null hypothesis is rejected, indicating that the correlation matrix is not an identity
matrix and that there is some correlation between the variables. As a result, the data are
sufficient for factor analysis.
Eigenvalues show how much variance can be accounted for overall by a particular primary
component. Though in theory, they might be either positive or negative, in actuality they
always explain positive variance. It's positive if the eigenvalues are bigger than zero. All
the values are above zero hence all are acceptable. The total variance explained table
depicts that, there are a total of 20 components were used in the study.
10 | P a g e
Communalities
Initial Extraction
S1 1.000 .776
S2 1.000 .816
S3 1.000 .567
C1 1.000 .650
C2 1.000 .599
C3 1.000 .621
A1 1.000 .697
A2 1.000 .682
A3 1.000 .546
P1 1.000 .707
P2 1.000 .641
P3 1.000 .670
P4 1.000 .367
R1 1.000 .604
R2 1.000 .601
R3 1.000 .610
CS1 1.000 .716
CS2 1.000 .753
CS3 1.000 .587
CS4 1.000 .682
Extraction Method: Principal Component Analysis.
Small values indicate variables that do not fit well with the factor solution and should possibly
be dropped from the analysis. None of the values are less, hence none of it need to be dropped.
11 | P a g e
A2 "Name is not necessarily needed when conversing with a chatbot.
.806 A picture as a depiction suffices". How strongly do you agree with
this statement?
A3 .693 "Soft-colored chatbot is preferred while interacting with one."
P1 Rate the response quality you experienced the last time you interacted
.821
with a chatbot.
P2 .344 "Chatbots provide concise and precise solutions."
P3 .541 Experience in terms of the speed of responses coming from the chatbot?
P4 .464 "I consider the solutions provided by the chatbot."
R1 Would you want the chatbot to stick to professionalism over a personal
.746
approach?
R2 How much do you prefer a dynamic (in terms of expressions) chatbot
.716
over a fixed one?
R3 How would you rate your experience the last time you interacted with a
.342
chatbot?
CS1 Were your queries addressed precisely the last time you interacted with a
.821
chatbot?
CS2 "Chatbots are known for effectively addressing customer issues".
.841
How strongly do you agree with this statement?
CS3 How quickly were your queries resolved the last time you interacted with
.737
a chatbot?
CS4 .808 Do you prefer an animated form of the chatbot over a human form?
The factor analysis was conducted using the varimax rotation method. Factors like S2, P2,
and R3 did not correctly load to the assumed factors. Given that the construct was very
similar in nature, the research is being carried forward without removing any items.
Reliability Test:
The internal consistency of the variables is examined using the reliability test. The most
used test to determine dependability is Cronbach's alpha. The alpha value ranges from 0 to
1 and values greater than 0.7 are regarded as excellent. However, any value above 0.5 is
acceptable.
Reliability Test of The Factors Affecting Customer Satisfaction
No.
Item Cronbach's
Item of
Code Alpha
Items
"I am fairly comfortable with sharing my personal
S1 information while talking to a chatbot."
On a scale of 1-5, how comfortable will you be in discussing
S2 any sensitive topic with a chatbot?
S3 "I consider the solutions provided by the chatbot." 3 0.666
How likely are you to prefer your regional language over the
C1 English language (of the chatbot)?
How much does the mode of communication with the
C2 chatbot (Written or Verbal) matter to you as a customer? 3 0.612
12 | P a g e
Would you want the chatbot to stick to professionalism over
C3 a personal approach?
How much does a machine replacing a human to address
A1 your queries affect you as a customer?
How strongly will you prefer a human form avatar over a
A2 non-human one?
How much do you prefer a dynamic (in terms of
A3 expressions) chatbot over a fixed one? 3 0.702
What impact does the gender of a chatbot has while
P1 interacting with one?
"Name is not necessarily needed when conversing with a
chatbot. A picture as a depiction suffices". How strongly do
P2 you agree with this statement?
"Soft-colored chatbot is preferred while interacting with
P3 one."
Do you prefer an animated form of the chatbot over a human
P4 form? 4 0.568
Rate the response quality you experienced the last time you
R1 interacted with a chatbot.
R2 "Chatbots provide concise and precise solutions."
Experience in terms of the speed of responses coming from
R3 the chatbot? 3 0.602
How would you rate your experience the last time you
CS1 interacted with a chatbot?
Were your queries addressed precisely the last time you
CS2 interacted with a chatbot?
"Chatbots are known for effectively addressing customer
issues".
CS3 How strongly do you agree with this statement?
How quickly were your queries resolved the last time you
CS4 interacted with a chatbot? 4 0.866
Normality Test:
Normality tests determine if the data conform to a bell-shaped normal distribution. The
normalcy assumption must be met by the data for many statistical procedures. The
histogram curve, the values of KS or Shapiro Wilk, or other graphical methods can all be
used to visually verify the data's normality. The most popular is Shapiro Wilk. We want to
accept the null hypothesis that data meets the normality assumption (p>0.05) in normality
tests if the p-value is greater than 0.05, which indicates that the data conform to the
normality assumption.
The null hypothesis that the data are normal is violated in the below table since both the
Shapiro-Wilk and KS p values are less than 0.05. As a result, the null hypothesis is rejected.
We draw the conclusion that the normalcy assumption is not met by our data. Non-
parametric tests are used rather than parametric tests when the data is not normal. However,
it can be challenging to get normality in social science studies. The sample size may also
13 | P a g e
be one of the causes. The study's sample size, n = 207, is inadequate. 600 and up is a
reasonable size to test for normalcy. Even though the data defies the assumption of
normality, we nonetheless run parametric tests on them.
Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic Df Sig.
S_Mean .127 207 .000 .968 207 .000
C_Mean .092 207 .000 .980 207 .004
A_Mean
.126 207 .000 .968 207 .000
P_Mean .080 207 .003 .986 207 .041
R_Mean .088 207 .000 .977 207 .002
CS_Mean
.096 207 .000 .975 207 .001
14 | P a g e
HYPOTHESIS TESTING
1.T-test:
Group Statistics
Gender N Mean Std. Deviation Std. Error Mean
Mean_CS male 127 3.1280 .92307 .08191
female 80 3.3188 .95630 .10692
In group statistics, 1.0 denotes male and 2.0 denotes female while 127 & 80 represent their
sample respectively. The mean is high for female group.
Now moving down to actual independent sample test there are two rows which we can read
from, one is equal variance assumed and the other is equal variance not assumed.
After that we can see Levene’s test of equality of variances. Levene’s test will tests the null
hypothesis that the population variances are equal.
If we will accept the null hypothesis then we will use the equal variance assumed data. In
our case, level of significance is equal to 0.776 and we will accept the null hypothesis. As
it is above 0.05 and p value > the value of alpha. Therefore, it is a homogenous variance
and is called as homoscedasticity.
Now we have the significance (2- tailed) with 0.155.
Null hypothesis: There is no significance difference in terms of the means
Alternate hypothesis: There is significance difference in terms of the means
Here, we will accept the null hypothesis which means there is no significance difference
between the two groups.
15 | P a g e
2. One–way Anova:
1.0, 2.0, 3.0, 4.0, 5.0, 6.0 denotes the age group, and N represents the sample where the age
group of 18-25 are 120
Mean_CS
Levene Statistic df1 df2 Sig.
.430 5 201 .828
Mean_CS
Sum of
Squares df Mean Square F Sig.
Between
2.804 5 .561 .631 .676
Groups
Within Groups 178.588 201 .888
Total 181.392 206
16 | P a g e
Null hypothesis: There is no statistical difference between the overall satisfaction levels
and age
Alternate hypothesis: There is a statistical difference between the levels of overall
satisfaction and age.
P value > alpha value in this case (0.676 > 0.05) we will accept the null hypothesis and
there is no statistically difference between the levels of overall satisfaction and age.
With this there is no need to evaluate the multiple comparison table as we don’t need to see
the difference between the levels as we have accepted the null hypothesis.
Multiple Comparisons
Dependent Variable: Mean_CS
95% Confidence
Mean Interval
(I) Difference Std. Lower Upper
Age (J) Age (I-J) Error Sig. Bound Bound
Tukey HSD 1.0 2.0 -.04583 .31025 1.000 -.9385 .8469
3.0 .02426 .31924 1.000 -.8943 .9429
4.0 .42500 .62050 .983 -1.3604 2.2104
5.0 -.15833 .62050 1.000 -1.9438 1.6271
6.0 -.82500 .62050 .768 -2.6104 .9604
2.0 1.0 .04583 .31025 1.000 -.8469 .9385
3.0 .07010 .14307 .996 -.3416 .4818
4.0 .47083 .55097 .957 -1.1145 2.0562
5.0 -.11250 .55097 1.000 -1.6979 1.4729
6.0 -.77917 .55097 .718 -2.3645 .8062
3.0 1.0 -.02426 .31924 1.000 -.9429 .8943
2.0 -.07010 .14307 .996 -.4818 .3416
4.0 .40074 .55609 .979 -1.1994 2.0008
5.0 -.18260 .55609 .999 -1.7827 1.4175
6.0 -.84926 .55609 .647 -2.4494 .7508
4.0 1.0 -.42500 .62050 .983 -2.2104 1.3604
2.0 -.47083 .55097 .957 -2.0562 1.1145
3.0 -.40074 .55609 .979 -2.0008 1.1994
5.0 -.58333 .76963 .974 -2.7979 1.6312
6.0 -1.25000 .76963 .584 -3.4645 .9645
5.0 1.0 .15833 .62050 1.000 -1.6271 1.9438
2.0 .11250 .55097 1.000 -1.4729 1.6979
3.0 .18260 .55609 .999 -1.4175 1.7827
4.0 .58333 .76963 .974 -1.6312 2.7979
6.0 -.66667 .76963 .954 -2.8812 1.5479
17 | P a g e
6.0 1.0 .82500 .62050 .768 -.9604 2.6104
2.0 .77917 .55097 .718 -.8062 2.3645
3.0 .84926 .55609 .647 -.7508 2.4494
4.0 1.25000 .76963 .584 -.9645 3.4645
5.0 .66667 .76963 .954 -1.5479 2.8812
Games- 1.0 2.0 -.04583 .37479 1.000 -1.3482 1.2566
Howell 3.0 .02426 .38310 1.000 -1.2849 1.3334
4.0 .42500 .44261 .920 -1.1198 1.9698
5.0 -.15833 .68824 1.000 -3.5421 3.2255
6.0 -.82500 .72749 .846 -4.5590 2.9090
2.0 1.0 .04583 .37479 1.000 -1.2566 1.3482
3.0 .07010 .14294 .996 -.3431 .4833
4.0 .47083 .26375 .584 -1.2832 2.2248
5.0 -.11250 .58936 1.000 -4.7790 4.5540
6.0 -.77917 .63474 .806 -5.8372 4.2789
3.0 1.0 -.02426 .38310 1.000 -1.3334 1.2849
2.0 -.07010 .14294 .996 -.4833 .3431
4.0 .40074 .27544 .710 -1.1871 1.9886
5.0 -.18260 .59468 .999 -4.7105 4.3453
6.0 -.84926 .63969 .767 -5.7758 4.0773
4.0 1.0 -.42500 .44261 .920 -1.9698 1.1198
2.0 -.47083 .26375 .584 -2.2248 1.2832
3.0 -.40074 .27544 .710 -1.9886 1.1871
5.0 -.58333 .63465 .917 -4.4865 3.3198
6.0 -1.25000 .67700 .557 -5.5398 3.0398
5.0 1.0 .15833 .68824 1.000 -3.2255 3.5421
2.0 .11250 .58936 1.000 -4.5540 4.7790
3.0 .18260 .59468 .999 -4.3453 4.7105
4.0 .58333 .63465 .917 -3.3198 4.4865
6.0 -.66667 .85797 .958 -4.7477 3.4144
6.0 1.0 .82500 .72749 .846 -2.9090 4.5590
2.0 .77917 .63474 .806 -4.2789 5.8372
3.0 .84926 .63969 .767 -4.0773 5.7758
4.0 1.25000 .67700 .557 -3.0398 5.5398
5.0 .66667 .85797 .958 -3.4144 4.7477
18 | P a g e
3.Two–way Anova
We have descriptive analysis which has Gender & Education Qualification having two
levels and three levels each respectively and we can see the sample size.
Between-Subjects Factors
Value Label N
Gender 1.0 Male 127
2.0 Female 80
Education Qualification 2.0 class 12 or below 9
3.0 Bachelors
119
degree
4.0 post grad or above 79
The next table shows the descriptive statistics where it shows the mean for males and
females in terms of educational qualification.
19 | P a g e
Descriptive Statistics:
20 | P a g e
Now, analyzing the table of test between subject’s effects there is no factor which has a
statistically difference between the dependent variable overall satisfaction & the
independent variable on gender and education qualification.
Gender
Dependent Variable: Mean_CS
Std. 95% Confidence Interval
Gender Mean Error Lower Bound Upper Bound
male 2.992 .151 2.693 3.290
female 3.398 .172 3.060 3.736
Then we have the grand mean of the overall satisfaction. Now we have multiple
comparisons where we can interpret that we don’t have any significant difference between
any education qualifications with any of the other education qualifications as p value >
alpha value.
4.One-Way MANOVA
MANOVA, which is an expansion of ANOVA, is used to analyse how two or more
continuous dependent variables are affected by a categorical independent variable. There
are many metric DVs here, compared to just one metric DV in an ANOVA.
Conditions and Assumptions required:
1) Independent observations
2) Normality of DVs
3) No outliers
4) DV is metric while IV is categorical
5) DVs must be correlated but not highly correlated
6) Linear relation between DVs and IVs
Education level’s relation with scores in customer satisfaction and response quality
parameters
21 | P a g e
Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Mean_CS .096 207 .000 .975 207 .001
Mean_R .087 207 .001 .978 207 .002
a. Lilliefors Significance Correction
22 | P a g e
As can be seen through Shapiro-Wilks’ normality test and the Q-Q plot, the data for both
the parameters is less than 0.05 and thus does not follow normality curve. This is due to
presence of outliers. Outliers though need not be removed in this case because the current
study is formed on Likert scale and presence of outlier is inevitable.
Correlations
Mean_CS Mean_R
Pearson Correlation 1 .720**
Mean_CS Sig. (2-tailed) .000
N 207 207
Pearson Correlation .720** 1
Mean_R Sig. (2-tailed) .000
N 207 207
**. Correlation is significant at the 0.01 level (2-tailed).
Test for correlation confirms that the DVs are positively correlated (0.720) but not too
highly correlated. This mean s that the DVs are not multicollinear. Hence the data satisfy
all the required conditions. Now the research can proceed to One-Way MANOVA for
hypothesis testing.
Main Null Hypothesis 1: There is no significant difference between the means linear
combination of customer satisfaction and response quality parameters with respect to
educational qualifications.
Descriptive Statistics
EducationQualification Mean Std. Deviation N
class 12 or below 3.0833 .89268 9
Mean_C bachelors degree 3.1534 .96940 119
S post grad or above 3.2880 .89969 79
Total 3.2017 .93837 207
class 12 or below 3.6296 .84071 9
FMean_ bachelors degree 3.2913 .81251 119
R post grad or above 3.4599 .72659 79
Total 3.3704 .78430 207
23 | P a g e
Descriptive stats provide us the information about variables wrt. numbers, mean and SD.
There’s a small difference in means of group as can be seen from the table, however,
statistical validity of how significant the difference between means is would be tested
through further analysis.
Multivariate Testsa
Effect Value F Hypothes Error Sig. Partial Eta
is df df Squared
.868 665.69 2.000 203.0 .000 .868
Pillai's Trace b
3 00
Wilks' .132 665.69 2.000 203.0 .000 .868
b
Lambda 3 00
Intercept
Hotelling's 6.559 665.69 2.000 203.0 .000 .868
b
Trace 3 00
Roy's Largest 6.559 665.69 2.000 203.0 .000 .868
b
Root 3 00
.028 1.464 4.000 408.0 .212 .014
Pillai's Trace
00
b
Education Wilks' .972 1.461 4.000 406.0 .213 .014
Qualification Lambda 00
Hotelling's .029 1.458 4.000 404.0 .214 .014
Trace 00
24 | P a g e
Roy's Largest .024 2.403c 2.000 204.0 .093 .023
Root 00
a. Design: Intercept + EducationQualification
b. Exact statistic
c. The statistic is an upper bound on F that yields a lower bound on the significance level.
Testing for main null hypothesis 1: There is no significant difference between the means
linear combination of customer satisfaction and response quality parameters with respect
to educational qualifications.
The multivariate test table shows the result of One-way MANOVA. The
EducationQulification row’s Wilks’ Lambda parameters are to be considered. The Wilks’
Lambda value of 0.972 and sig p-value of 0.213 at 5% level of significance means that null
hypothesis needs to be accepted. The partial eta squared value of 0.014 means that 1.4% of
information about DVs is explained by education qualification. Thus, null hypothesis is to
be accepted, i.e., there is no significant difference between mean values of linear
combination of customer satisfaction and response quality parameters with respect to
educational qualifications.
25 | P a g e
Tests of Between-Subjects Effects
Source Dependent Type III df Mean F Sig. Partial Eta
Variable Sum of Square Squared
Squares
Corrected Mean_CS .992a 2 .496 .561 .572 .005
Model Mean_R 1.982b 2 .991 1.621 .200 .016
686.370 1 686.370 776.1 .000 .792
Mean_CS
62
Intercept
815.314 1 815.314 1333. .000 .867
Mean_R
431
EducationQua Mean_CS .992 2 .496 .561 .572 .005
lification Mean_R 1.982 2 .991 1.621 .200 .016
Mean_CS 180.400 204 .884
Error
Mean_R 124.734 204 .611
Mean_CS 2303.313 207
Total
Mean_R 2478.111 207
Corrected Mean_CS 181.392 206
Total Mean_R 126.716 206
a. R Squared = .005 (Adjusted R Squared = -.004)
b. R Squared = .016 (Adjusted R Squared = .006)
Mean_CS
EducationQualification N Subset
1
class 12 or below 9 3.0833
Tukey bachelors degree 119 3.1534
HSDa,b post grad or above 79 3.2880
Sig. .744
26 | P a g e
class 12 or below 9 3.0833
bachelors degree 119 3.1534
Scheffea,b
post grad or above 79 3.2880
Sig. .765
Means for groups in homogeneous subsets are displayed.
Based on observed means.
The error term is Mean Square(Error) = .884.
a. Uses Harmonic Mean Sample Size = 22.698.
b. Alpha = .05.
Mean_R
Education Qualification N Subset
1
bachelors degree 119 3.2913
Tukey post grad or above 79 3.4599
a,b
HSD class 12 or below 9 3.6296
Sig. .314
bachelors degree 119 3.2913
post grad or above 79 3.4599
Scheffea,b
class 12 or below 9 3.6296
Sig. .348
Means for groups in homogeneous subsets are displayed.
Based on observed means.
The error term is Mean Square(Error) = .611.
a. Uses Harmonic Mean Sample Size = 22.698.
b. Alpha = .05.
Since there is no significant difference in the groups, all groups have been grouped in one
subset.
6.Two-way Multivariate Analysis of Variance (MANOVA)
Two-way MANOVA is an extension of One-way MANOVA that helps us test two
categorical independent variables on multiple dependent variables
Conditions and assumptions required:
✓ DV is on metric scale and IV is categorical scale
✓ Normality of Data
✓ No multicollinearity
✓ No outliers
Since same DVs are taken as in One-way MANOVA testing, all the conditions are tested
previously and thus, normality is assumed and there is no multicollinearity.
27 | P a g e
Physical parameters and non-physical parameters tested on education qualification
and gender
Null Hypothesis 1: There is no significant difference between the mean of customer
satisfaction and response quality parameter for all gender-related groups
Null Hypothesis 2: There is no significant difference between the means of customer
satisfaction and response quality parameters for all education qualification levels
Null hypothesis 3: There is no significant difference between the means of customer
satisfaction and response quality parameters for the joint effect of all the gender groups and
education qualification levels.
Descriptive Statistics
Education Qualification Gender Mean Std. Deviation N
male 2.7000 .54199 5
class 12 or below female 3.5625 1.08733 4
Total 3.0833 .89268 9
male 3.1646 .93016 79
bachelors degree female 3.1313 1.05458 40
Total 3.1534 .96940 119
Mean_CS
male 3.1105 .94696 43
post grad or above female 3.5000 .80178 36
Total 3.2880 .89969 79
male 3.1280 .92307 127
Total female 3.3188 .95630 80
Total 3.2017 .93837 207
male 3.5333 .90062 5
class 12 or below female 3.7500 .87665 4
Total 3.6296 .84071 9
male 3.3460 .72686 79
bachelors degree female 3.1833 .96062 40
Total 3.2913 .81251 119
Mean_R
male 3.3566 .76772 43
post grad or above female 3.5833 .66368 36
Total 3.4599 .72659 79
male 3.3570 .74201 127
Total female 3.3917 .85153 80
Total 3.3704 .78430 207
28 | P a g e
Descriptive stats provide a description of the variable with the help of means, no. of
observations., and standard deviation. The mean values differ a bit as seen in the table,
though, statistical significance will be tested in coming tables.
Box's M 17.570
F 1.044
df1 15
df2 1206.080
Sig. .407
Box ‘s M test is used to see if the covariance matrices of the variables are equal across the
matrices. Here, the sig value (p value) is 0.407, with alpha of 0.001, we fail to reject the
null hypothesis. Hence, it can be concluded that population variances are equal.
Multivariate Testsa
Effect Value F Hypothes Error Sig. Partial
is df df Eta
Squared
.867653.86 2.000 200.0 .000 .867
Pillai's Trace
0b 00
Wilks' .133 653.86 2.000 200.0 .000 .867
Lambda 0b 00
Intercept
Hotelling's 6.539 653.86 2.000 200.0 .000 .867
Trace 0b 00
Roy's Largest 6.539 653.86 2.000 200.0 .000 .867
Root 0b 00
.032 1.631 4.000 402.0 .166 .016
Pillai's Trace
00
Wilks' .968 1.629b 4.000 400.0 .166 .016
EducationQualifi Lambda 00
cation Hotelling's .033 1.627 4.000 398.0 .167 .016
Trace 00
Roy's Largest .027 2.752c 2.000 201.0 .066 .027
Root 00
.022 2.223b 2.000 200.0 .111 .022
Pillai's Trace
00
Wilks' .978 2.223b 2.000 200.0 .111 .022
Gender
Lambda 00
Hotelling's .022 2.223b 2.000 200.0 .111 .022
Trace 00
29 | P a g e
Roy's Largest .022 2.223b 2.000 200.0 .111 .022
Root 00
.022 1.142 4.000 402.0 .336 .011
Pillai's Trace
00
b
Wilks' .978 1.139 4.000 400.0 .338 .011
EducationQualifi Lambda 00
cation * Gender Hotelling's .023 1.136 4.000 398.0 .339 .011
Trace 00
c
Roy's Largest .018 1.844 2.000 201.0 .161 .018
Root 00
a. Design: Intercept + EducationQualification + Gender + EducationQualification *
Gender
b. Exact statistic
c. The statistic is an upper bound on F that yields a lower bound on the significance level.
This table will be used to examine the null hypothesis of our research:
Null hypothesis 1: There is no significant difference between the mean of customer
satisfaction and response quality parameter for all gender related groups
Inference: Here, in the gender table, Wilks’ lambda sig value is 0.111 (p>0.05). hence, we
fail to reject the null hypothesis. It can be concluded that there is no significant difference
between mean values of customer satisfaction and response quality parameters for both the
gender groups.
Null Hypothesis 2: There is no significant difference between the means of customer
satisfaction and response quality parameters for all education qualification levels
Inference: We can see in the table that in education qualification’s wilks’ lambda row, the
sig. (p) value is 0.166 (p>0.05). Hence, we fail to reject eh null hypothesis. Thus, it can be
concluded that there is no significant difference between the means of customer satisfaction
and response quality parameters for all the education qualification levels.
Null hypothesis 3: There is no significant difference between the means of customer
satisfaction and response quality parameters for the joint affect of all the gender groups and
education qualification levels.
Inference: In the EducationQualification * Gender row’s Wilks’ lambda, the p value is
0.338 (p>0.05). It means that we failed to reject the null hypothesis. Thus, it can be
concluded that there’s no significant difference between the mean value of customer
satisfaction and response quality parameters for the joint effect of gender groups and
education qualification levels.
30 | P a g e
Tests the null hypothesis that the error variance of the dependent variable is equal across
groups.
a. Design: Intercept + EducationQualification + Gender + EducationQualification * Gender
By the Levene’s test, we can measure if the DVs have homogenous error variance. The p-
value for customer satisfaction parameter is 0.477 and for response quality parameter is
0.290, both of which are more than the alpha of 0.05. Hence, we accept the null hypothesis
and conclude that both DVs have equal error variance
The above table examines individual DVs on Individual IVs. Since all the p values are
greater than 0.05, it can be concluded that there is no significant difference in means of DV
wrt. IVs and also with joint effect of both IVs
31 | P a g e
Multiple Comparisons
Dependent (I) (J) Mean Std. Sig. 95% Confidence
Variable EducationQu EducationQua Differen Error Interval
alification lification ce (I-J) Lower Upper
Bound Bound
bachelors -.0700 .3232 .974 -.8333 .6933
class 12 or degree 6
below post grad or -.2046 .3289 .808 -.9814 .5721
above 6
class 12 or .0700 .3232 .974 -.6933 .8333
Tukey bachelors below 6
HSD degree post grad or -.1346 .1357 .583 -.4550 .1858
above 0
class 12 or .2046 .3289 .808 -.5721 .9814
post grad or below 6
above bachelors .1346 .1357 .583 -.1858 .4550
Mean degree 0
_CS bachelors -.0700 .3232 .977 -.8672 .7272
class 12 or degree 6
below post grad or -.2046 .3289 .824 -1.0159 .6066
above 6
class 12 or .0700 .3232 .977 -.7272 .8672
bachelors below 6
Scheffe
degree post grad or -.1346 .1357 .612 -.4693 .2000
above 0
class 12 or .2046 .3289 .824 -.6066 1.0159
post grad or below 6
above bachelors .1346 .1357 .612 -.2000 .4693
degree 0
bachelors .3383 .2703 .424 -.3000 .9767
class 12 or degree 5
below post grad or .1697 .2751 .811 -.4799 .8193
above 2
class 12 or -.3383 .2703 .424 -.9767 .3000
Mean Tukey bachelors below 5
_R HSD degree post grad or -.1686 .1134 .300 -.4366 .0994
above 9
class 12 or -.1697 .2751 .811 -.8193 .4799
post grad or below 2
above bachelors .1686 .1134 .300 -.0994 .4366
degree 9
32 | P a g e
bachelors .3383 .2703 .458 -.3284 1.0050
class 12 or degree 5
below post grad or .1697 .2751 .827 -.5088 .8482
above 2
class 12 or -.3383 .2703 .458 -1.0050 .3284
bachelors below 5
Scheffe
degree post grad or -.1686 .1134 .334 -.4485 .1113
above 9
class 12 or -.1697 .2751 .827 -.8482 .5088
post grad or below 2
above bachelors .1686 .1134 .334 -.1113 .4485
degree 9
Based on observed means.
The error term is Mean Square(Error) = .612.
The above post hoc results also confirm that there is no difference in the means of groups
with all p values greater than 0.05.
Descriptive Statistics
Mean Std. Deviation N
Mean_R 3.3704 .78430 207
Mean_S 2.9243 .87280 207
33 | P a g e
Correlations
Mean_R Mean_S
Pearson Mean_R 1.000 .420
Correlation Mean_S .420 1.000
Mean_R . .000
Sig. (1-tailed)
Mean_S .000 .
Mean_R 207 207
N
Mean_S 207 207
The correlation between the variables is 0.420 which signifies a positive relationship
between variables.
Model Summaryb
Mode R R Square Adjusted R Std. Error of the Durbin-Watson
l Square Estimate
a
1 .420 .177 .173 .71345 2.021
a. Predictors: (Constant), Mean_S
b. Dependent Variable: Mean_R
The model summary shows the R and R-squared values. The R-value of (0.420) shows that
there is a positive correlation between the two variables. R-square indicates how much of
the total variation in response quality factor is explained by the security factor. The R-
square value of 0.177 means that 17.7% of the variation in response quality can be
explained by security parameters.
ANOVAa
Model Sum of df Mean Square F Sig.
Squares
Regression 22.369 1 22.369 43.945 .000b
1 Residual 104.347 205 .509
Total 126.716 206
a. Dependent Variable: Mean_R
b. Predictors: (Constant), Mean_S
The ANOVA table shows if the regression equation predicts the dependent variable. The
Sig p-value of (0.000) indicates that the regression model works.
34 | P a g e
Coefficientsa
Model Unstandardized Standardized t Sig.
Coefficients Coefficients
B Std. Error Beta
(Constant) 2.266 .174 13.042 .000
1
Mean_S .378 .057 .420 6.629 .000
a. Dependent Variable: Mean_R
Coefficients table provides information to create the predictor model for response quality
from security parameter and also examine whether security parameters significantly
contribute to the prediction model. The p-value of customer security parameter (0.00) is
less than 0.05, thus, we can say that non-physical parameters significantly affect the overall
satisfaction. The unstandardised B-value of .378 means that for every unit change in
security parameter, response quality will change by .378. Collinearity stats are not required
since only one IV is being used
Regression equation: Overall satisfaction = 2.266+0.378(non-physical parameters)
7.Discriminant Analysis:
Assumptions:
1. Linearity and Normality of the data
2. Equal variance amongst groups
3. No multicollinearity
4. Group membership must be mutually exclusive
5. Independent of observations
35 | P a g e
Both missing or out-of- 0 .0
range group codes and
at least one missing
discriminating variable
Total 0 .0
Total 207 100.0
The above table provides information on the number of observations and any missing data.
There are 207 observations that we collected. Hence the table shows those observations.
There are no missing or outrange groups in the data collected.
Group Statistics
Gender Mean Std. Valid N (listwise)
Deviation Unweighte Weighte
d d
Comfortability 2.9239 .83274 127 127.000
Ease_of_Communicati 3.4199 .75734 127 127.000
1.0 on
Avatar_Preference 3.6824 .74282 127 127.000
Past_Experience 3.1280 .92307 127 127.000
Comfortability 2.9250 .93829 80 80.000
Ease_of_Communicati 3.4333 .89285 80 80.000
2.0 on
Avatar_Preference 3.6583 .82502 80 80.000
Past_Experience 3.3188 .95630 80 80.000
Comfortability 2.9243 .87280 207 207.000
Ease_of_Communicati 3.4251 .81030 207 207.000
Total on
Avatar_Preference 3.6731 .77373 207 207.000
Past_Experience 3.2017 .93837 207 207.000
The above table provides the descriptive details of the data with mean, SD and number of
observations. The group statistics examines the difference between the groups of gender in
terms of four parameters. In this output, it can be seen that the means of independent
variables differ noticeably in each group of gender. These differences will allow using of
predictors to distinguish observations between the two groups. This table exhibits how far
the groups are variating. However, the statistical significance will be tested in further
36 | P a g e
analysis. In weighted values, the default weight is 1 for each observation, thus both
weighted and unweighted have equal observations
The tests of equality of group mean to measure each independent variable’s potential before
the model is created. Each test displays the result of a One-Way ANOVA for independent
variables using the grouping variable as the factor. If the significant value is greater than
0.10, the variable probably does not contribute to the model.
The above table shows the correlation between the predictor variables.
37 | P a g e
Box's M 9.479
Approx .926
df1 10
F
df2 132167.997
Sig. .507
Tests null hypothesis of equal population covariance matrices.
Box’s M tests the assumption of the equality of covariances across groups. Log
determinants are a measure of the variability of the groups. Larger log determinants
correspond to more variable groups. The rank column represents the number of independent
variables in the study. The study
has 4 IVs. Since the sig p-value is greater than 0.05, the null hypothesis of the equal
population of covariance matrices is accepted. Our data do not differ in their covariances
matrices.
Eigenvalues
Functio Eigenvalu % of Cumulative Canonical Correlation
n e Variance %
a
1 .014 100.0 100.0 .117
a. First 1 canonical discriminant functions were used in the analysis.
The eigenvalues table provides information about the relative efficacy of each discriminant
function. The larger the eigenvalue, the function is able to explain more variance in the
dependent variable. This is a measure of goodness of fit. There are two categories in the
dependent variable, thus there is one discriminant function. The canonical correlation value
in the table is 0.117. A square of this value is to be taken that gives us the variance (.013)
value between the discriminant function and the dependent variable group. This is
presented in percentage as 1.3% expressed as the variance between the two groups of
dependent variable gender. This value provides the magnitude of the discrimination
function between the groups (the magnitude is very low).
38 | P a g e
Wilks' Lambda
Test of Function(s) Wilks' Chi- df Sig.
Lambda square
1 .986 2.784 4 .595
Wilks’ lambda is a measure of how well each function separates cases into groups. Smaller
values of Wilks’ lambda indicate the greater discriminatory ability of the function.
39 | P a g e
D (Gender): -1.129 +1.245 *Past_Experience -.193 * Avatar_Preference - .105 *
Ease_of_Communication -.613 * Comfortability
This table provides the cutting point for classifying cases. Any score below -.093 is Male
and any score above 0.147 is Female. For the values between these, take an average of the
values and anything above the average is female and anything below the value is male
(Only when group sizes are equal). If group size is unequal Z cutoff score is taken for
discrimination.
Classification Statistics
Classification Resultsa,c
40 | P a g e
Gender Predicted Group Total
Membership
1.0 2.0
1.0 63 64 127
Count
2.0 36 44 80
Original
1.0 49.6 50.4 100.0
%
2.0 45.0 55.0 100.0
1.0 62 65 127
Count
Cross- 2.0 42 38 80
b
validated 1.0 48.8 51.2 100.0
%
2.0 52.5 47.5 100.0
a. 51.7% of original grouped cases were correctly classified.
b. Cross validation is done only for those cases in the analysis. In cross-validation,
each case is classified by the functions derived from all cases other than that case.
c. 48.3% of cross-validated grouped cases were correctly classified.
How much accuracy the model gives can be checked using the above classification table.
This table will determine how well the discriminant function works. The table shows that
originally those classified as male are accurately classified as male by the function, whereas
those originally classified as female is not correctly classified as females. The hit ratio of
52% denotes that many of the cases are properly classified. Hence it can be concluded that
the discriminant function can discriminate at a level of 52% with a variance of 49%.
9.Multiple Regression
Multiple regression is a statistical technique for examining the relationship between a single
dependent variable and a number of independent variables. The goal of multiple regression
analysis is to use known independent variables to predict the value of a single dependent
variable.
In our model the dependent variable is Customer Experience and the independent variables
are Precision of Responses, Effectiveness of addressing query and Openness in sharing
personal information
Descriptive Statistics
Mean Std. N
Deviation
41 | P a g e
CS1 3.261 1.0792 207
R2 3.068 1.1680 207
CS3 3.164 1.0937 207
S1 2.758 1.1233 207
The above shows the Mean and Std Deviation of all the variables existing in the model
Correlations
CS1 R2 CS3 S1
Pearson CS1 1.000 .506 .597 .272
Correlation R2 .506 1.000 .531 .153
CS3 .597 .531 1.000 .206
S1 .272 .153 .206 1.000
The Pearson correlation for variables is less than 0.7. There doesn’t exist any
multicollinearity between the Independent Variables. The Independent and Dependent
variables are weakly correlated which makes them fit for the model.
Model Summaryb
The multiple correlation coefficient obtained is 0.653. This model has R square value of 0.426
which suggests that 42.6% of the variance in dependent variable is explained by the independent
variable. An explanation of 42.6% of the variance by the independent variable can be considered
as a good output.
42 | P a g e
ANOVAa
Model Sum of df Mean F Sig.
Squares Square
Regression 102.181 3 34.060 50.201 .000b
1 Residual 137.732 203 .678
Total 239.913 206
a. Dependent Variable: CS1
b. Predictors: (Constant), S1, R2, CS3
The Anova results depict a F ratio and Significance value to test the overall fit of regression
model. Significance value of less than 0.05 enables us to reject the null hypothesis which
assume that all the regression coefficients are 0. Hence, we can say that at least of the
coefficient is not equal to 0 at a confidence level of 95%.
Coefficientsa
Unstandardised coefficients (B) in the above table depict how much the dependent variable
varies with the individual independent variable when rest of the variables are kept constant.
All the independent variables are statistically significant (have a sig value of >0.05). On
basis of this value, we can conclude that coefficients are statistically significantly different
from 0. The constant value in this regression model obtained is 0.808.
43 | P a g e
Therefore, our regression equation can be written down as:
Cluster Analysis
Cluster analysis is one of the most fundamental, simple, and very often used methos of
understanding and learning grouping of objects into similar groups based on their
characteristics. This procedure employs a variety of algorithms and methods to create
clusters of a similar type. It is also used in statistical analysis as part of data management.
We try to group a set of objects that have similar attributes, these groups are referred to as
clusters. Since it is relatively difficult to learn the properties of each individual object or
participant we instead try and categorise them into simpler and similar object groups and
have a common structure of properties that the group adheres to.
Hierarchical Clustering
Agglomeration Schedule
44 | P a g e
Cluster Cluster Cluster 1 Cluster 2
1 2
1 183 206 .000 0 0 15
2 200 205 .000 0 0 5
3 181 202 .000 0 0 17
4 197 201 .000 0 0 7
5 6 200 .000 0 2 34
6 162 198 .000 0 0 34
7 3 197 .000 0 4 38
8 180 192 .000 0 0 18
9 170 189 .000 0 0 157
10 147 188 .000 0 0 49
11 171 187 .000 0 0 26
12 184 186 .000 0 0 14
13 174 185 .000 0 0 23
14 78 184 .000 0 12 104
15 11 183 .000 0 1 53
16 143 182 .000 0 0 53
17 15 181 .000 0 3 29
18 23 180 .000 0 8 48
19 172 179 .000 0 0 25
20 158 178 .000 0 0 38
21 139 176 .000 0 0 57
22 167 175 .000 0 0 29
23 36 174 .000 0 13 32
24 85 173 .000 0 0 104
25 31 172 .000 0 19 43
26 50 171 .000 0 11 41
27 156 169 .000 0 0 40
28 164 168 .000 0 0 32
29 15 167 .000 17 22 85
30 155 166 .000 0 0 41
31 148 165 .000 0 0 48
32 36 164 .000 23 28 47
33 113 163 .000 0 0 81
34 6 162 .000 5 6 75
35 117 161 .000 0 0 78
36 25 160 .000 0 0 179
37 125 159 .000 0 0 70
38 3 158 .000 7 20 137
39 129 157 .000 0 0 67
40 26 156 .000 0 27 118
41 50 155 .000 26 30 63
45 | P a g e
42 149 154 .000 0 0 47
43 31 153 .000 25 0 172
44 79 152 .000 0 0 107
45 145 151 .000 0 0 51
46 133 150 .000 0 0 63
47 36 149 .000 32 42 108
48 23 148 .000 18 31 142
49 2 147 .000 0 10 71
50 120 146 .000 0 0 75
51 13 145 .000 0 45 93
52 119 144 .000 0 0 76
53 11 143 .000 15 16 58
54 5 142 .000 0 0 156
55 138 141 .000 0 0 58
56 136 140 .000 0 0 60
57 68 139 .000 0 21 167
58 11 138 .000 53 55 66
59 60 137 .000 0 0 121
60 16 136 .000 0 56 91
61 57 135 .000 0 0 175
62 124 134 .000 0 0 71
63 50 133 .000 41 46 125
64 54 132 .000 0 0 168
65 130 131 .000 0 0 66
66 11 130 .000 58 65 106
67 4 129 .000 0 39 73
68 122 128 .000 0 0 73
69 76 127 .000 0 0 108
70 9 125 .000 0 37 92
71 2 124 .000 49 62 95
72 102 123 .000 0 0 91
73 4 122 .000 67 68 83
74 95 121 .000 0 0 97
75 6 120 .000 34 50 101
76 39 119 .000 0 52 123
77 101 118 .000 0 0 92
78 98 117 .000 0 35 159
79 111 115 .000 0 0 83
80 22 114 .000 0 0 172
81 28 113 .000 0 33 160
82 97 112 .000 0 0 95
83 4 111 .000 73 79 120
84 108 109 .000 0 0 85
46 | P a g e
85 15 108 .000 29 84 138
86 61 107 .000 0 0 120
87 100 106 .000 0 0 93
88 48 105 .000 0 0 177
89 80 104 .000 0 0 106
90 46 103 .000 0 0 159
91 16 102 .000 60 72 119
92 9 101 .000 70 77 126
93 13 100 .000 51 87 141
94 63 99 .000 0 0 118
95 2 97 .000 71 82 116
96 91 96 .000 0 0 101
97 1 95 .000 0 74 109
98 65 94 .000 0 0 116
99 64 93 .000 0 0 117
100 87 92 .000 0 0 161
101 6 91 .000 75 96 145
102 37 88 .000 0 0 161
103 75 86 .000 0 0 109
104 78 85 .000 14 24 164
105 49 82 .000 0 0 166
106 11 80 .000 66 89 144
107 7 79 .000 0 44 162
108 36 76 .000 47 69 171
109 1 75 .000 97 103 143
110 58 72 .000 0 0 123
111 55 71 .000 0 0 125
112 21 70 .000 0 0 144
113 35 69 .000 0 0 137
114 53 67 .000 0 0 126
115 62 66 .000 0 0 119
116 2 65 .000 95 98 139
117 27 64 .000 0 99 129
118 26 63 .000 40 94 167
119 16 62 .000 91 115 124
120 4 61 .000 83 86 130
121 14 60 .000 0 59 136
122 29 59 .000 0 0 142
123 39 58 .000 76 110 170
124 16 56 .000 119 0 168
125 50 55 .000 63 111 128
126 9 53 .000 92 114 127
127 9 52 .000 126 0 164
47 | P a g e
128 50 51 .000 125 0 169
129 27 47 .000 117 0 171
130 4 45 .000 120 0 170
131 33 44 .000 0 0 139
132 38 43 .000 0 0 136
133 20 42 .000 0 0 145
134 34 41 .000 0 0 138
135 30 40 .000 0 0 141
136 14 38 .000 121 132 165
137 3 35 .000 38 113 140
138 15 34 .000 85 134 152
139 2 33 .000 116 131 148
140 3 32 .000 137 0 155
141 13 30 .000 93 135 146
142 23 29 .000 48 122 173
143 1 24 .000 109 0 157
144 11 21 .000 106 112 151
145 6 20 .000 101 133 173
146 13 18 .000 141 0 169
147 12 17 .000 0 0 148
148 2 12 .000 139 147 150
149 8 10 .000 0 0 150
150 2 8 .000 148 149 174
151 11 207 1.000 144 0 174
152 15 203 1.000 138 0 175
153 190 199 1.000 0 0 199
154 110 195 1.000 0 0 190
155 3 193 1.000 140 0 176
156 5 177 1.000 54 0 191
157 1 170 1.000 143 9 182
158 77 126 1.000 0 0 192
159 46 98 1.000 90 78 182
160 28 90 1.000 81 0 178
161 37 87 1.000 102 100 190
162 7 84 1.000 107 0 183
163 81 83 1.000 0 0 201
164 9 78 1.000 127 104 187
165 14 74 1.000 136 0 177
166 49 73 1.000 105 0 185
167 26 68 1.000 118 57 181
168 16 54 1.000 124 64 180
169 13 50 1.000 146 128 184
170 4 39 1.000 130 123 181
48 | P a g e
171 27 36 1.000 129 108 187
172 22 31 1.000 80 43 179
173 6 23 1.000 145 142 180
174 2 11 1.071 150 151 186
175 15 57 1.100 152 61 184
176 3 89 1.111 155 0 183
177 14 48 1.167 165 88 195
178 19 28 1.250 0 160 191
179 22 25 1.333 172 36 185
180 6 16 1.433 173 168 186
181 4 26 1.458 170 167 193
182 1 46 1.650 157 159 192
183 3 7 1.650 176 162 194
184 13 15 1.769 169 175 194
185 22 49 1.833 179 166 197
186 2 6 1.873 174 180 196
187 9 27 1.923 164 171 195
188 116 196 2.000 0 0 201
189 191 194 2.000 0 0 199
190 37 110 2.000 161 154 202
191 5 19 2.067 156 178 193
192 1 77 2.192 182 158 200
193 4 5 2.299 181 191 197
194 3 13 2.300 183 184 196
195 9 14 2.433 187 177 198
196 2 3 2.555 186 194 198
197 4 22 2.604 193 185 203
198 2 9 2.833 196 195 200
199 190 191 3.000 153 189 204
200 1 2 3.874 192 198 202
201 81 116 4.000 163 188 204
202 1 37 5.631 200 190 203
203 1 4 6.478 202 197 205
204 81 190 7.500 201 199 206
205 1 204 9.020 203 0 206
206 1 81 12.705 205 204 0
The Agglomeration Schedule shows the step wise manner in which the clustering process
is done. It shows which all clusters are combined at every step and the total error resulting
from the solution. To identify the optimal number of clusters we need to check for a
49 | P a g e
significant amount of jump in the error. This showcases that two different clusters have
been combined together.
The highest jump in error is witnessed from 4 to 5.631. Therefore, the optimal number of
clusters that can be formed are 207-202= 5
50 | P a g e
The Dendrogram is used to group the objects together and create clusters. The graph is
read form left to right and is used to visually categorise major clusters present in the data
set.
K-Means Clustering
K-Means Clustering analysis is used to check conformity of the number of clusters formed.
Here we choose the same 4 parameters as used in Hierarchical clustering. The parameters
on basis of which the clusters will be formed are Customer Experience, Effectiveness of
51 | P a g e
addressing queries, Precision of Solution, and Response Speed of the chat bot. The number
for clusters for which the confirmatory test is being run is chosen as 5.
CS1: Customer Experience
CS3: Effectiveness of addressing queries
R2: Precision of Solution
R3: Response Speed
Cluster
1 2 3 4 5
CS1 3.3 3.1 3.7 1.8 4.2
CS3 2.8 2.8 4.0 1.8 4.4
R2 3.6 2.5 2.3 1.8 4.4
R3 3.9 2.7 4.2 4.0 4.3
The above table shows the final cluster centres of the obtained solution for confirmatory
analysis. It is computed using mean of each variable of within the individual cluster
confirmed
ANOVA
52 | P a g e
Number of Cases in each
Cluster
1 48.000
2 38.000
Cluster 3 32.000
4 39.000
5 50.000
Valid 207.000
Missing .000
Th above table shows the number of cases that belong to individual cluster. Nearly 24% of
the sample belongs to 5th cluster while the 15% of them fall under cluster 3 which is the
least.
2 Step Clustering
In this method we do exploratory analysis to identify any naturally existing clusters in the
dataset. For our analysis the variables we chose for clustering are Age, Education Level,
Gender and Customer Experience. Following were the results we obtained
The below tables show the number wise and percentage wise distribution of data in
individual cluster based on that particular variable
Customer Experience
1 2 3
Frequenc Frequenc Frequenc
Percent Percent Percent
y y y
1 10 66.70% 8 23.50% 10 16.40%
2 0 0.00% 0 0.00% 30 49.20%
3 0 0.00% 17 50.00% 0 0.00%
Cluster 4 0 0.00% 0 0.00% 11 18.00%
5 5 33.30% 9 26.50% 10 16.40%
Combi 100.00 100.00 100.00
15 34 61
ned % % %
4 5
1 0 0.00% 15 71.40%
Cluster 2 0 0.00% 0 0.00%
3 28 36.80% 0 0.00%
53 | P a g e
4 35 46.10% 0 0.00%
5 13 17.10% 6 28.60%
Combined 76 100.00% 21 100.00%
Age
Gender
Male Female
Frequenc Percent Frequenc Percent
y y
1 36 28.3% 7 8.8%
2 29 22.8% 1 1.3%
3 45 35.4% 0 0.0%
Cluster 4 17 13.4% 29 36.3%
5 0 0.0% 43 53.8%
Combine 127 100.0% 80 100.0%
d
54 | P a g e
Educational Qualification
The model summary shows the total number of input variables were 4 and the overall
number of clusters obtained were 5. The quality of cluster was deemed to fair.
55 | P a g e
RESULTS & CONCLUSION
60 | P a g e
APPENDIX 1: QUALITATIVE RESEARCH QUESTIONNAIRE
Interview Schedule Objective: To examine the effect of AVATAR of the chatbot on the
consumer behavior
Method: In-depth Interview (Virtual / Personal)
Interviewers: Group 7 A
Duration: Approximately 10-20 mins
Interview Questions
Taking Participant Consent: I introduce myself as_________, student at Goa Institute of
Management. I am currently conducting a study on AVATAR of a chatbot for Marketing
Research course project assignment. So, in this regard your valuable inputs will be of great
help. This interview will take approximately 10 to 20 mins and will be recorded for my
further analysis. I promise that the recorded interview conversation, your name and
personal details will be kept confidential. Further, the participation in the study is voluntary
and no monetary benefit or reward is given. There is no physical, psychological or social
risk involved in the study. However, if you feel loaded due to the interview questions at
any moment during interview process you are free to stop and exit at any given point of
time. There is no compulsory obligation to participate or complete the interview process.
So, May I please know, if you are willing to be a part of the study and provide your
voluntary consent in participating in this interview and be a respondent for my research
study. (Voluntary consent will also be obtained through email) Basic Info: Age ______;
Gender ______; Education ______
The questionnaire was divided into four major parts i.e. Ease of Use, User Experience,
Security and Physical Appearance.
1. Security/Privacy
a) How strongly will you as a customer consider suggestions provided by a chatbot?
b) How comfortable are you sharing your personal information while interacting with
a chatbot?
c) On a scale of 1-5 how comfortable will you be in discussing any sensitive topic with
a chatbot?
2. Form of communication
a) How much does the mode of communication (Written or Verbal) matters to you
as a customer?
b) How much do you want the chatbot to stick to professionalism over personal
approach?
57 | P a g e
c) How likely are you to prefer your regional language over English language (of
the chatbot)?
3. Anthropomorphic
a) How much does a machine replacing human to address your queries affect you as a
customer?
b) How much do you prefer a movable (in terms of expressions) chatbot over a static
one?
c) How strongly will you prefer a human form avatar over a non-human one?
4. Physical Appearance
a) How much impact does gender of a chatbot have while interacting with it?
b) "A picture as a depiction of an Avatar suffices. It doesn't necessarily have to have a
name assigned to it". How strongly do you agree with this statement?
c) Do you have any preference towards animated form or human form of chatbot
avatar?
d) "Mild color chatbot is preferred while interacting with a chat bot" How strongly do
you agree with this statement?
5. Response Quality
a) Rate the response quality you experienced the last time you interacted with a chatbot
b) How concise was the response you received the last time you interacted with a
chatbot?
c) How was your experience in terms of speed of replies you received from the chat
bot?
6. Customer Satisfaction
a) How good was your experience when you had last interacted with a chatbot?
b) How quickly were your queries resolving the last time you interacted with a chatbot?
c) How precisely were your queries addressing the last time you interacted with a
chatbot?
d) "Chatbots are known for effectively addressing customer issues" How strongly do
you agree with this statement?
Thank you very much for your valuable time and inputs.
58 | P a g e
REFERENCES
1.Padhy, K. C. (2006). Book review: Harvard Business School Publishing Corporation. Asia
Pacific Business Review, 2(2), 110–110. https://doi.org/10.1177/097324700600200219
2.Miao, F., Kozlenkova, I. V., Wang, H., Xie, T., & Palmatier, R. W. (2021). An emerging
theory of avatar marketing. Journal of Marketing, 86(1), 67–90.
https://doi.org/10.1177/0022242921996646
3. Garnier, M., & Poncin, I. (2013). The avatar in marketing: Synthesis, integrative framework
and Perspectives. Recherche Et Applications En Marketing (English Edition), 28(1), 85–
115. https://doi.org/10.1177/2051570713478335
4. Rosenkrans, G. (2009). The creativeness and effectiveness of online interactive rich media
advertising. Journal of Interactive Advertising, 9(2), 18–31.
https://doi.org/10.1080/15252019.2009.10722152
60 | P a g e