Wsdm2015 Submission 286

Predicting Online Shopping Behavior Using a Fine-Grained
Model of Personality
Yoram Bachrach
Sofia Ceppi
Ian Kash
Microsoft Research
Microsoft Research
Microsoft Research
yobach@microsoft.com
soceppi@microsoft.com
Peter Key
iankash@microsoft.com
Microsoft Research
peter.key@microsoft.com
ABSTRACT
Targeting users is a crucial challenge in online advertising.
Recent techniques allow inferring a users personality from
various online sources, such as social network profiles and
web-browsing history. We investigate whether personality
traits can be used to improve user targeting, by predicting
which users tend to buy products online. We use a finegrained personality model, including not only the Big Five
personality traits but also 30 personality facets (from the revisited NEO personality inventory), and correlate them with
the propensity to shop online for products of various categories.
We find that both extroversion and openness to experience
are correlated with online purchasing behavior. However,
we show that a fine-grained personality model achieves a
much higher predictive performance than a coarse-grained
Big Five personality model. For example, while agreeableness is not significantly correlated with the propensity to
buy products online, some its facets are significantly correlated with this behavior. Further, we show that different
sets of personality traits should be used to predict purchasing behavior of different product categories. This allows
advertisers to focus on different user segments, improving
all of their welfare.
1. INTRODUCTION
Targeting users is a crucial challenge in online advertising.
The selection of the most appropriate ad for a given user
brings benefit to all the actors in the market. Users are less
annoyed when advertisements are useful or relevant to them.
Further, the probability that ads are effective increases and
so does the welfare of advertisers. When both users and
advertisers are more satisfied with the service, more page
views are generated by users and advertisers are willing to
pay more, both of which increase revenue for the publishers
showing the ads. Of course, the benefits of a targeted ser-
vices are not restricted to advertising scenarios, and exist in

other domains, such as personalized search as well.
Early attempts to target users were based on demographical information, e.g., geographical location, that could be
inferred from the IP address and the interaction of the user
with the computer [14]. Then, information known by the
publisher, i.e., the web history and the interaction of the
user with the publishers webpages, began to be considered for the selection of ads [50, 10]. Today, efforts to improve how users are targeted brought us to the point where
a rich ecosystem has grown to track users across the web
and gather a wealth of data about visited websites, search
queries, specific products webpages seen by a user, and online social interaction which then can be taken into account
in the ads selection process, e.g. through real-time bidding
by advertisers for the opportunity to show their ad to a specific user [6, 35].
Such direct signals of consumer intent can rapidly become
stale. For example, a user may plan a holiday using online
tools or may research a major appliance purchase online;
that user may then spent several months seeing advertisements for hotels from places they have already been or for
appliances they have just bought. However, the possibility
of easily inferring a users demographic and personality profile from various online sources, e.g., search history, browsing
history, and online social network data [49, 17, 5, 31], opens
the opportunity to further enhance the selection of targeted
ads using information that tends to be more stable over time.
In the literature, there are studies that correlate the Big
Five personality traits with the intention to shop online (see
Section 5). However, these Big Five traits are very broad dimensions of human personality that can be more accurately
described with a fine-grained model of personality.
The hypothesis we aim to verify is that a greater level of
detail can improve the understanding of which personality
profiles shop online and what they prefer to buy. In particular, in addition to the higher-level Big Five traits, we
evaluate a fine-grained approach that refine them into the
thirty facets of the Revisited NEO Personality Inventory,
and in addition to the total propensity to shop online, we
analyze how it changes with respect to different categories
of products that can be bought online. We provide a more
detailed discussion of the thirty facets in Section 2. Specifi-
cally, our hypotheses are:

Hypothesis 1. A finegrained model of personality better
describes the propensity to shop online than the Big Five
personality traits.
Hypothesis 2. An analysis per product category with the
finegrained model increases welfare by better targeting users.
Our empirical evaluation is based on information collected
through a questionnaire composed of three parts. The first
part asks questions about demographic information, the second part gathers a fine-grained personality profile and the
third investigates attitudes toward online shopping for different product categories. The dataset we use for our analysis is composed by the answers of 600 participants sourced
from Amazons Mechanical Turk. Our analysis of this data
is mainly based on multiple linear regression model whose
predictors variable are basic demographic information plus
the Big Five personality traits or the 30 facets of the NEOPI-R and the variable to predict is the propensity to shop
online for different product categories. A more detailed discussion of our methodology is given in Section 3.)
The formulated hypotheses are confirmed by our analysis,
which we present in Section 4. For the model derived from
the Big Five personality traits, we find that two of the five
(openness and extroversion) are more informative than the
others, with openness providing the bulk of the predictive
power. However, when we use the finer-grained facet model,
we find that the informative facets are not always associated with these traits. Looking across categories, we find
examples where facets of non-informative traits are informative, and where informative traits have no (significantly)
informative facets. However, the model based on facets is
consistently a better predictor than the one based on the
Big Five, confirming Hypothesis 1.
There are at least two different ways personality information could be use to improve predictions. First, some people may simply be more interested in online shopping than
others. Identifying these people is certainly useful for advertisers determining how much they should be willing to
pay to show their ad to a specific person, but this increase is
common to all advertisers and does not provide any reason
to show one ad instead of another. Furthermore, although
this would lead to publishers receiving higher prices for ads
shown to users likely to shop online, it would also lead to
them receiving lower prices for ads shown to users less likely
to shop online. Thus, this type of improvement would provide only a limited opportunity to improve the efficiency of
the advertising marketplace.
On the other hand, personality could help better identify
the specific categories that are interesting to a person. This
would be the case if different facets of personality tended
to be relevant for interest in different categories. We find
that this is the case, and show that, even with our simple linear models, allocating advertisement categories based
on the facets model or on the Big Five model would lead
to 82.43% and 80.14% of the maximum possible welfare, respectively, confirming Hypothesis 2. Furthermore, the facets
model selects a somewhat more diverse set of categories in

which to show ads.
Finally, we discuss some findings that contradicts previous
work from the literature. Hasan [21] found that men are
more inclined to shop online than women, while we find the
opposite. One possible explanation for this is that societal
attitudes could have shifted in the past several years. Another possibility is that this change could be driven by new
categories of products becoming commonly available online,
as our results show, which gender has a higher propensity
to shop online varies by category.
2. BACKGROUND
2.1 Personality Traits: the Big Five model and
NEO-PI-R Facets
Personality differences between individuals are one of the
key topics studied in psychology. Personality allows a better understanding of many domains, and is correlated with
many aspects of our social lives. It can help predict the
success of individuals in their work [27], satisfaction and
happiness [22] and even marital satisfaction [29].
We investigate the relation between personality and online
shopping behavior, using both the coarse Big Five personality model, and the fine-grained NEO-PI-R facet model.
The Five Factor Model is arguably the most prominent and
widely used model of personality [11, 18, 44, 48]. This model
is considered a representation of a basic structure, which is
predictive of human preferences and behaviors across many
situations and domains. The model represents human personality using five key traits, called the Big Five, which originate as the main dimensions in a latent factor analysis of
human opinions, choices and behaviors [11, 18, 44].
We briefly describe the five personality traits (an in depth
discussion of these traits can be found in the psychology
literature [19, 26]):
Openness to experience measures a persons imagination, the desire to seek new experiences and curiosity regarding a wide range of interests in culture, ideas, and aesthetics.
This is related to tolerance, political liberalism and sensitivity to emotion. People who are high in Openness tend to
appreciate unusual and creative ideas, art and adventure.
Those who are low on Openness tend to be more conventional and traditional, less creative, and to be conservative
and avoid changes.
Conscientiousness relates to a methodic and organized
approach to life rather than a spontaneous one. People high
on Conscientiousness tend to be orderly, organized and consistent, and to pursue long term goals by planning ahead.
Those low in Conscientiousness focus less on rules and plans,
and tend to be more spontaneous and easy-going.
Extroversion relates to the tendency to seek external stimulation in the company of others. Extrovert individuals are
typically socially active, friendly, energetic and adventurous.
In contrast, introverts typically prefer their own company,
are more reserved and seek environments with less external
stimulation.
Big Five
Agreeableness measures the tendency to try and maintain positive social relations. Those who are high in Agreeableness tend to be compassionate, cooperative, sympathetic
and trusting. Such individuals tend to adapt to the needs of
others, but find it difficult to disagree with others and argue
their own opinion.
agreeableness
Neuroticism, sometimes called emotional instability, relates to the tendency to experience rapid mood changes and
negative emotions such as anger, depression or anxiety. Neurotic individuals are more likely to be stressed and nervous,
while those who are not Neurotic (or emotionally stable) are
typically calmer and self-confident.
conscientiousness
The Big Five traits are the main high-level dimensions of

personality. However, deeper models have used an extension
of the Big Five dimensions, examining more fine-grained behaviors. Such models break-down the clusters represented
by the major dimensions into their component behaviors. In
our work we consider Facets, the lower-level dimensions identified in the Revised NEO Personality Inventory (NEO-PIR) [12]. For each Big Five personality trait, the NEO-PI-R
proposes 6 facets which together compose that trait. Table 1
shows the association between the IPIP-NEO facets [20] (a
slight variant that we use in our analysis to take advantage
of a shorter questionnaire) corresponding to the Big Five
personality traits. 1
extraversion
openness
neuroticism
The study of a fine-grained model of personality traits is
driven by the fact that the Big Five traits offer very broad
dimensions of personality, but can be less powerful than a
fine-grained model in predicting human behavior. The Big
Five traits may not be sufficient to capture and represent
the variety of characteristics that describe the behavioral
tendencies of an individual, resulting in a loss in predictive
power for some domains.
IPIP-NEO facets
a1: altruism
a2: cooperation
a3: modesty
a4: morality
a5: sympathy
a6: trust
c1: achievement-striving
c2: cautiousness
c3: dutifulness
c4: orderliness
c5: self-discipline
c6: self-efficacy
e1: activity level
e2: assertiveness
e3: cheerfulness
e4: excitement-seeking
e5: friendliness
e6: gregariousness
o1: adventurousness
o2: artistic interest
o3: emotionality
o4: imagination
o5: intellect
o6: liberalism
n1: anger
n2: anxiety
n3: depression
n4: immoderation
n5: self-consciousness
n6: vulnerability
Table 1: The Big Five personality traits and the

corresponding facets of the IPIP-NEO.
2.2 Product categories
3.1
This work aims to a fine-grained analysis of both the personality profile of online shoppers and the type of shopping
they do online. To this end we analyzed the way products are categorized on retail websites, e.g., Amazon.com,
and we identified 13 products categories that can be bought
online. These categories are: paper books, entertainment
(e.g., video games, toy and games, computer games, DVDs,
CDs, blu-ray), travel related items, household goods (e.g.,
pet supplies, decor, craft, home appliances, storage, garden tools, snow removal, generators), consumer electronics
(e.g., Televisions, CD Players, game consoles, camera, laptop), sport and outdoor, clothes and shoes, software, food,
home and garden furniture, health and personal care (e.g.,
medicine, cosmetics, perfume, after shave, baby products),
digital goods (e.g., music, movies, e-books), and office supplies.
Our data is based on an online shopping behavior questionnaire given to participants. All participants were located in
the United States, and sourced from Amazons Mechanical
Turk (AMT), a prominent crowdsourcing platform. AMT is
a marketplace where requesters can posts tasks, which can
then be completed by workers for a fee. Our questionnaire
is comprised of three parts.
3. THE METHODOLOGY
We describe the the methodology we used to collect and
analyze data.
1
Due to space limitation, we do not provide a more detailed
discussion of each of the facets. Such a discussion can be
found in the work describing the NEO-PI-R model and questionnaires [12].
The data
The first part is a simple demographic information questionnaire. It includes questions about the participants gender,
age, education, income, relationship status, occupation, and
state of residence.
The second part gathers information about the participants
personality profile. It is composed of the 120 questions of the
short form of the International Personality Item Pool Representation of the NEO PI-R (IPIP-NEO) [20, 1]. Each item
in IPIP-NEO is a statement, and the participant is asked to
quantify the degree to which she agrees with the statement,
in the following scale: disagree, moderately disagree, neither
disagrees nor agree, moderately agree, or agree. The items
of the short form IPIP-NEO were presented in a random
order to the participants.
Each personality facet is associated with 4 of the items.
For example, to determine the score for a participants adventurousness, the statements are: I prefer to stick with
things that I know, I prefer variety to routine, I dislike
changes, and I am attached to conventional ways.
The facet score a participant receives is determined giving
a score for each item: -2 for disagree, -1 for moderately
disagree and so on until a +2 for agree. Each item is
either associated with a facet in a positive direction or a
negative direction. The facet score is simply the sum of the
scores of the associated questions, where the scores of items
associated in the negative direction is multiplied by -1 (i.e.
the scoring is reversed for these items, where disagree has
a score of +2, and agree has a score of -2).
Each of the Big Five personality traits is composed of 6
facets, and the score for a Big Five trait is computed by
summing the scores of the associated facets.
The third part of our questionnaire gathers information about
the participants propensity to shop online for various product categories. In particular, we ask participants how often
they buy online products of a given category, on the following scale: never, rarely, sometimes, frequently, and
all the time. The scores we associate with these options
range from 1 (never) to 5 (all the time). The score assigned to the overall propensity to shop online, regardless
of the product category, is the sum of the scores of all the
product categories.
Our questionnaire also contained some quality control questions, whose aim was to identify participants quickly filling
in the questionnaire or providing random answers. 2 We
have excluded data collected from participants failing the
quality control items from our analysis. Such filtering is a
standard part of good study design on AMT [37]. Of the
original 750 participants, 600 do not fail any of the quality
control items. The demographical profile of the sample is
reported in Table 5 at the end of the paper.
3.2 Analysis
Our methodology is based on multiple linear regression models, designed to predict the score of the propensity to shop
online (PSO) in different product categories. We used two
sets of predictor variables: the set of Big Five personality
trait scores, and the set of the 30 facet scores. In both cases
we have also added demographic features of gender and age.
Thus, for each product category we have built two linear
regression models: one that predicts the PSO for that category using the Big Five personality features, and one that
predicts the PSO for that category using the 30 facet scores.
We first tested for the statistical significance of the entire
regression models, and then for the statistical significance
of the coefficients of each of the regressor variables. This
allows us to identify the personality traits or facets that
correlate with the PSO score for that product category in a
statistically significant manner.
Once we determined the set of personality traits or facets
2
An example such quality control item is a statement of the

form Please choose Moderately Disagree.
that are predictive to the PSO, we quantify the relative importance of each of the statistically significant factors. Several methods have been proposed in the literature to measure
the relative importance of factors in linear regression models
(see [47, 13] for a discussion of such methods). We have used
the coefficient of multiple determination (CMD), where the
score of a feature xi is the change in R2 value between the
regression model which uses all predictors (including xi ) and
the regression model which uses all predictors except xi . For
space reasons, we only report these where relevant to our
results.
In our analysis we examine twenty eight distinct multiple
linear regression models (two sets of features predicting for
thirteen categories and the total propensity to shop online).
Thus the p-values we report are subject to the multiple comparisons problem. We note, however, that many of our tests
are significant even at the 0.001 level, and all the results we
discuss in next section are robust to the possibility of a small
number of false positives.
4.
RESULTS
We begin with three observations related to Hypothesis 1

(the second of which confirms it), then discuss an observation that confirms Hypothesis 2 and an observation that
contradicts previous work. For brevity, we refer to the multiple linear regression model whose features are the Big Five
personality traits, age, and gender as the Big Five model,
and to the multiple linear regression model whose features
are the 30 facets of the NEO-PI-R, age, and gender as the
Facets model. Table 4, reported at the end of the paper, indicates the significant features for each model and category,
as well as the significance of each model. While most categories yielded strongly significant models (with p < 0.001
in most cases), personality seems to provide limited information at best about the food category and the home and
garden category.
Observation 1. Two of the Big Five personality traits (extroversion and openness) are more informative than the others for both the total propensity to shop online and the
propensity to shop online of most of the categories.
The p-values of the Big Five model reveal that the more
informative features are extroversion (p = 0.019), openness
(p = 1.36e6 ), conscientiousness (p = 0.083), and gender
(p = 0.083).
We analyze a decomposition of the explained variance for the
significant predictors of the Big Five model to show the magnitude of how informative they are (i.e., whether they are
economically significant as well as statistically significant).
The results show that extroversion contributes 11.17% of
the explained variance, openness 69.03%, conscientiousness
6.15%, and gender 13.66%.
Observation 2. The Facets model is a better predictor

than the Big Five model for the total propensity to shop
online as well as most of the categories.
The analysis of the R2 and adjusted R2 values shows that

the Facet model better explains the variability of the data
than the Big Five model. When the total propensity to shop
online is the variable that the model should predict, for the
Big Five model R2 = 0.075 and adjusted R2 = 0.065, while
for the Facets model R2 = 0.123 and adjusted R2 = 0.073.
The same behavior can be observed when the propensity to
shop online for categories is considered. Only the household
goods category and the health and personal care category
have a lower adjusted R2 under the Facets model.
Observation 3. The most informative Facets are not always Facets related to the most informative Big Five traits.
Several examples illustrating the observation can be found
in Table 4 where the most informative predictors (i.e., predictors whose p-value is lower than or equal to 0.05) for each
category and for the total propensity to shop online are reported. Perhaps the best illustration of this is the total
propensity, where agreeableness as a whole is not particularly informative, but its facet morality is.
Similar results are seen when we investigate individual categories. Given how much of the variance was explained by
openness, it is unsurprising that its facets appear frequently,
but facets of agreeableness and neuroticism appear almost
as often. While none of the Big Five is informative for the
office supplies category, the facets altruism, sympathy, selfefficacy, and intellect are. Similarly, while openness is the
informative Big Five trait for the consumer electronics product category, none of its facets is informative, while altruism
and cooperation (facets of agreeableness) and anger (facet
of neuroticism) are. While a few of these results may be
false positives due to the p-values of being uncorrected for
multiple comparisons, particularly where p-values are only
significant at the 0.05 level, this observation is robust to
that.
Observation 4. Using the Facets model to target users
would improve both welfare and diversity over the Big Five
model.
If our results were directly used to decide what type of advertising to show to a particular user, a natural goal would
be to maximize the propensity to shop online of each user for
the category of advertisement he is shown. For each person
in our data set, we use the category regressions to predict
the product category he is most likely to buy in with both
the Big Five models and the Facets models (i.e., the category with the highest predicted propensity to shop online).
Then for each class of models, we sum the propensity to shop
online that was actually reported by each person for the predicted category. These sums can be seen as an estimate of
the welfare of the users and advertisers when a given regression model is used, similar to the way number of clicks generated is used as a measure of welfare in both organic search
(e.g., NDCG) and search advertising [32]. More specifically,
we are assuming that the reported propensity to shop online correlates with other metrics such as probability of a
click on an advertisement from that category or probability of making a purchase after seeing an advertisement from
that category, so that maximizing the total propensity to

shop online is a reasonable proxy for maximizing these. An
improvement by this metric indicates that improved predictions actually lead to better matchings between people and
categories, not just better determination of peoples overall
attitude toward online shopping.
Our empirical analysis shows that the Facets model achieves
the 82.43% of the maximum possible welfare, while the Big
Five model achieves the 80.14%. This shows that a model
using Facets is not just better at identifying which users
tend to shop online, but is also able to exploit heterogeneity
of user interests to find better matches between users and
advertisers. In Table 2, for both the linear regression models
and for each product category, we report the number of people for whom that category has the highest predicted and
true propensity to shop online 3 . We can see from this that
the Big Five model assign users to a small number of categories which have the highest average propensity to shop
online. In contrast, the Facets model generates a somewhat
more diverse set of recommendations.
book
entertainment
travel
household goods
electronics
sport/outdoor
clothes/shoes
software
food
home/garden
health/personal care
digital goods
office supplies
Big Five
15
88
0
0
179
0
228
0
0
0
0
30
0
Facets
33
112
1
0
136
0
215
5
0
0
10
87
1
True
73.87
77.79
25.97
25.99
74.24
11.70
93.95
36.03
9.96
1.63
51.90
93.03
23.93
Table 2: For each product category, number of Turkers for which that category has the highest predicted
propensity to shop online. The second column shows
predictions of the Big Five model, the third column
predictions of the Facets model, and the last column
how the actual highest propensity to shop online are
distributed among the categories.
Observation 5. Our finding about how gender affects the

propensity to shop online is not in line with the result proposed by Hasan [21].
In his work published in 2010, Hasan [21] finds that gender

differences are significant for online shopping behavior and
that men are more favorably disposed toward online shopping than women. We obtain the opposite result. One possibility is simply that overall attitudes have shifted in this
3
Several users reported the highest propensity to buy online
for more than one category. Thus, in order to compare the
actual propensity to buy online for a category and the predicted one, the former is normalized by dividing each person
equally among all their highest propensity categories.
time. Another is that he studied a sample of 80 students enrolled in an electronic commerce course at a Midwestern university whose average age is 22.54 years, while our study of
workers on a crowdsouring platform may represent an older
population with different attitudes. Or, perhaps we can find
an explanation by looking at the per category result and taking into account how the online shopping has evolved in the
last 4 years, e.g., changes in which products can be easily
bought online now that were not broadly available 4 years
ago. For each category, the result of the Mann-Whitney Utest [39] showing if men or women are more inclined to shop
online for a product of that category is reported in Table 3.
Men > Women
entertainment
p-value
0.0015
Women > Men

books
p-value
6.34e4
electronics
0.0011
health and
personal care
4.84e6
sport/outdoor
3.18e16
clothes and
shoes
6.92e16
software
2.81e5
food
4.63e21
home and
garden
1.40e31
digital goods
8.63e5
office supplies
0.01
non significant categories (p-value > 0.05)

household goods
travel related items
Table 3: For each category, if the result is statistical
significant we report the gender that is more likely
to buy it and the p-value of the Mann-Whitney U
test.
5. RELATED WORK
Several studies have been done to understand the factors
that affect attitudes toward online shopping. One of the
first studies examines the attitude toward online shopping
of people of different ages given their online behavior, e.g.,
prepurchase search [45]. Another factor that has been considered to explain different online shopping attitude is gender. Hasan [21] studies the components of this attitude, i.e.,
cognition, affect, and behavior, and tests the gender differences on them. Results of his analysis show that gender
differences across the three attitudinal components are significant and that men demonstrate more favorable attitude
toward online shopping than women. Privacy and security
concerns have also been studied [38] as factors that affect
attitudes towards online purchasing. In particular, these
concerns arise due to the uncertainty of the online environment. This is why McCole et al. [38] investigate the importance of trust in the internet, trust in the vendor, and
trust in third parties such as institutions and guarantors
that provide certificates of integrity. Other factors that have
been analyzed [25] in relation with online shopping attitude
and behavior include perceived risks, return policies, subjective norms, and domain specific innovations. Some work
in the literature investigates the motivations behind online
shopping and defines typologies of shoppers based on what
motivates them to buy online [43, 28]. A comparison between online and non-online shoppers has been proposed by
Lokken et al. [36].
Works mentioned so far consider the attitude towards online
shopping but do not differentiate it for different products
that eshoppers can buy. Other papers in the literature
study online shopping acceptance in the context of different products [33, 41, 34, 4]. As determinant of the online
shopping attitude, factors like the consumer characteristics,
personal perceived value and risk, website design, and the
product itself are considered in this papers. A fine-grained
analysis like ours that investigates correlations between personality traits and propensity to shop online for different
product categories has not been previously undertaken.
Personality traits have previously been considered as a way
to describe and motivate different online shopping behavior.
In several works [7, 8, 9], the hierarchical approach to personality developed by Mowen [40] is applied to the online
setting. In Mowens model, in addition to the Big Five personality traits that are identified as elemental traits, compound traits (need for cognition, need for evaluate, need
for arousal, and need for material resources) and situational
traits (affective and cognitive involvement) are also considered to predict the intention to shop online. Results of these
study suggest that Mowens model needs to be modified to
capture online shopping behavior. Moreover, Bosnjak et
al. [7] observe that, according to Ajzen [2], there are some
factors not included in the model that are significant because the coefficient of determination rises when users past
behavior is included in the analysis.
Personality traits have also been used in other contexts.
Feng and Qian [15] used personality and interests to improve recommendation systems. Jansen and Solomon [24]
investigate demographic targeting in sponsored search.
Work on learning demographics and personality from online
behavior and data includes work by Weber and James [49],
who use search data to infer demographics. Goel et al. [17]
learn demographics from web browsing behavior. Bi et al. [5]
infer the demographics of searchers from social data. Kosinski et al. [31] explore different approaches to personality inference.
In previous work using Amazon Mechanical Turk as a research platform, Kittur et al. [30] explore using it for user
studies. Alonso and Mizzaro [3] use it for relevance assessment. Franklin et al.[16] use it for question answering services. Suri et al. [46] study the honesty of people on AMT.
Horton et al. [23] and Paolacci et al. [42] replicate classic behavioral economics experiments on AMT and obtain results
matching the original experiments. Mason and Suri [37] provide a survey, including results showing that AMT studies
can be reliable and identifying best practices for them.
6.
CONCLUSION
Our results indicate that using a fine-grained model of personality gives better predictions for shopping propensity,
when compared to just using the Big Five personality traits.
For our specific study, using a linear model, the R2 value
improved from 7.5% to 12.5% when using the Facet model
rather than the Big Five model, and the adjusted R2 improved from 6.5% to 7.3%. These are modest improvements
in absolute terms (which indicate the need to explore other
latent variables to explain shopping propensity as well as
more sophisticated models), but worthwhile in terms of relative improvement. The story is the same for predictions
applied to 11 of our 13 individual shopping categories, with
Facets giving better predictions; the two exceptions were
household goods and health and personal care.
For user-targeting and increasing social welfare, showing
user-relevant content is important, and here the use of Facets
helps by better predicting the most likely shopping category,
and hence matching users to categories better. Indeed, we
found that the predicted welfare increases from the 80.14%
to the 82.43% when from the Big Five model we move to
the Facets model, and also a better diversity of recommendations (the Big Five model is biased towards just predicting
the most popular categories).
When looking at the variance that can be explained by using personality traits, our results show that openness is the
best explanatory personality feature, together with facets of
neuroticism and agreeableness. In our study gender was not
significant for the total propensity to shop online, but was
for certain shopping categories (such as clothes and shoes,
software and health). We also identified categories where
women had higher shopping propensity than men and viceversa.
The relative improvements of using the Facets model over
the Big Five model for predicting on-line shopping attitude
are very promising when viewed in context. A question for
further research is to what extent the gains are affected when
personality features are combined with other features, such
as historical data , contextual and temporal browsing behavior.
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
7. REFERENCES
[1] International personality item pool: A scientific
collaboratory for the development of advanced
measures of personality traits and other individual
differences. http://ipip.ori.org/.
[2] I. Ajzen. Residual effects of past on later behavior:
habituation and reasoned action perspectives.
Personality and Social Psychology Review, 6(2):107
122, 2002.
[3] O. Alonso and S. Mizzaro. Can we get rid of trec
assessors? using mechanical turk for relevance
assessment. In Proceedings of the SIGIR 2009
Workshop on the Future of IR Evaluation, volume 15,
page 16, 2009.
[4] F. Belanger, J. S. Hiller, and W. J. Smith.
Trustworthiness in electronic commerce: the role of
privacy, security, and site attributes. The Journal of
Strategic Information Systems, 11(3-4):245 270,
2002.
[5] B. Bi, M. Shokouhi, M. Kosinski, and T. Graepel.
[16]
[17]
[18]
[19]
[20]
Inferring the demographics of search users: social data

meets search queries. In Proceedings of the 22nd
international conference on World Wide Web, pages
131140. International World Wide Web Conferences
Steering Committee, 2013.
M. Bilenko and M. Richardson. Predictive client-side
profiles for personalized advertising. In Proceedings of
the 17th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, pages
413421. ACM, 2011.
M. Bosnjak, M. Galesic, and T. Tuten. Personality
determinants of online shopping: Explaining online
purchase intentions using a hierarchical approach.
Journal of Business Research, 60(6):597605, June
2007.
T. Chen. Personality traits hierarchy of online
shoppers. International Journal of Marketing Studies,
3(4):2339, November 2011.
T. Chen and M.-C. Lee. Personality antecedents of
online buying impulsiveness. Journal of Economics,
Business and Management, 3(4):425429, April 2015.
Y. Chen, D. Pavlov, and J. F. Canny. Large-scale
behavioral targeting. In Proceedings of the 15th ACM
SIGKDD International Conference on Knowledge
Discovery and Data Mining, pages 209218. ACM,
2009.
P. Costa Jr and R. McCrae. Neo personality
inventoryrevised (neo-pi-r) and neo five-factor
inventory (neo-ffi) professional manual. Odessa, FL:
Psychological Assessment Resources, 1992.
P. T. Costa Jr. and R. R. McCrae. Domains and
facets: Hierarchical personality assessment using the
revised neo personality inventory. Journal of
Personality Assessment, 64(1):2150, 1995.
P. D. Ellis. The essential guide to effect sizes:
Statistical power, meta-analysis, and the interpretation
of research results. Cambridge University Press, 2010.
D. S. Evans. The online advertising industry:
Economics, evolution, and privacy. The Journal of
Economic Perspectives, 23(3):pp. 3760, 2009.
H. Feng and X. Qian. Recommendation via users
personality and social contextual. In Proceedings of
the 22nd ACM international conference on Conference
on information & knowledge management, pages
15211524. ACM, 2013.
M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh,
and R. Xin. Crowddb: answering queries with
crowdsourcing. In Proceedings of the 2011 ACM
SIGMOD International Conference on Management of
data, pages 6172. ACM, 2011.
S. Goel, J. M. Hofman, and M. I. Sirer. Who does
what on the web: A large-scale study of browsing
behavior. In ICWSM, 2012.
L. Goldberg. The structure of phenotypic personality
traits. American psychologist, 48(1):26, 1993.
L. R. Goldberg. The structure of phenotypic
personality traits. The American psychologist,
48(1):2634, Jan. 1993.
L. R. Goldberg, J. A. Johnson, H. W. Eber, R. Hogan,
M. C. Ashton, C. R. Cloninger, and H. G. Gough. The
international personality item pool and the future of
public-domain personality measures. Journal of
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
Research in Personality, 40(1):84 96, 2006.

Proceedings of the 2005 Meeting of the Association of
Research in Personality Association of Research in
Personality.
B. Hasan. Exploring gender differences in online
shopping attitude. Comput. Hum. Behav.,
26(4):597601, July 2010.
B. Headey and A. Wearing. Personality, life events,
and subjective well-being: toward a dynamic
equilibrium model. Journal of Personality and Social
psychology, 57(4):731, 1989.
J. J. Horton, D. G. Rand, and Z. Richard J. The
online laboratory: Conducting experiments in a real
labor market. Experimental Economics, 14:399425,
2011.
B. J. Jansen and L. Solomon. Gender demographic
targeting in sponsored search. In Proceedings of the
SIGCHI Conference on Human Factors in Computing
Systems, pages 831840. ACM, 2010.
M. H. M. Javadi, H. R. Dolatabadi, M. Nourbakhsh,
A. Poursaeedi, and A. R. Asadollahi. An analysis of
factors affecting on online shopping behavior of
consumers. International Journal of Marketing
Studies, 3(5):8198, September 2012.
J. A. Johnson and F. Ostendorf. Clarification of the
five-factor model with the abridged big five
dimensional circumplex. Journal of Personality and
Social Psychology, 65(3):563, 1993.
T. Judge, C. Higgins, C. Thoresen, and M. Barrick.
The big five personality traits, general mental ability,
and career success across the life span. Personnel
psychology, 52(3):621652, 1999.
A. K. Kau, Y. E. Tang, and S. Ghose. Typology of
online shoppers. Journal of Consumer Marketing,
20(2):139 156, 2003.
E. Kelly and J. Conley. Personality and compatibility:
A prospective analysis of marital stability and marital
satisfaction. Journal of Personality and Social
Psychology, 52(1):27, 1987.
A. Kittur, E. H. Chi, and B. Suh. Crowdsourcing user
studies with mechanical turk. In Proceedings of the
SIGCHI conference on human factors in computing
systems, pages 453456. ACM, 2008.
M. Kosinski, Y. Bachrach, P. Kohli, D. Stillwell, and
T. Graepel. Manifestations of user personality in
website choice and behaviour on online social
networks. Machine Learning, 95(3):357380, 2014.
S. Lahaie and D. M. Pennock. Revenue analysis of a
family of ranking rules for keyword auctions. In
Proceedings of the 8th ACM Conference on Electronic
Commerce, pages 5056. ACM, 2007.
J.-W. Lian and T.-M. Lin. Effects of consumer
characteristics on their acceptance of online shopping:
Comparisons among different product types. Comput.
Hum. Behav., 24(1):4865, Jan. 2008.
Z. Liao and M. T. Cheung. Internet-based e-shopping
and consumer attitudes: an empirical study.
Information and Management, 38(5):299 306, 2001.
K. Liu and L. Tang. Large-scale behavioral targeting
with a social twist. In Proceedings of the 20th ACM
International Conference on Information and
Knowledge Management, pages 18151824. ACM,
2011.
[36] S. L. Lokken, G. W. Cross, L. K. Halbert, G. Lindsey,
C. Derby, and C. Stanford. Comparing online and
non-online shoppers. International Journal of
Consumer Studies, 27(2):126133, 2003.
[37] W. Mason and S. Suri. Conducting research on
amazons mechanical turk. Behavioral Research,
44:123, 2012.
[38] P. McCole, E. Ramsey, and J. Williams. Trust
considerations on attitudes towards online purchasing:
The moderating effect of privacy and security
concerns. Journal of Business Research,
63(9-10):10181024, 2010.
[39] P. E. McKnight and J. Najab. Mann-Whitney U Test.
John Wiley and Sons, Inc., 2010.
[40] J. Mowen. The 3M Model of Motivation and
Personality: Theory and Empirical Applications to
Consumer Behavior. Springer, 1999.
[41] A. OCass and T. Fenech. Web retailing adoption:
Exploring the future of internet users web retailing
behavior. Journal of Retailing and Consumer Services,
10(2):81 94, 2003.
[42] G. Paolacci, J. Chandler, and P. Ipeirotis. Running
experiments on amazon mechanical turk. Judgment
and Decision Making, 5(5):411419, 2010.
[43] A. J. Rohm and V. Swaminathan. A typology of
online shoppers based on shopping motivations.
Journal of Business Research, 57(7):748 757, 2004.
[44] M. Russell, D. Karol, I. for Personality, and
A. Testing. The 16PF fifth edition administrators
manual. Institute for Personality and Ability Testing
Champaign, IL, 1994.
[45] P. Sorce, V. Perotti, and S. Widrick. Attitude and age
differences in online buying. International Journal of
Retail & Distribution Management, 33(2):122132,
2005.
[46] S. Suri, D. G. Goldstein, and W. A. Mason. Honesty
in an online labor market. In Proceedings of the 3rd
Workshop on Human Computation, 2011.
[47] B. G. Tabachnick, L. S. Fidell, and S. J. Osterlind.
Using multivariate statistics. 2001.
[48] E. Tupes and R. Christal. Recurrent personality
factors based on trait ratings. Journal of Personality,
60(2):225251, 1992.
[49] I. Weber and A. Jaimes. Demographic information
flows. In Proceedings of the 19th ACM international
conference on Information and knowledge
management, pages 15211524. ACM, 2010.
[50] J. Yan, N. Liu, G. Wang, W. Zhang, Y. Jiang, and
Z. Chen. How much can behavioral targeting help
online advertising? In Proceedings of the 18th
International Conference on World Wide Web, pages
261270. ACM, 2009.
**
office supplies
digital goods
*
***
health/personal care
***
home/garden
**
food
***
*
*
software
***
***
clothes/shoes
***
sport
electronics
household goods
travel
entertainment
*
***
book
total
agreeableness
coscientiousness
extraversion
openness
neuroticism
age
gender
Big Five model
a1: altruism
a2: cooperation
a4: morality
a5: sympathy
a6: trust
c4: orderliness
c6: self-efficacy
e1: activity level
e3: cheerfulness
e4: excitement-seeking
e5: friendliness
o1: adventurousness
o2: artistic interest
o4: imagination
o5: intellect
o6: liberalism
n1: anger
n2: anxiety
n3: depression
age
gender
Facets model
***
*
***
**
**
***
*
***
**
***
***
*
***
***
**
*
**
***
***
**
***
***
*
*
***
***
**
***
*
*
*
**
*
*
*
*
*
*
*
*
**
***
*
*
***
**
***
*
**
**
**
*
*
***
*
***
**
**
**
***
***
***
***
***
*
**
***
*
**
***
***
***
***
**
**
Table 4: Big five personality traits and facets that are more informative (i.e., p-value < 0.05) for the total
propensity of shopping online and for the propensity of shopping online of each product category. In the
table also the statistical significance of the entire Big Five model and Facets model. We report * when
0.01 < p-value 0.05, ** when 0.001 < p-value 0.01, and *** when p-value 0.001.
Gender
Age
Education
Income (NT $)
Status
Occupation
Options
Female
Male
19 20 30
31 40
41 50
51 +
Less than High School
High School
Bachelors Degree
Masters Degree
Doctorate
other
under 20,000
20,000 30,000
30,000 40,000
40,000 50,000
50,000 60,000
60,000 70,000
70,000 80,000
80,000 90,000
90,000 100,000
over 100,000
prefer not to answer
Single
In a relationship
Engaged
Married
Widowed
Separated
Divorced
Cohabitant
Civil union
Prefer not to answer
I go to school
I work
I am retired
I go to school and work
I do not go to school or work
Number
309
291
8
238
192
83
79
21
11
235
281
41
11
155
137
106
62
38
37
23
9
8
15
10
164
112
24
215
7
7
27
42
1
1
29
442
18
37
74
Percentage %
51.5
48.5
1.33
39.67
32
13.83
13.17
3.5
1.83
39.17
46.83
6,83
1.83
25.83
22.83
17.67
10.33
6.33
6.17
3.83
1.5
1.33
2.5
1.67
27.33
18.67
4
35.83
1.17
1.17
4.5
7
0.17
0.17
4.83
73.67
3
6.17
12.33
Table 5: Demographic profile of the sample.

Wsdm2015 Submission 286

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Wsdm2015 Submission 286

Uploaded by

Copyright:

Available Formats

Predicting Online Shopping Behavior Using a Fine-Grained

vices are not restricted to advertising scenarios, and exist in

cally, our hypotheses are:

model selects a somewhat more diverse set of categories in

The Big Five traits are the main high-level dimensions of

Table 1: The Big Five personality traits and the

2.2 Product categories

An example such quality control item is a statement of the

We begin with three observations related to Hypothesis 1

Observation 2. The Facets model is a better predictor

The analysis of the R2 and adjusted R2 values shows that

that category, so that maximizing the total propensity to

Observation 5. Our finding about how gender affects the

In his work published in 2010, Hasan [21] finds that gender

Women > Men

non significant categories (p-value > 0.05)

Inferring the demographics of search users: social data

Research in Personality, 40(1):84 96, 2006.

Table 5: Demographic profile of the sample.

You might also like