You are on page 1of 11

Electronic Commerce Research and Applications 34 (2019) 100836

Contents lists available at ScienceDirect

Electronic Commerce Research and Applications


journal homepage: www.elsevier.com/locate/elerap

Will this session end with a purchase? Inferring current purchase intent of T
anonymous visitors

Osnat Mokryna, , Veronika Boginab, Tsvi Kuflikb
a
Department of Information and Knowledge Management, University of Haifa, Israel
b
Department of Information Systems, University of Haifa, Israel

A R T I C LE I N FO A B S T R A C T

Keywords: Understanding the online behavior and intent of online visitors is the subject of a long line of research. me-
Purchase intent chanisms to understand the purchase intent of visitors, to increase the number of visits that end with a purchase.
Anonymous visitors anonymous visitors garner little attention, having no shopping history or known interests. Compared to profiled
Session dynamics returning customers whose history is known, anonymous visitors garner less attention. The lack of a known
Products trendiness
shopping history or interests makes it hard to learn from their behavior, or infer their shopping intent. Here, we
Temporal session information
suggest the use of products’ popularity trends and visit’s temporal information to infer the purchase intention of
anonymous visitors. We model these dynamics and utilize our model to infer purchase intent of visitors of two
large real e-commerce retailer sites. Our model identifies online signals for purchase intent that can be used for
online purchase prediction.

1. Introduction for applying marketing strategies, recommender systems and personal


automated shoppers (Kim and Yum, 2011). To that end, understanding
E-commerce has become a prevalent method for shopping, with the current purchase intent of shoppers is rudiment. Moreover, being
over 79% of Americans visiting e-commerce sites, as reported by Pew able to identify whether a session will end without a purchase, can help
Research Center, 2016. Yet, only a tiny fraction of these visits end with sellers to take actions to address shoppers’ needs and preferences better
a purchase, in the range of 2 to 5 percent (Pew Research Center, 2016; and to increase the conversion rate (Kim et al., 2003; McDowell et al.,
Center for Retail Research, 2017; McDowell et al., 2016). This fraction 2016).
is called “site’s purchase conversion rate”. Considering that current Lately, as a way to enrich the limited information about online
online retail is estimated at more than 460 billion dollars, according to shoppers, Pinterest Research has published a large-scale study in-
Forrester Research (2017), even a small improvement in a site’s pur- vestigating how the cross-platform activity of users may build up over
chase conversion rate will yield a significant increase in revenue time into a purchase intent (Lo et al., 2016). Still, learning from pre-
(Sismeiro and Bucklin, 2004). vious visits can help only with predicting the purchase intent of re-
Understanding online behavior is the subject of a long line of re- turning customers, who are identified. It cannot help in the case of
search in marketing, aimed at gaining insight into shoppers’ decision anonymous returning shoppers, first-time visitors, or occasional anon-
process, and apply this understanding for marketing, improving the ymous shoppers. Occasional online shopping is known to be prevalent
visitors’ shopping experience and increasing sales. Most of the previous and accounts for almost half of the online purchases (Pew Research
research explored the online behavior of returning shoppers. The online Center, 2016). Understanding and correctly classifying the current
behavior and purchase intent of anonymous visitors was researched to a purchase intent of online visitors that have no prior history is essential
lesser extent, as they cannot be profiled (Suh et al., 2004; Schäfer and for a site’s ability to target these visitors with accurate online help.
Kummer, 2013; Baumann et al., 2018). side, employ a variety of tools to Anonymous visitors, even if are returning, might be turned into shop-
improve purchase conversion rate, among which are recommender pers with the right recommendations or aid. Classifying the purchase
systems and personal automated shoppers, aimed at converting non- intent of anonymous visitors might also help in understanding the
shoppers to shoppers (Kim and Yum, 2011). Gaining insights on factors that lead to an impulse purchase of visitors that do not have a
anonymous visitors and their navigation and decision process is crucial history of acquisitions in the site (Cobb and Hoyer, 1986; Chan et al.,


Corresponding author.
E-mail address: omokryn@univ.haifa.ac.il (O. Mokryn).

https://doi.org/10.1016/j.elerap.2019.100836
Received 4 October 2018; Received in revised form 3 February 2019; Accepted 9 February 2019
Available online 26 February 2019
1567-4223/ © 2019 Elsevier B.V. All rights reserved.
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

2017). We refer to an unidentified visitor as anonymous, as there is no 2. Related work


known shopping history of this visitor in the site. this type of visitors,
with no history at the site, as anonymous. A low conversion rate in online shopping is a widely recognized
Inferring the purchase intent of visitors with no historical in- problem for e-commerce sites (Moe and Fader, 2004; Venkatesh and
formation is challenging, as there is no data about the visitor for Agarwal, 2006; Panagiotelis et al., 2014). The online environment in-
building their profile. When the purchase intent of anonymous visitors troduces inherent barriers for purchase, such as the shoppers’ inclina-
has been considered, studies have hitherto assumed a fixed baseline tion to accept the involved technology, their perception of the site and
value (Sismeiro and Bucklin, 2004; Park and Park, 2016; ?) or a per- their ability to trust it (Deng and Poole, 2010; Gefen et al., 2003; Pavlou
domain baseline value of purchase intent (Suh et al., 2004; Moe and and Fygenson, 2006; Zhou et al., 2007; Amaro and Duarte, 2015); On
Fader, 2004; Ding et al., 2015; Panagiotelis et al., 2014). first-time may the other hand, it offers a plethora of information, and thus shoppers
be viewed as first-time shoppers in this context. may browse many sites before making a decision (Pavlou and Fygenson,
We investigate here the question of how to determine the purchase 2006). Senecal et al., 2005 show that shoppers tend to search more
intent of anonymous visitors from their current session information, online than they do offline, and get involved in a rather complex
and how to utilize it to predict whether the session will end up with a browsing behavior. Further, browsing is not only done for goal- or-
purchase. We suggest to consider the clickstream of visitors and to iented reasons such as purchasing, but also for experiential reasons,
check the recent trend of products popularity they explore (i.e., click e.g., the shopping experience itself (Wolfinbarger and Gilly, 2001;
on) as a means for understanding their purchase intent. This trend of Scarpi et al., 2014).
product popularity can be thought of as the site’s local view of the Assessing visitors purchase intentions correctly can help sites in
overall perceived product popularity and success. Product popularity is converting visitors to shoppers by showing them recommendations or
known to affect product success (Hanson and Putler, 1996; Salganik suggesting assistance (Lu et al., 2015; Chen et al., 2017).
et al., 2006; Cai et al., 2009; Tucker and Zhang, 2011), but has been
found to be trending with time (Tsymbal, 2004), even on a scale of days 2.1. Predicting purchase intent utilizing clickstream data
(Srinivasan and Mekala, 2014). To conform to the locality in time, we
check only the recent trajectory (i.e., trend) of the popularity of pro- Tracking and modeling behaviors over recurrent visits have been
ducts (“product trendiness”). trendiness of products that first-time the focus of much research. Shoppers, in a search for information,
visitors view during their session is a key predictor of their purchase browse and compare prices over different sites. Their final choice of site
intent. We are also interested in the temporal aspects of the visit, i.e., on to purchase depends, among other things, on the stickiness of the site
which day of the week and in which period of the year the Web site is (Venkatesh and Agarwal, 2006; Wolfinbarger and Gilly, 2001;
visited. The work of Lo et al. (2016), using Pinterest dataset, found that Panagiotelis et al., 2014). Popular (sticky) sites can then log their
these are predictive features for personal purchase intent. We are ex- visitors’ online browsing sequences (clickstream data) over time, as
tending this aspect to see whether the temporal aspects have a general well as additional information, to gain knowledge of their online be-
effect on the shopping intent of anonymous visitors. We also take into havior and predict their purchase intent. Clickstream data gives a
account the length of the anonymous visitors’ sessions (in the number of competitive edge to sites and enables researchers to understand and
clicks) as a feature and discuss how to use it for predictions. model online actions, and use it for predictions (Bucklin et al., 2002;
We build an elaborate purchase intent modeling algorithm for Bucklin and Sismeiro, 2009; Olbrich and Holsing, 2011; Lo et al., 2016).
anonymous visitors, utilizing the visitor’s session’s temporal char- Moe, 2003 identifies different behavior types of online shoppers from
acteristics, number of clicks and product trendiness. prediction algo- their clickstream data, creating a typology of shopping strategies. The
rithm over each of the datasets, tailoring the temporal features per site, intent is built over time when shoppers browse or search for informa-
and receive a highly accurate purchase intent prediction using en- tion while planning a future purchase. This type of search is either di-
semble classifiers. design an elaborate purchase intent prediction al- rected or exploratory. Immediate purchases occur when a shopper visits
gorithm for first-time visitors, utilizing these features. We present an a site with the intent to make a purchase. Hedonic browsing, derived
analysis of two large datasets from real and active e-commerce sites, from the shopping experience itself, may also result in an immediate
each containing tens of millions of sessions of visiting users collected purchase.
during the period 2012–2014. We evaluate the performance of our in- Moe and Fader, 2004 model individual conversion dynamics for
ference algorithm using an ensemble of classifiers (Ricci et al., 2015) visitors based on the above typology. They assume a baseline intent that
over each of the datasets. Given that the visitor’s sessions’ temporal is gamma distributed and model the personal development of the intent
information differ between the sites, the inference algorithm is adapted over time to predict a purchase. Montgomery et al. (2004) model na-
to account for this difference. We further show that our results are vigation patterns of recurring visits as a Markov chain, and find that six
competitive against a state-of-the-art deep learning technique that prior visits are enough to learn and predict the personal intent of re-
mines recurrent patterns utilizing neural networks. A general frame- current shoppers. Other models include modeling of completed tasks
work that does not take into account site-specific temporal details is (Sismeiro and Bucklin, 2004; Su and Chen, 2015), and graph mining
presented and evaluated as well. Our good inference results for the over navigation between tasks (Kalczynski et al., 2006).
purchase intent of anonymous visitors over both datasets support our Recent models consider dynamic behavioral patterns that we also
hypotheses. take into account. Park and Park, 2016 model the user’s visit dynamics
In addition to purchase intent inference the method is applicable for over time as clusters, where close-by visits are clustered and between
cold start situations. Employing our method would enable websites to clusters are periods of no visits. They find that the conversion rates are
replace the initial fixed baseline value of purchase intent for anon- higher at later visits in the clusters than at earlier ones. Modeling these
ymous visitors with a personalized purchase intent value. fixed baseline temporal dynamics for visitors, they predict the purchase intent using a
intent estimation. current initial purchase intent on top of the baseline Bayesian learning model. Baumann et al. (2018) follow visitors navi-
for returning customers. Our approach is also applicable for boot- gation paths, modeling each session as a navigation graph that is up-
strapping new e- commerce sites after a few days, where no prior in- dated with every click. They use graph metrics to determine purchase
formation is available on any of the customers or products, and give a intentions. Bhatnagar et al. (2016) find that the length of the first visit
dynamically calculated intent baseline during the session. predicts the possibility of next visits, and model online behavior uti-
lizing visit dynamics and navigation information. Like us, they also
consider temporal information, such as the day of the week and the
time of day. They determine a time window in which the visitor is more

2
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

likely to purchase, however their model depends on the personalization Indeed, a recent work shows that product popularity is trending in
of the visitors, and therefore cannot be applied to the case of anon- nature, and changes with time: Tang (Srinivasan and Mekala, 2014)
ymous shoppers. find that the trendiness of products changes with time, week by week,
More similar to our approach is the use of machine learning on or even day by day, due to either user interest change, demand shifts
clickstream data to predict individual intent. Common in this case is the ignited by some external events, or just because a product is out of
use of the dynamics of visits, e.g., the frequency of visits, time from last inventory.
visit, and in-session dynamics (like dwell time, the time spent viewing a Item popularity and user ratings over time are studied by Koren
page). Findings show that the rate of visits and dwell time increase (2010) in a different context, showing that temporal dynamics can af-
when the visitor is close to the purchase. Each visitor is then char- fect user preferences. We further examine whether these temporal dy-
acterized by these dynamics, as well as their current session’s dynamics namics, as well as the trendiness of the viewed products, are predictors
and additional information that is available (such as their demo- of the user’s current purchase intent.
graphics, detailed clickstream and product information, purchase his-
tory, social interactions and influence, and more). Conversion predic- 2.3. Session-based recommenders
tion is then made for either the current or the next visit (Van den Poel
and Buckinx, 2005; Bucklin and Sismeiro, 2009; Lukose et al., 2008; Su The study of session-based recommenders is a growing trend,
and Chen, 2015; Lo et al., 2016; Kooti et al., 2016; Raphaeli et al., especially in the music domain, as described next. Park et al. (2011)
2017). We consider the case of anonymous visitors that do not have have coined in this reference the term Session-based Collaborative
prior history in the site, nor do we know their social network. Filtering (SSCF) and present a modified user-based CF that relies on
When no information on the visitor exists, as is the case of first-time session information that captures sequence patterns and repetitiveness
or anonymous visitors, understanding their current, real-time intent, is in the users’ listening process. Their goal is to predict which song will
challenging. Polites et al. (2018) find a misalignment between online be played next given past sessions. When a song is played, an event
shoppers’ initial intention and the outcome of their online visit, i.e., describing the user and song (item), with the corresponding time stamp,
some of those stating they are in the browsing phase and do not intend is created. A session is defined as a sequence of per-user events within a
to buy end the visit with a purchase, while others, starting with the specific continuous time frame. Then, session similarity is calculated
intention to buy, do not. Purchasing without prior intent, or impulse using the cosine distance between each pair of sessions. The items’ log
purchase accounts for many of the online transactions (Chan et al., data is used as implicit feedback. In their experimental results, using log
2017). While some of the visitors have a predisposition to purchase, data from Bugs Music (one of the biggest music services in Korea), they
others might be inclined or driven to impulse purchases. Research on show that SSCF outperforms the basic CF. Zheleva et al. (2010) develop
websites stimuli that may trigger an impulse purchase considers the a session-based hierarchical graphical model using Latent Dirichlet
site’s visibility and cues embedded in it, such as promotions, persuasive Allocation and show that their model can facilitate playlist completion
aids, etc. (Jeffrey and Hodge, 2007; Wells et al., 2011; Chan et al., based on previous listening sessions or songs that the user has just lis-
2017). In our work we do not consider the initial intent of the visitor, tened to. Using the Zune2 Social music community as a test bed, they
nor can we assume their predisposition to purchase or impulsive model a song listening process by two graphical models with latent
shopping. variables. The first one, the taste model, is characterized by a set of
tastes or media preferences of a specific community. The second one,
2.2. Products’ temporal dynamics and trends the session model, is where each song the user has listened to is defined
as a finite combination of listening moods. They show that from the
Product popularity information is known to affect its success, and perplexity perspective there is a clear advantage in using a session-
people tend to consume more products perceived as popular (Hanson based model for characterizing user preferences in the social media
and Putler, 1996; Salganik et al., 2006; Cai et al., 2009; Tucker and content. Dias and Fonseca (2013) improve music recommendations by
Zhang, 2011). The temporal nature of popularity, though, is unstable. adding temporal context and session’ diversity factors into the analysis
In many scenarios in our lives (TV program consumption, product of music sessions. Their purpose is to recommend to the user the next
purchase, tweet topics and so on) our interests change with time, in a song to listen to. They represent each session using five features: time of
process known as concept drift (Widmer and Kubat, 1996; Tsymbal, day (users tend to listen to different songs at different periods of the
2004; Krawczyk et al., 2017). Concept drift can be sudden, or gradual, day), weekday (users’ song preferences are different in weekdays and
changing slowly with time. Systems for handling concept drift differ weekends), day of month (users tend to listen to more happy music in
according to the type of change they handle (Tsymbal, 2004): (1) In- the beginning of the month and more sad one towards the end), month
stance selection, where the goal is to select instances that are relevant (users’ preferences are not the same during different seasons), and song
to the current time window; (2) Instance weighting, were instances are diversity. They show that the inclusion of temporal information, either
weighted based on their estimated relevance; and (3) Ensemble explicitly or implicitly, increase the accuracy of the recommendations
learning, handling the family of predictors that are weighted per their significantly when compared with traditional Session-based CF. A
relevance to the present time. Among the three, the first is more re- fundamental difference from our work is that playlists are longer se-
levant to our research. quences compared with shopping sessions. Additionally, they did not
Concept drift is naturally linked to temporal trends. Temporal use dwell time, and did not consider the song’s popularity. Jannach
trends have been shown to govern general interests (Mokryn et al., et al. (2017) and Jannach et al. (2015) incorporate long-term pre-
2016), and are studied not only in recommender systems (Choi and ferences into a next music track generation. They distinguish two types
Varian, 2012; Dias and Fonseca, 2013; Koren, 2010; Lathia et al., 2010; of preferences: short-term history (current session) and long-term his-
Srinivasan and Mekala, 2014) but also in general Web search. Google tory (previous sessions), considering repeated tracks, co-occurrences of
Trends 1 uses the time series index of the volume of submitted queries. the tracks, favorite singers and social friends’ track preferences. They
For example, the volume of queries on a particular brand of a watch combine all into a multi-faceted scoring scheme to provide the best
during the second week of May might be helpful in predicting June recommendation for the next track in the playlist.
sales for that brand. Choi and Varian (2012) use Google Trends to de- Classification: In recommender systems, one common practice is to
monstrate that Google queries help predict economic activity. use ensembles of classifiers (Ricci et al., 2015). Any hybrid technique

1 2
https://trends.google.com/trends/. https://en.wikipedia.org/wiki/Zune.

3
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

that combines the results of several classifiers could be seen as an en- Table 1
semble method. Netflix winners (Bell et al., 2007) used a combination YooChoose dataset general data statistics.
of many different methods in their study. The two most common en- Name Clicks Buying sessions Non-buying sessions Items
semble techniques are Bagging and Boosting Ricci et al. (2015). Bag-
ging (Bootstrap Aggregation) was initially proposed by Breiman (1996), YooChoose 3,3003,944 509,696 8,740,032 52,739
and it combines outputs from several machine learning techniques for
improving the performance and stability of prediction or classification.
This technique is a special form of the averaging model (Hoeting et al., However, the distribution is a right-skewed one, with sessions lasting
1999). In our experiment, we use three classifiers: Bagging, NBTree, over more than 40 clicks.
and Logistic Regression. WEKA (Hall et al., 2009) enables the use of Fig. 1 depicts the distribution of sessions’ lengths, measured by the
different base-learn classifiers for Bagging. number of clicks, for sessions that ended with a purchase (termed
buying sessions), and for those that did not (termed non-buying ses-
3. Datasets description sions). It is quite easy to see that the non-buying sessions are much
shorter than buying sessions. About 80% of the sessions contain be-
The data in this study is comprised of two datasets of clickstream tween 1 and 4 sessions, while 40% of the sessions are 2-clicks sessions.
events from two e-commerce websites, each representing a different This is quite understandable, as users are not happy with what they
domain of goods, as detailed below. looked at and abandon the search. On the other hand, buying sessions
Both datasets are anonymized and contain clickstream log in- have a much longer tail. Unlike the non-buying sessions, the percentage
formation for extended periods. of 1-click buying sessions is small (4% compared with 14% 1-click non-
To predict the real-time intent of anonymous visitors we treat each buying sessions) while 2-click buying sessions is the largest portion
session as a separate visit of an anonymous visitor and do not consider (about 22% of the buying sessions) Then the percentage decreases
user information. Yet, in our datasets, there are possibly repeated visits gradually, but in a much moderate rate compared with the non-buying
by users. We explain here why treating each session as an anonymous sessions. This behavior seems to be quite understandable as users tend
session strengthens our results. Previous works found that in cases to better examine items they are about to purchase - they may want to
where shoppers are searching for information a repeated process in learn more about them and possibly compare several options. We can
which the time in-between visits decreases captures well their behavior see that in both cases most of the sessions are 2-clicks long: they ac-
(Moe and Fader, 2004; Kalczynski et al., 2006; Van den Poel and count for almost 40% of non-buying sessions, and 20% of buying ses-
Buckinx, 2005; Bucklin and Sismeiro, 2009; Lukose et al., 2008; Su and sions. Next come 3-click sessions, which account for 17.6% and 14.5%,
Chen, 2015; Lo et al., 2016; Kooti et al., 2016; Raphaeli et al., 2017). respectively. The third-place diverges between non-buying sessions (1-
However, recent work showed that shoppers might change their mind click sessions, 14.2%), and buying sessions (4-click sessions, 11.3%).
during their online visit (Polites et al., 2018). By treating each session
as an anonymous independent session, we make no prior assumptions 3.2. Zalando dataset
on the purchase intent of the user, do not consider previous behavior
and patterns, and learn solely from the session dynamics whether it will The second dataset we use is an anonymized click log from
end with a purchase. Zalando3, a large European online fashion retailer, used previously for
The datasets contain real visits of users, some are anonymous, some session-based recommendations (Tavakol and Brefeld, 2014). Every
repeating customers. We handle each session separately, and do not click is associated with a timestamp, the attributes of the viewed item,
consider personal information. This enables us to model every session user ID, and the clicked items. The dataset is richer in details, and more
as belonging to an unknown visitor, and hence we do not collect user attributes are associated with each product than in the YooChoose da-
information or previous visit dynamics. taset. However, to validate our results across these domains, we limit
ourselves to the use of the features used in the YooChoose dataset.
3.1. YooChoose RecSys dataset Table 2 describes the total number of clicks, items, and sessions in the
dataset.
Our primary dataset is the YooChoose RecSys challenge dataset, Here we see longer sessions on average, namely 8.11 clicks per
representing six months of user activities in a large European e-com- session on average, and a larger percentage (5.9%) of sessions that end
merce business that sells a variety of consumer goods including garden in a purchase.
tools, toys, clothes, electronics, and more (Ben-Shimon et al., 2015).
The YooChoose dataset contains two log files: a click events log, and a 4. Modeling dynamics in E-commerce sessions
purchase events log. The click events log consists of a list of click events
on items. Each such event is associated with the session id, a timestamp Modeling the purchase intent of an anonymous visitor can be
(the time when the click occurred), the item id, and the category of the thought of as modeling the purchase intent of an anonymous visitor
item. The purchase events log consists of purchase events from sessions during their session, i.e., during an anonymous session. To understand
that appear in the click events log and end with a purchase. Each entry the characteristics of anonymous sessions that end with a purchase, we
contains a session id, a timestamp (the time when the purchase oc- quantify the dynamics of e-commerce sessions. Each session is con-
curred), and details on the purchased item - the item id, the price, and sidered as a distinguishable visit of an anonymous visitor. We char-
the quantity. The sessions vary in terms of length, the number of clicks, acterize the dynamics of each session by the trendiness of the viewed
and the number of items clicked on. Sessions’ lengths last from a few products, the clickstream, and the session’s temporal characteristics, as
minutes to a few hours and the number of clicks varies from one click to detailed below. We define the recent trendiness of each product and
a few hundred in a session, depending on the user’s activity. consider a session to be as trendy as the trendiest product in it.
Table 1 presents the main characteristics of the YooChoose click-
stream events, i.e., the number of sessions that end with a purchase 4.1. Modeling products recent trendiness in purchasing sessions
(Buying sessions), the number of sessions in the dataset that do not end
up with a purchase (Non-buying sessions), the overall number of clicks We consider here a local view of products’ popularity, in terms of
and the overall number of items. Only 5.5% of the sessions end up with
a purchase. The average number of clicks per session is roughly 2.8,
3
with the majority of sessions ending within less than three clicks. www.zalando.com

4
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

Fig. 1. Distribution of the number of clicks per sessions, YooChoose dataset.

Table 2 Table 3
Zalando dataset general data statistics. PS Table example from the YooChoose dataset.
Name Clicks Buying sessions Non-buying sessions Items Product Day 91 Day 92 Day 93 Trend

Zalando 224,175,394 1,647,738 25,976,224 350 Product1 6 8 9 Increasing


Product2 5 5 5 Non-decreasing
Product3 7 5 4 Decreasing
both space and time. We use the site’s local information about the
products, and refer to this view as the ‘local popularity’ of products.
Similarly, we are interested in the recent trend of the products’ popu- sessions. The data was extracted from the YooChoose Buying Table. The
larity, rather than their historical popularity information. Hence, the days are counted by their numerical offset from the date the dataset
modeled trend of popularity is the recent local trajectory of the popu- begins with. Let us assume we model the product trendiness over a
larity of each product within site, for purchasing customers. A product period of three days4. Defining trendy products as products partici-
is considered trendy if in the last several days before the analyzed session it pating in a non-decreasing number of sessions that ended with a pur-
had been viewed in a non- decreasing number of sessions that ended with a chase within a predefined time window, then in this example Product1
purchase. is a trendy product with an increasing trend, Product2 is trendy with a
To model the trendiness of products let us define the following stable non- decreasing trend, and Product3 has a decreasing trend and,
process: Both YooChoose and Zalando datasets are split into two therefore, is not a trendy product.
random subsets, a learning subset and an experimental one. 80% of the We further determine the average trendiness of products, and the
sessions are used as the learning subset for the learning phase corresponding trendiness of a session, as explained below.
(Temporal Model Building Data), and the remaining 20% are used for
experimentation (Experimental Data). The proportion of buying and
4.2. Modeling the trendiness of products and sessions
non-buying sessions (5% vs. 95%) is similar in these two subsets of the
datasets. We split the sessions in the learning subset, i.e., in the
We define a session to be as trendy as the trendiest products in it. To
Temporal Model Building Data, into two corresponding lookup tables.
that end, we determine the average trendiness of each product at time t,
The first, PS, is for sessions that ended with a purchase, and the NPS
denoted in days. For a given chosen time window size of n days, the
table is for sessions that did not end with a purchase. Each entry in the
average recent trendiness of a product i is a factor of the number of
PS table (or NPS table) corresponds to a tuple of 〈day, producti 〉, de-
sessions it was viewed in that ended with a purchase, divided by the
picting the number of sessions that the product was viewed at that day,
overall number of sessions it was viewed in. Let product i’s overall
that ended with a purchase (or did not end with a purchase, respec-
performance in a time window n be determined as the overall number
tively).
Table 3 shows an example of the PS table for three consecutive days.
In these days, Product1 was viewed in 6, 8, and 9 sessions that ended 4
the window size for determining the trendiness of products over sessions
with a purchase, while Product2 was viewed in each of these days in 5 that end with a purchase is a parameter. Section 5.1 evaluates the results ob-
sessions that ended with a purchase, and Product3 in 7, 5, and 4 such tained for different time windows.

5
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

Fig. 2. Evaluation Flow.

of sessions it was viewed in during the preceding n days, Pin . Then, 5. Evaluating purchase intention in an anonymized session
Pin (t ) = Σtj−=1t − n − 1 (PS (j, i) + NPS (j, i)) (1) Our goal is to determine the purchase intent of anonymous visitors
calculated as the sum of the number of buying sessions in which i was from their session’s dynamic characteristics, as modeled above. To that
clicked on with the number of non-buying sessions in which it was end we consider each session as a distinguishable visit of an anonymous
clicked on during these days. The average trendiness TD of a product i visitor. We train an ensemble of classifiers, as well as the XGBoost
at day t is then defined as follows: classifier, and further examine the effect of the different dynamics
modeled. We compare our results to deep learning technique that mines
Σtj−=1t − n − 1PS (j, i)
TDin (t ) = = recurrent patterns and utilizes neural networks (RNN).
Pin (t ) (2) For the classification task, each session is modeled with the fol-
lowing set of features: max product trendiness (TD Sk (t ) , calculated over
#of buying sessions product i was viewed in during the preceding n days
= different time windows of time t); number of clicks; temporal parameters
#of overall sessions product i was viewed in during the preceding n days
of the session. YooChoose and Zalando differ in the available temporal
(3) parameters, as described in Section 4.2. Therefore, the temporal para-
We can now proceed to model the trendiness of current sessions meters used for modeling the YooChoose dataset sessions are: Day of the
(i.e., in our example, sessions occurring at day t), using the trendiness of week; Month the session took place in, and the session’s Dwell time. The
the products viewed in them. If we define a session of length k, denoted temporal parameters used for the Zalando dataset sessions are: Day
by Sk , as a sequence of k views of products on a site (with possible number from the beginning of the dataset.
repetitions), then, a session Sk at day t will be as trendy as the trendies Fig. 2 depicts our design flow for trendiness’ modeling. We learn the
product viewed in it: global trendiness information over 80% of the data. We then take the
remaining 20% of the data, termed test set. In the Figure, this set is in
TD Sk (t ) = max TDin (t ), i ∈ Sk
i (4) the Session Generation. We then perform SMOTE over the test set, and
divide the test set to ten folds, learn from 90% of the test set, and
4.3. Modeling temporal and clickstream characteristics of a session evaluate our results over each of the remaining 10%.
Recall that each of the two datasets we have, YooChoose and
Additional temporal characteristics were used for modeling a ses- Zalando, has been split into two subsets. The first, consisting of 80% of
sion in both datasets. However, the temporal characteristics differ be- the data, is used for the modeling, and the second, consisting of 20% of
tween the YooChoose and Zalando datasets, denoted with Y and Z re- the sessions, is used for experimentations (test part). Each of these sets
spectively: keeps the original characteristics of imbalanced data, as only less than
6% of the sessions end with a purchase. Classifiers working well with
MonthY: Some months are more prone to purchases than others. For imbalanced data, perform well while classifying the main items be-
example, the YooChoose dataset spans seven months, from April to longing to the main category, but poorly otherwise (Chawla et al.,
September. During this time, August was the month with the highest 2002). To overcome this imbalance, we use SMOTE (Chawla et al.,
purchase conversion rate. 2002), a combination of over-sampling and under- sampling techni-
Day of the weekY: People behave differently on different days of ques. SMOTE combines Informed over-sampling of the minority class
the week. For example, in the YooChoose dataset we found that with the random under-sampling of the majority class. We conduct 10-
people tend to purchase more on Sundays and Mondays than on fold cross-validation experiments on the test part of the dataset, which
other days. was not used for modeling.
Dwell timeY: Dwell time, the time a customer spends on viewing a We train an ensemble of classifiers (Ricci et al., 2015), namely
particular page or a product, has been recently linked to the interest Bagging, NBTree and Logistic Regression (Hall et al., 2009), and a state-
the customer has in the product (Yi et al., 2014; Bogina and Kuflik, of-the-art boosting machine learning method, XGBoost (Chen and
2017). Here, we use the session’s latency. Guestrin, 2016). WEKA data mining software (Hall et al., 2009) enables
Day number from the beginning of the datasetZ: As the Zalando the use of different base-learn classifiers For Bagging we use the fol-
dataset does not include the dates, we use the number of days offset lowing: Reduced Error Pruning Tree (“REPTree”) (Srinivasan and
from the beginning of the dataset. Mekala, 2014), which is a quick decision tree learner that is built upon
Additionally, we use the number of clicks in a session. This feature the information gain; NBTree (Kohavi, 1996), a hybrid of decision trees
has been previously found important in several studies, as described with Naive Bayes classifiers that learn from instances that reach the
in Section 2.1. decision trees’ leaves; and Logistic Regression (Hosmer et al., 2013),
Number of clicks in a sessionY, Z defines the length of a session in where the binary dependent variable is categorical, as is the case with
number of clicks. Clearly, as the same product can be clicked-on our prediction of purchase. XGBoost5 is used with max depth of seven
several times within a session, this value does not necessarily cor- and grid search.
relates with the number of viewed products. It is used as an addi- Our experimental results show that using temporal and dynamic
tional feature for both datasets. characteristics of the products and the sessions we are able to achieve a
good classification for whether a session ends up with a purchase or not.

5
https://github.com/dmlc/xgboost.

6
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

Fig. 3. YooChoose: Aggregated number of sessions with trending products over different time windows, compared to the number of sessions with non-trending
products.

Preliminary results calculated with ensemble methods over the Table 4


YooChoose dataset, with a fixed time window of three days (Bogina YooChoose: Quality of prediction (F1) of purchase intent over different time
et al., 2016) gave very good classification measures, with the Bagging windows.
classifier giving a precision of 0.937 , and AUC = 0.939. Here, we deepen Days Features Logistic Bagging NBTree XGBoost
the understanding of the effect of different session dynamics as well as
the time window used for learning, and compare our results over both 2 days With Trendiness 0.739 0.904 0.916 0.7853
Without Trendiness 0.733 0.886 0.888 0.765
the YooChoose and Zalando datasets, enabling a cross-domain and
cross-site evaluation of our model. 3 days With Trendiness 0.72 0.886 0.899 0.798
Without Trendiness 0.71 0.854 0.855 0.76
5.1. Products trendiness over time
4 days With Trendiness 0.7 0.882 0.883 0.8236
Without Trendiness 0.686 0.816 0.817 0.758
To calculate products trendiness, we first plot the number of ses-
sions that trending products were viewed during different time win- 5 days With Trendiness 0.68 0.889 0.867 0.8559
dows in the YooChoose dataset. Fig. 3 depicts the total number of Without Trendiness 0.664 0.786 0.786 0.75
sessions on consecutive days in which products were trending in (a non-
6 days With Trendiness 0.677 0.899 0.873 0.9
decreasing trend of clicks), for time windows that span from two to six Without Trendiness 0.65 0.796 0.795 0.7788
days. We chose this range as the total number of sessions that contain
the same product over seven consecutive days is negligible, with only
175 such sessions with trending products that end with a purchase. The decreases, in which case taking into account the product trendiness
calculation is applied to the entire dataset. As expected, there are more improves the prediction. However, when only the very recent history is
such sessions in shorter time periods than longer ones, and the same taken into account, the temporal features give a prediction result that is
proportion of sessions end with a purchase compared to the global almost as good as the one that includes recent product trendiness. Re-
dataset (around 5%). cency improves the effect of the temporal features on the quality of the
prediction of the purchase intent when using ensemble methods. To
5.2. The effect of session temporal features and product trendiness over further understand which features contribute most to the prediction, we
different window sizes applied WEKA’s automatic feature selection algorithm to the Yoo-
Choose dataset for all features (after SMOTE). The algorithm selected
First, to deepen our understanding of the effect of the trendiness of trendiness and the number of clicks. Hence, the very recent click be-
products viewed on the estimated purchase intent, we evaluate the havior in the site is indeed a good predictor. The use of product tren-
quality of the estimation with and without trendiness, and over several diness improves the prediction accuracy over all selected time win-
time windows. We compare the prediction of each of the four classifiers dows, for all four classifiers. The effect of trendiness for the longer time
to the other three. windows is surprising, considering that the number of sessions with
Table 4 depicts the purchase intent classification performance for trending products over five or six days is rather small, and may indicate
the YooChoose dataset over the different time windows showing all a higher predictive power for products trending over these longer
classifiers predictions F1-measures. For each time window, we compare periods. The effect of trendiness for longer time windows improves
the performance of the model over the different classifiers with and XGBoost’s prediction quality, with the highest value (F1-measure of
without taking product trendiness into account. Product trendiness 0.9) over a time window of six days.
improves the purchase intent estimation over all examined window For Zalando the available temporal information is the session’s
sizes significantly, with p -value <0.03 for T-tests performed for all time offset in days from the beginning of the dataset. We use this temporal
windows. Interestingly, when the temporal features are taken over a information along with the number of clicks and product trendiness,
larger time window, the ensemble classifiers’ prediction quality that were found important for the YooChoose dataset. Table 5 presents

7
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

Table 5 Table 6
Zalando-Quality of prediction (F1) of purchase intent over different time win- Prediction results (F1) of the general comparative model.
dows.
Time-Window Classifier Datasets
Days Features Logistic Bagging NBTree XGBoost
Yoochoose Zalendo
2 days With Trendiness 0.702 0.859 0.905 0.7554
Without Trendiness 0.701 0.745 0.731 0.7281 2 days logistic 0.644 0.686
Bagging 0.784 0.76
3 days With Trendiness 0.702 0.892 0.892 0.7543 NBTree 0.717 0.741
Without Trendiness 0.701 0.761 0.735 0.7222 XGBoost 0.6486 0.7335

4 days With Trendiness 0.705 0.859 0.806 0.7861 3 days logistic 0.629 0.686
Without Trendiness 0.7 0.807 0.762 0.7246 Bagging 0.786 0.777
NBTree 0.717 0.763
5 days With Trendiness 0.725 0.876 0.848 0.8681 XGBoost 0.65127 0.72201
Without Trendiness 0.718 0.87 0.825 0.7409
4 days logistic 0.616 0.686
6 days With Trendiness 0.761 0.893 0.883 0.9438 Bagging 0.77 0.766
Without Trendiness 0.755 0.896 0.875 0.786 NBTree 0.7 0.714
XGBoost 0.65309 0.7187

5 days logistic 0.613 0.684


the quality of the prediction of purchase intent over different time
Bagging 0.735 0.747
windows showing all classifier prediction F1-measures. At each time NBTree 0.675 0.711
window, we compare the results with and without the trendiness. The XGBoost 0.6512 0.7214
trendiness of products improves the prediction across number of clicks
6 days logistic 0.604 0.684
and the temporal information when recent history is considered, and
Bagging 0.729 0.747
was found significant using a T-test only for a time window of four NBTree 0.674 0.711
( p -value = 0.05). These results are somewhat different than the ones of XGBoost 0.6501 0.729
YooChoose, where trendiness has a larger positive effect on prediction
in longer time windows. This, however, can be attributed to the dif-
ference in the available temporal history between the two datasets. For features, for the YooChoose and Zalando datasets, respectively. Com-
XGBoost, as is the case with the YooChoose dataset, the effect of tren- paring these results to the results of the general model in Table 6, it is
diness for longer time win improves its prediction quality, with the clear that the best results are achieved for each dataset when all the
highest value (F1-measure of 0.94) over a time window of six days. model parameters, including the available temporal information, are
Prediction with recurrent neural networks We further compare our taken into account. When temporal information is not considered, as is
results with a deep learning technique that mines recurrent patterns the case with the general model, the prediction quality decreases.
utilizing neural networks (RNN). RNN is considered the state-of-the- art
in e-commerce within sessions next click prediction recommendations
(Hidasi et al., 2015; Hidasi et al., 2016; Quadrana et al., 2017), though, 6. Discussion
as far as we know, it has not been previously used for in-session intent
prediction. Here, we represent each session as a sequence of clicked This study explores factors that can reveal the purchase intent of
items. Due to the imbalanced nature of our datasets, we perform a anonymous visitors to sites. Understanding the purchase intent of
downsampling of the data, such that the number of sessions that end up anonymous visitors also applies to occasional shoppers (that may also
with a purchase is equal to the number of sessions that do not. The be unidentified returning shoppers) who are responsible for almost half
classification task is then to predict whether a session ends up with a of all online purchases. Previous works that consider only anonymous
purchase. RNN achieves an F1-measure of 0.84 for the YooChoose visitors mine the session’s clickstream data for rules, looking for known
dataset, and an F1-measure of 0.80 for Zalando. patterns that are typical of recurrent visitors with known purchase in-
tent. Our work is the first to try to quantify unique factors that help
explain the shopping intent of anonymous and unknown occasional
5.3. General modeling of purchase intention visitors.
We show that sites’ visitors that view currently trending products
In order to build a general model of purchase intent for anonymous are more likely to purchase at the end of their visit. In our case, we
users over the e- commerce domain, we build a representative features define this trendiness as not losing popularity in that site recently, in
set that will perform well on both datasets. Considering the two dif- what can be viewed as a locality in time (recency) and space (site-
ferent datasets are in the same e-commerce domain, we use the same centric). Interest in products is known to shift in a process described as
features: number of clicks and trendiness, and do not consider any concept drift and explained earlier in Section 2.2. To capture this
temporal information. To create a valid comparison, we run SMOTE temporal change of interest, we implement an instance time window
only on these two particular features for both datasets. This way we and detect the local trend of popularity of products in the site. While
assure the same baseline for comparison between the results. The re- others also have explored the strolling habits of online shoppers, the
sults are given in Table 5. Since temporal features are not explicitly recent local popularity of products was not considered yet. Our results
taken into account in this part of the experiment, only implicitly in the demonstrate that there is a connection between viewing trendy pro-
trendiness calculation, the differences between different time windows ducts and purchase intent. Temporal aspects of the session itself,
are negligible. Interestingly, the quality of predicting purchase inten- namely the day of the week and the time of year the session took place
tion in both datasets is similar over different time spans (especially in, were previously shown to be indicative of the purchase intent of
when using an ensemble technique, i.e., Bagging). returning visitors. We show here that it is also indicative of anonymous
The general model does not take into account temporal parameters, visitors’ shopping intent.
but rather only the products trendiness and the number of clicks. Tables Building on the above, we introduce a novel classification method
6 and 5 detail the classification results with all the model parameters, that uses the temporal dynamics of products (which we term trendiness)
that is: products trendiness, number of clicks, and the temporal together with the length and temporal features of a session to classify

8
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

whether the session ends with a purchase. There are interesting managerial implications to our finding. We
Our datasets come from the product and retail e- commerce in- have considered the session’s temporal parameters, and products’ re-
dustries. The temporal features denoting the time of the visit differ cent trendiness. Temporal information was previously considered for
between both datasets. The use of sessions’ temporal features improves returning customers. Our results indicate that sites can use sessions’
the classification performance, and like trendiness, their removal is temporal information for intent prediction not only for recurrent visi-
associated with a negative impact on the result. The temporal in- tors, but also for first-time and anonymous visitors. The temporal in-
formation available for the YooChoose dataset is rich. The Zalando formation we have for the YooChoose and the Zalando datasets differ,
dataset does not contain temporal information, and we only had the yet in both cases it improves the prediction quality. Sites should,
offset in days from the starting date of the given dataset. Nevertheless, therefore, consider using the full temporal information that exists per
adding this temporal information was sufficient to improve the in- session. Our findings on products’ trendiness contribute to under-
ference over all time windows that we experimented with. When tem- standing the intent of these visitors. We show that removing the pro-
poral information is not used, as is the case with our general model duct trendiness feature negatively affects the classification accuracy for
(described in Section 5.3) the inference quality decreases for both da- both datasets, and more so for long time windows. Utilizing our pre-
tasets. characteristics is a limitation of the study, as we do not compare liminary results (Bogina et al., 2016), the use of products’ recent po-
the same session temporal information across datasets, and therefore pularity in the site was applied in an e-commerce recommender system
cannot compare the in these cases. for purchase intent prediction (Zhang et al., 2016). trendiness, as de-
The datasets used are of real visits done by registered or returning fined and calculated in this study, and session length (in clicks) were
customers, as well as anonymous visitors. Anonymous visitors might be found by the feature selection process to be significant for under-
first-time shoppers, or returning visitors and occasional shoppers who standing the purchase intent for both datasets, and yielded good pre-
do not wish to register at the site. We treat each session as done by an diction compared to a baseline and RNN, an in-session deep learning
anonymous visitor, and model the session’s dynamics, as discussed predictor. different time windows. We find that using very recent in-
before. While we think that this approach is more challenging to our formation for learning, in the time span of two to three days, is suffi-
model, this is also a limitation of this study. cient for achieving good results. The findings from our learning process
Another limitation is the lack of use of the extensive product in- of product trendiness over windows of consecutive days indicate a high
formation available at sites. Previous works have found that shopping recency in the products’ attention span. Products are often being
intention and acceptance is influenced by product characteristics viewed by visitors in up to three consecutive days, but less so in longer
(Pavlou and Fygenson, 2006). windows of times.
Our study takes a machine learning approach to the classification of Interestingly, the number of sessions a product is viewed in during
anonymous visitors’ purchase intention. Both our datasets were im- consecutive days, e.g., in a window of two to six consecutive days,
balanced, with 5–5.9% of the sessions ending with a purchase. There decreases with each day. When we consider a time window of seven
are several known techniques to overcome imbalanced datasets. We days in which we require that a product is viewed (at least once) in each
have observed that using SMOTE on our datasets provides better results consecutive day, we find that we are left with a negligible number of
than using undersampling techniques. We further found that applying sessions, indicating a clear within-site concept drift for the vast majority
SMOTE on all features is more successful than applying SMOTE only on of products. Sites can thus track this temporal trending interest in
selected ones. classifiers provided better results when SMOTE was ap- products and identify a per product trend and concept drift; suggest
plied on all features and then some of the features were removed, re- products accordingly; look for patterns; and try to identify products
maining only with the selected ones. Additionally, w In this study, we with correlating trends, or complementary trends.
define the temporal features as nominal, rather than applying the nu-
meric values we used in our preliminary study (Bogina et al., 2016). 7. Conclusions and future work
Both these changes improve the classifiers’ results compared to the
initial ones presented in the preliminary paper. For the classification We present a method for determining the shopping intent of anon-
task we employ an ensemble of classifiers (Ricci et al., 2015), and ymous visitors to a site. Our method uses only the visitor’s session in-
XGBoost, a novel boosting method (Chen and Guestrin, 2016). We formation, namely the session temporal information, session length,
achieve classification with F1 measures of 0.9 and 0.94 for the datasets. and the recent trendiness of products clicked on in that session. The
These results outperform a random baseline, which was used in these trendiness offers a local temporal view of the products’ recent popu-
recent empiric works Lo et al., 2016; Kooti et al., 2016. We further larity. To detect a recent trend, we draw from machine learning tech-
achieve better quality compared to RNN, a within-session deep learning niques for identifying a concept drift in popularity, and find strong
method, which classified with F1 measures of 0.8 and 0.84 , respectively. locality in time, in scale of days. We show over two separate datasets
Generally, Bagging gives the best results for both datasets regardless of from the retail industry that our method achieves good classification for
the time window used. XGBoost seems to have a different tendency than understanding anonymous and occasional visitors’ purchase intent. The
the ensemble methods, producing better results over the longer time best intent inference is achieved when using temporal aspects together
windows. This might be attributed to the smaller amount of data that is with the session’s trendiness and the number of clicks.
available for longer time windows, as demonstrated in Fig. 3. Similar to The results of this work can be utilized for creating a novel real-time
our findings, it has been reported, that while XGBoost is a common recommender systems that integrates trendiness, and session temporal
choice in Kaggle challenges and KDD Cup competitions, yet, depending information into the reasoning process of an online purchase intent
on the dataset, ensemble methods may give better results (Bekkerman, classification mechanism in sites. This mechanism may guide an online
2015). recommender for improving the shopping experience to the benefit of
At the moment, we classified successfully sessions that ended with a both the buyer and the seller. user’s purchase intent to identify users in
purchase or not. It identifies online signals for purchase intent that can anonymized sessions with low purchase intent. These users then are
be used for online purchase prediction of anonymous visitors while their directed to recommender systems in the hope of converting them into
session is undergoing. Almost half the sessions are long and involve shoppers. setting the threshold for the number of cases participating in
three or more clicks. This gives rise to a predictive paradigm, in which a trendiness analysis will be an interesting option. Our work is a first
our model is used for predicting the purchase intent of an unknown step towards predicting shopping intent of anonymous visitors in sites.
visitor after three or more clicks. Predicting early that a visitor does not We find here purchase intent only at the end of the session. An inter-
intend to purchase improves the site’s ability to introduce re- esting future direction is to find how early in the session the classifier
commenders and personal aids to the visitor during their visit. may have a good enough recommendation. method considers sessions

9
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

at varying lengths. In a future study we intend to learn how early in the Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H., 2009. The
visit our method is applicable, and utilize these signals of intent that we weka data mining software: an update. ACM SIGKDD Explorations Newsletter 11 (1),
10–18.
identified for prediction. The above findings may also be applicable to Hanson, W.A., Putler, D.S., 1996. Hits and misses: Herd behavior and online product
impulse purchases and returning visitors. We intend to further explore popularity. Marketing Lett. 7 (4), 297–305.
this direction in future works. each product in the session, i.e. using Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D., 2015. Session-based recommendations
with recurrent neural networks. arXiv preprint arXiv:1511.06939.
loop featurewhen the same product was clicked few times in sequence, Hidasi, B., Quadrana, M., Karatzoglou, A., Tikk, D., 2016. Parallel recurrent neural net-
or cycle feature, when product was clicked and then after clicking on work architectures for feature- rich session-based recommendations. In: Proceedings
few other products it was clicked again or different ratios – loop/ of the 10th ACM Conference on Recommender Systems ACM, pp. 241–248.
Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T., 1999. Bayesian model averaging:
(length of the session). a tutorial. Stat. Sci. 382–401.
Hosmer Jr, D.W., Lemeshow, S., Sturdivant, R.X., 2013. Applied logistic regression, vol.
Conflict of interest 398 John Wiley & Sons.
Jannach, D., Kamehkhosh, I., Lerche, L., 2017. Leveraging multi-dimensional user models
for personalized next-track music recommendation. In: Proceedings of the
None. Symposium on Applied Computing ACM, pp. 1635–1642.
Jannach, D., Lerche, L., Kamehkhosh, I., 2015. Beyond hitting the hits: Generating co-
Acknowledgement herent music playlist continuations with the right tracks. In: Proceedings of the 9th
ACM Conference on Recommender Systems ACM, pp. 187–194.
Jeffrey, S.A., Hodge, R., 2007. Factors influencing impulse buying during an online
We would like to thank the Zalando team for providing their data purchase. Electron. Commerce Res. 7 (3), 367–379.
for our research. Kalczynski, P.J., Senecal, S., Nantel, J., 2006. Predicting on-line task completion with
clickstream complexity measures: a graph-based approach. Int. J. Electron.
Commerce 10 (3), 121–141.
References Kim, E., Kim, W., Lee, Y., 2003. Combination of multiple classifiers for the customer’s
purchase behavior prediction. Decis. Support Syst. 34 (2), 167–175.
Kim, Y.S., Yum, B.-J., 2011. Recommender system based on click stream data using as-
Amaro, S., Duarte, P., 2015. An integrative model of consumers’ intentions to purchase
sociation rule mining. Expert Syst. Appl. 38 (10), 13320–13327.
travel online. Tourism Manage. 46, 64–79.
Kohavi, R., 1996. Scaling up the accuracy of naive-bayes classifiers: a decision-tree hy-
Baumann, A., Haupt, J., Gebert, F., Lessmann, S., 2018. Changing perspectives: using
brid. KDD 96, 202–207.
graph metrics to predict purchase probabilities. Expert Syst. Appl. 94, 137–148.
Kooti, F., Lerman, K., Aiello, L.M., Grbovic, M., Djuric, N., Radosavljevic, V., 2016.
Bekkerman, R., 2015. The present and the future of the kdd cup competition: an outsider’s
Portrait of an online shopper: understanding and predicting consumer behavior. In:
perspective. https://www.linkedin.com/pulse/present-future-kdd-cup-competition-
Proceedings of the Ninth ACM International Conference on Web Search and Data
outsiders-ron-bekkerman/.
Mining ACM, pp. 205–214.
Bell, R., Koren, Y., Volinsky, C., 2007. Modeling relationships at multiple scales to im-
Koren, Y., 2010. Collaborative filtering with temporal dynamics. Commun. ACM 53 (4),
prove accuracy of large recommender systems. In: Proceedings of the 13th ACM
89–97.
SIGKDD international conference on Knowledge discovery and data mining ACM, pp.
Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J., Woź niak, M., 2017. Ensemble
95–104.
learning for data stream analysis: a survey. Inform. Fusion 37, 132–156.
Ben-Shimon, D., Tsikinovsky, A., Friedmann, M., Shapira, B., Rokach, L., Hoerle, J., 2015.
Lathia, N., Hailes, S., Capra, L., Amatriain, X., 2010. Temporal diversity in recommender
Recsys challenge 2015 and the yoochoose dataset. In: Proceedings of the 9th ACM
systems. In: Proceedings of the 33rd international ACM SIGIR conference on Research
Conference on Recommender Systems ACM, pp. 357–358.
and development in information retrieval ACM, pp. 210–217.
Bhatnagar, A., Sen, A., Sinha, A.P., 2016. Providing a window of opportunity for con-
Lo, C., Frankowski, D., Leskovec, J., 2016. Understanding behaviors that lead to pur-
verting estore visitors. Inform. Syst. Res. 28 (1), 22–32.
chasing: a case study of pinterest. KDD 531–540.
Bogina, V., Kuflik, T., 2017. Incorporating dwell time in session-based recommendations
Lu, J., Wu, D., Mao, M., Wang, W., Zhang, G., 2015. Recommender system application
with recurrent neural networks. In: First Workshop on Temporal Reasoning in
developments: a survey. Decis. Support Syst. 74, 12–32.
Recommender Systems, Como, Italy.
Lukose, R., Li, J., Zhou, J., Penmetsa, S.R., 2008. In: Learning user purchase intent from
Bogina, V., Kuflik, T., Mokryn, O., 2016. Learning item temporal dynamics for predicting
user-centric data In Pacific-Asia Conference on Knowledge Discovery and Data
buying sessions. In: Proceedings of the 21st International Conference on Intelligent
Mining. Springer, pp. 673–680.
User Interfaces ACM, pp. 251–255.
McDowell, W.C., Wilson, R.C., Kile Jr, C.O., 2016. An examination of retail website design
Breiman, L., 1996. Bagging predictors. Mach. Learn. 24 (2), 123–140.
and conversion rate. J. Business Res. 69 (11), 4837–4842.
Bucklin, R.E., Lattin, J.M., Ansari, A., Gupta, S., Bell, D., Coupey, E., Little, J.D., Mela, C.,
Moe, W.W., 2003. Buying, searching, or browsing: differentiating between online shop-
Montgomery, A., Steckel, J., 2002. Choice and the internet: from clickstream to re-
pers using in-store navigational clickstream. J. Consumer Psychol. 13 (1–2), 29–39.
search stream. Market. Lett. 13 (3), 245–258.
Moe, W.W., Fader, P.S., 2004. Dynamic conversion behavior at E-commerce sites.
Bucklin, R.E., Sismeiro, C., 2009. Click here for internet insight: advances in clickstream
Manage. Sci. 50 (3), 326–335.
data analysis in marketing. J. Interactive Marketing 23 (1), 35–48.
Mokryn, O., Wagner, A., Blattner, M., Ruppin, E., Shavitt, Y., 2016. The role of temporal
Cai, H., Chen, Y., Fang, H., 2009. Observational learning: evidence from a randomized
trends in growing networks. PloS one 11 (8), e0156505 .
natural field experiment. Am. Econ. Rev. 99 (3), 864–882.
Montgomery, A.L., Li, S., Srinivasan, K., Liechty, J.C., 2004. Modeling online browsing
Center for Retail Research (2017). Online retailing: Britain, europe, us and canada 2017.
and path analysis using clickstream data. Marketing Sci. 23 (4), 579–595.
http://www.retailresearch.org/onlineretailing.php.
Olbrich, R., Holsing, C., 2011. Modeling consumer purchasing behavior in social shopping
Chan, T.K., Cheung, C.M., Lee, Z.W., 2017. The state of online impulse-buying research: a
communities with clickstream data. Int. J. Electron. Commerce 16 (2), 15–40.
literature analysis. Inform. Manage. 54 (2), 204–217.
Panagiotelis, A., Smith, M.S., Danaher, P.J., 2014. From amazon to apple: modeling on-
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P., 2002. Smote: synthetic min-
line retail sales, purchase incidence, and visit behavior. J. Business Econ. Stat. 32 (1),
ority over-sampling technique. J. Artif. Intelligence Res. 16, 321–357.
14–29.
Chen, C., Hou, C., Xiao, J., Wen, Y., Yuan, X., 2017. Enhancing purchase behavior pre-
Park, C.H., Park, Y.-H., 2016. Investigating purchase conversion by uncovering online
diction with temporally popular items. IEICE Trans. Inform. Syst. 100 (9),
visit patterns. Marketing Sci. 35 (6), 894–914.
2237–2240.
Park, S.E., Lee, S., Lee, S.-G., 2011. Session-based collaborative filtering for predicting the
Chen, T., Guestrin, C., 2016. Xgboost: a scalable tree boosting system. In: Proceedings of
next song. In: Proceedings of the 2011 First ACIS/JNU International Conference on
the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data
Computers, Networks, Systems and Industrial Engineering (CNSI) IEEE, pp. 353–358.
Mining ACM, pp. 785–794.
Pavlou, P.A., Fygenson, M., 2006. Understanding and predicting electronic commerce
Choi, H., Varian, H., 2012. Predicting the present with google trends. Econ. Record 88
adoption: an extension of the theory of planned behavior. MIS Q. 115–143.
(s1), 2–9.
Pew Research Center (2016). Online shopping and e-commerce. http://www.
Cobb, C.J., Hoyer, W.D., 1986. Planned versus impulse purchase behavior. J. Retailing.
pewinternet.org/2016/12/19/online-shopping-and-e-commerce/.
Deng, L., Poole, M.S., 2010. Affect in web interfaces: a study of the impacts of web page
Polites, G.L., Karahanna, E., Seligman, L., 2018. Intention–behaviour misalignment at b2c
visual complexity and order. Mis. Q. 711–730.
websites: when the horse brings itself to water, will it drink? Eur. J. Inform. Syst. 27
Dias, R., Fonseca, M.J., 2013. Improving music recommendation in session-based colla-
(1), 22–45.
borative filtering by using temporal context. In: The Proceedings of the 2013 IEEE
Quadrana, M., Karatzoglou, A., Hidasi, B., Cremonesi, P., 2017. Personalizing session-
25th International Conference on Tools with Artificial Intelligence (ICTAI) IEEE, pp.
based recommendations with hierarchical recurrent neural networks. In: Proceedings
783–788.
of the Eleventh ACM Conference on Recommender Systems ACM, pp. 130–137.
Ding, A.W., Li, S., Chatterjee, P., 2015. Learning user real-time intent for optimal dynamic
Raphaeli, O., Goldstein, A., Fink, L., 2017. Analyzing online consumer behavior in mobile
web page transformation. Inform. Syst. Res. 26 (2), 339–359.
and pc devices: a novel web usage mining approach. Electron. Commer. Res. Appl.
Forrester Research, 2017. Forrester data: Online retail forecast, 2017 to 2022. https://
26, 1–12.
www.forrester.com/report/Forrester+Data+Online+Retail +Forecast+2017+To
Ricci, F., Rokach, L., Shapira, B., Kantor, P.B., 2015. Recommender Systems Handbook.
+2022+US/-/E-RES139271.
Springer.
Gefen, D., Karahanna, E., Straub, D.W., 2003. Trust and tam in online shopping: an in-
Salganik, M.J., Dodds, P.S., Watts, D.J., 2006. Experimental study of inequality and un-
tegrated model. MIS Q. 27 (1), 51–90.
predictability in an artificial cultural market. Science 311 (5762), 854–856.

10
O. Mokryn, et al. Electronic Commerce Research and Applications 34 (2019) 100836

Scarpi, D., Pizzi, G., Visentin, M., 2014. Shopping for fun or shopping to buy: is it different Oper. Res. 166 (2), 557–575.
online and offline? J. Retailing Consumer Services 21 (3), 258–267. Venkatesh, V., Agarwal, R., 2006. Turning visitors into customers: a usability-centric
Schäfer, K., Kummer, T.-F., 2013. Determining the performance of website-based re- perspective on purchase behavior in electronic channels. Manage. Sci. 52 (3),
lationship marketing. Expert Syst. Appl. 40 (18), 7571–7578. 367–382.
Senecal, S., Kalczynski, P.J., Nantel, J., 2005. Consumers’ decision-making process and Wells, J.D., Parboteeah, V., Valacich, J.S., 2011. Online impulse buying: understanding
their online shopping behavior: a clickstream analysis. J. Business Res. 58 (11), the interplay between consumer impulsiveness and website quality. J. Assoc. Inform.
1599–1608. Syst. 12 (1), 32.
Sismeiro, C., Bucklin, R.E., 2004. Modeling purchase behavior at an e-commerce web site: Widmer, G., Kubat, M., 1996. Learning in the presence of concept drift and hidden
a task-completion approach. J. Marketing Res. 41 (3), 306–323. contexts. Mach. Learn. 23 (1), 69–101.
Srinivasan, D.B., Mekala, P., 2014. Mining social networking data for classification using Wolfinbarger, M., Gilly, M.C., 2001. Shopping online for freedom, control, and fun.
reptree. Int. J. Adv. Res. Comput. Sci. Manage. Stud. 2 (10). California Manage. Rev. 43 (2), 34–55.
Su, Q., Chen, L., 2015. A method for discovering clusters of e-commerce interest patterns Yi, X., Hong, L., Zhong, E., Liu, N.N., Rajan, S., 2014. Beyond clicks: dwell time for
using click-stream data. Electron. Commer. Res. Appl. 14 (1), 1–13. personalization. In: Proceedings of the 8th ACM Conference on Recommender sys-
Suh, E., Lim, S., Hwang, H., Kim, S., 2004. A prediction model for the purchase prob- tems ACM, pp. 113–120.
ability of anonymous customers to support real time web marketing: a case study. Zhang, H., Ni, W., Li, X., Yang, Y., 2016. Modeling the heterogeneous duration of user
Expert Syst. Appl. 27 (2), 245–255. interest in time-dependent recommendation: A hidden semi-markov approach. IEEE
Tavakol, M., Brefeld, U., 2014. Factored mdps for detecting topics of user sessions. In: Trans. Syst., Man, Cybern.: Syst.
Proceedings of the 8th ACM Conference on Recommender Systems ACM, pp. 33–40. Zheleva, E., Guiver, J., Mendes Rodrigues, E., Milić-Frayling, N., 2010. Statistical models
Tsymbal, A., 2004. The problem of concept drift: definitions and related work. Comput. of music-listening sessions in social media. In: Proceedings of the 19th international
Sci. Department, Trinity College Dublin 106 (2). conference on World wide web ACM, pp. 1019–1028.
Tucker, C., Zhang, J., 2011. How does popularity information affect choices? A field Zhou, L., Dai, L., Zhang, D., 2007. Online shopping acceptance model-a critical survey of
experiment. Manage. Sci. 57 (5), 828–842. consumer factors in online shopping. J. Electron. Commerce Res. 8 (1), 41.
Van den Poel, D., Buckinx, W., 2005. Predicting online-purchasing behaviour. Eur. J.

11

You might also like