Professional Documents
Culture Documents
Systems
ABSTRACT of the probabilistic calculus for the calculation of P (xj |xi ), which
Recommender systems estimate the conditional probability is discussed in the next section. Section 3 illustrates the proposed
P (xj |xi ) of item xj being bought, given that a customer has methodology by reference to the simplest estimator of the condi-
already purchased item xi . While there are different ways of tional, the Naïve Bayes (NB) recommender system, and is followed
approximating this conditional probability, the expression is by a brief evaluation of the expected uptake of recommendations
generally taken to refer to the frequency of co-occurrence of estimated from historical data in a real-world application. For con-
items in the same basket, or other user-specific item lists, venience, the NB classifiers using basket-based and item-based es-
rather than being seen as the co-occurrence of xj with xi timates will be referred to NB(baskets) and NB(items), respectively,
as a proportion of all other items bought alongside xi . This throughout the rest of the paper.
paper proposes a probabilistic calculus for the calculation of
conditionals based on item rather than basket counts. The 2. AN ITEM-BASED PROBABILISTIC
proposed method has the consequence that items bough to- CALCULUS
gether as part of small baskets are more predictive of each
other than if they co-occur in large baskets. Empirical re- 2.1 Basket-based and Item-based Probability
sults suggests that this may result in better take-up of per- Estimation
sonalised recommendations.
A probability estimate represents the expected frequency of oc-
Categories and Subject Descriptors: H.3.3 [Information currence of an event of interest taken over a defined space of
Systems]: Information Search and Retrieval possibilities. In collaborative filtering there are two different in-
General Terms: Algorithms, Experimentation, Performance. terpretations of the base population from which inferences about
Keywords: shopping basket recommendation, naive bayes clas- item purchases are made, which may be the number of purchasing
sifier, performance evaluation, popularity-based ordering. episodes (i.e. baskets) or the totality of individual items purchased
[5, 7, 3, 6].
1. INTRODUCTION 1. Basket-based probability estimator. In this estimate, which
The estimation of class-conditional probabilities lies at the heart is the most commonly used in practice [5, 7, 6], the count-
of personalised recommender systems. It is normally the case that ing unit is a basket. Consequently, the prior probability
co-occurrence of item pairs is empirically estimated by the mean P (Ci = xi ) is empirically estimated to be the ratio be-
proportion of baskets in which the two items appear together. We tween the number of baskets containing xi (i.e. ND (xi )
shall refer to this as basket-based estimation. However, this loses and the total number of baskets; the conditional probabil-
track of the context of remaining items purchased, in particular ity P (Fj = xj |Ci = xi ) is similarly given by the number
whether the observed pairing is one of a few or very many item of baskets in which (xi , xj ) co-occur divided by the total
pairs in the relevant baskets in the user-item data matrix. This pa- number of baskets that contain xi .
per proposes an alternative modeling framework where each item ND (xi )
pairing is counted as a proportion of the total basket size, that is to PD (Ci = xi ) = (1)
|baskets|
say the frequency of paired occurrences for item xj with item xi
is normalised by the total number of items purchased alongside xi . n(xi , xj )
PD (Fj = xj |Ci = xi ) = (2)
This we term item-based estimation. It requires a re-formulation ND (xi )
129
item xj divided by the total number items pairs purchased in
the same baskets as xi . Table 1: Example of Shopping Baskets.
Basket ID F1 F2 F3 F4 F5 F6
NF (xi ) 1 d e f
PF (Ci = xi ) = (3)
|item pairs| 2 d e f
n(xi , xj ) 3 c d
PF (Fj = xj |Ci = xi ) = (4) 4 b c d
NF (xi )
5 a b c d
where NF (xi ) = n(xi , xj ) and |item pairs| = 6 a b c d
j:j=i
i NF (xi ) .
7 a b
8 a b
Note that the basket-based estimate does not discriminate be-
tween items that have co-occurred with different numbers of other
items. In contrast, the item-based approach captures the item co- ones1 . In a NB recommender system, the calculation of P (Ci =
occurrence information in the feature, or item, space. xi |F ) can be simplified as follow:
Furthermore, for a probabilistic calculus to be complete and con-
m
sistent, the following three conditions must be met: P (Ci = xi |F ) ≈ P (Ci ) P (Fj |Ci )
j:j=i
• i PF (xi ) = 1: normalisation of individual probabilities m
j:j=i n(xi , xj )
=
• j:j=i PF (xj |xi ) = 1: normalisation of conditional prob- N (xi )m−1
abilities The above equation implies that NB(baskets) and NB(items) will
generate the same recommendation lists if they rank N (xi ), i.e.
• j:j=i PF (xi , xj ) = P (Xi ): calculation of marginal prob- ND (xi ) for NB(baskets) and NF (xi ) for NB(items), in the same
abilities order. Otherwise, the two lists will be different.
In a preliminary experiment, we evaluate the impact of N (xi )
The first two identities follow trivially from the stated definitions. on the ranking position of item xi in the recommendation list. The
The third identity holds as result is shown in Figure1. As can be seen, among 210 active cat-
egories, there are only around 23 categories with the same ranking
n(xi , xj ) j:j=i n(xi , xj )
PF (xi , xj ) = = position in both cases.
j:j=i j:j=i
|item pairs| |item pairs|
25
NF (xi )
= = PF (Ci = xi ) = P (Xi )
|item pairs| 20
Frequency
15
2.2 An Illustrative Example
10
To illustrate the difference between the two probability estima-
tions, consider the example given in Table 1 in which there are eight 5
baskets (i.e. |baskets| = 8) and 23 basket items in total across all
the baskets (i.e. |item pairs| = i NF (xi ) = 48). With the
0
−20 −10 0 10 20 30 40 50
basket-based probability estimation: Ranking Difference
130
1
real-life recommender systems to model the correlations among a
22
0.9
subset of all the product categories, also called active categories.
NB(baskets) 20
0.8 NB(items) 18
In this section, we compare the performance difference among sev-
0.6 14 with increasing coverage = (0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9,
0.5 12
1.0) of all the n% categories (i.e. the categories that account for
10
0.4 most of the total spend over the training data). Note that, a low (or
8
0.3
6 high) coverage of the active categories is indicative of a small (or
0.2 4
0.1
large) value of the average basket size, which is shown in Figure
2
0 0
2(b).
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Here, we also implement the item-based collaborative filtering
Coverage of Active Category(%) Coverage of Active Category(%) algorithm (item CF) [3], for comparison. In item CF, the pairwise
item similarity is calculated as:
(a) Hit Rate (Binary) (b) Avg. Basket Length
∀q:Rq,j>0 Rxq ,xj
Figure 2: Performance comparison with increasing sim(xi , xj ) = (5)
coverage of the active categories. testset = LeShop. ND (xi ) × ND (xj )α
131
0.4
5. CONCLUSIONS
0.38 A probabilistic framework using individual item purchases was
NB(baskets) proposed and its performance for personalised recommender sys-
0.36
NB(items)
0.34 item CF tems was benchmarked against the standard basket-based estimates
HR(Binary)
132