Professional Documents
Culture Documents
net/publication/221158441
CITATIONS READS
15 1,465
2 authors, including:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Alípio Mário Jorge on 21 May 2014.
2
test session one item is hidden. Testing is done by compar- mental user-based approach. The results on 3 datasets have
ing produced recommendations with the hidden items. The shown that item-based incrementality pays off in terms of
number of neighbors N eib used in CF was 5. computational cost, with low model update times, and that
Measures: To compare the predictive abilities of the recommendation times are identical to the non-incremental
algorithms we have used the measures of Recall and version. The incremental item-based approach had better
P recision. Let hits be the set with the hits of the rec- results than its user-based counter-part, both in terms of
ommendation system; hidden be the set with the pairs time and predictive accuracy. The update of similarities be-
< user, item > that are hidden; and recs the set with all tween items is faster than between users because it doesn’t
the recommendations made by the recommender system. need to revisit the usage database. We have also used sparse
#hits
Recall = #hidden , measures the ability to recommend every- matrices to cope with scalability, and tried 3 different R
#hits packages. We conclude that the package spam yelds faster
thing that is relevant to the user. P recision = #recs , mea-
sures the ability to recommend only what is relevant, leav- R programs and that there is a relatively high price for using
ing out what will probably not be seen. We also measure sparse matrices, as expected.
the relative T ime, with respect to algorithm A1, needed to As future work we need to investigate the behavior of the
construct S in each method, the average wall time of rec- algorithms with larger datasets and in a continuous setting.
ommendation per user and to update S and other supporting The incremental algorithm will be included in an adaptive
data structures (per user). web platform and the impact of incrementality in the accu-
Time: We show results (fig. 1) for the recommendation rate response of CF will be studied on a live setting.
and update/construction time. We can see that the use of Acknowledgments: Fund. Ciência e Tec-
sparse matrices has a high cost in terms of recommendation nologia POSC/EIA/ 58367/2004/ Site-o-Matic and
time. A2 is 10 to 45 times slower than A1 when recom- PTDC/EIA/81178/2006 Rank! Projects co-financed by
mending. This is a necessary price to pay, due to the limi- FEDER.
tation of full matrices. From the comparison of incremental
(A3) and non-incremental (A2) item-based algorithms, we References
can see that the recommendation time is very similar. A2
tends to be worse, but for spurious reasons. More impor- [1] J. S. Breese, D. Heckerman, and C. M. Kadie. Empirical
tantly, the update time in A3 is much lower than construc- analysis of predictive algorithms for collaborative filtering. In
tion time in A2, which means that it is much cheaper to G. F. Cooper and S. Moral, editors, UAI, pages 43–52. Mor-
update the model than to rebuild it from scratch every time gan Kaufmann, 1998.
we have new information. The exception is the ZP data, [2] M. Papagelis, I. Rousidis, D. Plexousakis, and E. Theo-
for the 0.4 split. The exact reasons for this exception are haropoulos. Incremental collaborative filtering for highly-
not known. A4 tends to be much slower when recommend- scalable recommendation algorithms. In M.-S. Hacid, N. V.
ing, except for the dataset PE200, when the number of users Murray, Z. W. Ras, and S. Tsumoto, editors, ISMIS, volume
3488 of Lecture Notes in Computer Science, pages 553–561.
is identical to the number of items (which is not realistic).
Springer, 2005.
Update times of A3 and A4 are similar when the number of [3] R Development Core Team. R: A Language and Environment
users is not much larger than the number of items. However, for Statistical Computing. R Foundation for Statistical Com-
when we have proportionally more users, the update time of puting, Vienna, Austria, 2007.
A4 degrades considerably. [4] P. Resnick and H. R. Varian. Recommender systems - intro-
Predictive ability: Figure 2 shows the curves of Recall duction to the special section. Commun. ACM, 40(3):56–58,
and Precision for increasing values of N (maximum number 1997.
of recommendations allowed to the algorithms). We do not [5] B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl. Anal-
show the results for the split 0.4 because they are very simi- ysis of recommendation algorithms for e-commerce. In ACM
lar to 0.8. We can see that the user-based approach tends to Conf. on Electronic Commerce, pages 158–167, 2000.
[6] B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl. Item-
have worse results. The item-based algorithms have similar
based collaborative filtering recommendation algorithms. In
performance in both measures. A1 and A2 give exactly the WWW, pages 285–295, 2001.
same results since they differ only in the data structures for [7] U. Shardanand and P. Maes. Social information filtering: Al-
storing the matrices. gorithms for automating ”word of mouth”. In Human Factors
in Computing Systems, CHI 95 Proceedings, pages 210–217,
5. Conclusions and Future Work 1995.
3
(a) Recom. time, Split=0.4 (b) Recom. time, Split=0.8
Figure 1. Recommendation and Update Time of each algorithm using the relation to A1
Figure 2. Recall and Precision vs. Number of Recommendations for each used dataset using
Split=0.8