You are on page 1of 6

2016 IEEE/WIC/ACM International Conference on Web Intelligence

A Composite Recommendation System for Planning


Tourist Visits
Idir Benouaret Dominique Lenne
Sorbonne Universités Sorbonne Universités
Université de technologie de Compiègne Université de technologie de Compiègne
CNRS UMR 7253 Heudiasyc CNRS UMR 7253 Heudiasyc
CS 60 319 - 60 203 Compiègne cedex CS 60 319 - 60 203 Compiègne cedex
Email: idir.benouaret@hds.utc.fr Email: dominique.lenne@hds.utc.fr

Abstract—Classical recommender systems provide users with of tourism and leisure resources or attractions with the user
ranked lists of recommendations that are relevant to their needs and interests.
preferences. Each recommendation consists of a single item, Classical recommender systems provide the user with rec-
e.g., a movie or a book. However, these ranked lists are not
suitable for applications such as travel planning, which deal ommendations as ranked lists consisting in single items, e.g.,
with heterogeneous items. In fact, in such applications, there is movie, book, i.e, the system always ranks the same type of
a need to recommend packages the user can choose from, each item. In trip planning, a user is interested in suggestions for
package being a set of Points of Interest (POIs), e.g., museums, places to visits, or places of interest (POI), that could be very
parks, monuments, etc. In this paper, we focus on the problem of heterogeneous, e.g., museum, park, restaurant, etc. A tourism
recommending a set of packages to the user, where each package
is constituted with a set of POIs that may constitute a tour. recommender system can benefit from a system capable of
Given a collection of POIs, where each POI has a cost and a recommending items organized in packages (bundles) rather
time associated with it, and the user specifying a maximum total than ranked lists, which constitute an improved exploratory
value for both the cost and the time (budgets), our goal is to experience for the visitor. Then, there is a need to recommend
recommend the most interesting packages for the user, where for the user the best match his preferences. Furthermore, there
each package satisfies the budget constraints. We formally define
the problem and we present a novel composite recommendation may be a cost and a time needed for visiting each point of
system, inspired from composite retrieval. We introduce a scoring interest, that the user may want to constraint with a budget.
function and propose a ranking algorithm that takes into account The budget can also simply be the number of items per
the preferences of the user, the diversity of POIs included in package. Optionally, there may be a notion of compatibility
the package, as well as the popularity of POIs in the package. among items in a package, that can be modeled as constraints
Extensive experimental evaluation of our proposed system, using
a real dataset demonstrates its quality and its ability to improve that the user may specify: e.g., ” no more than three museums
both diversity and relevance of recommendations. in a package ”, ” no more than two restaurants ”, ” the total
distance covered for visiting all POIs in a package should be
I. I NTRODUCTION less than 20 km ” etc. Some so-called ” third generation ”
travel planning web sites, such as Tripadvisor and YourTour,
Information overload is a phenomenon that has invaded aim at assisting the user with suggestions of places to visits
every field of our lives. What movie to rent? What stock to integrating these kinds of constraints, but the suggestions are
buy? and especially in our field of interest, what attractions often based only on the most popular places and neglect the
to visit when we are traveling to a new city? The amount personalization aspect for the user. Thus, the use of these web
of information in the tourism field available in the Web and sites is very limited. Our contribution in this paper is the design
its number of users have noticed an enormous increase. All and implementation of a suggestion model promoting diversity
this information may be useful for those users who plan to and inspired from composite retrieval [1]. The approach we
visit a new city. Information about travel destinations and their propose is to group suggestions in different packages, where
associated resources, such as accommodations, museums or each package is constituted with a set of diverse POIs. The
events, among others, is commonly searched for by tourists set of all package recommendations covers a wide diversity
in order to plan their trip. However, the list of possibilities of themes. Each POI has a a time and a price associated with
offered by Web search engines may be overwhelming for it, and the user specifies a maximum total value for price and
users. The evaluation of this long list of options is very time (budgets) for any recommended package of items. POIs
complex and time consuming for tourists. During the last in each package are chosen using a scoring function that takes
decades, recommender systems have found their way in the into account the preferences of the user, the diversity of items
context of travel planning to help visitors finding relevant including the package, and the popularity of the items in the
Points of Interest (POIs) that might be interesting for them. package. The evaluation of our proposed system using a real
Travel recommender systems aim to match the characteristics dataset with data crawled from the website Tripadvisor, shows

978-1-5090-4470-2/16 $31.00 © 2016 IEEE 626


DOI 10.1109/WI.2016.109
that our system is competitive, it can improve the diversity a course recommendation for helping students planning their
of recommendations without deteriorating their relevance. The academic program at Stanford University. The recommended
road map of the paper is as follows. After discussing related set of courses must also satisfy constraints (e.g., take two out
work in more details (Section 2), we present the architecture of of a set of five math courses). Similar to our work, each
our system and give a formalization of the problem (Section 3). course is associated with a value that is calculated using
We then describe our model and define the quality criteria of an underlying recommender engine. Formally, they use the
packages (section 4). In section 5, we describe our algorithm popularity of courses and courses taken by similar students.
for computing the top-k packages. In section 6, we subject Given a number of constraints, the system finds a minimal set
our system to experimental analysis using a real dataset. We of courses that satisfy the requirements of the user and has
investigate the quality of the recommended packages in terms the highest score. The same authors in [10] extend it with
of both relevance and diversity. Finally, we discuss future work prerequisite constraints, and propose several approximation
and conclude the paper in Section 7. algorithms that return high-quality course recommendations
satisfying all prerequisites. Like in our work, such suggestions
II. R ELATED WORK
of packages are not in a fixed size. However, [9], [10] do
A. Diversity not consider constraints for items (i.e. courses), while we
Diversification is a problem that has been discussed in the capture the cost and time for POIs, that the user constraint with
area of information retrieval, diversifying results associated to budgets, which are essential features in the application of trip
a query helps to cover different interpretations of the query planning that we consider. Other closely related work is [11]
[2]. Research in recommender systems has also proved that where a framework is proposed to automatically recommend
users tend to prefer diversified recommendations [3] [4], which travel itineraries from online user-generated data, like picture
best cover their interests and may allow them to discover new uploads using social websites such as Flickr. They formulate
items. Many works taking into account this dimension were the problem of recommending travel itineraries that might be
designed. The approach in [4] defines an intra-list similarity interesting for users while the travel time is under a given
which relies on mapping items to taxonomies to determine time budget. However, in this work, the value (score) of
topics or using item features. The method is based on an each POI is only determined by the number of times it was
post-processing algorithm which operates on a top-N list to mentioned by other users in the social network, whereas in
compute the top-K recommendations (N>K) that best satisfies our work, the importance of a POI is determined not only
the diversity of the results. Di Noia et al. [5] establish first a by the popularity of the POI but also with a personalized
ranking list of recommendations using relevance, by estimating score depending on user’s preferences and his ratings for other
the rating a user will give to unseen items. Then, they reorder POIs. Finally, the closest work to ours is [12], where the
the list of recommendations using a score function integrating authors explore returning approximate solutions to composite
both the concepts of diversity and user’s preferences. Vargas et recommendations. The focus of the work is on using a Fagin-
al. [6] model every user using a set of sub-profiles representing style algorithm for variable size packages and proving its
a partition of the whole interests of the user. Then, they use optimality. The same authors further develop the idea into a
collaborative filtering with each sub-profile that allows to gen- prototype of recommender system for travel planning (Com-
erate a set of recommendations, each set being recommended pRec). However, in this work the score of an item is just
from one sub-profile. Finally, all recommendations stemming the predicted rating of a user, while we believe that using
from each sub-profile are merged into one final list, ranked also the popularity of items improves the relevance of the
according to their score which is a combination between the recommended packages. Furthermore, none of these works
relevance of the recommendations and the diversity of the sub- accounts for the diversity in packages which leads to a better
profiles represented in the final list. satisfaction of the user.

B. Composite Recommendations III. S YSTEM ARCHITECTURE AND PROBLEM STATEMENT


In [7], authors are interested in finding the top-k tuples A. System Architecture
of entities. Examples of entities could be cities, hotels and As shown in Figure 1, our system is composed of two
airlines, while packages are tuples of entities, they query main components: the recommender engine and the composite
documents using keywords in order to determine entity scores. recommendation system. The recommender engine captures
A package in their framework has a fixed size, e.g., one users item ratings based on their preferences, and these ratings
city, one hotel and one airline. Instead,in our work, we allow are used to predict an appreciation score for POIs not rated
packages of variable size, subject to a budget constraint yet by an active user. In the system, each candidate POI has
specified by the user. CARD [8] is a framework for finding also a popularity assigned to it, which can be calculated using
top-K recommendations of composite products or services. A the number of users reviews. The composite recommender
language similar to SQL is proposed to specify user require- component then receives information about the estimated
ments as well as how atomic costs are combined. However appreciation of a POI and its overall popularity for a user. The
as in [7], recommended packages have a fixed size, making goal of this component is to recommend the best packages for
the problem simpler. CourseRank [9] is a project motivated by the user, it includes a constraint checker module, which checks

627
where ci , cj are the categories of POIs i, j and sp is the
shortest path function.
The topical similarity evaluates in which measure two POIs
i and j deal with similar thematics or topics. The similarity
depends on the topical distance between the two POIs i, j. The
smaller the topical distance, the higher the similarity between
i and j:
1
sim(i, j) = (2)
1 + distt (i, j)
B. Packages Quality Criteria
In order to create the top-k packages of POIs for a specific
user, it is necessary to have a criteria estimating how ”good”
is a package P according to a user u. We denote it by the
score of a package. To this end, we need to measure how a
Fig. 1: System architecture new POI not visited yet by the user would be interesting for
him. The popularity is also an important factor in the general
appreciation of a POI. In fact, the properties depending only
whether a package satisfies the budget constraints. The user on the popularity of a POI are often more important than the
interacts with the system by specifying a cost budget, a time similarity between POIs liked by the user [14]. Moreover, we
budget and an integer K which is the number of packages to assume that the user is not only interested by visiting POIs
recommend. The system finds the k best packages of POIs, he would like, but instead visiting POIs that best cover his
each package having a total cost and a total time under the interests. The diversity of POIs in the same package is thus
budgets specified by the user. an important criteria for the quality of the package.
1) Overall Popularity:
B. Problem Statement The overall popularity measures the popularity of a POI i.
Given a set I of POIs, a set U of users, an active user It is defined by:
u ∈ U and a POI i ∈ I. We denote by c(i) the cost of POI pop(i)
opop(i) = ∈ [0, 1] (3)
i and by t(i) the average time needed for visiting POI i. maxj∈I pop(j)
Given a set of POIs P ⊂ I, we define Score(P ) the score of where j designates the POIs of I and pop : I → N
a package P , which estimates the quality of a package (see represents a popularity indicator of a POI. By extension, the
more details in section IV-B4)
 , c(P ) = i∈P c(i) the cost overall popularity of a package P is:
for a package P and t(P ) = i∈P t(i) the time to visit POIs 
in package P . Given a cost budget Bc and a time budget Bt , opop(i)
opop(P ) = i∈P
∈ [0, 1] (4)
a package P is said Valid iff c(P ) ≤ Bc and t(P ) ≤ Bt . |P |

Problem 1: Top-k Composite recommendations 2) Intra Package Diversity:


Given a set I of POIs, an active user u with his preferences Most of travel recommender systems focus on the modeling
of user preferences and representation of POIs in order to get
background, a cost budget Bc , a time budget Bt and an integer a ranking of the most pertinent POIs for a user. However,
k, a top-k composite recommender system has to determine the the diversity of suggestions in travel planning applications has
top-k packages P1 , P2 , ..., Pk such that each Pi has c(Pi ) ≤ never been the focus point. Nevertheless, it has been suggested
Bc , t(Pi ) ≤ Bt , and among all Valid packages, P1 , P2 , ..., Pk that the diversity of the recommendations has a large positive
have the k highest scores, i.e Score(P ) ≤ Score(Pi ) for all effect on the satisfaction of the user [4]. In order to take into
account the diversity of POIs in a package, we adapt the intra
Valid packages P ∈ / {P1 , P2 , ..., Pk } list diversity introduced by [4]. For a package of POIs P , we
define the intra package diversity:

IV. M ODEL i,j∈P 1 − sim(i, j)
ipd(P ) = (5)
A. Topical Distance and Similarity between Points of Interest |P |2
Our distance between POIs is based on a taxonomy of 3) Estimated Appreciation (Prediction):
hierarchical topic categories organized in a tree structure. The estimated appreciation evaluates to what extent a POI
Formally, we used a domain ontology developed by [13] to i that a user has not yet rated or visited would be interesting
represent these categories. Let I be the set of all possible for a user u. This estimation is based on the preferences of the
POIs for potential suggestions. Each POI in I is associated to user and is calculated using ratings that he gave to a sample of
one category in the taxonomy, e.g. museum, park, building, similar POIs Si pondered by the similarity between the POI to
etc. We define the topical distance distt : I × I → N between estimate and POIs of the sample. The estimated appreciation
two POIs i and j as the length of the shortest path between of a user u ∈ U for a POI i ∈ I is defined by:
the two categories of i and j in the taxonomy: 
j∈Si ratingu (j) × sim(i, j)
eappu (i) =  (6)
distt (i, j) = sp(ci , cj ) (1) j∈Si sim(i, j)

628
where j designates POIs of the sample Si , the set of similar Algorithm 1: BOBO
POIs rated by the user u, and ratingu → [0, 1] associates for Input: I, Bc , Bt , number of packages c
a POI the rating given by the user u, divided by the maximum Output: a set c of packages
rate. 1 P ackages ← ∅
Note that our algorithm, that we describe in the section V 2 P ivots ← Descending sort(I, opop)
do not depend on a specific recommendation algorithm, we 3 while (P ivots = ∅) and |P ackages| < c do
used a simple memory-based item-item collaborative filtering 4 w ← P ivots[0]
approach to generate predicted ratings for each user. This 5 P ivots ← P ivots − {w}
method is chosen for its simplicity and low computational cost. 6 P ←Pick bundle(w, I, Bc , Bt )
By extension, the estimated appreciation (prediction) for a 7 P ivots ← P ivots − P
user u for a package P is defined by the mean of the estimated 8 P ackages ← P ackages ∪ P
appreciation for POIs forming the package: 9 end

eappu (i) 10 return Packages
eappu (P ) = i∈P
∈ [0, 1] (7)
|P |
where i are the POIs included in the package P . Algorithm 2: P ICK BUNDLE
4) Score of a package: Input: pivot w, I, Bc , Bt
The score of a package evaluates the quality of POIs that Output: a package P
form a package for a user u according to the overall popularity, 1 S ←w
the diversity and the estimated appreciation. The score for a 2 active ← I − {w}
package P for a user u is calculated by: 3 cost ← c(w)
4 time ← t(w)

Scoreu (P ) = Ceapp ×eappu (P )+Copop ×opop(P )+Cdiv ×ipd(P ) 5 while (not f inish) do
(8) 6 i ← argmaxi∈active Scoreu (S ∪ {i})
where Ceapp , Copop , Cdiv are positive Coefficients that mod- 7 if (cost + c(i) ≤ Bc ) and (time + t(i) ≤ Bt ) then
ulate respectively, the importance of the estimated appreci- 8 S ← S ∪ {i}
ation, the overall popularity and the diversity in the score 9 cost ← cost + c(i)
function. 10 time ← time + t(i)
11 end
V. C ONSTRUCTION OF THE T OP - K PACKAGES 12 else
The construction of top-k packages is done in two steps: 13 f inish ← true
first, a set of valid packages are produced in large quantities 14 end
with a cardinality c >> k, packages are formed by aggregation 15 active ← active − i
around a pivot POI and taking into account the quality 16 end
criteria. After that, the packages are ranked according to their 17 return S
respective score to recommend the top-k packages.
A. Creating Good Packages
described in algorithm 2. This routine greedily keeps picking
Our approach for forming a set of good valid packages is the next POI that maximizes the score of the package formed
inspired from the algorithm ”BOBO” (Bundles One-By-One) around the pivot (line 6), as far as the budget constraints
introduced by Amer-Yahia et al. [1]. We adopted this algorithm are satisfied (line 7). If the selected POI respects the budget
to take into account the quality criteria of the packages constraint then it is added to the package (line 8), its cost is
(estimated appreciation, popularity and diversity) defined in added to the cost of the package (line 9) and its time is added
Section IV-B. The goal of this algorithm is to create c valid to the whole time of the package (line 10), it is then discarded
packages that respect the budget constraint. It is inspired from from the active POIs (line 15) so that it will not appear in an
k −nn clustering. At each step a POI is chosen as pivot, and a other package. Note that, without loss of generality, we assume
valid package with maximum score is built around that pivot. that all POIs have a smaller cost than the cost budget Bc and
The pseudo code is described in algorithm 1. have a smaller visiting time than the time budget Bt .
BOBO starts with an empty set of packages (line 1). Then, Let us go back to BOBO’s main loop: once a candidate
a list of candidates pivots POIs is constituted (line 2), this list package is created, it is added to ”Packages” (line 8) and its
ranks the list of potential suggestions (I) in a decreasing order elements are removed from ”P ivots” (line 7) so that they are
of their overall popularity (opop) . not longer used.
As long as the number of formed packages is less than the
number of required packages c, at each iteration the first POI B. Selection of Top-K packages
is taken from the set of Pivots (line 4), and a package is built Once the required number of packages has been created,
around it (line 5). This is done by the routine Pick bundle they are ranked following their respective scores (Section

629
IV-B4). Afterwards, we select the k packages having the best Versions of our system Ceapp Copop Cdiv
scores. The relevance and the diversity of the recommended per 1 0 0
top-k packages is evaluated in the next section. pop 0 1 0
VI. E VALUATION div 0 0 1

A. Data Set per+pop 1/2 1/2 0


per+div 1/2 0 1/2
The goal of our experiments were: (1) evaluate the relevance
pop+div 0 1/2 1/2
of the packages recommended by our algorithm, and (2)
evaluate their diversity as well. In order to have a set of POIs per+pop+div 1/3 1/3 1/3
constituting potential recommendations, we crawled data from TABLE II: Different versions of our system
Tripadvisor. Each POI has a thematic category organized in
a tree structure, which allows us to construct our similarity
measure. In addition, Tripadvisor provides for a POI the We tested our system varying the number of returned
number of users rating for a POI, we used it as an indicator of packages k, we vary it between four values: {5, 10, 15, 20}.
its popularity, for estimating the function pop : I → N defined The cost budget is fixed to e60 and the time budget fixed
in Section IV-B1. For our experiments, we crawled users rating to 300 minutes. We tested our algorithms under various cost
from POIs in the five most popular cities in France. We exclude and time budgets with very similar results, so other budgets
POIs that have very few or no reviews. The dataset contains are omitted for lack of space. The variation of cost and time
40635 ratings for 1183 POIs by 18227 users, so as we see the budget does not really affects the precision and diversity of
data is very sparse. We associate with each POI its cost and recommendations, but rather affects the POIs that will be
its average time of visit crawled also from Tripadvisor. The selected into each package, which means, if the cost budget
average price of POIs was close to e7 and the average time is very small, this will basically limits the budget to only free
for visiting a POI ranges from 30 minutes to 3 hours. Because attractions (POIs), while with a very large budget the algorithm
of the large sparsity of the underlying user rating matrix, we will add POIs to a package until the time budget is reached.
selected the 20 most active users as our sample for testing the As well, if the time budget is very small, the algorithm will
algorithms. tend to create packages which may contain very few POIs in
B. Evaluation Metrics and Experimental Protocol a package or even empty packages. This is why we fixed the
cost budget to e60 and the time budget 300 minutes, which
1) Evaluation Metrics: are medium budgets given the distribution of costs and times
P recision: is calculated as the ratio of recommended POIs for all POIs.
that are relevant to the total number of recommended POIs. 3) Baseline Approach :
|relevant recommended P OIs| To evaluate also the effectiveness of the proposed system,
P recision = (9)
|recommended P OIs| we compare our results with the package recommendation
Diversity: we extend the intralist diversity introduced by method proposed by Xie et al. [12], which is the closest work
Ziegler et al [4] to a set of k packages {P1 , ..., Pk }. The Mean to ours. The authors also computed the estimated appreciation
Intralist Diversity (MILD) is defined. of POIs using an item-item collaborative filtering approach,
k without taking in consideration the popularity of POIs and the
ILD(Pi ) diversity aspect of recommendations.
M ILD({P1 , ..., Pk }) = i=1
(10)
k
We used a third metric in order to compare approaches C. Results and discussions
within a better compromise between precision and diversity. Results of our versions compared to the competitive ap-
FP D : the F-measure is the harmonic mean of precision and proach according to precision, diversity and FP D are reported
diversity : in Table I. In all our versions, we can notice a high influence
2 × precision × diversity of the popularity of POIs with respect to the precision. It
FP D = (11) is important to underline that the popularity is a significant
precision + diversity
factor as well as the personalization. In fact, in most cases,
2) Experimental Protocol: the ”pop” version leads to a better precision than the ”per”
Our goal was to test the impact of personalization (per), version and the ”pop +div” version better than the ”per+div”
popularity (pop) and diversity (div) on the quality of recom- version. These results are in accordance with [14], which
mendations. To this end, we compared several versions of our highlights the importance of the popularity and its effect on
system, corresponding to different possible combinations of the relevance of recommendations. The div version performs
the factors of Ceapp , Copop Cdiv . Each version corresponds the worst precision, since it ignores the personalization aspect
to a different combination of the parameters. Versions we and the popularity of POIs.
tested are summarized in the Table 1. The name of each
Varying the number of packages, the ”per+pop” version al-
version indicates the use or not of the different aspects when
ways performs the best precision and outperform the algorithm
constituting the set of packages.
of Xie et al., due to combining the personalization and the

630
k=5 k=10 k=15 k=20
P(%) D(%) FP D (%) P(%) D(%) FP D (%) P(%) D(%) FP D (%) P(%) D(%) FP D (%)
per 49.73 40.02 44.35 50.19 41.12 45.20 51.13 42.74 46.56 48.88 43.75 46.17
pop 57.75 47.89 52.36 53.5 51.33 52.39 51.84 49.43 50.59 50.18 48.61 49.38
div 41.01 61.96 49.35 43.01 60.39 50.25 42.68 59.21 49.60 38.08 57.81 45.91
pop+div 53.7 55.73 54.69 53.36 58.04 55.60 51.92 56.14 53.94 49.67 55.81 52.56
per+div 47.85 55.03 51.19 49.88 57.84 53.56 48.1 57.18 52.24 48.22 53.46 50.70
per+pop 59.38 43.25 50.04 54.53 50.57 52.47 53.03 50.37 51.66 51.11 49.25 50.16
per+pop+div 55.09 58.45 56.72 53.66 59.88 56.59 51.75 57.33 54.39 50.39 55.84 52.97
Xie et al 57.24 43.58 49.48 53.32 48.11 50.58 51.65 50.8 51.22 50.45 52.85 51.62

TABLE I: Comparison of our different versions with the competitive approach

popularity. Without surprise, the ”div” version is the one that the recommended POIs are close to the position of the user.
achieves the best diversity compared to all others. However, it Furthermore, it will be interesting to compare between a
has also the worst values for precision. recommender system providing classical ranked lists and our
Concerning the F-measure between precision and diver- composite recommender system.
sity,we notice that the ”per+pop+div” realizes the best compro-
ACKNOWLEDGMENT
mise, and out performs the competitive algorithm. This version This work is supported by the Regional Council of Picardie.
tends to promote a large diversity, performs better than Xie et
al., and is not significantly different in precision comparing to R EFERENCES
the ”per + pop” version. Thus, the ”per+pop+div” is the best [1] S. Amer-Yahia, F. Bonchi, C. Castillo, E. Feuerstein, I. Mendez-Diaz,
approach when considering both precision and diversity. and P. Zabala, “Composite retrieval of diverse and complementary
bundles,” TKDE, 2014.
Through this analysis, we argued on the quality of our rec- [2] C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan,
ommended packages for the different settings. We especially S. Büttcher, and I. MacKinnon, “Novelty and diversity in information
confirmed our hypothesis on the importance of taking into retrieval evaluation,” in ACM SIGIR conference on Research and devel-
opment in information retrieval, 2008.
account the popularity and the diversity in addition to the [3] K. Bradley and B. Smyth, “Improving recommendation diversity,” in
personalization aspect when constructing the top-k packages. Proceedings of the Twelfth Irish Conference on Artificial Intelligence
We proved that combining these three aspects for the task of and Cognitive Science, Maynooth, Ireland. Citeseer, 2001, pp. 85–94.
[4] C.-N. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen, “Improving
composite recommendations leads to a better results on the recommendation lists through topic diversification,” in Proceedings of
F-measure compared to other approaches. the 14th international conference on World Wide Web. ACM, 2005.
[5] T. Di Noia, V. C. Ostuni, J. Rosati, P. Tomeo, and E. Di Sciascio, “An
VII. C ONCLUSION AND FUTURE WORK analysis of users’ propensity toward diversity in recommendations,” in
8th ACM Conference on Recommender systems. ACM, 2014.
Motivated by applications of trip planning, we studied the [6] S. Vargas and P. Castells, “Exploiting the diversity of user preferences
problem of recommending packages consisting of sets of for recommendation,” in Proceedings of the 10th Conference on Open
Research Areas in Information Retrieval, 2013, pp. 129–136.
POIs. Each POI is associated to a category, where categories [7] A. Angel, S. Chaudhuri, G. Das, and N. Koudas, “Ranking objects
are organized in a hierarchical tree structure, which allowed based on relationships and fixed associations,” in 12th International
us to define a semantic similarity measure between POIs. Conference on Extending Database Technology, ser. EDBT ’09, 2009.
[8] A. Brodsky, S. Morgan Henshaw, and J. Whittle, “Card: A decision-
Our composite recommendation system consists of ranking guidance framework and application for recommending composite al-
packages according to a score function, where the score of ternatives,” ser. RecSys ’08. ACM, 2008.
a package depends on the estimated appreciation, the overall [9] A. Parameswaran, P. Venetis, and H. Garcia-Molina, “Recommendation
systems with complex constraints: A course recommendation perspec-
popularity and the diversity of POIs constituting the package. tive,” Technical report, 2009.
We formalized the problem of generating top-k packages [10] A. G. Parameswaran and H. Garcia-Molina, “Recommendations with
recommendations that are under cost and time budgets, where prerequisites,” in Proceedings of the third ACM conference on Recom-
mender systems. ACM, 2009, pp. 353–356.
a cost and a time of visit are incurred by visiting each [11] M. De Choudhury, M. Feldman, S. Amer-Yahia, N. Golbandi, R. Lem-
recommended POI and the budgets are user specified. We pel, and C. Yu, “Automatic construction of travel itineraries using social
developed an algorithm for retrieving the top-k packages with breadcrumbs,” in Proceedings of the 21st ACM conference on Hypertext
and hypermedia. ACM, 2010, pp. 35–44.
best scores. The evaluation of our system using a real world [12] M. Xie, L. V. Lakshmanan, and P. T. Wood, “Breaking out of the box
dataset crawled from the website Tripadvisor demonstrates of recommendations: From items to packages,” ser. RecSys ’10. ACM,
its quality and its ability to improve both the relevance and 2010, pp. 151–158.
[13] L. Castillo, E. Armengol, E. Onaindı́a, L. Sebastiá, J. González-
the diversity of recommendations. We plan now to realize a Boticario, A. Rodrı́guez, S. Fernández, J. D. Arias, and D. Borrajo,
study of the proposed system with real users on a situation of “Samap: An user-oriented adaptive system for planning tourist visits,”
mobility, where the localization context will take an important Expert Systems with Applications, vol. 34, no. 2, pp. 1318–1332, 2008.
[14] H. Steck, “Item popularity and recommendation accuracy,” ser. RecSys
role on the recommendation process, the task will be to ’11. ACM, 2011, pp. 125–132.
recommend the best packages for a given user provided that

631

You might also like