You are on page 1of 13

Expert Systems With Applications 98 (2018) 153–165

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

Improving memory-based user collaborative filtering with


evolutionary multi-objective optimization
Nour El Islem Karabadji a,b,∗, Samia Beldjoudi a,b, Hassina Seridi b, Sabeur Aridhi c,
Wajdi Dhifli d
a
High School of Industrial Technologies, P.O. Box 218, Annaba 23000, Algeria
b
Electronic Document Management Laboratory (LabGED), Badji Mokhtar-Annaba University, P.O. Box 12, Annaba, Algeria
c
University of Lorraine, LORIA, Campus Scientifique, BP 239, Vandoeuvre-lès-Nancy 54506, France
d
University of Lille, EA2694, 3 rue du Professeur Laguesse, BP 83, Lille Cedex 59006, France

a r t i c l e i n f o a b s t r a c t

Article history: The primary task of a memory-based collaborative filtering (CF) recommendation system is to select a
Received 23 September 2017 group of nearest (similar) user neighbors for an active user. Traditional memory-based CF schemes tend
Revised 3 January 2018
to only focus on improving as much as possible the accuracy by recommending familiar items (i.e., pop-
Accepted 11 January 2018
ular items over the group). Yet, this may reduce the number of items that could be recommended and
Available online 12 January 2018
consequently weakens the chances of recommending novel items. To address this problem, it is desirable
Keywords: to consider recommendation coverage when selecting the appropriate group. This could help in simulta-
Recommender systems neously making both accurate and diverse recommendations. In this paper, we propose to focus mainly
Collaborative filtering on the growing of the large search space of users’ profiles and to use an evolutionary multi-objective
Genetic algorithms optimization-based recommendation system to pull up a group of profiles that maximizes both similarity
Multi-objective optimization with the active user and diversity between its members. In such manner, the recommendation system
Diversity
will provide high performances in terms of both accuracy and diversity. The experimental results on the
Movielens benchmark and on a real-world insurance dataset show the efficiency of our approach in terms
of accuracy and diversity compared to state-of-the-art competitors.
© 2018 Elsevier Ltd. All rights reserved.

1. Introduction (c) content-based filtering and (d) hybrid filtering. The most widely
used classification divides the filtering algorithms into collabora-
The main aim of a recommender system (RS) is to propose to
tive filtering (CF) and content-based (CB) techniques (Lu, Wu, Mao,
users personalized items (e.g., movies (Wang, Yu, Feng, & Wang,
Wang, & Zhang, 2015). CF techniques are categorized as model-
2014), books (Crespo et al., 2011) and music (Lee, Cho, & Kim, based or memory-based approaches. Memory-based and model-
2010)) according to their historical preferences and to save their
based methods are both used in collaborative filtering (Cacheda,
searching time by extracting only useful data. With the massive
Carneiro, Fernández, & Formoso, 2011). The memory-based meth-
amounts of information available in database systems and that ods act on the matrix of ratings. They use the ratings of users on
led to an overload problem, RSs are facing unprecedented chal-
items to compute similarities among different users or items in
lenges. In this context, existing RSs have shown many limitations
order to select neighbors and recommend their items to the cur-
among which we mainly cite sparseness, scalability, overspecial-
rent user (Bobadilla, Hernando, Ortega, & Bernal, 2011). Memory-
ized recommendations and cold-start problems (Bobadilla, Ortega,
based methods usually provide a good recommendation accuracy
Hernando, & Gutiérrez, 2013).
but they suffer a high computation time that increases along with
In general, RSs can be partly categorized according to the inter-
the number of users and items. In contrast, model-based methods
nal filtering method (Bobadilla, Ortega, Hernando, Gutiérrezet al., create models to generate recommendations and focus on describ-
2013) under: (a) collaborative filtering, (b) demographic filtering,
ing users behaviors to predict the ratings of items. Thus, in com-
parison with memory-based methods, model-based methods are

Corresponding author at: LabGED, Badji Mokhtar University, PO Box 12, Annaba faster yet provide less accurate predictions.
230 0 0, Algeria. Memory-based techniques are the most popular and they fol-
E-mail addresses: karabadji@labged.net (N.E.I. Karabadji), Beldjoudi@ low the way in which humans make decisions. For an active user
labged.net (S. Beldjoudi), seridi@labged.net (H. Seridi), sabeur.aridhi@loria.fr (S.
Aridhi), wajdi.dhifli@univ-lille2.fr (W. Dhifli).
(u), a memory-based CF approach collects ratings of items given by

https://doi.org/10.1016/j.eswa.2018.01.015
0957-4174/© 2018 Elsevier Ltd. All rights reserved.
154 N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165

the neighbors of the target user u to recommend promising items. ing these ideas requires an optimization in a very large search
The most serious problem suffered by memory-based CF methods space that grows exponentially with the increasing size of data.
are data particularity (i.e., complexity and dimensionality) (Wang Therefore, we propose a new genetic algorithm (GA) to pull up an
et al., 2014), and insufficient effectiveness of the similarity mea- optimized subset of profiles that improves the construction of a
sures (Choi, Yoo, Kim, & Suh, 2012). Indeed, low-dimensional rat- memory-based collaborative filtering-based recommender system.
ing data may lead to a sparsity problem, and high-dimensional rat- The rest of the paper is organized as follows: in Section 2, we
ing data makes extracting interesting users by similarity compu- present an discuss related works. Section 3 introduces the main
tation very costly. In this context, many similarity measures have notions used in this work. In Section 4, we describe our GA ap-
been proposed among which we cite (Resnick, Iacovou, Suchak, proach based on profiles reduction. Section 5 reports the experi-
Bergstrom, & Riedl, 1994), CPCC (Shardanand & Maes, 1995), SPCC mental results. Finally, Section 6 concludes the study.
(Jamali & Ester, 2009), COS (Adomavicius & Tuzhilin, 2005), ACOS
and PIP (Ahn, 2008), MSD (Cacheda et al., 2011), Jaccard (Koutrika, 2. Related works
Bercovitz, & Garcia-Molina, 2009), JMSD (Bobadilla, Hernando, Or-
tega, & Gutiérrez, 2012), and NHSM (Liu, Hu, Mian, Tian, & Zhu, In the literature, many efforts have been dedicated to improve
2014). Although the existence of various similarity measures, none RSs performances and to deal with their various drawbacks. These
of them is universally adequate for all varieties of problems. More- efforts have been oriented to generating accurate RSs (Bobadilla,
over, following traditional memory-based CF scheme by selecting Ortega, Hernando, Gutiérrezet al., 2013). In order to achieve this
the k most similar profiles for an active user u increases the accu- objective, many solutions have been proposed. In general, contri-
racy in terms of rating prediction by fostering popular items rec- butions have been attempting to improve CF accuracy by conduct-
ommendation. However, it decreases the diversity of the recom- ing data reduction solutions and enhancing the effectiveness of
mendations. Efficient personalized recommendations for an active similarity measures.
user, must also account for the variety of the users’ needs and Data reduction solutions such as clustering and dimensionality
make both accurate and diverse recommendations (Gogna & Ma- reduction techniques (i.e, principal component analysis (PCA) and
jumdar, 2017; Zuo, Gong, Zeng, Ma, & Jiao, 2015). singular-value decomposition (SVD)) have been used to improve CF
In the classification field, data reduction is often employed performances (Gao, Xing, & Zhao, 2007b; Goldberg, Roeder, Gupta,
to overcome the problems of data particularity (Karabadji, Seridi, & Perkins, 2001; Sarwar et al., 20 0 0). Although many CF based
Bousetouane, Dhifli, & Aridhi, 2017; Karabadji, Seridi, Khelf, Az- on clustering algorithms have shown a significant improvement
izi, & Boulkroune, 2014). RSs use clustering to address this prob- in terms of accuracy, the accuracy performance of CF based on
lem (Gao, Xing, & Zhao, 2007a; Sarwar, Karypis, Konstan, & Riedl, clustering algorithms is strongly related to both the used cluster-
20 0 0). Clustering aims at partitioning users over different groups ing technique and dataset. Moreover, these techniques show strong
which form “like-minded” neighbors; reducing the whole users’ effectiveness degradation with respect to: (a) high dimensional
space to a smaller subset (i.e., cluster) that allows improving the spaces phenomenon, (b) wrong choice of the number of clusters
efficiency and scalability of CF-based RS (Gao et al., 2007a; Kim (e.g., k-means), and (c) being frequently trapped into local optima.
& Ahn, 2008; Wang et al., 2014). However, clustering approaches Some evolutionary algorithms have been proposed to avoid these
mainly suffer (a) determining the number of clusters, (b) select- problems (Bakshi, Jagadev, Dehuri, & Wang, 2014; Liao & Lee, 2016;
ing an appropriate initial seed, (c) a significant degradation with Rana & Jain, 2014; Wang et al., 2014).
respect to the increase of dimensional spaces, and (d) frequent Liao and Lee (2016) proposed a clustering-based approach
convergence to local optima. To alleviate these different problems, which applies a self-constructing clustering algorithm to reduce
a possible alternative consists of selecting an appropriate profiles the number of products and thus the dimensionality for the rec-
subset for each active user. The selected group allows to increase ommendation. In their approach, they group similar products in
the accuracy and to reduce the effect of data particularity in terms the same cluster and then the recommendation is performed with
of constitutionality and effectiveness of the similarity measures. the resulting clusters called product groups. In this manner, the
However, assuming that we have a dataset D with |D | profiles, processing time for making recommendations is reduced due to
there are 2|D| − 1 non-empty possible subsets. The most reliable the reduction of the number of products. The authors have shown
strategy for finding the optimal subset among 2|D| − 1 candidates through experimental analysis the efficiency of their recommenda-
is to evaluate them all. Unfortunately, computing the optimal sub- tion system and that their approach can improve the recommen-
set by exhaustive search is infeasible as the number of subsets of dation without compromising the quality of the results.
candidates grows exponentially. Evolutionary computation meth- Rana and Jain (2014) addressed the problem of dynamic
ods proved to be efficient in a variety of applications where the changes in user requirements for seeking information on the web
optimal solution is hard to obtain (Dhifli, Da Costa, & Elati, 2017; over a period of time. They proposed an evolutionary clustering
Karabadji et al., 2017; Karabadji et al., 2014). They allow a fast es- algorithm based on temporal features for dynamic recommender
timation of near-optimal solutions by restraining the search space. systems. The clustering is performed to detect similar users and
In this paper, we investigate the possibility of constructing evolves them to capture relevant user preferences over time. They
an efficient RS by improving memory-based collaborative filtering showed through empirical tests that their algorithm has consid-
through evolutionary computation and multi-objective optimiza- erable improvement in terms of quality of recommendations and
tion. We propose an approach that allows (a) ensuring accuracy computation time compared to standard RS approaches.
and diversity of recommendations, and (b) reducing the required Bakshi et al. (2014) proposed a new estimation scheme based
amount of computation by restraining the search to only a subset on the sparsity of data to enhance both the scalability and accu-
of selected profiles Px from the entire set of possibilities P (Px ⊆P). racy of RSs using unsupervised learning and particle swarm opti-
The selected set of recommendation profiles simultaneously max- mization. The computation time is reduced by matching the pro-
imizes its similarity to the active user which improves the pre- file of the user of interest to its partitioned small training samples
diction accuracy, and its intra-group diversity which increases the that are captured through clustering. The particle swarm optimiza-
number of possible items to be recommended. As both similar- tion is then used to weight local and global neighbors for every
ity and diversity represent two conflicting performance metrics, user and for every prediction. The author showed that their tech-
we follow a flexible multi-objective optimization strategy that al- nique has significantly increased the accuracy of prediction over
lows providing the best trade-off between both measures. Follow- traditional methods of collaborative filtering.
N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165 155

Wang et al. (2014) introduced a hybrid model for the recom- which makes considering them simultaneously challenging. Some
mendation of movies. This hybrid model aims at getting around works have followed evolutionary schemes to pull up an optimal
the problems of both high dimensionality and data sparsity. To proper balance between accuracy and diversity.
achieve this objective, a principal component analysis transforma- Zuo et al. (2015) proposed a system denoted MOEA for person-
tion of data is first applied on user profiles which makes them alized recommendation based on evolutionary multi-objective op-
denser profile vectors. Next, a clustering approach (k-means) based timization. Using the MOEA method allows improving the list of
on a genetic algorithm solution is performed to select starter cen- top-N recommendations by keeping in balance accuracy and diver-
troids for partition user profiles. Finally, an active user is assigned sity. To this end, first, a clustering technique is applied for par-
to a cluster, and a TOP-N movie recommendation list is presented. titioning users into several clusters. Then, an NSGA-II algorithm
As mentioned previously, many works have been conducted to (Deb, Pratap, Agarwal, & Meyarivan, 2002) is invoked to optimize
improve RSs accuracy by enhancing the used similarity measure these two conflicting objectives simultaneously to find a balanced
(Alhijawi & Kilani, 2016; Bobadilla, Ortega, Hernando, & Alcalá, recommendation between accuracy and diversity to users. In the
2011; Bobadilla, Ortega, Hernando, & Glez-de Rivera, 2013; Choi work of Wang et al. (2014), Wang and Gong propose a decompo-
& Suh, 2013; Liu et al., 2014). Choi and Suh (2013) proposed a sition multi-objective evolutionary algorithm focused on optimiz-
new similarity function by considering similarity between items. ing the accuracy and diversity simultaneously. According to the au-
The similarity value is calculated between a target item and each thors, the proposed algorithm is useful to recommend diverse and
of the co-rated items and then it will be used as a weight for cal- unpopular items.
culating user similarity. The efficiency of this approach has been
shown compared to traditional measures on MovieLens and Net- 3. Preliminaries
flix datasets where it enhanced considerably the accuracy of rec-
ommendation. In this section, we present some basic notations and definitions
In the same context, Liu et al. (2014) shared the same moti- before describing our approach. A search space is a collection of
vation concerning the ineffectiveness of traditional similarity mea- elements (i.e., combinations (subsets) in our case) with some order
sures. Authors presented an improved heuristic similarity measure relations (i.e., at least one binary relation).
denoted NHSM that combines the common ratings of each users
couple noted local context with the global preference of each user Definition 1. A binary relation R over a set E is a partial order if it
ratings. This heuristic similarity measure is calculated based on is reflexive, transitive, and anti-symmetric.
composing proximity, significance and singularity factors with the • ∀x ∈ E, xRx (reflexivity),
mean and variance of the rating and Jaccard similarity (Koutrika • ∀(x, y) ∈ E × E, (xRy and yRx) ⇒ x = y (anti-symmetry),
et al., 2009). NHSM similarity measure has been compared with • ∀(x, y, z) ∈ E × E × E, (xRy and yRz) ⇒ xRz (transitivity).
many state-of-the-art similarity measures on three datasets where
the results show that NHSM improves the recommendation perfor- Definition 2. Let E be a finite set. The powerset denoted P (E ) is a
mance. partially ordered set composed of all the subsets of E.
In order to achieve this latter objective, some works have
The powerset system will now be a lattice P (E ) that is a par-
adopted evolutionary schemes (Alhijawi & Kilani, 2016; Bobadilla,
ticular partial ordered set (called poset) in which every pair of ele-
Ortega et al., 2011). In Bobadilla, Ortega et al. (2011), a metric that
ments admits at least an upper bound and a greatest lower bound.
allows calculating similarity between two users is presented. This
Now, we denote by P = { p0 , p1 , . . . , pn } the user profiles, and by μ
metric consists of a linear function that combines a vector of rates
and τ the minimum and maximum profile subsets’ size, respec-
differences to a vector of weights. While values of differences be-
tively. P represents the powerset system of P, where each Ps ∈ P
tween rate vectors are calculated for each pair of users between
is a profile subset. P is a Boolean lattice such that every node in
which the similarity will be evaluated, weights depend on the spe-
the lattice is a subset of profiles. The bottom node corresponds to
cific nature of the data (all users rates). Thus, a genetic algorithm
the empty set and the top most nodes correspond to the P set. A
is used to get an ”optimal” sequence of weights for each dataset.
node Pw is a child of the node Pv (respectively a parent) in the lat-
Alhijawi and Kilani (2016) proposed a genetic encoding, denoted
tice, if Pw ⊆Pv and the pair (Pw , Pv ) differs by exactly one profile.
SimGen, that allows identifying similarity values between users
If Pw and Pv differ by more than one profile, we denote that Pv is
without using a similarity measure. The idea consists of improving
more general than Pw (respectively more specific). Moreover, the
the results of CF as follow: first, an initial random similarity value
lattice P is composed of |P| ranges denoted Rs, where the range 0
between every two users is generated. Then, evolutionary steps are
(i.e., R0 ) is represented by an empty subset, R1 is composed of one
executed until stopping criteria are reached where in every gener-
profile subsets, and Rk consists of k profiles subsets. Each range is
ation a fitness function is evaluated in order to optimize the pre-
diction error on testing data. Finally, SimGen pulls up the optimal composed of k!·(||PP||−k
!
)! profiles subsets of size k.
configuration that optimizes the similarity values between users. Additionally to the ⊆ order, we define two linear orders  and
Resultants showed a significant improvement in prediction quality.  to avoid a full exploration of the sets system P. The  order is
Recently, many efforts have been dedicated to generating ef- introduced as a linear order between profiles such that pi  pj iff
ficient RSs by taking into account diversity in recommendation i ≤ j. The  order is a linear lexicographic order between ordered
(Di Noia, Rosati, Tomeo, & Di Sciascio, 2017; Gogna & Majumdar, profiles sets in the same range (i.e., having equivalent sizes) and
2017; Kunaver & Požrl, 2017; Wang, Ma, Cai, Jiao & Gong, 2014; that share at least the first profile p0 . An ordered profiles set Pi =
Zhou et al., 2010; Zuo et al., 2015). In Gogna and Majumdar (2017), { pa0 , pa1 , pa2 , . . . , pat , . . . , pan } is lexicographic more general than
the authors present a modified latent factor model to provide di- an ordered profiles set Pj = { pb0 , pb1 , pb2 , . . . , pbt , . . . , pbn }, if Pi and
verse recommendations. The proposed solution aims at predicting Pj share at least a profile pai = pb j and iff the Eq. (1) is true:
missing ratings and it incorporates additional diversity enhancing
constraints in the matrix factorization model for collaborative fil-
∃t, 0  t  n, pak = pbk f or k < t, and pat  pbt (1)
tering. Particularly, the proposed model is based on incorporating To provide a clear idea about the usefulness of the proposed
two conflicting concepts, one promoting accuracy and the other lattice arrangement, we can see the powerset system as a group
promoting diversity, in a unified optimization framework. Although of boxes with labeled content and arranged in a particular order
the importance of both criteria, they present conflicting objectives of priority (i.e., by level). This facilitates the procedure of mining
156 N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165

R6 {P1 , P2 , P3 , P4 , P5 , P6 }
[1]

R5 {P1 , P2 , P3 , P4 , P5 } {P1 , P2 , P3 , P4 , P6 } ...................................................... {P2 , P3 , P4 , P5 , P6 }


[1..5]

R4 {P1 , P2 , P3 , P4 } {P1 , P2 , P3 , P5 } {P1 , P2 , P3 , P6 } .............. {P1 , P4 , P5 , P6 } .............. {P3 , P4 , P5 , P6 }


[1..10] [1]

R3 {P1 , P2 , P3 } {P1 , P2 , P4 } {P1 , P2 , P5 } {P1 , P2 , P6 } {P1 , P3 , P4 } .............. {P3 , P5 , P6 } {P4 , P5 , P6 }


[1..10] [1..3]

R2 {P1 , P2 } {P1 , P3 } {P1 , P4 } {P1 , P5 } {P1 , P6 } {P2 , P3 } .............. {P4 , P5 } {P4 , P6 } {P5 , P6 }
[1..5] [1..3]

R1 {P1 } {P2 } {P3 } {P4 } {P5 } {P6 }


[1] [1]

R0 ∅

Fig. 1. Profiles powerset system.

an appropriate box with only information about its place and the the subset of profiles Ps range (i.e., the size of the subset), gene2
first element in it. In this work, each box is a group of profiles. encodes the most general profile ps0 of the encoded subset Ps (i.e.,
Fig. 1 illustrates a powerset system representing all the subsets of profile with minimal order w.r.t against Ps members), and gene3
the set {P1 , P2 , P3 , P4 , P5 , P6 }. In this example, we can see that: encodes the Ps \ ps0 subset of profiles. The genes encode different
(a) groups/boxes are arranged by their size from R0 to R6 . (b) At pieces of information, each gi will use Xi bits. Fig. 2 illustrates the
each level, groups of profiles are created starting by their minimal chromosome representation.
profiles (i.e., sorted by their profile index), in this case groups un- Before describing the encoding process, we note that 1) a range
derlined by the blue start at P1 and orange ones start by P3 . (c) indicator Ri can take at most a value equal to τ and at least a
Subsets of iteratively grouped profiles create a linear order where value equal to μ; 2) a ps0 indicator noted id ps ranges from 1 to
0
each group can be identified by its position. According to R3 , there Rs − (Ri − 1 ) ; 3) Ps \ ps0 indicator noted idPs −p0 ranges from 0 to
are 10 groups starting at P1 which could be ordered from 1 to 10 (Rs−id ps )!
(Ri −1 )!·((Rs−id ps )−(Ri −1 ))! − 1.
0
where 1 ⇔ {P1 , P2 , P3 }, 2 ⇔ {P1 , P2 , P4 }, and 10 ⇔ {P1 , P5 , P6 }. 0
Therefore, according to these three parameters namely group size, Gene1 represents the subset of profiles Ps range which is pre-
first profile and group position, we can reach the corresponding sented as the binary representation of integers in [μ .. τ ]. To en-
group. For instance, for the following triple {group size = 4, first code a Ps range, X1 bits are used, where X1 is the minimal required
profile = P1 , position = 3}, the respective group is {P1 , P2 , P3, P6 }. bits to encode τ . Gene2 represents the most general profile ps0
If the position becomes equal to 10, the respective group will be in Ps ; X2 bits are used representing the minimal required bits to
{P1 , P4 , P5 , P6 }. encode the indicator that can at most be equal to Rs − (μ − 1 ).
However, to encode Gene3, we require X3 bits which represent
4. Meta-heuristic optimization for enhancing memory-based the greatest indicator idPs −p0 that can represent a subset of pro-
collaborative filtering files Ps \ ps0 . X3 is the minimal number of bits required to encode
(Rs−1 )!
(τ −1 )!·((Rs−1 )−(τ −1 ))! − 1 (i.e., the highest value of idPs −p0 is got for
In this section, we present our approach that leverages meta-
Ri = τ if τ ≤ n
2 and id ps =1).
heuristic optimization strategies to enhance memory-based CF and 0

predicting personalized recommendations. We propose a novel GA-


based method which aims at listing the most appropriate subset Example 1. Assume that μ = 4 and τ = 15 (the minimal and max-
of neighbor profiles that leads to building an efficient RS. Fig. 2 il- imal profiles subset size) and |P | = 50. The required bits to encode
lustrates the proposed GA scheme. This proposition tends to over- a subset of profiles Pi are X1 = 4 bits. The gene2 bits’ size X2 is 6
come the risk of combinatorial explosion for seeking an optimal bits allowing to represent Rs − (μ − 1 ) = 47 in binary. X3 = 40 bits.
subset of neighbor profiles that shows a high similarity towards a As a second example, we assume μ = 20, τ = 100, and |P | = 10 0 0.
given active user as well as a good intra-group diversity between The number of bits required to encode information is X1 = 7 bits,
its members. To meet this challenge, we present a binary encoding X2 = 10 bits, and X3 = 462 bits.
of the subsets information. The population of our GA is a set of 3
genes vectors where each individual consists of a different combi-
nation of indicators that represents a given profiles subset. Calcu- In this work, we propose to generate the initial population ran-
lating a successive population tends to improve a fitness function domly in order to increase the diversity. According to this, a set of
to seek an optimal subset of profiles that allows to reach our objec- N chromosomes is created. Fig. 3 illustrates two different chromo-
tive which aims at improving and diversifying recommendations. somes built by considering the following inputs: 1) profiles num-
ber of 40, μ = 10 and τ = 20, and 2) profiles number of 100,
4.1. Genetic encoding and decoding of profiles μ = 20 and τ = 40. The chromosomes ch(a) and ch(b) are strings
of (47) and (106) bits respectively. Genes sizes are defined as fol-
Individuals are represented as 0s and 1s arrays (i.e., binary lows: the ch(a) genes: |Gene1| = 5, |Gene2| = 5, |Gene3| = 37, and
string). Each chromosome is composed of 3 genes. Gene1 encodes the ch(b) genes: |Gene1| = 6, |Gene2| = 7, |Gene3| = 93.
N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165 157

Fig. 2. An overview of the proposed GA scheme.

der and the subset of profiles Ps index idPs −p0 , we generate the
appropriate Ps . Now, an interesting question is “How to do this?”.
Assume a Ri ∈ [μ..τ ], a ps0 index ∈ [1..Rs − (μ − 1 )], and an in-
dex idPs −p0 of the subset of profiles Ps , Ps can be generated as fol-
lows: First, we generate Ps with only the index of ps0 (i.e., Ps =
Fig. 3. Examples of encoded chromosomes. { ps0 , ?, ?, . . . , ?}). Then, by following the Algorithm 1, we seek pro-

Algorithm 1 Generating a subset of profiles Ps .


During a GA scheme, individuals are regularly evaluated to Input: id ps , idPs −p0 , Ri , and Rs.
0
make a decision whether they survive or not. Accordingly, a de- Output: a set of profiles PS .
coding step is invoked for each individual ch.
1: Ps ← id ps ;
0
2: Id pos ← 0;
4.1.1. Decoding a chromosome
3: i ← 0;
The decoding process consists of recovering the different genes
4: while i < Ri − 2 do
binary code encoded in a chromosome ch to the appropriate set-
5: Ps ← Ps ∪ (Ps [i] + get _k(id pos , idPs −p0 , (Rs − Ps [i] ), (Ri −
ting. Technically, this conversion consists of retrieving a profiles
subset Ps from a considered ch by decoding the information en-
(i + 1 ))))
6:  id pos is passed by reference
coded in Gene1, Gene2, and Gene3 as follows: Gene1 encodes an
7: i←i+1
integer Ri that indicates the size of the subset of profiles Ps ∈ P.
8: end while
Therefore, to get Ri , the binary code on the Gene1 is converted to
9: if Id pos = idPs −p0 then
an integer x. In order to make sure that Ri ∈ [μ..τ ], its value is con-
10: Ps ← Ps ∪ (Ps [i] + 1 )
verted as Ri = (x + μ ) modulo τ . Gene2 encodes an integer id ps
0 11: return Ps
that indicates the profile ps0 . This latter is recovered by converting
12: end if
the binary code on Gene2 to an integer x. The ps0 index must be in
13: Ps ← Ps ∪ Ps [Ri − 2] + (idPs −p0 − Id pos )
[1..Rs − (μ − 1 )], thus id ps = x + 1 modulo (Rs − (μ − 1 )). Decod-
0
ing the Gene3 information is the most costly operation in the de-
coding process. Exploring all the subsets of profiles of the system P 14: return Ps
is computationally expensive. Based on the lattice ranges and the
two linear orders  and , we propose an optimal traversing strat- files psi index according to the position order in Ps w.r.t , a linear
egy to find Ps by generating a few number of profiles subsets. First, order between profiles. The step 5 of the algorithm shows that the
we convert the binary information to an integer that indicates the index value of the position i is got by increasing the index value of
index idPs −p0 of the subset of profiles Ps in the linear ordered chain (i − 1 ) by k. At this stage, the function get _k (see Algorithm 2) is
of profile subsets that share the same ps0 . Then, according to  or- used for calculating the k value, where POP is the size of the con-
158 N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165

Algorithm 2 Get_K function. At i = 4, the while loop is stopped and at line 13 of the
Algorithm 1, the last profile identifier value is set to the value of
Input: id pos , idPs −p0 , P OP , and NPOS .
P [Ri − 2] then increased by the difference between idPs −p0 − Id pos .
Output: an integer k.
Finally, the subset of profiles that has the identifier 17 is Ps =
{3, 5, 6, 7, 8, 10}.
1: Cid pos ← id pos ;
2: k ← 1;
4.2. Selection of best neighbor profiles
3: i ← 0;
4: while Cid pos ≤ idPs −p0 do
The selection phase seeks to pull up the best chromosomes in
5: id pos ← Cid pos ;
a population based on the fitness function. The selected chromo-
6: C id pos ← C id pos + (N −1 )!·((P(POP
OP −k )!
−k )−((N
; somes will be used to produce the next population. There exists
POS POS −1 )))!
7: k ← k + 1; many selection strategies (Srinivas & Patnaik, 1994). In this work,
8: end while we use a roulette wheel selection technique that runs as follows:
9: if k=1 then in every generation, half of the current population (noted Popula-
10: return k; tion) is selected by using the roulette wheel technique (the prob-
11: end if ability is accorded based on the fitness test) to create new chil-
12: return k − 1 dren and form a new population noted offspringPopulation. Then,
we create a pool of chromosomes (Pool) by adding chromosomes
of Population to the offspringPopulation. Hence, the created Pool
consists of 150% of the size of the actual population (150 chromo-
somes for N pop = 100). Next, we apply again the roulette wheel
technique to select up to Npop chromosomes from the 150%-
chromosomes of the Pool. This operation is noted Elitist, and it al-
lows selecting a good new population for the next iteration. Sub-
sequently, the worst chromosomes are removed and replaced with
new ones.

4.2.1. Fitness function


In RSs, the recommendations for each user are expected to
match as best as possible the latter’s profile in order to satisfy the
user’s expectations and to gain his interest. Achieving such a goal
requires an appropriate and personalized subset of reference users
that needs to be carefully selected from the reference database. To
this end, we define Similarity as a measure for computing the sim-
ilarity between a query user u and the candidate solution of users
Gu selected by our GA and thus that could be used as a reference
Fig. 4. The powerset system of P = { p1 , p2 , p3 , . . . , pn }.
subset for deriving the recommendations. Similarity is computed as
follows:
sidered population and Npos is the number of places in Ps that are 1 
Similarity(u, Gu ) = δ (u, g), ∀g ∈ G (3)
considered for combinations. The k value is the maximal integer |Gu |
that satisfies the following equation according to a given position
where g is a candidate user in Gu and δ is a distance measure (such
i > 0 and i >= (Ri − 2 ):
as Euclidean distance, Pearson correlation, COS and ACOS) that is
 (Rs − (id p j−1 + k ))! defined in R[0,1] and computed over the ratings of the users u and
≤ idPs −p0 (2) g on the shared items.
(Ri − j )! · ((Rs − (id p j−1 + k )) − (Ri − j ))!
k=1 We also define Diversity for measuring the intra-group diversity
where id p j−1 is the profile indicator discovered for the position j − of the candidate solution Gu . Diversity is computed as follows:
1. 1 
Diversity(Gu ) = 1 − γ (g, v ), ∀g, v ∈ Gu , g = v (4)
|Gu |
Example 2. Assume the powerset system on Fig. 4 where Rs = 10,
if we want to generate the subset Ps1 on Ri = 6 which is the sub- where γ is a distance function (such as Tanimoto coefficient and
set range. According to Ri = 6, id ps = 3, and idPs −p0 = 17. Know- individual diversity) that is defined in R[0,1] and computed over
0
ing that for these latter arguments there are 21 possible sub- the sets of items present in the profiles of the users u and v. Note
sets of profiles that start at id ps = 3. To reach the idPs −p0 from that for Similarity and Diversity, {u} ∩ Gu = ∅. One could expect the
0
Ps = {3, ?, ?, ?, ?, ?}, the step 5 of Algorithm 1 invokes get_k in a candidate solution of reference users Gu to be as close as possi-
loop for i values from 0 to 3. For these values, the passed argu- ble to the query user’s profile while at the same time guarantee-
ments are: ing intra-group diversity. We propose Group-fitness, a function that
computes the fitness of a candidate solution Gu based on a combi-
i = 0 : for (id pos = 0, idPs −p0 = 17, P OP = 7, N pos = 5 ) we get nation of Similarity and Diversity such that we guarantee that the
k = 2, and id pos = 15, thus Ps = {3, 5, ?, ?, ?, ?}. selected G∗u will present the best trade-off between both aspects,
i = 1 : for (id pos = 15, idPs −p0 = 17, P OP = 6, N pos = 4 ) we get with respect to the query user u. Formally, Group-fitness is defined
k = 1, and id pos = 15, thus Ps = {3, 5, 6, ?, ?, ?}. as follows:
i = 2 : for (id pos = 15, idPs −p0 = 17, P OP = 5, N pos = 3 ) we get
Group-fitness(Gu ) = α · Similarity + (1 − α ) · Diversity (5)
k = 1, and id pos = 15, thus Ps = {3, 5, 6, 7, ?, ?}.
i = 3 : for (id pos = 15, idPs −p0 = 17, P OP = 4, N pos = 2 ) we get Note that the parameter α allows to personalize the solution
k = 1, and id pos = 15, thus Ps = {3, 5, 6, 7, 8, ?}. G∗u by adding flexibility to the fitness function Group-fitness. The
N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165 159

selected G∗u will be the one that maximizes Group-fitness, with re- 5.1. Datasets and evaluation criteria
spect to α :
Movielens 100K dataset includes 10 0 0 0 0 ratings given by 943
G∗u = argmaxGu Group-fitness(Gu ) (6)
users on 1682 movies, and 1M dataset includes 1 million ratings
from 60 0 0 users on 40 0 0 movies. Each user has rated at least 20
4.3. Genetic operators movies by assigning a discrete rate value from 1 to 5. For each ac-
tive user, we conduct a five fold cross validation (5-cv) following
Crossover: This operator is applied on two randomly chosen in- a stratified random split of ratings into five sets where each set is
dividuals from the population. As a result, it gives a chromosome composed of 20% of the data ratings. For each cv run, 4 sets are
formed from the characteristics of both parents. Two children are concatenated to form the training data and the 5th one is used
generated for the next generation. A percentage of crossovers is set for testing. We used training data to select the k-nearest neigh-
to 50%, and cross at a point is applied. bors and the test data to measure RS performances. Following the
Mutation: This operator is applied to an individual by modifying procedure described and followed in Gogna and Majumdar (2017),
one or more of its genes. The modified genes are chosen randomly for each test set, we carried out 100 simulations randomly and the
from the parent chromosome forming one new child. The mutation results are represented by the average value for all the runs. The
percentage in our case is set to 1%. This ratio defines the probabil- average performances (i.e., Precision, Recall, F-measure, Individual
ity to change a bit by another randomly and without interaction Diversity, and Novelty) of a given user are the average of the mea-
with other chromosomes. sured values obtained during each cv.
We evaluate each approach in terms of Precision, Recall, F-
4.4. The stopping criteria measure, Individual Diversity, and Novelty. The testing methodol-
ogy follows the experimental scheme of a recommender system
The algorithm stops at a predefined maximum number of iter- introduced by Cremonesi, Koren, and Turrin (2010) and Park, Park,
ations that we set to 50 in this work. Jung, and Lee (2015). To evaluate Precision, Recall, and F-measure
testing items are categorized as relevant (i.e., a rating equal to or
5. Experimental evaluation higher than 4) to the active user which forms the test set or irrel-
evant which form the probe set. At this stage, the test set consists
The proposed evolutionary approach is designed to provide an of all of the four-stars ratings or higher (out of five-stars), and the
improved RS by giving the most appropriate k-nearest neighbors others are in the probe set. Then, for each item hit in the test set
for a given active user. We note that our approach was imple- (i.e. relevant to the active user), we randomly select 100 items 80
mented using the Mahout framework (Owen, Anil, Dunning, & from the probe set and 20 from the test set such that probe set ∩
Friedman, 2011). To demonstrate the efficiency of our approach, we test set = ∅ and recommend the top-10 items from the 101 items
validate it on two different scenarios. In the first one, we evalu- (the 100 items selected in addition to the hit).
ate it on the benchmark movie database from Grouplens - Movie- To compute the diversity in recommendations, we follow the
lens 100K and 1M datasets (http://www.grouplens.org/datasets/ individual diversity formula (7) introduced by Zhang and Hurley
movielens/). Movielens datasets are among the most widely used (2009) and novelty formula (9) introduced by Vargas (2014). Indi-
benchmarks for evaluating the performance of recommender sys- vidual diversity (ID) measures diversity of items recommended to
tems. In the second scenario, we evaluate the proposed approach users (Gogna & Majumdar, 2017).
on a real-world application in the insurance field. In both scenar- 1
ios, recommendations should be adequate to customers in terms of Individual Diversity =
|Users|
personalization, diversity and novelty.  
For our approach, we set: (a) the minimum and maximum pro-
 x∈RL(u ) (1 − sim(x, y ))
y∈RL(u )
· (7)
file subsets’ sizes (μ and τ ) to 20 and 50 respectively. (b) Eu- u∈Users
|RL| · (|RL| − 1 )
clidean distance is used as the similarity measure to find similar-
ities between the active user and other profiles. (c) We use Tan- where |Users| is the number of users in the data, RL(u) is the rec-
imoto Coefficient to calculate the dissimilarities between profiles. ommendation list of user u, |RL| is the length of recommendation
Experiments follow two main scenarios. In the first one, we com- list, and sim(x, y) is the similarity between two items x and y that
pare the experimental results of our approach with those of built belong to the same RL(u). We measure sim(x, y) using the Pearson
simple RSs with different numbers of neighbors K selected for the correlation coefficient similarity between x and y. To have the sim-
filtering phase based on KNN. Under comparable conditions to our ilarity function bounded between 0 and 1, we calculate the angular
approach (i.e., μ = 20 and τ = 50), the used K values are 10, 20, distance as follows:
30, 40 and 50, and Euclidean distance is used to select the K most cos−1 (Pearson(x, y ) )
distance(x, y ) =
similar profiles. 
Moreover, we divide this first evaluation into two sub- sim(x, y ) = 1 − distance(x, y ) (8)
experiments, where in the first one we perform the evaluation on
the Movielens dataset 100K for which we consider two cases: (a) Novelty is considered for testing the newness of recommended
all users in the dataset noted All100K and (b) only a subset of the items of the proposed approach. It is measured as:

100K dataset by using only the users who have rated more than 1  x∈RL(u )|Users|/#x
Novelty = · (9)
100 movies noted Partial100K. In the second sub-experiment, we |Users| u∈Users
|RL|
demonstrate the effectiveness of the proposed GA method on a
larger dataset namely the Movielens 1M dataset noted All1M. In where, #x denotes the number of ratings for the item x in the
the second scenario, we compare the performance of our evolu- training data.
tionary approach on the Partial100K dataset against DiABlO (Gogna As previously mentioned, to validate the effectiveness of the
& Majumdar, 2017), a recently proposed approach focusing on proposed GA approach (first scenario), we considered a comparison
the accuracy-diversity balance in recommendations and that has against five configuration settings of simple RS based on k-nearest
shown good performances when users have rated more than 100 neighbors KNN filtering that are reported in Table 1. The statistics
movies. of the datasets considered for the experimental process are shown
160 N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165

Table 1 in conf5 (k=50). As illustrated in Fig. 5, classical RS results show


The used classical RS configurations.
that the performances increase according to the number of nearest
Configuration # of considered nearest neighbors neighbors for each value of Precision, Recall, and F-measure. How-
conf1 10 ever, the GA approach results show better performances by using
conf2 20 more restricted neighbors groups which pictures the high quality
conf3 30 of neighbors groups selected by the GA approach.
conf4 40 In Fig. 6, we illustrate the Individual Diversity and Novelty re-
conf5 50
sults which show a balanced behavior of the GA approach against
the recommendation of popular and unpopular items.
Table 2 Table 4 lists results for the Partial100K dataset. The different
Dataset statistics.
approaches show similar behavior on the All100K dataset. GA ap-
Movielens 100K # Users # Items # Ratings proach shows good results and as illustrated in Fig. 7, we see once
All100K 973 1682 10 0 0 0 0 again that our GA approach outperforms in terms of Precision, Re-
Partial100K 364 1668 74522 call, and F-measure the classic RS approach. Fig. 8 illustrates In-
All1M 60 0 0 40 0 0 10 0 0 0 0 0 dividual Diversity and Novelty results for the Partial100K dataset
where we can clearly see that the GA approach is more balanced
Table 3 than the classical RS which indicates a lively interest in recom-
Comparison of our GA approach against simple RS on the All100K mending unpopular items (i.e., conf2 -conf5 ), or in recommending
dataset. only popular items (i.e., conf1 ).
Precision Recall F-measure ID Novelty
5.2.1. Results on the 1M Movielens dataset
GA 0.9126 0.4356 0.5896 0.4496 2.6897
conf1 0.2545 0.2229 0.2081 0.1878 0.7606 In this section, we evaluate the performances of our ap-
conf2 0.3526 0.2551 0.2867 0.2319 1.5048 proach against the classical RSs on an extended coverage Movie-
conf3 0.4080 0.2817 0.3225 0.2426 1.8942 lens dataset (All1M). Table 5 lists the obtained results. The re-
conf4 0.4194 0.2870 0.3298 0.2747 1.9456
sults show clearly that the GA approach outperforms the classical
conf5 0.5669 0.3574 0.4251 0.3333 2.8698
RSs in terms of Precision, Recall, F-measure, and Individual Diver-
sity results. However, in terms of Novelty, the classical RSs (conf5 )
Table 4 and our approach showed very close results. We notice that the
Comparison of GA approach against simple RS on the Partial100K
dataset.
proposed GA approach has selected neighbors sets composed of
a number of users ranging from 21 to 34, with an average value
Precision Recall F-measure ID Novelty of 26.875 against 50 users for KNN (conf5 ). This shows again that
GA 0.8740 0.4165 0.5641 0.4430 2.0608 even for large datasets, our approach is able to select a high quality
conf1 0.1965 0.0935 0.1267 0.0393 1.0954 group of profiles that allows a better trade-off between similarity
conf2 0.4690 0.2242 0.3034 0.3014 3.3701
and diversity as compared to the classical RSs.
conf3 0.5503 0.2622 0.3551 0.3019 3.5637
conf4 0.7068 0.3378 0.4571 0.4342 4.3605
conf5 0.8151 0.3889 0.5266 0.4385 4.4718 5.2.2. Results using different similarity measures
In order to provide a complete evaluation of our approach, we
Table 5
measure its performance using six different similarity measures
Comparison of our GA approach against simple RS for All1M dataset. other than the Euclidean distance. We evaluate the performances
of our approach against those of the classical RSs. The used mea-
Precision Recall F-measure ID Novelty
sures are City Block, Log Likelihood, Pearson Correlation, Spearman
GA 0.9423 0.4488 0.6082 0.4522 3.1005 Correlation, Tanimoto Coefficient and Uncentered Cosine.
conf1 0.0984 0.0464 0.0632 0.0128 0.6312
The obtained results are listed in Table 6. They show a clear
conf2 0.2216 0.1048 0.1416 0.0528 1.4040
conf3 0.2976 0.1416 0.1920 0.1488 2.6048 confirmation of the previously obtained experimental results with
conf4 0.3816 0.1824 0.2464 0.2040 3.0584 the Euclidean distance. These results demonstrate that using any
conf5 0.43544 0.2096 0.2832 0.2296 3.2608 of these similarity metrics, the GA approach achieves the best per-
formances in terms of Precision, Recall, F-measure, ID and Novelty
as compared to the five configurations of KNN (conf1, conf2, conf3,
in Table 2. Second, we compare the performances against DiABlO conf4 and conf5). We remarked that only KNN (conf5) outperforms
by considering only the Partial100K data. our approach with Pearson Correlation, Log Likelihood and Tani-
moto Coefficient only in term of novelty. Thus, once again, the pro-
5.2. Results and discussion posed approach proves that it maximizes both precision and diver-
sity even when changing the similarity metric.
Tables 3–5 list the obtained results of our approach and the five
configurations of simple RS (see Table 1) based on normalized pre- 5.2.3. Comparison with other evolutionary approaches
diction results. Table 3 lists Precision, Recall, F-measure, Individual Before concluding the first scenario, we compare our results
Diversity and Novelty for the All100K dataset. with those of DiABlO (Gogna & Majumdar, 2017) with Partial100K
Additionally, we notice that the proposed GA approach has se- dataset in terms of Precision, Individual Diversity and Novelty. Di-
lected neighbors sets composed of 20 to 37 users, where the av- ABlO is a recently proposed recommendation algorithm that at-
erage value was 22.9060. The Precision, Recall, and F-measure re- tempts to achieve a high balance between diversity and accuracy.
sults show that the collaborative filtering produces better results We note that DiABlO results are obtained following the default
using the GA selected users group than using classical RSs (i.e., configuration where we have just modified the items diversity ma-
KNN). Moreover, GA results outperform the ones of classical RSs trix. The similarity of each item couple is set based on similarity
in terms of Individual Diversity. Yet, it showed lower results in formula defined in formula (8). In Table 7, we report results of GA
terms of Novelty. This could be due to the diversity of the neigh- approach and DiABlO. For this experimentation, we have consid-
bors set of GA and the large number of used neighbors by RS ered three of DiABlO results: the one that shows best Precision
N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165 161

Fig. 5. GA performances against classical RS in terms of Precision, Recall, and F-measure on the All100K dataset.

Fig. 6. GA performances against classical RS in terms of Individual Diversity and Novelty on the All100K dataset.

Fig. 7. GA performances against classical RS in terms of Precision, Recall, and F-measure on the Partial100K dataset.
162 N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165

Fig. 8. GA performances against classical RS in terms of Individual Diversity and Novelty on the Partial100K dataset.

Table 6
Results of GA and simple RS on All100K using different similarity measures.

Measure Approach Precision Recall F-measure ID Novelty

City Block conf1 0.1524 0.0735 0.0992 0.0519 0.9227


conf2 0.2412 0.1156 0.1562 0.0995 1.5135
conf3 0.3073 0.1468 0.1986 0.1239 1.9061
conf4 0.3420 0.1638 0.2214 0.1351 2.1333
conf5 0.3795 0.1815 0.2455 0.1463 2.2999
GA 0.9017 0.4304 0.5825 0.4513 2.6607
Log Likelihood conf1 0.2930 0.1402 0.1896 0.1233 2.5599
conf2 0.4926 0.2352 0.3184 0.2555 3.9269
conf3 0.5910 0.2818 0.3816 0.3137 4.5055
conf4 0.6542 0.3122 0.4227 0.3395 4.7542
conf5 0.6908 0.3299 0.4465 0.3492 4.8505
GA 0.9179 0.4380 0.5930 0.4480 2.8094
Pearson Correlation conf1 0.1004 0.0482 0.0651 0.0254 0.7627
conf2 0.2135 0.1016 0.1377 0.0751 1.5450
conf3 0.2977 0.1420 0.1923 0.1199 2.1117
conf4 0.3723 0.1775 0.2404 0.1593 2.6098
conf5 0.4408 0.2102 0.2847 0.1933 3.0071
GA 0.8427 0.4029 0.5451 0.4471 2.8622
Spearman Correlation conf1 0.0950 0.0455 0.0615 0.0232 0.6870
conf2 0.1935 0.0922 0.1249 0.0699 1.4008
conf3 0.2747 0.1312 0.1776 0.1084 1.9514
conf4 0.3450 0.1643 0.2226 0.1384 2.3788
conf5 0.3979 0.1899 0.2571 0.1650 2.7177
GA 0.8137 0.3889 0.5262 0.4498 2.9120
Tanimoto Coefficient conf1 0.2930 0.1402 0.1896 0.1233 2.5599
conf2 0.4926 0.2352 0.3184 0.2555 3.9269
conf3 0.5910 0.2818 0.3816 0.3137 4.5055
conf4 0.6542 0.3122 0.4227 0.3395 4.7542
conf5 0.6908 0.3299 0.4465 0.3492 4.8505
GA 0.8966 0.4284 0.5797 0.4486 2.7817
Uncentered Cosine conf1 0.0926 0.0441 0.0598 0.0248 0.6707
conf2 0.1899 0.0908 0.1228 0.0673 1.3165
conf3 0.2662 0.1272 0.1721 0.1026 1.8185
conf4 0.3313 0.1582 0.2142 0.1300 2.2034
conf5 0.3834 0.1833 0.2481 0.1562 2.5434
GA 0.9097 0.4337 0.5873 0.4493 2.7231

Table 7 noted bestPr , the one that shows best Individual Diversity noted
Comparison of GA approach against DiABlO on Par- bestID , and the one that shows the average value.
tialK100 dataset. Compared to DiABlO, GA approach gives higher Precision (11%
Precision ID Novelty higher than bestPr and 12% higher than averaged Precision). The
Individual Diversity is lower by comparatively a small percentage
GA Mean 0.8740 0.4430 2.0608
DiABlO bestPr 0.7687 0.4376 1.5276
(3% lower than bestID and 2% lower than averaged Individual Di-
bestID 0.7415 0.4636 1.7376 versity), whereas Novelty is higher by 0.3 than bestID while it is
Mean 0.7538 0.4542 1.6554 higher than averaged Novelty of DiABlO by 0.4. Thus, the proposed
N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165 163

GA in this work shows the best behavior in terms of Precision and


Novelty, and approximately equivalent results in terms of Novelty
as compared to DiABlO.

5.3. A real-world application for the insurance field

Insurance in Algeria has taken an important place in economic


life. Its relation is now well established with all the activities that
rely on it very often. In addition to the guarantees it offers, insur-
ance provides the economy with significant investments favorable
to its development. An insurance company is ready to do anything
to satisfy the slightest demand from its customers. As a result, it is
important for an insurer to know the desire of his insured clients.
Surveys are performed by societies to answer these questions. Of
course ordering a study comes at a cost but insurers are ready to
pay from the moment they can better know the necessities of their
customers. Fig. 9. GA performances against classical RS in terms of number of recommenda-
tions.
Introducing a new system that helps customers to find new
personalized products present a great asset in the insurance field
to satisfy the insurer’s aim by enhancing sales. While expert sys-
tems try to extract human expertise in a particular domain such
as medical diagnosis, recommender systems try to expect a fu-
ture result based on past experiences encapsulated in the dataset.
Furthermore, from a commercial standpoint, recommender systems
proved their capacity to improve cross-sell by suggesting additional
products for the customer to purchase. Thus, our aim is to leverage
the community experiences in the insurance field and propose a
new recommender system that learns from customers and recom-
mends products that they will find relevant among the available
choices. It is also important that the proposed recommendations
be adequate to customers in terms of personalization, diversity and
novelty.

5.3.1. Dataset
Fig. 10. Comparison between surprising and ordinary items in terms of acceptance
We have collected a dataset that includes 239 ratings by 100 and rejection.
customers gathered from 17 Algerian insurance companies. The
customers are invited to rate the products that are purchased from
the insurance companies by assigning a discrete rate value of 1 to to discover more new products and so to probably select more of
5. The number of products in the database is 26. Each customer them to enrich their profiles.
can be affected to one or various companies. To evaluate the performance of our approach in terms of nov-
• Products: Any risk that can be estimated can be insured. An elty and diversity, we asked each user to indicate his satisfaction
example of a type of products is vehicle insurance that in gen- (i.e., pleased or surprised) about the recommended products. The
eral covers both the property risk and the liability risk. A home feedbacks of our 100 customers are presented in Fig. 10.
insurance policy in general comprises coverage for damage to We can conclude that the precision of our GA outperforms 60%
the house and the owner’s belonging and so on. The number of which means that the participated customers have appreciated the
the collected products in this experiment is 26. recommended products.
• Participants (customers): We have asked 100 individuals to Through this experiment, we also discuss our algorithm in
participate in our experiment. All the participants are cus- terms of its ability to generate surprising recommendations. Ac-
tomers in one of the 17 Algerian insurance companies. All these cording to the state of the art, results validate the intuition that
members are asked to rate the risk which they have already in- classic recommender systems could generate accurate results but
sured in the past. In another term, we want to know the opin- unfortunately least surprising recommendations. Results presented
ion of these customers about their experience in the insurance in Fig. 10 show that 70% of the recommended items surprised the
field, especially about the insured risks. participant customers. This means that the proposed GA performs
best in terms of surprise and accuracy simultaneously. These are
5.3.2. Experimental results quite encouraging results showing that our approach have pro-
To validate the efficiency of our approach in the insurance do- posed recommendations that are adapted to the customers profiles
main, we incorporate the customers community in the evaluation and that are truly able to help the latter when searching for new
process since the computation of the metrics that will be used in surprising products.
the estimation requires the knowledge of the customers feedback The results presented in Fig. 11 include the number of rec-
about the recommended products. This allows us to compare what ommended products produced by our GA in four different points
they prefer with the results provided by our recommender system. of view: Accepted & surprising (denoted Ac&S), Accepted & ordi-
Thus, to estimate the efficiency of the proposed approach, we have nary (denoted Ac&O), Rejected & surprising (denoted Re&S), and
invited customers again to judge the relevance of recommended Rejected & ordinary (denoted Re&O). By analyzing this figure, we
products. Results presented in Fig. 9 demonstrate that the number conclude that our approach achieved a good result in terms of ac-
of recommendations obtained by our GA is higher than those rec- curacy and surprise when searching for a compromise between
ommended by the five other configurations. This allows customers results precision and novelty in recommendations. Note also that
164 N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165

References

Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender
systems: A survey of the state-of-the-art and possible extensions. IEEE Transac-
tions on Knowledge and Data Engineering, 17(6), 734–749.
Ahn, H. J. (2008). A new similarity measure for collaborative filtering to alleviate
the new user cold-starting problem. Information Sciences, 178(1), 37–51.
Alhijawi, B., & Kilani, Y. (2016). Using genetic algorithms for measuring the simi-
larity values between users in collaborative filtering recommender systems. In
Computer and information science (ICIS), 2016 IEEE/ACIS 15th international confer-
ence on (pp. 1–6).
Bakshi, S., Jagadev, A. K., Dehuri, S., & Wang, G.-N. (2014). Enhancing scalability and
accuracy of recommendation systems using unsupervised learning and particle
swarm optimization. Applied Soft Computing, 15, 21–29.
Bobadilla, J., Hernando, A., Ortega, F., & Bernal, J. (2011). A framework for collabo-
rative filtering recommender systems. Expert Systems with Applications, 38(12),
14609–14623.
Bobadilla, J., Hernando, A., Ortega, F., & Gutiérrez, A. (2012). Collaborative filtering
based on significances. Information Sciences, 185(1), 1–17.
Bobadilla, J., Ortega, F., Hernando, A., & Alcalá, J. (2011). Improving collaborative fil-
tering recommender system results and performance using genetic algorithms.
Knowledge-Based Systems, 24(8), 1310–1316.
Fig. 11. The number of recommended products produced by the GA in four differ- Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems
ent points of view. survey. Knowledge-Based Systems, 46, 109–132.
Bobadilla, J., Ortega, F., Hernando, A., & Glez-de Rivera, G. (2013). A similarity met-
ric designed to speed up, using hardware, the recommender systems k-nearest
neighbors algorithm. Knowledge-Based Systems, 51, 27–34.
the number of ordinary recommendations that are being rejected Cacheda, F., Carneiro, V., Fernández, D., & Formoso, V. (2011). Comparison of collab-
present the smallest value. orative filtering algorithms: Limitations of current techniques and proposals for
scalable, high-performance recommender systems. ACM Transactions on the Web
(TWEB), 5(1), 2.
Choi, K., & Suh, Y. (2013). A new similarity function for selecting neighbors for each
5.3.3. General discussion target item in collaborative filtering. Knowledge-Based Systems, 37, 146–153.
Choi, K., Yoo, D., Kim, G., & Suh, Y. (2012). A hybrid online-product recommenda-
The results of our experiments are promising in terms of accu- tion system: Combining implicit rating-based collaborative filtering and sequen-
racy and surprise. This means that the use of a genetic algorithm tial pattern analysis. Electronic Commerce Research and Applications, 11(4), 309–
in the insurance field to make recommendations has shown its ef- 317.
Cremonesi, P., Koren, Y., & Turrin, R. (2010). Performance of recommender algo-
ficiency in the enrichment of customers profiles. The proposed ap- rithms on top-n recommendation tasks. In Proceedings of the fourth ACM con-
proach also overcomes the limitation of content over-specialization ference on recommender systems (pp. 39–46).
that prevents recommending relevant items that are different from Crespo, R. G., Martínez, O. S., Lovelle, J. M. C., García-Bustelo, B. C. P., Gayo, J. E. L.,
& De Pablos, P. O. (2011). Recommendation system based on user interaction
the ones the user already knows. We demonstrated the efficiency data applied to intelligent electronic books. Computers in Human Behavior, 27(4),
of our approach in terms of the enhancement of diversity when 1445–1449.
recommending new products to customers in a real-world applica- Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. (2002). A fast and elitist multiobjec-
tive genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation,
tion from the insurance field.
6(2), 182–197.
Dhifli, W., Da Costa, N. O., & Elati, M. (2017). An evolutionary schema for mining
skyline clusters of attributed graph data. In Evolutionary computation (CEC), 2017
IEEE congress on (pp. 2102–2109).
6. Conclusion Di Noia, T., Rosati, J., Tomeo, P., & Di Sciascio, E. (2017). Adaptive multi-attribute
diversity for recommender systems. Information Sciences, 382, 234–253.
The popularity of collaborative filtering in recommender sys- Gao, F., Xing, C., & Zhao, Y. (2007a). An effective algorithm for dimensional reduc-
tion in collaborative filtering. In Asian digital libraries. looking back 10 years and
tems is strongly related to its simplicity, ease of understanding, forging new frontiers (pp. 75–84).
and close resemblance to human reasoning. However, choosing the Gao, F., Xing, C., & Zhao, Y. (2007b). An effective algorithm for dimensional reduc-
appropriate parameters is not a straightforward process. The main tion in collaborative filtering, pp. 75–84).
Gogna, A., & Majumdar, A. (2017). Diablo: Optimization based design for improving
difficulty lies on the selection, for each active user, of the best
diversity in recommender system. Information Sciences, 378, 59–74.
neighbor profiles that will be used to estimate new ratings. Us- Goldberg, K., Roeder, T., Gupta, D., & Perkins, C. (2001). Eigentaste: A constant time
ing existing similarity metrics over the whole users’ profiles is collaborative filtering algorithm. Information Retrieval, 4(2), 133–151.
Jamali, M., & Ester, M. (2009). Trustwalker: a random walk model for combin-
not much more efficient, and adopting clustering techniques needs
ing trust-based and item-based recommendation. In Proceedings of the 15th
paying attention to a set of matters like the number of desired par- ACM SIGKDD international conference on knowledge discovery and data mining
titions and the selection starter centroids. In this paper, we pro- (pp. 397–406).
posed to use evolutionary computation to select an optimal group Karabadji, N. E. I., Seridi, H., Bousetouane, F., Dhifli, W., & Aridhi, S. (2017). An evo-
lutionary scheme for decision tree construction. Knowledge-Based Systems, 119,
of neighbor profiles that will be used for recommendation for an 166–177.
active user. We proposed a genetic algorithm that offers two ad- Karabadji, N. E. I., Seridi, H., Khelf, I., Azizi, N., & Boulkroune, R. (2014). Improved
vantages. (a) It allows to alleviate settings problems related to the decision tree construction based on attribute selection and data sampling for
fault diagnosis in rotating machines. Engineering Applications of Artificial Intelli-
use of clustering approaches or to select the N most similar profiles gence, 35, 71–83.
from the entire dataset. (b) It selects a group on which individu- Kim, K.-j., & Ahn, H. (2008). A recommender system using ga k-means clustering
als are different to guarantee diversity. These two advantages allow in an online shopping market. Expert Systems with Applications, 34(2), 1200–
1209.
achieving personalized recommendations for an active user while Koutrika, G., Bercovitz, B., & Garcia-Molina, H. (2009). Flexrecs: Expressing and com-
preserving a balance between accuracy and diversity. Experimen- bining flexible recommendations. In Proceedings of the 2009 ACM SIGMOD inter-
tal results on Movielens 100k showed that the proposed approach national conference on management of data (pp. 745–758).
Kunaver, M., & Požrl, T. (2017). Diversity in recommender systems–a survey. Knowl-
allows suggesting movie lists that offer very high precision, recall
edge-Based Systems, 123, 154–162.
and F-measure results by efficiently selecting the most appropriate Lee, S. K., Cho, Y. H., & Kim, S. H. (2010). Collaborative filtering with ordinal
group of neighbors. Moreover, the proposed GA scheme was ap- scale-based implicit ratings for mobile music recommendations. Information Sci-
ences, 180(11), 2142–2155.
plied on a real-world insurance application and the results showed
Liao, C.-L., & Lee, S.-J. (2016). A clustering based approach to improving the effi-
great potential of using recommender system to improve the in- ciency of collaborative filtering recommendation. Electronic Commerce Research
surance field. and Applications, 18, 1–9.
N.E.I. Karabadji et al. / Expert Systems With Applications 98 (2018) 153–165 165

Liu, H., Hu, Z., Mian, A., Tian, H., & Zhu, X. (2014). A new user similarity model Srinivas, M., & Patnaik, L. M. (1994). Genetic algorithms: A survey. Computer, 27(6),
to improve the accuracy of collaborative filtering. Knowledge-Based Systems, 56, 17–26.
156–166. Vargas, S. (2014). Novelty and diversity enhancement and evaluation in recom-
Lu, J., Wu, D., Mao, M., Wang, W., & Zhang, G. (2015). Recommender system appli- mender systems and information retrieval. In Proceedings of the 37th interna-
cation developments: A survey. Decision Support Systems, 74, 12–32. tional ACM SIGIR conference on research & development in information retrieval.
Owen, S., Anil, R., Dunning, T., & Friedman, E. (2011). Mahout in action. Greenwich, 1281–1281
CT, USA: Manning Publications Co. Wang, S., Ma, L., Cai, Q., Jiao, L., & Gong, M. (2014). Decomposition based multi-
Park, Y., Park, S., Jung, W., & Lee, S.-g. (2015). Reversed cf: A fast collaborative filter- objective evolutionary algorithm for collaborative filtering recommender sys-
ing algorithm using a k-nearest neighbor graph. Expert Systems with Applications, tems. In Proceedings of the IEEE congress on evolutionary computation (CEC)
42(8), 4022–4028. (pp. 672–679).
Rana, C., & Jain, S. K. (2014). An evolutionary clustering algorithm based on tempo- Wang, Z., Yu, X., Feng, N., & Wang, Z. (2014). An improved collaborative movie rec-
ral features for dynamic recommender systems. Swarm and Evolutionary Com- ommendation system using computational intelligence. Journal of Visual Lan-
putation, 14, 21–30. guages & Computing, 25(6), 667–675.
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994). Grouplens: an Zhang, M., & Hurley, N. (2009). Novel item recommendation by user profile parti-
open architecture for collaborative filtering of netnews. In Proceedings of the tioning. In Proceedings of the 2009 IEEE/WIC/ACM international joint conference on
1994 ACM conference on computer supported cooperative work (pp. 175–186). web intelligence and intelligent agent technology-volume 01 (pp. 508–515).
Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (20 0 0). Application of dimensionality Zhou, T., Kuscsik, Z., Liu, J.-G., Medo, M., Wakeling, J. R., & Zhang, Y.-C. (2010). Solv-
reduction in recommender system-a case study. Technical Report. DTIC Docu- ing the apparent diversity-accuracy dilemma of recommender systems. Proceed-
ment. ings of the National Academy of Sciences, 107(10), 4511–4515.
Shardanand, U., & Maes, P. (1995). Social information filtering: algorithms for au- Zuo, Y., Gong, M., Zeng, J., Ma, L., & Jiao, L. (2015). Personalized recommenda-
tomating word of mouthǥ. In Proceedings of the SIGCHI conference on human fac- tion based on evolutionary multi-objective optimization [research frontier]. IEEE
tors in computing systems (pp. 210–217). Computational Intelligence Magazine, 10(1), 52–62.

You might also like