You are on page 1of 20

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/343417775

PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE


ART REVIEW

Article · August 2020

CITATIONS READS

0 536

4 authors, including:

Lakshmi Chetana Syed Ibrahim


MIC College of Technology VIT University , Chennai Campus
12 PUBLICATIONS   2 CITATIONS    33 PUBLICATIONS   134 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Driver Behavior Recognition Based on Real-Time Driving Information View project

Recommendation System, Real time Classification, Real time closed itemset mining View project

All content following this page was uploaded by Lakshmi Chetana on 04 August 2020.

The user has requested enhancement of the downloaded file.


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

PERSONALIZATION USING RECOMMENDATION SYSTEMS:


STATE-OF-THE ART REVIEW

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree

Asst. Prof., DVR & Dr.HS MIC College of Technology, Kanchikacherla


Professor, School of ComputingScience and Engineering, VIT University-Chennai Campus,
Chennai
Asst. Prof., Computer Science Department, KakatiyaUniversity, Hanamkonda
Professor, Dept. of CSE, Shri Vishnu Engineering College for Women, Bhimavaram
*Corresponding author:mailtochetana@gmail.com

ABSTRACT:
In the era of internet, the proliferation of services will be a great challenge to the users in
selecting the best and optimal services that suit the user requirement. So there is a need to filter,
prioritize and efficiently deliver relevant information according to the user requirement.
Recommendation systems rescue the users by selecting the required content after searching
through large volumes of services. Recommendation system predicts user responses to options
using diverse prediction techniques like content-based system, collaborative filtering system
and hybrid system. This paper, in detail, explores different characteristics, approaches,
evaluation metrics and potentials of different prediction techniques in recommendation systems
that endorse the quality of services for the users.

Keywords: Content-based Recommendation system, Collaborative Recommendation system,


Hybrid Recommendation system, Evaluating Recommendation system, Evaluation metrics,
Similarity measures, Recommendation systems.

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 56


PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE ART REVIEW

[1] INTRODUCTION
Increase in the growth of available information online has created the problems of information
overload and paradox of choice. Most of the e-commerce sites uses personalization and
prioritization which provides content and services tailored to individuals based on knowledge
about their preferences and behavior. Recommendation systems are the important tools to solve
the challenges of information overload and paradox of choice.
The objective of the recommendation system is to create meaningful predictions to the
users for products or items of their interest. There are multitude ways to find the interests of
the users like tags, reviews, likes and number of visits for a page or items or blogs.It is the
process of finding what the customer needs, either data or an item [1].Some of the example
recommender systems are Ringo, Pandora, Last.fm for music , Netflix,Sceneami.com for
movies , YouTube, IMDb for videos, LinkedIn for jobs, Facebook, Twitter for friends, Amazon
for books, OKCUPID for dating and ACM, Springer, IEEE, ScienceDirect etc for published
papers.
The point that differentiates the recommender system with search engines is,
recommender systems try to present relevant content that the user did not necessarily search
for and which they have not even heard about. The goal of the recommendation system is to
make legitimate recommendations for the user which the user has not seen or to provide users
with item lists ranked in accordance with the estimated preferences. The recommendation
systems are mostly commonly used in two scenarios. First, when there are a very large number
of items available for products from Amazon, Movies from Netflix, Music from Pandora, news
items on Google news etc, it becomes difficult for the user to find what they want. Searching
is helpful when the users know what they are looking for.In this situation, the recommender
system recommends items that the user may not know about or hard-to-find or non-hit items.
The total sale of such items is called long-tail. [Figure-1] represents the long-tail phenomenon.

Figure: 1. Long-tail Phenomenon

V Lakshmi Chetana, Dr.SP Syed Ibrahim, Dr.B Rama, Dr.P KiranSree 57


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

Nomenclatures
𝑟𝑥,𝑖 Rating given by user x on item i
𝑟𝑦,𝑖 Rating given by user y on item i.
𝑟̅𝑥 Average ratings given by user x
𝑟̅𝑦 Average ratings given by user y
I Set of Set of all items in User-Item space
𝑟̅𝑖 Average ratings on item i given by all users
𝑟̅𝑗 Average ratings on item j given by all users
U Set of all users in User-Item space
𝑟𝑢,𝑖 Rating given by user u for item i
𝑟𝑢,𝑗 Rating given by user u for item j
𝑃𝑎,𝑖 Predicted rating for active user a on item i
𝑤𝑎,𝑥 Similarity coefficient between a and x
Abbreviations
CBRS Content-based Recommendation System
CF Collaborative Filtering
CFRS Collaborative Filtering based Recommendation System
SVD Singular Value Decomposition
MF Matrix Factorization
PCA Principle Component Analysis
LSI Latent Semantic Indexing
ALS Alternate Least Squares
SVM Support Vector Machines
MAE Mean Absolute Error
RMSE Rooted Mean Squared Error
NMAE Normalized Mean Absolute Error
ROC Receiver Operating Characteristic
The y-axis represents the popularity i.e., the number of times a product is chosen and the x-
axis represents products ordered according to their popularity. Online sites provide the entire
range of products that are most popular and the rest of the tail.Second, as personal taste plays
an important role in selecting an item, recommendation systems often utilizes the interests of a
group which helps in discovering items based on the similarities.
This paper reviews various techniques used in recommendation systems and is organized
as follows - The second section describes the steps involved in recommender systems like
information gathering, learning and recommendation. Section three explains about various
filtering techniques used to produce recommendations followed by various evaluation metrics
in section four.

[1.1] STEPS IN RECOMMENDER SYSTEMS


The steps that are involved in recommender systems are
1. Information gathering
2. Learning
3. Recommendations

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 58


PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE ART REVIEW

[1.1.1] INFORMATION GATHERING STEP


In this phase, user preferences or interest is collected from different sources like user
profiles,reviews,tags,shares,likes,newsgroup and so on.Reddit.com is a website which collects
votes for an item or newsgroup.Recommender systems depend on different types of input
extracted through.
Explicit feedback :
The system generates an interface for the user to provide ratings or feedback for an item to
improvise itself. The accuracy of the recommender system relies on the quantity of the ratings
given by the user.But the main problem with this method is, the user may not be everytime
ready to give complete information and apt information.
Implicit feedback :
The system implicitly derives the user preferences or interests by monitoring the previous
actions of users likes purchase history, navigation history, a number of visits to a web page,
number of shares and likes in social networking sites, content of the email and so on.[2].It
reduces burden on the user but the problem with it is, it gives less accuracy
Hybrid feedback :
To improve the performance of the system the pros of both explicit and implicit feedback is
combined. Here the system takes explicit feedback only when he is interested in expressing.
[1.1.2] LEARNING STEP
It uses different machine learning algorithms to filter user interests collected in the first step to
predict interests of user's choice. The different learning algorithms that are used are Bayesian,
Decision Tree, Neighborhood-based , Neural Network, Rule learning, Ensemble, Gradient –
Descent based, Kernal methods, Clustering, Associative classification, Lazy learning,
Regularization methods, Topic Independent Scoring Algorithm[3].

[1.1.3] RECOMMENDATION STEP

It predicts what sort of items the user prefers. These predictions are done either by the system
observed activities or by analyzing the data collected in the information collection step. Any
of the learning algorithms may be applied to analyze the input data. [Figure- 2]exhibits the
steps involved in the recommendation process.

Information
Learning Recommendation /Prediction
Collection
Step Step
Step

Feedback

Figure: 2. Recommendation Process Steps

V Lakshmi Chetana, Dr.SP Syed Ibrahim, Dr.B Rama, Dr.P KiranSree 59


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

[2] APPROACHES FOR RECOMMENDATION SYSTEMS

Recommendation systems should be efficient in predicting the items as per the user's interests
accurately.The different techniques that are used in filtering the recommendations are shown
in [Figure-3]

Recommender systems

Content - based filtering Hybrid filtering[Content-


Collaborative filtering based+Collaborative filtering]

Model based Hybrid approach[Memory


Memory based approach based+Model based]
approach

User based Item based

Figure:3.Approaches for Recommender Systems

[2.1] CONTENT-BASED APPROACH


Content is the foundation for building recommender applications. These recommendations are
based on item description rather than similarity of other users.There are various contents
available, such as wikis, polls, articles, blogs, classification terms, photos and videos. Metadata
can be extracted by analyzing content. The procedure consists of creating tokens, normalizing
the tokens, approving them, stemming them, and detecting phrases. [1].The major idea of this
approach is to recommend items to customer X similar to the user profile or previous items
purchased or liked. [Figure-4] explains how the content-based recommendations are
generated. The recommender system generates the user and item profiles from the web, where
user profile consists of information regarding their ratings, likes etc and the item profile
consists of a complete description of the items like actors, music director, a genre of the movie
etc. The recommender systems take these as inputs and generate recommendations.
The advantages of content-based recommender systems are – information of other users
is not required, ready to predict items to users with unique tastes.Ready to recommend new and
hard to find items. Some of the disadvantages are - finding the appropriate features is hard,
over specialization, cold start problem for new users, never recommends items outside user's
content profile and scalability.

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 60


PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE ART REVIEW

User
profiles

Generates
Web CBRS Recommendations

Documents

Item
Profiles

Figure: 4. Architecture of Content based Recommendation System

[2.2] COLLABORATIVE FILTERING APPROACH


The word “Collaborative Filtering” was first coined by one of the developers of Tapestry[4].
In this approach, recommendations are drawn based on the interests of the crowd. The vital
assumption of collaborative filtering is that if two users A and B rate ‘n’ items similarly or have
similar behaviors then they rate or act on various items similarly[4].This technique uses a data
structure called “User-Item Matrix”.It consists of ratings given by the users for different items
where these ratings usually scale from 1 through 5.Table 1 shows a simple movie database
with some rows (called as records) that describes five movies and some columns (which are as
attributes, fields, characteristics or variables ) that describe the properties like Genre of the
movie and its year of release.

Table 1 Movie Database

ID Movie Name Genre Year of


release

1 Terminator Action, Science Fiction 1984

2 Matrix Action, Science Fiction 1999

3 Final Destination Horror, Thriller 2000

4 Lord of the Rings Adventure, Drama, Fantasy 2001

5 Avatar Action,Adventure,Fantasy 2009

User-Item Matrix which consists of ratings given by different users for the movies mentioned
in the movie database. Table 2 depicts the User-Item Matrix for the movie database.

V Lakshmi Chetana, Dr.SP Syed Ibrahim, Dr.B Rama, Dr.P KiranSree 61


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

Table 2 User-Item Matrix

Movie Terminator Lord of Rings Avatar Matrix


U1 3 2
U2 5 3 3
U3 3 ? 2
U4 2.5 3.5
U5 4.5 4 3.5
U6 3 2

Collaborative filtering is classified into two categories: Memory-based and Model-based.

[2.2.1] MEMORY-BASED COLLABORATIVE FILTERING


In this technique, complete or part of the user-item matrix is used to generate recommendations.
This technique is also called as Neighborhood- based CF.Memory based CF is further
classified into user-based CF and item-based CF.In user-based CF, the similarity is calculated
between the users by differentiating their ratings on the same item. In the item-based CF, the
similarity is calculated between items rather than users[2]. To generate recommendations, the
neighbors of the active users are found by calculating the similarity measure and a weighted
aggregate of those user’s ratings are computed. [Figure-5] describes the architecture of
memory-based collaborative filtering.

Figure : 5. Architecture for memory based Collaborative filtering

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 62


PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE ART REVIEW

SIMILARITY MEASURES
There are many similarity computation methods like Jaccard coefficient, Kendall's coefficient,
Manhattan distance, Spearman’s correlation, Cosine, Adjusted Cosine, Euclidian distance,
constrained Correlation, Mean Squared Difference etc among which Pearson's correlation
coefficient and cosine –based similarity are most commonly used.
Pearson’s correlation coefficient is a statistical approach which is used to
measure to which extent the two variables are linearly related. For user-based CF, the Pearson’s
correlation coefficient between user x and user y is given as (1)
∑𝑖∈𝐼(𝑟𝑥,𝑖 −𝑟̅𝑥 )(𝑟𝑦,𝑖 −𝑟̅𝑦 )
𝑤𝑥,𝑦 = (1)
2 2
√∑𝑖∈𝐼(𝑟𝑥,𝑖 −𝑟̅𝑥 ) √∑𝑖∈𝐼(𝑟𝑦,𝑖 −𝑟̅𝑦 )

where 𝑟𝑥,𝑖 is the rating given by user x on item i . 𝑟𝑦,𝑖 is the rating given by user y on item i.
𝑟̅𝑥 is the average ratings given by user x. 𝑟̅𝑦 is the average ratings given by user y and I is the
set of all items in User-Item space.
For item – based CF, the Pearson’s correlation coefficient will be
∑𝑥∈𝑈(𝑟𝑥,𝑖 −𝑟̅𝑖 )(𝑟𝑥,𝑗 −𝑟̅𝑗 )
𝑤𝑖,𝑗 = (2)
√∑𝑥∈𝑈(𝑟𝑥,𝑖 −𝑟̅𝑖 )2 √∑𝑥∈𝑈(𝑟𝑥,𝑗 −𝑟̅𝑗 )2

where𝑟𝑥,𝑖 is the rating given by user x on item i . 𝑟𝑦,𝑖 is the rating given by user y on item i. 𝑟̅𝑖 is
the average ratings on item i given by all users. 𝑟̅𝑗 is the average ratings on item j given by all
users and U is the set of all users in User-Item space.
The other important similarity metric which is most commonly used is cosine
similarity.It is a linear algebraic technique widely used in the fields of text mining and
information retrieval. Documents are represented as vectors(i.e., instead of users or items,
documents are used and instead of ratings, a frequency of words in the document bare used) .
It measures the similarity based on the angle formed between the vectors. For User –based
CF,the similarity between users x and y is defined as
𝑥⃗ . 𝑦
⃗⃗ ∑𝑖∈𝐼 𝑟𝑥,𝑖 𝑟𝑦,𝑖
𝑤𝑥,𝑦 = cos(𝑥⃗, 𝑦⃗) = |𝑥⃗|∗|𝑦⃗⃗| = (3)
2 ∑ 2
√∑𝑖∈𝐼 𝑟𝑥,𝑖 √ 𝑖∈𝐼 𝑟𝑦,𝑖

where 𝑟𝑥,𝑖 is the rating given by user x for item i , 𝑟𝑦,𝑖 is the rating given by user y for item i
and I is the list of all items in the User-Item space.
For Item–based CF ,the similarity between users i and j is defined as
𝑖⃗ . 𝑗⃗ ∑𝑢∈𝑈 𝑟𝑢,𝑖 𝑟𝑢,𝑗
𝑤𝑖,𝑗 = cos(𝑖⃗, 𝑗⃗) = |𝑖⃗|∗|𝑗⃗| = (4)
2 ∑ 2
√∑𝑢∈𝑈 𝑟𝑢,𝑖 √ 𝑖∈𝐼 𝑟𝑢,𝑗

where𝑟𝑢,𝑖 is the rating given by user u for item i , 𝑟𝑢,𝑗 is the rating given by user u for item j
and U is the list of all users in the User-Item space.
The important step in collaborative filtering is to obtain recommendations.So
after calculating the similarity, nearest neighbors of the active user are chosen and a weighted
aggregate of their ratings is used to generate predictions [4].The weighted aggregate of other
ratings is given as follows.
∑𝑥𝜖𝑈(𝑟𝑥,𝑖 −𝑟
̅̅̅
𝑥 ).𝑤𝑎,𝑥
𝑃𝑎,𝑖 = 𝑟̅𝑎 + ∑𝑥𝜖𝑈|𝑤𝑎,𝑥 |
(5)

V Lakshmi Chetana, Dr.SP Syed Ibrahim, Dr.B Rama, Dr.P KiranSree 63


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

where 𝑃𝑎,𝑖 is the predicted rating for the active user an on item i , 𝑟𝑥,𝑖 is the rating given by x
on item i , 𝑤𝑎,𝑥 is the similarity coefficient between a and x ,𝑟̅𝑥 is the average of ratings given
by user x.
For example, consider the user-item matrix in Table 2 and we want to find whether
Avatar movie can be recommended to user 3.So first we calculate similarity between user 3
and other users and is as follows w(1,2)=0.403072, w(1,3)=0.323575, w(2,3)=-0.41523,
w(3,4)=-0.2325, w(3,5)=0.518545, w(3,6)=0.382336 etc.,From the above values,the highest
similarity is between user 3 and user 5 Now we calculate the predicted rating for user 3 using
weighted aggregate of other ratings and is given as follows
̅̅̅).𝑤
(𝑟2,3 −𝑟 2 ̅̅̅).𝑤
3,2 +(𝑟4,3 −𝑟 4 ̅̅̅).𝑤
3,4 +(𝑟5,3 −𝑟 5 3,5
P(3,3) = 𝑟̅3 +
|𝑤3,2 |+|𝑤3,4 |+|𝑤3,5 |

(3−3.66667).(−0.41′5213)+3.5−3).(−0.2325)+(4−3).(0.518545)
=2+ |−0.415213|+|−0.2325|+|0.518545|
0.679116
=2+ +6
1.166275

=2.058229
[2.2.2] MODEL BASED COLLABORATIVE FILTERING
The performance of the memory-based technique decreases when the user-item matrix
becomes extremely sparse, which is very common on e-commerce sites.In model-based
collaborative filtering, the model is generated from a sample of larger databases using machine
learning and data mining algorithms.This pre-computed model is used to generate
recommendations similar to memory-based CF.The model based collaborative filtering
resolves the major problem of recommender systems called sparsity by employing
dimensionality reduction techniques.Some of the dimensionality reduction techniques are
Singular Value Decomposition(SVD)[44-45],Matrix Factorization(MF)[46-48],Principal
Component Analysis(PCA) ,Latent Semantic Indexing(LSI),among which MF is rapidly
used.Dimensionality reduction helps in improving the performance of the recommender
systems.Widely used model-based techniques are Association rule mining[11],Bayesian
classifiers[5-13],Decision trees[9,11,12],Clustering[8][21],Latent factor models(MF using
SVD, ALS, Regularization)[27].Some of the model-based CF techniques are based on the bio-
inspired approaches like Neural networks[9,12,14-16], Fuzzy systems[17], Genetic
algorithms[18-20].Slope One Method deals with sparsity and cold start problems. It is used to
predict the vacant ratings wherever necessary[26].Classification algorithms like Association
rule mining, Decision trees, Naïve Bayesian classification, SVM etc are used if the user ratings
are categorical.If the user-item matrix is of multi-class data then it is converted to binary class
for simplicity. Regression models are used if the user ratings are numerical. [Figure-6]
describes the architecture of Model-based Collaborative filtering.

[2.2.3] HYBRID COLLABORATIVE FILTERING


This technique is used to improve the prediction accuracy by combining some of the advantages
of both neighborhood based and model based collaborative filtering techniques.
Neighbourhood based approach does not perform well with sparse data in the user-item matrix
and it also suffers from scalability i.e., if the user-item matrix consists of more number of items
or users, then the entire computation has to be done to generate a single recommendation. A

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 64


PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE ART REVIEW

model-based approach is extremely complex to generate a model if the dataset consists of


multiple parameters. The construction of the model takes a large amount of time and machine
and user perspective are different, so if the assumptions of the model do not fit the data, results
in wrong predictions.To avoid these problems hybrid collaborative filtering generates
algorithms using models that are fast and easy to calculate[40].
The recommendations generated by this approach are more accurate compared to
memory based and model based collaborative filtering algorithms.ParitoshNagarnaik et al.[28]
Uses a hybrid collaborative filtering approach to generate web page recommendations using
pattern discover algorithms such as clustering and association rule mining.K. Yu, A.
Schwaighofer et al. [29]combined memory based and model based techniques to generate
predictions based on the user profiles and posterior distribution of user ratings.D. M. Pennock,
E. Horvitz et al.[30] introduced Personality Diagnosis which selects the active user uniformly
at random and calculates same personality type(neighbors for the active user) by adding
Gaussian noise to his or her ratings.

Figure:6.Architecture for Model-based Collaborative filtering

[2.3]
HYBRID RECOMMENDER SYSTEMS
To overcome the individual disadvantages of Content-based RS and Collaborative RS, both are
combined together to form a new Recommender system called Hybrid Recommender system.
It uses the advantages of both the systems to optimize the predictions. There are multiple ways
to build a hybrid recommender system. They are
1. Both the approaches are individually implemented and the results are combined
in generating the recommendations.
2. Content-based techniques are used in Collaborative recommender systems.
3. Collaborative techniques are used in Content –based Recommender systems.
4. Both are combined together as a model.
[Table 3] depicts different combinations of to build a hybrid model.

The different hybridization methods are


Weighted Hybrid
It integrates the results of each technique and generates new recommendations. It linearly
combines the predicted ratings. Initially, the predicted ratings are considered equal and
gradually weights are added during the process of generating Top-N recommendations. P-

V Lakshmi Chetana, Dr.SP Syed Ibrahim, Dr.B Rama, Dr.P KiranSree 65


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

Tango[31] is an example of weighted hybrid recommender system which combines Content-


based RS and Collaborative RS.For further examples refer (Pazzani1999), (Towle&Quinn
2000).The advantage of this techniques is that all the benefits of the recommendation systems
are used during the process of generating recommendations.
Cascade Hybrid
It follows an iterative approach in generating the recommendations.Initially, one
recommendation technique is used to generate a coarse list of predictions and again a second
recommendation technique is used on this coarse list of predictions to generate a finer
recommendations.The advantage of this technique is, it is optimal and tolerant to noise as low-
priority recommendations are filtered at the first iteration and in the second iteration, the only
high-level coarse list of predictions are processed to generate the final recommendations.Its
disadvantage is, it adds complexity. EntreeC[13]-a restaurant recommender system, Fab are the
examples of cascade hybridization.

Table 3 Different ways to build a hybrid model


SNo Approach

CBRS Recommendations CFRS


1

2 CBRS Recommendations
CFRS

CBRS
3 Recommendations
CFRS

MODEL
4 Recommendations
CBRS CFRS

Switching Hybrid
It uses the strategy of switching in between the recommendation techniques based on
some criteria. DailyLearner is an example of switching technique which uses content and
collaborative techniques where content-based recommendation technique is employed first and
collaborative recommendation technique is implemented later when the result drawn from
content based RS are with low confidence. The advantage of this approach is it uses all the
benefits of recommendation techniques used.The disadvantage of this technique is the criterion
has to be determined on which switching has to take place.It results in another level of
parameterization.

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 66


PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE ART REVIEW

Mixed Hybrid
It is a combination of more than one recommendation systems.The recommendations are drawn
individually by each technique and the results are combined to generate final recommendations.
Each item has multiple recommendations associated with it from different
recommendationtechniques[2].PTV[32] – A television show recommender system,
PickAFlick[34], ProfBuilder[33] are some of the examples of mixed hybrid recommendation
techniques.
Feature combination Hybrid
It is another hybrid approach which combines content and collaborative filters. The predicted
rankings generated by the collaborative technique is taken as an extra feature and on this
augmented dataset content based filter is employed to generate recommendations. Pipper is an
example of feature combination hybrid technique which uses ratings of collaborative filters as
a feature in the content-based system for movie recommendations[35].GroupLens is another
example. The feature that is combined with the data set may not be always collaborative filters.
Feature augmented Hybrid
Feature augmented hybrids are superior to feature combination hybrids. It employs any
technique to predict ratings or classify the dataset and this output is taken as input to the
recommendation process. Libra is an example of feature augmentation which generates book
recommendations using content-based system based on the data found in Amazon.com using a
naive Bayes text classifier[36].
Meta–level Hybrid
It is another type of combining the recommendation systems. The model generated by the first
system is taken as input to the other. The difference between the feature augmentation and meta
level is, in feature augmentation, the feature generated by the model is taken as input to the
second whereas, in metal level hybrid, the entire model is taken as input to the second. Fab,
LaboUr are the examples of meta-level hybrids.
[Table 4] describes the various approaches used by the recommendation systems.

[3] EVALUATION METRICS


To evaluate the performance of the recommendation systems and accuracy of the
recommendations generated, a myriad evaluation metrics are used. Herlockeret.al[37]., broadly
classified the evaluation metrics into predictive metrics and recommendation accuracy metrics.

[3.1] Predictive metrics


It measures the difference between the predicted ratings generated by the system and the real
ratings.The most popular prediction metrics are Mean Absolute Error(MAE), Rooted Mean
Squared Error(RMSE), Normalized Mean Absolute Error(NMAE) etc. Predictive metrics are
also called as statistical accuracy metrics.
MAE: It is the most common and popularly used metric. It is the absolute difference between
the predicted ratings and real ratings. It is defined as follows
∑𝑁
𝑖=1|𝑝𝑖 −𝑟𝑖 |
𝑀𝐴𝐸 = 𝑁
(6)

where N is the total number of ratings of all the users on the item i, 𝑝𝑖 is the predicted rating
and 𝑟𝑖 is the real rating. Lower the MAE indicates that the recommendation engine is accurately

V Lakshmi Chetana, Dr.SP Syed Ibrahim, Dr.B Rama, Dr.P KiranSree 67


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

predicting the user ratings.(𝑝𝑖 − 𝑟𝑖 )represents the error in prediction.


RMSE: It has become more popular after its usage in Netflix Prize Competition. Willmott et
al.[38] claimed thatRMSE is more appropriate to represent model performance than the MAE
when the error distribution is expected to be Gaussian[39].It is defined as follows
∑𝑁
𝑖=1(𝑝𝑖 −𝑟𝑖 )
2
𝑅𝑀𝑆𝐸 = √ 𝑁
(7)

Table 4 Classification of Recommendation Systems

Recommendation
Description Limitations
approaches
Recommendations are generated based ColdStart
Content based RS
on user and item features Problem
Over
Specialization
Recommendations are generated based

It uses statistical methods to find the


Memory-based similar users and items.It is also ColdStart
Collaborative filtering RS

Collaborative called as Neighborhood-based CF. Problem


Filtering It is further categorized into User- Sparsity
based CF and Item-based CF Gray Sheep
on the user past history

It uses machine learning and data Shilling Attacks


Model-based
mining algorithms to generate a Overfitting
Collaborative
model. This model is responsible for Scalability
Filtering
generating the recommendations. Synonymity
Hybrid It combines both memory based and Trust
Collaborative model based approaches.
Filtering
Recommendations are generated by merging

The results of several


recommendation techniques are
content based and collaborative filtering techniques

integrated to generate a final


Weighted
recommendation. Weights are
Hybridization
added linearly to individual results
of each RS to draw a single
recommendation.
The system switches between the
Switching
Complex to implement

recommendation techniques based


Hybridization
on some criteria.
It is an iterative procedure where
Cascade one recommendation technique
Hybrid RS

Expensive

Hybridization refines the output of another


recommendation technique.
Recommendations generated from
Mixed
individual techniques are mixed to
Hybridization
generate the final recommendation.

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 68


PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE ART REVIEW

Features drawn from different


Feature- algorithms are augmented and taken
Combination as input for a recommendation
algorithm.
It is superior to feature combination
hybrid. Features generated from one
Feature-Augmented
hybrid is taken as input to the
second.
Meta-level It is a combination of two hybrid
Hybridization methods.

where N is the total number of ratings of all the users on item i ,𝑝𝑖 is the predicted rating and
𝑟𝑖 is the real rating. This metric puts more emphasis on larger absolute error because,on a 5-
point scale, a 1-point error may not be perceptible by the user(items rated with either 4 or 5
points are good recommendations) whereas with a 4-point error, the algorithm could be
recommending bad items[40] and the lower the RMSE is, the better the recommendation
accuracy.
[3.2] Recommendation accuracy metrics
These are further classified into classification accuracy metrics and rank accuracy
metrics. These metrics are also called as decision support accuracy metrics.
[3.2.1] Classification accuracy metrics
Classification accuracy measures how well the recommendation system differentiates
relevant(good) items from non-relevant(bad) items.Some of the classification accuracy metrics
are precision, recall, F-measure.
Precision: It is the ratio of relevant recommended items to the total recommended items.It is
defined as follows.
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡𝑅𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑎𝑡𝑖𝑜𝑛𝑠
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑜𝑡𝑎𝑙𝑅𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑎𝑡𝑖𝑜𝑛𝑠 (8)

Recall: It is the ratio of relevant recommended items to the total relevant items. It is defined as
follows
𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡𝑅𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑎𝑡𝑖𝑜𝑛𝑠
𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑜𝑡𝑎𝑙𝑅𝑒𝑙𝑒𝑣𝑎𝑛𝑡𝐼𝑡𝑒𝑚𝑠
(9)

It is preferable to have high precision and recall, but both the metrics are inversely
proportional i.e., when precision increases, recall decreases and vice versa[40].So most of the
systems use another measure called f-measure.
F-measure: Itis a combination(harmonic mean) of precision and recall and is defined as
follows.
2∗𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑟𝑒𝑐𝑎𝑙𝑙
𝐹 − 𝑚𝑒𝑎𝑠𝑢𝑟𝑒 = (10)
𝑝𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑟𝑒𝑐𝑎𝑙𝑙

The best f-score is 1 and the worst is 0.


[3.2.2] Rank accuracy metrics
When the number of recommended items are large, then the users usually prefer the top-

V Lakshmi Chetana, Dr.SP Syed Ibrahim, Dr.B Rama, Dr.P KiranSree 69


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

N recommendations from the list. This creates more errors than to the items placed at the last
of the recommendation list. Rank accuracy metrics considers this situation and it measures the
ability of the recommendation algorithm to produce the recommendation list that matches with,
how the user would have ordered the same list of items. For example, if the top ten list of items
generated by the recommendation algorithm is relevant to the user and if the user does not feel
that the top item in the list is good(relevant), then the rank accuracy metric score for that
specific algorithm is low. Some of the rank accuracy metrics are ROC, half-life utility (HL)
and discounted cumulative gain(DCG).
ROC: Swets[41,42] introduced ROC metric to the information retrieval community, which was
very popular in signal detection theory. It is an alternative to precision and recall. It is a
graphical representation of what extent the recommendation algorithm discriminates
good(relevant) items with bad(irrelevant) items. It is also called as Relative Operating
Characteristic.The area below the ROCcurve is called as Swet’s A measure, which is used as a
single metric of the system's ability to discriminate between good and bad items, independent
of the search length[37].
Half – lifeUtility : It was proposed by Breese et al.[43]. It evaluates the utility of an ordered
list of recommendations by the user. It is based on the fact that, the utility of the user is greater
for the beginning list of items and it exponentially decreases going deeper into the list. It is
defined as follows
∑𝑖(𝑟𝑎𝑖 −𝑑,0)
𝐻𝐹 = 2(𝑖−1)(𝛼−1)
(11)

where𝑟𝑎𝑖 𝑖s the rating of the user a on item i of the recommended list, d is the default rating,
and α is the half-life. The half-life is the rank of the item on the list such that there is a 50%
chance that the user will view that item[37].
Discounted Cumulative gain(DCG): It is defined as
𝑟
𝐷𝐶𝐺 = 𝑟𝑎𝑖 + ∑𝑘𝑖=2 𝑎𝑖 (12)
log2 (𝑖)

where 𝑟𝑎𝑖 represents the rating of the user a on item i of the recommended list, d is the default
rating, and α is the half-life. The half-life is the rank of the item on the list such that there is a
50% chance that the user will view that item[37] and k is the rank of the evaluated item.

[4] CONCLUSION

The review of various approaches like content-based, collaborative and hybrid alleviates
the problem of information overload and opens new opportunities of retrieving personalized
information to the users. This paper discussed the steps involved in recommender systems, the
approaches for recommendation systems and highlighted their strengths and challenges with
diverse kind of hybridization strategies used to improve their performances. Various learning
algorithms used in generating recommendation models and evaluation metrics used in
measuring the quality and performance of recommendation algorithms were discussed. This
review will serve as a blue print to the researchers in improving the state of the art
recommendation techniques besides developing new approaches.

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 70


PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE ART REVIEW

REFERENCES
[1] G. Suganeshwari and S.P. Syed Ibrahim,”A Survey on Filtering collaborative Based
Recommendation System", March 2016 DOI: 10.1007/978-3-319-30348-2_42.
[2] F.O Isinkaye, Y.O.Folajimi, B.A.Ojokoh," Recommendation systems: Principles, methods, and
evaluation”, Egyptian Informatics Journal, pp. 261-273(2015)
[3] Ivens Portugal, PauloAlencar, DonaldCowan,"The use of Machine Learning Algorithms in
Recommender Systems: A Systematic Review”,
[4] Xiaoyuan Su and TaghiM.Khoshgoftaar," A survey of Collaborative Filtering Techniques",
Hindawi Publication Corporation-Advances in Artificial Intelligence, Vol 2009, Article ID
421425.
[5] S.B. Cho, J.H. Hong, M.H. Park, Location-based recommendation system using Bayesian user’s
preference model in mobile devices, Lecture Notes in Computer Science 4611 (2007) 1130–1139.
[6] Wang, Y., Chan, S. C. F., & Ngai, G. (2012, December). Applicability of demographic
recommender system to tourist attractions: a case study on trip advisor. In Proceedings of the The
2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent
Technology-Volume 03 (pp. 97-101). IEEE Computer Society.
[7] Seric, L., Jukic, M., &Braovic, M. (2013, May). Intelligent traffic recommender system. In
Information & Communication Technology Electronics & Microelectronics (MIPRO), 2013 36th
International Convention on (pp. 1064-1068). IEEE.
[8] Ericson, K., &Pallickara, S. (2013). On the performance of high dimensional data clustering and
classification algorithms. Future Generation Computer Systems,29(4), 1024-1034.
[9] Felden, C., &Chamoni, P. (2007, January). Recommender systems based on an active data
warehouse with text documents. In System Sciences, 2007. HICSS2007. 40th Annual Hawaii
International Conference on (pp. 168a-168a). IEEE.
[10] Gorodetsky, V., Samoylov, V., &Serebryakov, S. (2010, August). Ontology–based context
dependent personalization technology. In Web Intelligence andIntelligent Agent Technology (WI-
IAT), 2010 IEEE/WIC/ACM International Conference on (Vol. 3, pp. 278-283). IEEE.
[11] Lucas, J. P., Segrera, S., & Moreno, M. N. (2012). Making use of associative classifiers in order
to alleviate typical drawbacks in recommender systems. ExpertSystems with Applications, 39(1),
1273-1283.
[12] Marović, M., Mihoković, M., Mikša, M., Pribil, S., & Tus, A. (2011, May).Automatic movie
ratings prediction using machine learning. In MIPRO, 2011Proceedings of the 34th International
Convention (pp. 1640-1645). IEEE.
[13] Burke, R.: Hybrid recommender systems: survey and experiments. User Model. User-Adap.
Inter. 12(4), 331–370 (2002)
[14] Alvarez, S. A., Ruiz, C., Kawato, T., &Kogel, W. (2011). Neural expert networks for faster
combined collaborative and content-based recommendation. Journal ofComputational Methods
in Sciences and Engineering, 11(4), 161-172.
[15] Martineau, J. C., Cheng, D., &Finin, T. (2013). TISA: topic independence scoring algorithm. In
Machine Learning and Data Mining in Pattern Recognition(pp. 555-570). Springer Berlin
Heidelberg.

V Lakshmi Chetana, Dr.SP Syed Ibrahim, Dr.B Rama, Dr.P KiranSree 71


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

[16] Ingoo, H., Kyong, J.O., Tae, H.R.: The collaborative filtering recommendation based on
SOMcluster-indexing CBR. Expert Syst. Appl. 25, 413–423 (2003).
[17] R.R. Yager, Fuzzy logic methods in recommender systems, Fuzzy Sets andSystems 136 (2)
(2003) 133–149
[18] R. Garcı´ a, X. Amatriain, Weighted content based methods for recommending connections in
online social networks, in Proceedings of the 2010 ACMconference on Recommender Systems,
2010, pp. 68–71.
[19] L.Q. Gao, C. Li, Hybrid personalized recommended model based on genetic algorithm, in
International Conference on Wireless Communication, Networks and Mobile Computing, 2008,
pp. 9215–9218.
[20] Qian Wang,Xianhu Yuan,Min Sun : Collaborative filtering recommendation algorithm based on
hybrid user model ,in Proceedings of the Seventh International Conference on fuzzy Systems and
Discovery,2010,pp.1985-1990
[21] Ericson.K & Pallickara.S ,On the performance of distributed clustering algorithm in file and
streaming processing systems.,In Utility and Cloud Computing(UCC),2011 Fourth IEEE
International ,pp.33-40,IEEE.
[22] Arenas-Garcia.J,Meng.A,Petersen.K.B,Lehn-Schioler.T,Hansen.L.K & Larsen.J (2007
August),Unveiling music structure via PLSA similarity fusion.In Machine Learning for Signal
Processing,2007 IEEE Workshop .(pp.419-424).IEEE.
[23] Schelter.S,Boden.C,Schenck.M,Alexander roy.A & Markl.v(2013 October),Distributed Matrix
factorization with mapreduce using a series of broadcast joins.,In proceedings of the 7 th ACM
Conference on the Recommendation Systems(pp.281-284),ACM.
[24] Takacs.G,Pilaszy.I,Nemeth.B & Tikk.D(2009).Scalable collaborative filtering approaches for
large recommendation systems.The Journal of Machine Learning Research,10,623-656.
[25] Tewari .N.C,Koduvely.H.M,Guha.S,Yadav.A & David.G(2013 October),Mapreduce
implementation of Variational Bayesian Probabilistic Matrix Factorization algorithm in Big
Data,2013 IEEE International Conference (pp.145-152),IEEE.
[26] Pu WANG, HongWuYE: A Personalized Recommendation Algorithm Combining Slope One
Scheme andUser Based Collaborative Filtering.In2009 International Conference on Industrial and
Information Systems (pp.152-154)
[27] Robert Bell Yehuda Koren and Chris Volinsky. Matrix factorization techniques for recommender
systems. In IEEE Computer, volume 42 (8), pages30–37, 2009
[28] ParitoshNagarnaik and Prof.A.Thomas: Survey on Recommendation System Methods.In IEEE
SPONSORED 2ND INTERNATIONAL CONFERENCE ON ELECTRONICS AND
COMMUNICATION SYSTEM(ICECS 2015)
[29] K. Yu, A. Schwaighofer, V. Tresp, X. Xu, and H.-P. Kriegel, “Probabilistic memory-based
collaborative filtering,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 1,
pp. 56–69, 2004.
[30] D. M. Pennock, E. Horvitz, S. Lawrence, and C. L. Giles, “Collaborative filtering by personality
diagnosis: a hybrid memory- and model-based approach,” in Proceedings of the 16th Conference
on Uncertainty in Artificial Intelligence (UAI '00), pp. 473–480, 2000.
[31] Claypool M, Gokhale A, Miranda T, Murnikov P, NetesD, Sartin M. Combining content- based
and collaborative filters in an online newspaper. In: Proceedings of ACM SIGIR workshop on
recommender systems: algorithms and evaluation, Berkeley, California; 1999.

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 72


PERSONALIZATION USING RECOMMENDATION SYSTEMS: STATE-OF-THE ART REVIEW

[32] Smyth B, Cotter P. A personalized TV listings service for the digital TV age. J Knowledge-Based
Systems 2000;13(2–3):53–9.
[33] Wasfi AM. Collecting user access patterns for building user profiles and collaborative filtering.
In: Proceedings of the 1999international conference on intelligent user, Redondo Beach,
CA;1999. p. 57–64.
[34] Burke R, Hammond K, Young B. The FindMe approach to assisted browsing. IEEE Expert
1997;12(4):32–40.
[35] Basu C, Hirsh H, Cohen W. Recommendation as classification: using social and content-based
information in recommendation.In: Proceedings of the 15th national conference on artificial
intelligence, Madison, WI; 1998. p. 714–20.
[36] Mooney RJ, Roy L. Content-based book recommending using learning for text categorization.
In: Proceedings of the fifth ACMconference on digital libraries. ACM; 2000. p. 195–204.
[37] HERLOCKER, J. L., KONSTAN, J. A., TERVEEN, L. G., AND RIEDL, J. T. 2004. Evaluating
collaborative filtering recommender systems. ACM Trans. Inform. Syst. 22, 1, 5–53.
[38] Willmott, C. J., Matsuura, K., and Robeson, S. M.: Ambiguitiesinherent in sums-of-squares-
based error statistics, Atmos. Environ.,43, 749–752, 2009.
[39] T.Chai, R.R Draxler : Root mean square error (RMSE) or mean absolute error (MAE)? –
Arguments against avoiding RMSE in the literature, Geo-scientific Model Development.,2014.
[40] Cacheda, F., Carneiro, V., Fernandez, D., and Formoso, V. 2011. Comparison of collaborative
filtering algorithms: Limitations of current techniques and proposals for scalable, high-
performance recommendersystems. ACM Trans. Web 5, 1, Article 2 (February 2011), 33 pages.
[41] SWETS, J. A. 1963. Information retrieval systems. Science 141, 245–250.
[42] SWETS, J. A. 1969. Effectiveness of information retrieval methods. Amer. Doc. 20, 72–89.
[43] BREESE, J. S., HECKERMAN, D., AND KADIE, C. 1998. Empirical analysis of predictive
algorithms for collaborative filtering. In Proceedings of the 14th Conference on Uncertainty in
Artificial Intelligence(UAI-98). G. F. Cooper, and S. Moral, Eds. Morgan-Kaufmann, San
Francisco, Calif.,43–52.
[44] M.G. Vozalis, K.G. Margaritis, Using SVD and demographic data for the enhancement of
generalized collaborative filtering, Information Sciences 177(2007) 3017–3037.
[45] S. Zhang, W. Wang, J. Ford, F. Makedon, Using singular value decomposition approximation for
collaborative filtering, in IEEE International Conference on E-Commerce Technology, 2005, pp.
1–8.
[46] Y. Koren, R. Bell, CH. Volinsky, Matrix factorization techniques for recommender systems, IEEE
Computer 42 (8) (2009) 42–49.
[47] X. Luo, Y. Xia, Q. Zhu, Incremental collaborative filtering recommender based on regularized
matrix factorization, Knowledge-Based Systems 27 (2012)271–280
[48] X. Luo, Y. Xia, Q. Zhu, Applying the learning rate adaptation to the matrix factorization based
collaborative filtering, Knowledge-Based Systems 37(2013) 154–164.

V Lakshmi Chetana, Dr.SP Syed Ibrahim, Dr.B Rama, Dr.P KiranSree 73


International Journal of Computer Engineering and Applications,
Volume XI, Issue IX, September 17, www.ijcea.com ISSN 2321-3469

AUTHORS INTRODUCTION

V Lakshmi Chetana has obtained her MCA degree from Acharya Nagarjuna Uni versity and
M.Tech in Computer Science and Engineering from JNTU Kakinada with distinctions. She is now
working as Assistant Professor in the department of CSE at DVR & Dr. HS MIC College of
Technology,Vijayawada. She has 8yrs of Experience in teaching. She has more 10 international Journal
and Conference papers. Her areas of interest in research are Big Data Analytics, Data Science,Artificial
Intelligence and Machine Learning.

Dr. S.P SYED IBRAHIM has received his B.E(CSE) degree from Bharathidasan University ,
M.E(CSE) degree from Anna University and PhD in CSE from Anna University .He is now working as
a professor in the School of Computing Sciences and Engineering at VIT University,Chennai Campus
and also coordinates Data Analytics Research Group.His research interests includes Data Mining,Big
Data Analytics.He has published more than 40 International Journal and CDonference papers and
completed five Industrial consultancy worl related to Big Data Analytics.

Dr.B Rama has obtained her PhD from Sri Padmavathi Mahila Viswa Vidyalayam, Tirupati.She
has 5 years industrial and 8 years of teaching experience .She is now working as HoD in the Department
of Informatics/ Computer Science, University college Kakatiya University, Warangal .She is the
recipient of best paper award for the paper “ A Data mining perspective for forest fire prediction”
presented in International Conference. She has more than 45 international journal and IEEE and
Springer conference papers. She organized various workshops and delivered many guest lectures.Her
research areas of interest are Data Mining,Artificial Intelligence and Big Data Analytics.

Dr P Kiran Sree has obtained his B.Tech degree in Computer Science & Engineering from
JNTU Hyderabad and M.E in Computer Science & Engineering from Anna University with
distinctions. He has obtained his Ph.D in Artificial Intelligence from Jawaharlal Nehru Technological
University-Hyderabad. His interests include Parallel Algorithms, Artificial Intelligence, and
Bioinformatics. His bibliography was listed Listed in Marquis Who’s Who in the World, 28 Edition
(2012), U.S.A. He is the recipient of Bharat Excellence Award from Dr GV Krishna Murthy, Former
Election Commissioner of India in 2013. He worked as Principal I/C of the N.B.K.R. Institute of
Science & Technology, the second oldest private engineering college of the state for two years
(04/12/2009-02/12/2011). He is the editor in chief of international journal of Parallel and Cloud
Computing Research (PCCR). He has published 52 research papers in international journals and
conferences. He is recognized supervisor in the stream of Artificial Intelligence for guiding Ph.D
scholars at VIT and KL universities.

V Lakshmi Chetana*, Dr. SP Syed Ibrahim, Dr. B Rama, Dr. P KiranSree 74

View publication stats

You might also like