You are on page 1of 22

Part- II Outline

 Collaborative filtering based


recommendation
 Intro: Memory based vs Model Based

 CF – User based vs Item based

 CF – K-nearest neighbor ( KNN)

 CF – Limitations
Collaborative RS
Introduction
 Key is to find users/user groups whose interests
match with the current user
 More users, more ratings: better results
Collaborative RS
 User-based collaborative filtering
 Item-based collaborative filtering
Collaborative RS...
• Recommendation task 1:
– Predicting the rating on a target item for a given user
– Predicting John’s rating on Star Wars Movie

movie1 ??
Recommender

• Recommendation task 2:
– Recommending a list of items to a given user
– Recommending a list of movies to John for watching
List of Top
Movies ??
Recommender
Collaborative RS...
 Collaborative filtering (CF) is perhaps the most studied
and also the most widely-used recommendation
approach in practice.
 K-nearest neighbor (KNN)

 User-based KNN CF

 Item-based KNN CF

 Association Rules

 Matrix factorization (MF) e.g. SVD, NMF, PCA,..

 Key characteristic of CF: it predicts the utility of items for a user


based on the items previously rated by other like-minded users.
Collaborative RS: Memory based

 K-NN (which is also called the memory-based


approach) utilizes the entire user-item database to
generate predictions directly, i.e., there is no model
building.
 This approach includes both
 User-based KNN CF

 Item-based KNN CF
Collaborative RS Using KNN
1. User Based KNN CF
 A user-based KNN collaborative filtering method consists of
two primary phases:
 the neighborhood formation phase

 the recommendation phase.

 There are many specific methods for both. Here we only


introduce one for each phase.
Collaborative RS Using KNN...
1. User Based KNN CF
i. Neighborhood formation phase
 Let the record (or profile) of the target user be x
(represented as a vector), and the record of another user
be y.
 The similarity between the target user, x, and a neighbor,
y, can be calculated using the Pearson’s correlation
coefficient:

sim x,𝑦
Collaborative RS Using KNN...
1. User Based KNN CF
ii. Recommendation Phase
 Use the following formula to compute the rating prediction of item i for
target user x
sim x,𝑦 × 𝑟y,i − 𝑟
𝑝 u,i = 𝑟𝑥 +
sim x,𝑦
where x is the set of k similar users,
ry,i is the rating of user y given to item i,
Collaborative RS Using KNN
Example: Using Pearson Correlation
Consider Alice and all the other users ratings of news media in the table
below. Predict the ratings of Alice to I5(BBC) based user based similarity
using Pearson correlation.

User/News I1(Inshorts) I2(HT) I3(NYT) I4(TOI) I5(BBC)


Alice 5 4 1 4 ?
U1 3 1 2 3 3
U2 4 3 4 3 5
U3 3 3 1 5 4

Step 1: Calculating the similarity between Alice and all the other users
At first we calculate the averages of the ratings of all the user excluding I5 as it is
not rated by Alice. Therefore, we calculate the average as ,
Collaborative RS Using KNN…

User-Based CF Using Pearson Correlation..

and calculate the new ratings as,


Collaborative RS Using KNN…
User-Based CF Using Pearson Correlation..

Step 2: Predicting the rating of the app not rated by Alice


Now, we predict Alice’s rating for BBC News App,

Hence, with the help of a small example we tried to understand the working of
User-based collaborative filtering.

Note: Prediction computation: We compute this using a formula which computes


rating for a particular item using weighted sum of the ratings of the other similar
products.
Collaborative RS Using KNN…
User based similarity
Harry Harry Harry Twilight Star Star Star
Potter 1 Potter 2 Potter 3 Wars 1 Wars 2 Wars 3
A 4 5 1
B 5 5 4
C 2 4 5
D 3 3

Recommending: Which users have similar interests/similar


ratings?
• Which pair of users do you consider as the most similar?
• What is the right definition of similarity?
Collaborative RS Using KNN…
User Similarity : Using Jaccard Similarity
Harry Harry Harry Twilight Star Star Star
Potter 1 Potter 2 Potter 3 Wars 1 Wars 2 Wars 3
A 1 1 1
B 1 1 1
C 1 1 1
D 1 1

Jaccard Similarity: users are sets of movies

Disregards the ratings.


• Jsim(A,B) = 1/5
• Jsim(A,C) = Jsim(B,D) = 1/2
Collaborative RS Using KNN…
User Similarity : Using Cosine Similarity
Harry Harry Harry Twilight Star Star Star
Potter 1 Potter 2 Potter 3 Wars 1 Wars 2 Wars 3
A 2/3 5/3 -7/3
B 1/3 1/3 -2/3
C -5/3 1/3 4/3
D 0 0

Normalized Cosine Similarity:


• Subtract the mean and then compute Cosine
(correlation coefficient)
• Corr(A,B) = 0.092
• Cos(A,C) = -0.559
Collaborative RS Using KNN…
Drawbacks of User-based KNN CF
 The problem with lack of scalability as it requires the
real-time comparison of the target user to all user
records in order to generate predictions.
 A variation of this approach that remedies this problem is
called item-based KNN CF.
Collaborative RS Using KNN…
2. Item Based KNN CF
i. Neighborhood formation phase
 The item-based approach works by comparing

items based on their pattern of ratings across


users. The similarity of items i and j is
computed as follows:
𝑟u,i − 𝑟𝑢 𝑟u,j − 𝑟𝑢
sim i,j =
Collaborative RS Using KNN…
2. Item Based KNN CF
ii. Recommendation Phase
 After computing the similarity between items we
select a set of k most similar items to the target item
and generate a predicted value of user u’s rating
𝑟u,j × sim i,j
𝑝 u, i =
sim i,j

where J is the set of k similar items


Collaborative RS Using KNN…
Item to Item Similarity: The very first step is to build the model by
finding similarity between all the item pairs. The similarity between
item pairs can be found in different ways. One of the most common
methods is to use cosine similarity.

Formula for Cosine Similarity:


Collaborative RS Using KNN…
Example: Using Cosine similarity
Let us consider one example. Given below is a set table that contains some
items and the user who have rated those items. The rating is explicit and is on
a scale of 1 to 5. Each entry in the table denotes the rating given by a ith User
to a jth Item. In most cases majority of cells are empty as a user rates only for
few items. Here, we have taken 4 users and 3 items. We need to find the
missing ratings for the respective user.

User/Item Item1 Item2 Item3


User1 2 ? 3
User2 5 2 ?
User3 3 3 1
User4 ? 2 2
Collaborative RS Using KNN…
Step 1: Calculating cosine sim. of items based on non-missed ratings in the
table.

Step 2: Generating the missing ratings in the table. Now, in this step we
calculate the ratings that are missing in the table.
Collaborative RS
Collaborative RS - Limitations
 Finding similar users/user groups isn’t very easy
 New user: No preferences available (user cold start problem)
 New item: No ratings available (item cold start problem)
 Sparsity of rating matrix as dimension grows

You might also like