You are on page 1of 20

PROJECT REPORT

ON
“E-Commerce Recommendation”

By:
Aakash Vaishnav – 199278007
Amar Agrahari – 199278112
Epuri Sai Harish – 199278028
T Chaitanya Kiran – 199278056
Sameer Suman - 199278074

1|Page
CONTENTS

1. Problem Statement

2. Introduction

3. About Recommender Systems

4. Advantages of Recommender Systems

5. Item-to-Item Collaborative Filtering for Books

6. Item-to-Item Collaborative Filtering for Movies

7. Item-to-Item Collaborative Filtering for Games

8. Results and Conclusion

2|Page
Problem Statement
How E-Commerce Recommendations would work for Books, Movies and Fashion
segment across different transaction types such as buy, add to cart and add to
Wishlist?

Introduction
About Recommender Systems
E-commerce recommendation is method of collecting user data like recent
browses, recent buys etc. to push in products that are most likely to be bought by
the user. This uses data analytics and decision making to give recommendations
to the user. Companies like Flipkart, Amazon use this technique widely to improve
the recommendations to the user. Companies like Netflix and Amazon prime also
does the same to improve the user experience.
This will give us an advantage as it enhances the user experience. Companies use
techniques like collaborative filtering and content-based filtering to give the
recommendations. It improves the sales of the products as well as this tends to
push products to the user.
The customer engagement improves a lot by giving better recommendations. So,
this tends to be a must for companies to punch above the weights of their
competition.

3|Page
Source: https://www.researchgate.net/figure/Structure-of-a-recommender-
system_fig2_220827211
The difference between collaborative filtering and content based filtering is that
collaborative rating uses the ratings and user data of one person for
recommending content to another but that is not the case in case of content
based filtering. Content based filtering is more personalized and is dependent on
the data concerned with the same user.

Source: https://towardsdatascience.com/brief-on-recommender-systems-
b86a1068a4dd
Types of filtering that can be used –
1. Traditional collaborative filtering – It uses randomly sampling the
customers from a set of selected pool of customers and then the customer
pool is optimized by either removing the most popular or unpopular items
from the list.

4|Page
Source: https://medium.com/hexagondata/collaborative-filtering-how-
predictive-modelling-tells-us-what-we-want-37848533ece7

2. Search based Filtering – This method is based on the items that is being
searched by the user. Depending upon what the user searches, suggestions
are shown with respect to either the same keywords being used or the
similar sizes (in case of attires) or the similar brands.
Example of a search based filtering –

5|Page
Source:
https://docs.litium.com/documentation/litium-accelerators/develop/
search-filter-and-navigation

3. Item-to-Item filtering – It is based on the similarity of the items that is


bought by different users and calculate the similarity.

Source: https://neo4j.com/blog/collaborative-filtering-creating-teams/

Advantages of Recommender system in the real world –


1. It enhances the user experience as it reduces the time for the user to
search a lot of items. A better recommendation will reduce both the time
and effort required by the user in using the application and which means
the user has a positive feeling about the application.
2. It helps in delivering the correct content to the users.
3. It drives traffic into the application.
4. It has the ability to convert shoppers into real customers.
5. This can be used to push the less search items but relevant to the user.
6. Increases the average order value

6|Page
Item-Item Based Collaborative Filtering for Books
Following is the data set consisting of user ids and book ratings:

Following is the description of the data set


 Rows : 100 rows each row corresponding to a unique user
 Columns : 100 columns corresponding to 100 unique books
 Value : 1-5 (Blank implying no rating provided by the user and 5 being the
highest rating of liking)
Step 1 : Calculating the similarity between books:
To compute the similarity matrix using cosine similarity below formula is used:

Where, r(u, i) represents the rating of user i for item i


r (u) represents the average rating of user i

Step 2 : Calculating the recommendation score


7|Page
Step 3 : Predicting the top 5 books to be recommended to the user by selecting
the top 5 scores under every user
Following code illustrates the loading of required libraries and dataset

Below code illustrates the conversion of data into matrix and then into the class
“realRatingMatrix”

8|Page
Then we split the dataset and use 80% of the data for training and 20% of the
data for testing the output of the model

Following is the explanation for various parameters:


 train – specifies the amount of data to be used for training the model
 given – specifies the number of recommendations to be given to the user
 goodRating – specifies the rating above which the rating is considered to be
the books preferred by the user
Using the above evaluation scheme, the model is built using recommenderlab and
top 5 book recommendations are generated for the test data

Following is the explanation for various parameters:


 getData – inputs training data into the model
 “IBCF” – specifies item based collaborative filtering
 normalize – specifies if any normalization is to be done
 method – specifies the method using which the similarity is to be computed

9|Page
Following output is generated for all the test user ids:

Different normalization methods were used and accuracy was calculated to find
the best fitting model. Following output highlights the error measurements for
different normalization methodologies:

 IBCF_N – No Normalization
 IBCF_C – Centered Normalization
 IBCF_Z – Z-Score Normalization

10 | P a g e
Simulating the model for buy scenario for a user:
Consider a user who has previously provided ratings in some of the books and
new books are to be recommended to the user. Following is the subset of ratings
of a user for whom the recommendation is to be made

As can be seen, user id 2 has provided rating only for 3 books out of the total 5
books. Model will recommend top 5 books out of the books whose ratings are not
provided by the user. Following are the recommendations generated by the
model for user id 2

Following are the book recommendations:


1. Classical Mythology
2. The Kitchen God’s Wife
3. Icebound
4. From the Corner of His Eye
5. Decision in Normandy
All these 5 books were not rated by the user and are being recommended basis
the similarity with the ratings of the books already provided by the user

11 | P a g e
Item-Item Based Collaborative Filtering for Movies
Following is the data set consisting of user ids and movie ratings:

Following is the description of the data set


 Rows : 100 rows each row corresponding to a unique user
 Columns : 100 columns corresponding to 100 unique movies
 Value : 0-5 (0 implying no rating provided by the user and 5 being the
highest rating of liking)
Calculating the similarity between movies:
To compute the similarity matrix using pearson similarity below formula is used:

12 | P a g e
Following code illustrates the loading of required libraries and dataset

Below code illustrates the conversion of data into matrix and then into the class
“realRatingMatrix”

Then we split the dataset and use 80% of the data for training and 20% of the
data for testing the output of the model

Following is the explanation for various parameters:


 train – specifies the amount of data to be used for training the model
 given – specifies the number of recommendations to be given to the user
 goodRating – specifies the rating above which the rating is considered to be
the books preferred by the user
Using the above evaluation scheme, the model is built using recommenderlab and
top 5 movie recommendations are generated for the test data

13 | P a g e
Following is the explanation for various parameters:
 getData – inputs training data into the model
 “IBCF” – specifies item based collaborative filtering
 normalize – specifies if any normalization is to be done
 method – specifies the method using which the similarity is to be computed
Following output is generated for all the test user ids:

Different normalization methods were used and accuracy was calculated to find
the best fitting model. Following output highlights the error measurements for
different normalization methodologies:

 IBCF_N – No Normalization
 IBCF_C – Centered Normalization
 IBCF_Z – Z-Score Normalization

14 | P a g e
Simulating the model for add to cart scenario for a user:
Consider a user who has browsed and provided ratings in some of the movies and
have added a movie to the cart and new movies are to be recommended to the
user. Following is the subset of ratings of a user for whom the recommendation is
to be made

As can be seen, user id 9 has provided rating 5 for the movie “Father of the Bride
Part II”. Model will recommend top 5 movies out of the books whose ratings are
provided by the user and ratings are similar to the movie already added to the
cart by the user. Following are the recommendations generated by the model for
user id 9

Following are the movie recommendations:


1. Money Train (1995)
2. Crossing Guard, The (1995)
3. Antonia’s Line (Antonia) (1995)
4. Once Upon a Time… When We Were Colored (1995)
5. Toy Story (1995)
All these 5 movies were rated by the user and are being recommended basis the
similarity with the ratings of the movies already provided by the user

15 | P a g e
Item-Item Based Collaborative Filtering for Games
Following is the data set consisting of user ids and game ratings:

Following is the description of the data set


 Rows : 100 rows each row corresponding to a unique user
 Columns : 100 columns corresponding to 100 unique games
 Value : 0-5 (0 implying no rating provided by the user and 5 being the
highest rating of liking)
Calculating the similarity between games:
To compute the similarity matrix using uclidean similarity below formula is used:

Where, Sp1 and Sp2 represents the score of score in item 1 and score in item 2

16 | P a g e
Following code illustrates the loading of required libraries and dataset

Below code illustrates the conversion of data into matrix and then into the class
“realRatingMatrix”

Then we split the dataset and use 80% of the data for training and 20% of the
data for testing the output of the model

Following is the explanation for various parameters:


 train – specifies the amount of data to be used for training the model
 given – specifies the number of recommendations to be given to the user
 goodRating – specifies the rating above which the rating is considered to be
the books preferred by the user
Using the above evaluation scheme, the model is built using recommenderlab and
top 5 game recommendations are generated for the test data

17 | P a g e
Following is the explanation for various parameters:
 getData – inputs training data into the model
 “IBCF” – specifies item based collaborative filtering
 normalize – specifies if any normalization is to be done
 method – specifies the method using which the similarity is to be computed
Following output is generated for all the test user ids:

Different normalization methods were used and accuracy was calculated to find
the best fitting model. Following output highlights the error measurements for
different normalization methodologies:

 IBCF_N – No Normalization
 IBCF_C – Centered Normalization
 IBCF_Z – Z-Score Normalization

18 | P a g e
Simulating the model for add to wishlist scenario for a user:
Consider a user who has previously provided ratings in some of the games and
new games are to be recommended to the user. Following is the subset of ratings
of a user for whom the recommendation is to be made

As can be seen, user id 1 has provided rating for 4 games out of the total 5 games.
Model will recommend top 5 games out of the games whose ratings are not
provided by the user. Following are the recommendations generated by the
model for user id 1

Following are the game recommendations:


1. Rocket League
2. Cities in Motion 2
3. Company of Heroes (New Steam Version)
4. Portal 2
5. THE KING OF FIGHTERS XIII STEAM EDITION
All these 5 games were not rated by the user and are being recommended basis
the similarity with the ratings of the games already provided by the user

Result and Conclusion


Recommender systems have been helping e-commerce platforms since the
beginning but with better exposure and increase to internet literacy, these
platforms are able to recommend the absolute right product the consumer might
be looking for a purchase.

19 | P a g e
With our tests using Cosine, Pearson and Uclidean to compute the similarity for
Books, Movies and Games recommendation, we were able to compute the RMSE
and advocate the best method that could be used in a recommendation system to
calculate the similarity matrix.

RMSE for Cosine:

Average RMSE = 1.86001

RMSE for Pearson:

Average RMSE = 2.219257

RMSE for Uclidean:

Average RMSE = 2.247149

So, Cosine matrix similarity formula works best among the three and could be
used in the recommendation system for effective usage.

20 | P a g e

You might also like