Kathmandu Bernhardt College
Tribhuvan University
Institute of Science and Technology
A Proposal On
“Book Recommendation System”
Submitted to:
Department of Computer Science and
Information Technology
Submitted by:
Kalpit Pandey
Rajan Neupane
Rajun Pandey
Table of Contents
1. Introduction ................................................................................................................................ 1
2. Problem Statement ..................................................................................................................... 2
3. Objectives ................................................................................................................................... 2
4. Methodology .............................................................................................................................. 3
A. Requirement Identification.......................................................................................................... 3
i. Literature Review .................................................................................................................... 3
ii. Requirement Analysis ............................................................................................................. 3
B. Feasibility Study .......................................................................................................................... 5
i. Technical Feasibility ................................................................................................................ 5
ii. Operational Feasibility ............................................................................................................ 5
iii. Economic Feasibility ............................................................................................................. 5
iv. Schedule Feasibility ............................................................................................................... 5
C. High Level Design of System ...................................................................................................... 7
i. Development Model................................................................................................................. 7
ii. Use-case Diagram ................................................................................................................... 7
iii. Block diagram of the System ................................................................................................. 9
iv. Algorithms ........................................................................................................................... 10
5. Expected Outcome ................................................................................................................... 13
6. References
1. Introduction
A recommendation system is a subclass of Information filtering Systems that seeks to
predict the rating or the preference a user might give to an item. In simple words, it is an
algorithm that suggests relevant items to users. A book recommendation system is a type
of recommendation system where we have to recommend similar books to the reader
based on his interest. The books recommendation system is used by online websites
which provide e-books like google play books, open library good Read's, etc. One
approach to building a book recommendation system is collaborative filtering, which
utilizes the past behavior and preferences of users to make recommendations. One
popular measure of similarity used in collaborative filtering is cosine similarity, which
calculates the cosine of the angle between two vectors in an innerproduct space. There is
a lack of personalized recommendations for books, leading to a sub optimal reading
experience for many readers. In this proposal, we propose to develop a book
recommendation system using collaborative filtering based on cosine similarity. By
analyzing the ratings data of a group of users, we can identify patterns in the preferences
of different users and use these patterns to make personalized recommendations to
individual users.
1
2. Problem Statement
There is a lack of personalized recommendations for books, leading to a sub optimal
reading experience for many readers. Traditional approaches to book recommendation,
such as "cold start" problem may which not be able to handle large and complex data
sets. So we have proposed to develop a collaborative filtering system for book
recommendations that will overcome these limitations and provides high quality
personalized recommendations to readers. The proposed recommendation system will
be based on a data set of user behavior and preferences, and will use machine learning
techniques to learn patterns and make predictions about which books a user is likely to
enjoy. The system will be designed to handle new users and items with little or no past
behavior data, and to make recommendations for a diverse range of books. The
performance of the system will be carefully evaluated and fine-tuned to ensure that it is
making accurate and diverse recommendations.
3. Objectives
To accurately predict which books a user is likely to enjoy based on their past
behavior and preferences.
To improve the overall reading experience for users by providing personalize
recommendation
To implement a model-based collaborative filtering system that is able to handle new
users and items with little or no past behavior data, overcoming the "cold start" problem.
2
4. Methodology
A. Requirement Identification
i. Literature Review
In the ever-expanding realm of digital information, recommendation systems have
become the unsung heroes that guide us through the vast seas of content. Whether it’s
suggesting our next favorite movie, helping us discover a new book, or
recommending products tailored to our preferences, these systems play a pivotal role
in shaping our online experiences.[1]
In [2] the authors proposed a model that generates recommendations to buyers,
through an enhanced CF algorithm, a quick sort algorithm and Object Oriented
Analysis and Design Methodology (OOADM). They performed recommendations by
implementing the stated model with python model-view-controller (MVC)
framework known as Django Framework. This improved system was implemented
using a real-time, cloud-hosted NOSQL database called FireBase which guarantees
scalability. Scalability was ensured through the implementation of Firebase SQL.
This system performed well on the evaluation metrics. In [3] the authors proposed a
system that saves details of books purchased by the user. From these Book contents
and ratings, a hybrid algorithm using collaborative filtering, content- based filtering
and association rule generates book recommendations. Rather than Apriori they
recommended the use of Equivalence class Clustering and bottom up Lattice
Transversal (ECLAT) as this algorithm is faster due to the fact that it examines the
entire dataset only once.
The mainly used algorithm in recommendation system are content based filtering
(CBF) and collaborative filtering (CF). Cosine similarity is a common metric used to
measure similarity. It calculates the cosine of the angle between two vectors (either
user preferences or item attributes), ranging from -1 (completely dissimilar) to 1
(completely similar). In CF, cosine similarity is used to compare users or items based
on their interaction histories. In CBF, it is used to compare items based on their
attributes or to match user profiles to items.
ii. Requirement Analysis
a. Functional requirement
Data storage and management:
The system will be able to store and manage the data required for generating
recommendations, such as user ratings for books, metadata for books, etc.
3
Data preprocessing:
The system will be able to preprocess the data by performing tasks such as missing
value imputation, normalization, etc.
Similarity computation:
The system will be able to compute the similarity between users or between books
using the cosine similarity measure.
Recommendation generation:
The system will be able to generate recommendations for a user by selecting the
most highly rated books by similar users or the most similar books to the ones that
the user has rated highly.
User interface:
The system will have a user interface through which users can provide ratings
forbooks, view their recommendations, and perform other tasks.
b. Non-Functional Requirement
Scalability:
The systems will be able to handle a large number of users and books and generate
recommendations in a timely manner.
Reliability:
The system will be able to generate accurate and reliable recommendations
Ease of use:
The system will have a user-friendly interface that is easy for users to navigate and
use.
Security:
The system will be secure and protect user data from unauthorized access.
Performance:
The system will have a fast response time and be able to generate
recommendations quickly.
4
B. Feasibility Study
i. Technical Feasibility
Sufficient amount of data is available to train the recommendation model and generate
accurate recommendations. We have data on user, books rating and metadata for
books. Our project has sufficient computational resources to process and analyze the
data, compute similarities between users or books, and generate recommendations.
ii. Operational Feasibility
From an operational perspective, the system requires regular updates to incorporate
new books and user data. It should also ensure a seamless integration with existing
platforms and a user-friendly interface. Training personnel for system management
and maintenance is necessary to ensure smooth operation.
iii. Economic Feasibility
The economic feasibility is promising, given the increasing popularity of digital
reading and online book purchasing. An effective recommendation system can drive
more sales and enhance user engagement, providing a good return on investment.
Initial costs include data acquisition, development, and integration, but potential
revenue from increased sales and premium subscriptions can offset these expenses
over time.
iv. Schedule Feasibility
The development timeline of this project is set to be 3-4 months, including phases for
planning, development, testing, and deployment. Key milestones include project
kickoff, development completion, successful testing and deploy. The schedule ensures
a structured approach with clear deadlines for each phase of the project.
5
Figure 1: Gantt Chart
6
C. High Level Design of System
i. Development Model
The incremental build model is a method of software development where the
product is designed, implemented and tested incrementally (a little more is added
each time) until the product is finished. It involves both development and
maintenance.
Figure 2: Incremental Model
ii. Use-case Diagram
In the context of the book recommendation system, the Use Case Diagram illustrates the
essential interactions between actors and the system's functionalities. The Dataset actor,
representing the data source, initiates two crucial actions: loading datasets into the system
and retrieving pre-existing datasets. Loading datasets involves importing raw data for
analysis, while retrieving datasets entails accessing stored data. Once the data is within
the system, the System actor takes charge. It preprocesses the data, cleaning it, handling
missing values, and transforming it into a suitable format for further analysis.
Subsequently, the system builds a recommendation model, employing various algorithms
such as cosine similarity algorithms to predict user preferences based on the available
data. The final use case involves the User actor, who interacts with the system to receive
personalized book recommendations. After the preprocessing, model building, and
similarity calculations, the system provides tailored book suggestions to the user based
on their preferences, past interactions, or search queries. This interaction enhances the
user experience, ensuring that the recommendations align closely with the user's interests
and preferences.
7
Figure 3: Use-case diagram
8
iii. Block diagram of the System
Figure 4: Block diagram of the System
The block diagram illustrates a hybrid book recommendation system. When a user logs into
the system, their interaction data such as books they have rated, viewed, and personal details
is collected and stored in both a main database and an ontology database, which organizes
relationships between books and genres. The system uses a hybrid recommender approach,
combining two primary techniques: collaborative filtering and content-based filtering. The
collaborative filtering technique recommends books based on ratings and preferences of
similar users, while the content-based technique suggests books by analyzing the features of
books the user has already interacted with. Together, these techniques generate a personalized
list of recommended books for the user.
9
iv. Algorithms
a) Hybrid recommendation system
It is basically a combination of both the above methods. It is a too complex model which
recommends product based on your history as well based on similar users like you. There
are some organizations that use this method like Facebook which shows news which is
important for you and for others also in your network and the same is used by Linkedin
too.
b) Content Based Filtering
Content-based filtering is a recommendation system technique that suggests items to users based
on the content characteristics of items and user preferences. It relies on the similarity between the
content of items and the user's historical interactions or explicitly stated preferences. Here's how
content-based filtering works, with a simplified explanation and formula:
Content-Based Filtering:
Feature Representation:
Each item (e.g., books products) and user is represented using a set of features or se features
describe the content of the items and the user's preferences. For example, in a book
recommendation system, item features could include author, title, publication year, etc,
while user features might represent their preferred genres.
Vector Space Representation:
Transform these feature representations into numerical vectors. One common technique for
this is one-hot encoding, where each feature corresponds to a binary value (0 or 1). In more
advanced content-based systems, embeddings or numerical representations can be used for
text-based features.
10
Similarity Calculation:
To make recommendations, you calculate the similarity between the user's preferences and
item characteristics using a similarity metric like cosine similarity:
Cosine_Similarity(user, item) = (user_vector · item_vector) / (||user_vector|| *
||item_vector||)
Here, user_vector and item_vector represent the numerical feature vectors for the user and
the item, and ||user_vector|| and ||item_vector|| represent their Euclidean norms (lengths).
Recommendation Calculation:
Calculate the similarity between the user's feature vector and the feature vectors of all
items. This results in a similarity score for each item, reflecting how similar the item's
features are to the user's preferences.
Rank the items based on their similarity scores, and recommend the top-N items with the
highest scores to the user.
In this content-based filtering approach, items are recommended to the user if their
content features align with the user's preferences. For example, if a user prefers
action books, the system will recommend action books with similar content
characteristics.
Content-based filtering is effective at making recommendations, particularly when the content
features are well-defined and users have explicit preferences. However, it has limitations, such as
the inability to capture serendipity (suggesting items outside the user's usual preferences) and
challenges related to feature engineering. In practice, hybrid recommendation systems that
combine content-based and collaborative filtering techniques are often used to provide more
robust and diverse recommendations.
c) Collaborative Filtering
Collaborative filtering is a recommendation system technique that provides personalized
recommendations by leveraging the behavior and preferences of a group of users to make
predictions for an individual user. There are two main types of collaborative filtering: user-based
and item-based. Here, I'll explain the concept of user-based collaborative filtering with a
simplified formula.
11
User-Based Collaborative Filtering:
In user-based collaborative filtering, recommendations are made based on the similarity
between users' preferences. The basic idea is to find users who are similar to the target user
and recommend items that those similar users have liked. The formula can be described as
follows:
1) User-Item Interaction Matrix (M): This matrix represents user-item interactions, with rows
representing users and columns representing items. The entries in the matrix contain user
ratings, purchase history, or any relevant interaction data.
2) User Similarity Calculation: Calculate the similarity between the target user (u) and other
users (v) based on their interactions with items. Common similarity metrics include cosine
similarity or Pearson correlation coefficient. The formula for cosine similarity is:
Cosine_Similarity(u, v) = (u · v) / (||u|| * ||v||)
Where:
u · v represents the dot product of user vectors u and v.
||u|| and ||v|| represent the Euclidean norms (lengths) of user vectors u and v.
3) Recommendation Calculation: To make recommendations for the target user, find users
with high similarity to the target user and recommend items that those similar users have
liked but the target user has not yet interacted with. The formula for the predicted rating for
item i by user u (P(u, i)) can be calculated as a weighted average of the ratings given by
similar users to that item:
P(u, i) = Σ(sim(u, v) * R(v, i)) / Σ(|sim(u, v)|)
Where:
sim(u, v) represents the similarity between users u and v.
R(v, i) represents the rating of user v for item i. The summation
performed over all users v who are similar to user u.
4) Top-N Recommendations: After calculating predicted ratings for all items, you can
recommend the top-N items with the highest predicted ratings to the target user.
12
5. Expected Outcome
This Book Recommendation System is expected to analyze the user data and leverage
algorithms to recommend books that align with each user's tastes, enhancing their reading
experience. By offering relevant and diverse book options, the system will aim to
increase user satisfaction, engagement, and the likelihood of discovering new authors
and genres, ultimately fostering a more enjoyable and enriching reading journey.
13
6. References
[1] I. Saifudin and T. Widiyaningtyas, “Systematic Literature Review on Recommender
System: Approach, Problem, Evaluation Techniques, Datasets,” in IEEE Access, doi:
10.1109/ACCESS.2024.3359274.
[2] E. Okon, B. Eke, and P. Asagba, "An improved online book recommender
system using collaborative filtering algorithm," 05 2018.
[3] P. Mathew, B. Kuriakose, and V. Hegde, "Book recommendation system through
con- tent based and collaborative filtering method," pp. 47-52, 03 2016.