You are on page 1of 38

AN INTERNSHIP REPORT ON

NETFLIX RECOMMENDATION SYSTEM


A Report Submitted to
Jawaharlal Nehru Technological University
Kakinada, Kakinada in partial fulfilment for the award
of the degree of

BACHELOR OF TECHNOLOGY

IN
INFORMATION TECHNOLOGY
Submitted by

Name: KALANEEDU LAVANYA

Roll No: 19KN1A1215

UNDER THE ESTEEMED GUIDANCE OF

MR. ANKITH

DEPARTMENT OF INFORMATION TECHNOLOGY

NRI INSTITUTE OF TECHNOLOGY


Autonomous
(Approved by AICTE, Permanently Affiliated to JNTUK,
Kakinada) Accredited by NBA (CSE, ECE & EEE), Accredited
by NAAC with ‘A’ Grade ISO 9001: 2015 Certified Institution
Pothavarappadu (V), (Via) Nunna, Agiripalli (M), Krishna Dist., PIN: 521212,
A.P, India.
2021-2022
NRI INSTITUTE OF TECHNOLOGY
(An Autonomous Institution, Approved by AICTE, Permanently Affiliated to JNTUK, Kakinada)
Accredited by NBA (CSE, ECE & EEE), Accredited by NAAC with ‘A’ Grade ISO
9001: 2015 Certified Institution
Pothavarappadu (V), (Via) Nunna, Agiripalli (M), Krishna Dist., PIN: 521212, A.P, India.

CERTIFICATE

This is to certify that the “Internship report” submitted by KALANEEDU LAVANYA (Regd. No.:
19KN1A1215) is work done by her and submitted during YEARS academic year, in partial fulfilment of
the requirements for the award of the degree of BACHELOR OF TECHNOLOGY IN
INFORMATION TECHNOLOGY, at BLACKBUCK ENGINEERS PVT LTD, Road No:36,
Jubilee Hills, Hyderabad, Telangana.

INTERNSHIP COORDIANTOR HEAD OF THE DEPARTMENT


(R KATHYAYANI) (CH. CHITHANYA KISHORE REDDY)

EXTERNAL EXAMINER
CERTIFICATE OF INTERNSHIP
ACKNOWLEDGEMENT

I take this opportunity to thank all who have rendered their full support to
my work. The pleasure, the achievement, the glory, the satisfaction, the reward,
the appreciation and the construction of my project cannot be expressed with a
few words for their valuable suggestions.

I am expressing my heartfelt thanks to Head of the Department, Dr. M.


CHAITANYA KISHORE REDDY for his continuous guidance for completion
of my Project work.

I am extending our sincere thanks to Dean of the Department, Dr. M.


CHAITANYA KISHORE REDDY for his continuous guidance and support to
complete my project successfully.

I am thankful to the Principal, Dr. C. NAGA BHASKAR Garu for his


encouragement to complete the Project work.

I am extending my sincere and honest thanks to the Chairman, Dr. R.


VENKATA RAO garu & Secretary, Sri K. Sridhar Garu for their continuous
support in completing the Project work.

Finally, I thank the Administrative Officer, Staff Members, Faculty of


Department of CSE, NRI Institute of Technology and my friends, directly or
indirectly helped us in the completion of this project.

Name: KALANEEDU LAVANYA


Roll No: 19KN1A1215
ABSTRACT
Providing a useful suggestion of products to online users to increase their
consumption on websites is the goal of many companies nowadays. People usually select or
purchase a new product based on some friend’s recommendations, comparison of similar
products or feedbacks from other users. In order to do all these tasks automatically, a
recommender system must be implemented. The recommender systems are tools that provide
suggestions that best suit the client’s needs, even when they are not aware of it. That offers of
personalized content are based on past behavior and it hooks the customer to keep coming
back to the website. In this project, a movie recommendation mechanism within Netflix will
be built. Personalized recommendations on the Netflix Homepage are based on a user's
viewing habits and the behavior of similar users. These recommendations, organized for
efficient browsing, enable users to discover the next great video to watch and enjoy without
additional input or an explicit expression of their intents or goals. The Netflix Search
experience, on the other hand, allows users to take active control of discovering new videos
by explicitly expressing their entertainment needs via search queries.

Usually, the basic recommender systems consider one of the following factors for generating
recommendations; the preference of user (i.e., content-based filtering) or the preference of
similar users (i.e. collaborative filtering). To build a stable and accurate recommender system
a hybrid of content-based filtering as well as collaborative filtering will be used.

KEYWORDS: Netflix, Recommendation system, CBF- Content-based filtering, CF-


Collaborative filtering, Hybrid systems
TABLE OF CONTENTS

S.NO CHAPTERS PAGE NO


1 INTRODUCTION 1-2
1.1 INTRODUCTION TO NETFLIX 1
RECOMMENDATION SYSTEM
1.2 INTRODUCTION TO RECOMMENDATION 1
SYSTEM
1.2.1 WHAT IS RECOMMENDATION 1
SYSTEM
1.2.2 TYPES OF RECOMMENDATION 2
SYSTEM
2 ADVANTAGES & DISADVANTAGES 3-4
2.1 ADVANTAGES OF RECOMMENDATION 3
SYSTEM
2.2 DISADVANTAGES OF RECOMMENDATION 4
SYSTEM
3 SOFTWARE & HARDWARE REQUIREMENTS 5
3.1 SYSTEM CONFIGURATIONS 5
3.2 SOFTWARE REQUIREMENTS 5
3.3 ENVIRONMENT & TOOLS: 5
3.4 HARDWARE REQUIREMENTS 5
4 MOTIVATION 6
5 LITERATURE REVIEW 7-9
5.1 CONTENT-BASED FILTERING 7
5.2COLLABORATIVE FILTERING 8
5.3HYBRID FILTERING 9
6 EXISTING SYSTEM 10
7 PROPOSED SYSTEM 11-12
8 KEYWORDS AND DEFINITION 13
9 IMPLEMENTATION & ARCHITECTURE 14-22
10 CHALLENGES FACED 23
10.1DEFICIENCY OF INFORMATION 23
10.2 INFORMATION VARIABILITY 23
10.3 UNPREDICTABLE PERFORMANCE 23
11 TESTING 24-25
12 RESULTS 26-27
13 CONCLUSION 28
14 REFERENCES 29
LEARNING OBJECTIVES/INTERNSHIP OBJECTIVES

 Internships are generally thought of to be reserved for college students looking to


gain experience in a particular field. However, a wide array of people can benefit
from Training Internships in order to receive real world experience and develop
their skills.

 An objective for this position should emphasize the skills you already possess in
the area and your interest in learning more

 Internships are utilized in several different career fields, including architecture,


engineering, healthcare, economics, advertising and many more.

 Some internship is used to allow individuals to perform scientific research while


others are specifically designed to allow people to gain first-hand experience
working.

 Utilizing internships is a great way to build your resume and develop skills that can
be emphasized in your resume for future jobs. When you are applying for a
Training Internship, make sure to highlight any special skills or talents that can
make you stand apart from the rest of the applicants so that you have an improved
chance of landing the position.
INTRODUCTION

1.1 INTRODUCTION TO NETFLIX RECOMMENDATION SYSTEM

Recommendation algorithms are at the core of the Netflix product. They provide our
members with personalized suggestions to reduce the amount of time and frustration to find
something great content to watch. Because of the importance of our recommendations, we
continually seek to improve them by advancing the state-of-the-art in the field. We do this by
using the data about what content our members watch and enjoy along with how they interact
with our service to get better at figuring out what the next great movie or TV show for them
will be. We go beyond validating our ideas on historical data to understand how people
respond to changes in our recommendation system by running online A/B tests and
measuring long-term satisfaction metrics. These experiments also provide us with new
insights to further improve our research and product. This cycle of experimentation has led us
to move beyond rating prediction, made famous by the Netflix prize, and into personalized
ranking, page generation, search, image selection, messaging, and much more.

While our research has made many significant improvements to our recommendation system
over the years, there is still a long way to go until every member sees the perfect
recommendation at the top of their page and understands why it will be great for them. Thus,
we continue to improve our algorithms and look for new areas we can personalize. We also
extend beyond the selection layer by looking for new ways we can present recommendations,
explain them, and have members interact with our systems.

1.2 INTRODUTION TO RECOMMENDAATION SYSTEM


1.2.1 What Is Recommendation System?
A recommendation system is a subclass of Information filtering Systems that seeks to predict
the rating or the preference a user might give to an item. In simple words, it is an algorithm
that suggests relevant items to users. E.g.: In the case of Netflix which movie to watch, In the
case of e-commerce which product to buy, or in the case of kindle which book to read, etc.

Use-Cases of Recommendation System: There are many use-cases of it. Some are

A. Personalized Content: Helps to Improve the on-site experience by creating dynamic


recommendations for different kinds of audiences like Netflix does.

1
B. Better Product search experience: Helps to categories the product based on their
features. E.g.: Material, Season, etc.

1.2.2 TYPES OF RECOMMENDATION SYSTEM

1. Content-Based Filtering
In this type of recommendation system, relevant items are shown using the content of
the previously searched items by the users. Here content refers to the attribute/tag of the
product that the user like. In this type of system, products are tagged using certain keywords,
then the system tries to understand what the user wants and it looks in its database and finally
tries to recommend different products that the user wants.

2.Collaborative Based Filtering


Recommending the new items to users based on the interest and preference of other
similar users basically collaborative-based filtering

There are 2 types of collaborative filtering: -

A. User-Based Collaborative Filtering


Rating of the item is done using the rating of neighboring users. In simple
words, It is based on the notion of users’ similarity.

B. Item-Based Collaborative Filtering


The rating of the item is predicted using the user’s own rating on neighboring
items. In simple words, it is based on the notion of item similarity.

3.Hybrid Methods
Hybrid Recommendation Systems combine content-based and collaborative methods. This
solution can be more effective in practice than any of the two methods separately. A Hybrid system is
used on Netflix, where the movie recommendations are the result of both comparing the watching
habits of similar users (collaborative filtering) and finding movies that have similar characteristics like
films that user has liked in the past (content-based filtering).

2
ADVANTAGES & DISADVANTAGES
Already for several years, recommendation systems (or recommenders) became essential
for every person who uses the Internet daily. We can face recommenders while using large
ecommerce websites like Amazon and eBay, online movie and streaming platforms like
Netflix, Hulu, and Spotify.

2.1 ADVANTAGES OF RECOMMENDATION SYSTEM

1. Revenue and sales increase


For years the revenue increase is probably the most popular indicator for
every business owner. As we already know, recommendation systems can be used in various
situations to solve different business goals. Let’s have a closer look at NETFLIX.

Netflix key statistics

 Netflix generated $24.9 billion revenue in 2021, a 23.8% increase year-on-year


 $12.97 billion of Netflix’s revenue was generated in North America, its largest market
 Netflix had an operating profit of $5.1 billion in 2021, an 85% increase year-on-year
 In 2022, Netflix had 222 million subscribers worldwide
Netflix revenue by region
Netflix earned 43% of its revenue from the US & Canada market, its largest region
for revenue and usage.

Netflix annual revenue by region 2018 to 2021 ($bn)

Year US & Canada EMEA Latin America Asia-Pacific

2018 8.28 3.95 2.22 0.94

2019 10.05 5.54 2.78 1.46

2020 11.45 7.77 3.13 2.37

2021 12.97 9.69 3.57 3.26

TABLE: REVENUE

3
2. User satisfaction growth

When a user sees the personalized feed, generated by the recommendation system: no
matter whether we are talking about music, books, news, movies or e-commerce – he feels
less stress and more connected with the service.

So, if the service can offer the personalized feed (built according to the correlation
with the recent purchases) and understands the user’s needs (from a similar look-a-like
audience), it tends him to buy more which leads both to the customer satisfaction and revenue
increase.

3. Turnover increase
Recommender systems can provide the turnover increase for any business. As we
already know, the recommendation engine analyses the users’ behaviour. It can take into
consideration the connections between several users too.
Everyone cares about what other people think about us: especially our friends and
relatives. Such engines can provide quite precise recommendations according to the reviews
which our friends left about the specific good or service.
Moreover, we can build a model that can recommend the goods or services taking
into consideration the reviews from your friends and relatives, which will help the user to
stay in touch with its relatives and friends

2.2 DISADVANTAGES OF RECOMMENDATION SYSTEM


Recommendation systems have some limitations. Understanding these limitations is
important in order to build a successful recommendation system:

 The cold-start problem: Collaborative filtering systems are based on the action of
available data from similar users. If you are building a brand-new recommendation
system, you would have no user data to start with. You can use content-based filtering
first and then move on to the collaborative filtering approach.

 Scalability: As the number of users grow, the algorithms suffer scalability issues. If
you have 10 million customers and 100,000 movies, you would have to create a
sparse matrix with one trillion elements.

 The lack of right data: Input data may not always .

4
SOFTWARE & HARDWARE REQUIREMENTS

3.1 SYSTEM CONFIGURATIONS:

 Text Editor (VS-code/WebStorm)


 Anaconda distribution package (PyCharm Editor)
 Python Libraries

3.2 SOFTWARE REQUIREMENTS:

• Platform- Windows, Linux or MacOS


• Operating System- Window, Linux or MacOS
• Technology- Python machine learning
• Scripting language- Python
• IDE- Jupiter Notebook

3.3 ENVIRONMENT & TOOLS:

• Jupiter Notebook
• NumPy
• Pandas
• Seaborn
• Matplotlib

3.4 HARDWARE REQUIREMENTS

 A PC with Windows/Linux
 Processor with 1.7-2.4gHz speed
 Minimum of 8gb RAM
 2 Gb Graphic cards

5
MOTIVATION
The key reason why many people seem to care about recommender systems is
money. For companies such as Amazon, Netflix, and Spotify, recommender systems drive
significant engagement and revenue. But this is the more cynical view of things. The reason
these companies (and others) see increased revenue is because they deliver actual value to
their customers – recommender The key reason why many people seem to care about
recommender systems provide a scalable way of personalizing content for users in scenarios
with many items.

To address search intents and provide a meaningful product experience for users, we
need to rethink how effective search results should be defined in the entertainment context.
We define a search match as a video retrieved by the search engine by keyword-matching the
query with the indexed videos (or by applying techniques such as query expansion ). A search
recommendation, on the other hand, is a video selected by the search engine by relaxing the
match constraints, i.e., a video retrieved via traditional recommender systems approaches
(e.g., collaborative filtering) in the query context. We use the term search results to refer to
the union of search matches and search recommendations, i.e., all videos returned in response
to a user query.

A recommendation system has become an indispensable component in various e-


commerce applications. Recommender systems collect information about the user’s
preferences of different items (e.g., movies, shopping, tourism, TV, taxi) by two ways, either
implicitly or explicitly. An implicit acquisition of user information typically involves
observing the user’s behavior such as watched movies, purchased products, downloaded
applications. On the other hand, a direct procurement of information typically involves
collecting the user’s previous ratings or history. Collaborative filtering (CF) is the way of
filtering or calculating items through the sentiments of other people. It first gathers the movie
ratings given by individuals and then recommends movies to the target user based on like-
minded people with similar tastes and interests in the past.

6
LITERATURE REVIEW
There are three techniques of recommendation system: Collaborative Filtering, Content-
Based Filtering and Hybrid Filtering. In Content Based recommender system, user provides
data either explicitly (rating) or implicitly (by clicking on a link). The system captures this
data and generates user profile for every user. By making use of user profile,
recommendation is generated. In content-based filtering, recommendation is given by only
watching single user’s profile. System tries to recommend item similar to that item based on
users’ past activity. Unlike content based, collaborative filtering finds those users whose
likings are like a given user. It then recommends item or any product, by considering that the
given user will also like the item which other users like because their taste is similar
. Both these techniques have their own strength and weakness so to overcome this, hybrid
technique came into picture, which is a combination of both these techniques. Hybrid
filtering can be used in various types. We can use content-based filtering first and then pass
those results to collaborative recommender (and vice-versa) or by integrating both the filter
into one model to generate the result. These kinds of modifications are also uses to cope up
with cold start, data sparsity and scalability problem.

5.1 Content-Based Filtering Content-Based Filtering are also known as cognitive filtering.
This filtering recommends item to the user based on his experience. For example, if a user
likes only action movies, then the system predicts him only action movies similar to it which
he has highly rated.

The broader explanation could be supposing the user likes only politics related content so the
system suggests the websites, blogs or the news similar to that content.

Unlike collaborative filtering, content-based filtering do not face new user problem. It does
not have other user interaction in it. It only deals with user’s interest. Content based filtering
first checks the user preference and then suggest him with the movies or any other product to
him.

It only focus on single user’s ideas, thoughts and give prediction based on his interest. So if
we talk about movies, then the content based filtering technique checks the rating given by
the user.

The approach checks which movies are given high ratings by the user by checking the genre
categories in the user profile.

7
figure 1: Content Based Filtering
5.2Collaborative Filtering The concept of collaborative filtering was first introduced in
1991 by Goldberg et al. The Tapestry systemapplies only to smaller user groups (e.g. a
single unit), and has too many demands on the user.

figure 2: Collaborative Based Filtering

As a prototype of collaborative filtering recommendation system, Tapestry presents a


new recommendation, but there are many technical deficiencies. Since then, there has
been a scoring based collaborative filtering recommendation system, such as Group lens,
which recommends news and films. At present many ecommerce sites have been using
the recommendation system such as Amazon, CD Now, Drugstore and Movie finder

8
etc. There is massive amount of data available. As we all know that today in this busy
life no one has time to search hundreds of thousands of item and select the one which is
similar to their taste. So collaborative filtering is one of the ways to filter the data and
provide the relevant information in which the user is interested in. Collaborative
Filtering is one of the most well known techniques for recommending items. This
technique suggests relevant item to the user based on neighbor’s choice. It first finds out
the similarity between the user and his neighbor and then predicts the items. There can
be n number of users. This technique finds the similar user from the list of user’s. But the
similarity between users is found out based the ratings which the users have given to the
particular item. This way the approach continues and the desired result is generated. This
strategy takes ratings given by user for any item from the large catalog of item catalog of
ratings given by the user. This large catalog is referred as user-item matrix.

5.3Hybrid Filtering This filtering is an information filtering system that takes ratings of the
movies as input from the users and then apply the collaborative and content based filtering
and generate recommendation list [49]. It is a combination of the two technique i.e.
collaborative filtering and content based filtering. When only the single method i.e. the
collaborative filtering or content based filtering alone cannot solve the problem then hybrid
filtering concept comes into picture. By using hybrid filtering many problems of collaborative
filtering and content based filtering can be resolved. The problem like cold start problem in
collaborative filtering is a major challenge in it. So if we apply content based filtering and
then use collaborative filtering can be a solution to it. So making it hybrid can resolve the
problem.

9
Figure 3: Hybrid Filtering

10
EXISTING SYSTEM

Netflix’s system shows personalized recommendations based on several factors, including:


 Each user’s previous interactions (e.g., viewing history, searches, and ratings)
 Other members’ choices (especially those with similar tastes and preferences)
 Information about the specific title (genre, category, the year of release, etc.)
 The device used to watch videos on Netflix
 The watching time
Combined, all that data is a useful input for Netflix recommendation algorithms. They
process and analyze all that data and use machine learning to turn it into useful and accurate
movie recommendations When a user creates a new Netflix account, they have to choose
several movie titles that they like. It’s a starting point for the recommendation algorithm to
start working. As time goes by and you watch more movies and more TV shows, the
recommendation system learns your habits and preferences. As a result, the suggestions that
you get are becoming more and more accurate. Additionally, the recommender system used
by Netflix focuses on each user’s most recent choices. So if a few years ago you were
interested in fantasy movies, but now you mainly watch romance movies, the algorithm will
suggest mostly such titles.
The Netflix recommendation system is actually very complex, and it uses various
technologies and machine learning models to provide millions of users with accurate
suggestions. There are several algorithmic approaches in place, and they comprise8:
 Reinforcement learning (RL algorithms don’t need any information in
advance; they learn from data during the process)
 Neural networks (they try to imitate the way the human cortex works; neural
networks are extremely important in deep learning)
 Causal modelling (it’s an analytical technique concentrated on the cause-and-
effect relationships)
 Probabilistic graphical models (PGM expresses the conditional dependence
structure between random variables)
 Matrix factorization (it’s a class of collaborative filtering algorithms used
specifically in recommendation systems)
 Ensemble learning (a technique using multiple learning algorithms to achieve
better results)

11
PROPOSED SYSTEM

We are proposing the recommended system based on Cosine Similarity. We will use
the Cosine Similarity from Sklearn, as the metric to compute the similarity between two
movies. Cosine similarity is a metric used to measure how similar two items are.
Mathematically, it measures the cosine of the angle between two vectors projected in a multi-
dimensional space. The output value ranges from 0–1.0 means no similarity, where as 1
means that both the items are 100% similar.

COSINE SIMILARITY:
Similarity measure refers to distance with dimensions representing features of the data
object, in a dataset. If this distance is less, there will be a high degree of similarity, but
when the distance is large, there will be a low degree of similarity.

Some of the popular similarity measures are –

1. Euclidean Distance.
2. Manhattan Distance.
3. Jaccard Similarity.
4. Minkowski Distance.
5. Cosine Similarity.
Cosine similarity is a metric, helpful in determining, how similar the data objects are
irrespective of their size. We can measure the similarity between two sentences in
Python using Cosine Similarity. In cosine similarity, data objects in a dataset are treated as
a vector.
The formula to find the cosine similarity between two vectors is –
Cos(x, y) = x . y / ||x|| * ||y||

where,

 x . y = product (dot) of the vectors ‘x’ and ‘y’.


 ||x|| and ||y|| = length of the two vectors ‘x’ and ‘y’.
 ||x|| * ||y|| = cross product of the two vectors ‘x’ and ‘y’.
Example:
Consider an example to find the similarity between two vectors – ‘x’ and ‘y’, using Cosine
Similarity.

12
The ‘x’ vector has values, x = { 3, 2, 0, 5 }
The ‘y’ vector has values, y = { 1, 0, 0, 0 }
The formula for calculating the cosine similarity is : Cos(x, y) = x . y / ||x|| * ||y||
x . y = 3*1 + 2*0 + 0*0 + 5*0 = 3

||x|| = √ (3)^2 + (2)^2 + (0)^2 + (5)^2 = 6.16

||y|| = √ (1)^2 + (0)^2 + (0)^2 + (0)^2 = 1

∴ Cos(x, y) = 3 / (6.16 * 1) = 0.49

The dissimilarity between the two vectors ‘x’ and ‘y’ is given by –

∴ Dis(x, y) = 1 - Cos(x, y) = 1 - 0.49 = 0.51

 The cosine similarity between two vectors is measured in ‘θ’.


 If θ = 0°, the ‘x’ and ‘y’ vectors overlap, thus proving they are similar.
 If θ = 90°, the ‘x’ and ‘y’ vectors are dissimilar.

Figure 4:Cosine Similarity between two vectors

Advantages :

 The cosine similarity is beneficial because even if the two similar data objects are far
apart by the Euclidean distance because of the size, they could still have a smaller angle
between them. Smaller the angle, higher the similarity.
 When plotted on a multi-dimensional space, the cosine similarity captures the
orientation (the angle) of the data objects and not the magnitude.

13
KEYWORDS &DEFINATIONS

TF-IDF : TF-IDF stands for Term Frequency Inverse Document Frequency of records. It
can be defined as the calculation of how relevant a word in a series or corpus is to a text.
The meaning increases proportionally to the number of times in the text a word appears but
is compensated by the word frequency in the corpus (data-set).

 Term Frequency: In document d, the frequency represents the number of instances of a


given word t. Therefore, we can see that it becomes more relevant when a word appears
in the text, which is rational. Since the ordering of terms is not significant, we can use a
vector to describe the text in the bag of term models. For each specific term in the
paper, there is an entry with the value being the term frequency.
The weight of a term that occurs in a document is simply proportional to the term
frequency.
tf(t,d) = count of t in d / number of words in d

 Document Frequency: This tests the meaning of the text, which is very similar to TF,
in the whole corpus collection. The only difference is that in document d, TF is the
frequency counter for a term t, while df is the number of occurrences in the document
set N of the term t. In other words, the number of papers in which the word is present is
DF.
df(t) = occurrence of t in documents

 Inverse Document Frequency: Mainly, it tests how relevant the word is. The key aim
of the search is to locate the appropriate records that fit the demand. Since tf considers
all terms equally significant, it is therefore not only possible to use the term frequencies
to measure the weight of the term in the paper. First, find the document frequency of a
term t by counting the number of documents containing the term:
df(t) = N(t)
were

 df(t) = Document frequency of a term t


 N(t) = Number of documents containing the term t

Term frequency is the number of instances of a term in a single document only; although
the frequency of the document is the number of separate documents in which the term
appears, it depends on the entire corpus.
14
IMPLEMENTATION & ARCHITECTURE

ARCHITECTURE:

Figure 5: architecture

15
In this diagram, we break down the system architecture of the Netflix
recommendation system, analyzing the technical component of our socio-technical system on
the recommendation engine. “We identified four different layers in the architecture of the
Netflix recommendation engine: user interaction, function, algorithm, and database.
Furthermore, we identify the elements that the user interaction, function, and algorithms and
database shape.

In the user interaction layer of the Netflix recommendation engine architecture, there is a
variety of elements involved within this sector: the search feature, the feedback users share,
featured rows such as the “genre”, “top pick”, and “because you watched” row. These user
interaction features do a variety of different things. The feedback feature allows for users to
voice out their opinion on whether they feel that the recommendation given to them are
useful or not. The “genre” row is tailored by personalization of the preferred genre as
identified by the user. The “top pick row” is curated based off popularity and a user’s
identified viewing trends. The “because you watched” row is a series of recommendations
based of a user’s viewing history. The Netflix recommendation engine is built on a hybrid
filter

Within the algorithm, a user’s search functions in a capacity to give the user a result
for the search, predict a user’s request from a partial input, and find videos to recommend as
options for the user’s search. With users giving feedback, it changes the overall result
selections. The “genre” rows personalize top picks using the user’s genre preferences. The
impact the algorithm has on the “top pick row” is that it uses data from user viewing patterns
to determine rankings on what the top videos are based on popularity of the users. The
algorithm functions in the “because you watched” row by organizing videos based on its
similarity.

In conclusion all the factors of the user interaction, functionality, and algorithm, go
hand in hand and flow together in order to improve and correct recommendations to the
user’s taste preferences.

16
IMPLEMENTATION:

Figure 6: Importing packages

Figure 7: Reading Data

17
Figure 8:Counting the data set

Figure 9: Filling Null Values

18
Figure 10: Term Frequency (TF) and Inverse Document Frequency(IDF)

Figure 11: Cosine Similarity

19
Figure 12: Analyzing Datasets through Various PLOTS for Various Information

20
 In which Year Maximum Movies and TV-Shows are released

21
Figure16: Netflix Movie Rating

22
Figure18: Netflix TV-Show Rating

23
CHALLENGES FACED

It is not customary to talk about shortcomings, but we will. It should be remarked,


that building and possibility to gain all benefits of recommendation systems is not the natural
and straightforward process, as it may seem to be.
1. DEFICIENCY OF INFORMATION

Perhaps the most common and significant difficulty is a lack of high-quality data to
complete the neural network learning.

It takes a lot of cleared data to create a recommendation system that works


efficiently and makes precise suggestions. The neural network that powers
recommender software is quite sensitive to any data distortion. It means that if we give
the inaccurate and uncleared data – we cannot expect the precise results.

2. INFORMATION VARIABILITY

Recommendation Engines are always based on the data collected during past or
current periods. And it is a cruel fact by the way.

There are several business industries (Media, Online Gaming, Marketing), where the
information is changing swiftly. In such sectors, we use the “old data” to complete the
learning, while the situation could already be changed. When we finish the teaching, there
can already be more precise and fresh data. So, the suggestions are less relevant or even
irrelevant at all.

There is a simple way, and we can solve this issue. We reteach the neural network
after every new period again and again.

3. UNPREDICTABLE PERFORMANCE

Today we live in an extremely fast-changing world. So, businesses all over the globe
are in chase of constant revenue increase and turnover growth. The best way to provide year
to year growth is to plan everything: develop the strategy and follow it. Every business
strategy is primarily based on numbers and data.

As business owners count not only every minute but also the single cent and calculate
the Return on Investment (RIO) for every operation they perform. It is rather challenging to
estimate the value of the recommendation engine at first.

Although it is easy to understand what benefits of recommender systems we can get.


It is close to impossible to evaluate the benefits it brings in money terms until you try
implementing it to the existing business processes. This way businesses prefer to invest
money in something more predictable. But always there is an enthusiastic company that
invents or integrates new technologies.

24
TESTING
Software testing is an investigation conducted to provide stakeholders with
information about the quality of the product or service under test. Software testing can
also provide an objective, independent view of the software to allow the business to
appreciate and understand the risks of software implementation. Test techniques
include, but are not limited to, the process of executing a program or application with
the intent of finding software bugs (errors or other defects). Software testing can
provide objective, independent information about the quality of software and risk of
its failure to users and/or sponsors.
TESTING TECHNOLOGIES
Testing is the process of detection errors. Testing performs a quality role for
assurance and for ensuring the ability of software. The results of testing are used later
on during maintenance also.

11.1 TESTING OBJECTIVES

The main objective of testing is to uncover a host error, systematically, the


minimum effort and time starting formally, we can say
1. Test is the process of executing a program with the intent of finding an error.
2. A successful test is one that uncovers and yet undiscovered error.
3. A good test case is one that has a high probability of finding errors, if it exits.

11.2 WHITE BOX TESTING


1. This is unit testing method where the unit will be taken at a time and tested
thoroughly at statement level to find the maximum level errors.
2. We have tested step wise every piece of code, taking care of every statement in
the code.

2.1 Is executed at least once.

2.2 The white box testing is also called glass box testing.

11.3 BLACK BOX TESTING

This testing method models a single unit and checks the unit at interface and
communication with other models rather getting into detail levels. Here the model will
be treated as a black box that take input and generates the output. Output of given input

25
combinations are forwarded to other models.

11.4 UNIT TESTING:

Unit testing focuses verification effort on the smallest unit of software that is the
model using the detailed design and the process specifications testing is done to uncover
errors within the boundary of the model all models must be successful in the unit test before
the start of the integration testing.
In our project unit testing involves checking each future specified in the component a
component performs only small part of the functionality on the system and relies on
cooperating with other part of the system.

11.5 INTEGRATION TESTING

In this project integrating all the modules forms the main system when integrating all
the modules we have checked whether the integration effects working of any of the services
by giving different combinations of inputs with which the services run perfectly before
integration.

26
RESULTS
OUTPUT:

27
28
CONCLUSIUON
Netflix would not be as successful as they are today without their recommendation system that
not only selects the most relevant content for each individual user but also customizes how
content is displayed. Netflix has reinvented television by combining skillful storytelling with
customized experiences for every single user and providing a massive library of content available
to users anywhere and anytime. Netflix will continue to develop smarter recommendation
algorithms and create original content. However, we as users need to be mindful of how we
interact with Netflix and technology as a whole. Next time, before you open Netflix, take a
minute to think about some shows you may want to watch today, and then see how accurate
Netflix’s recommendations are in comparison to your own preferences? You could even try to
decide on what to watch before you open Netflix to regain some control over your entertainment
experience.

We are proposing the recommended system based on Cosine Similarity. We will use
the Cosine Similarity from Sklearn, as the metric to compute the similarity between two
movies. Cosine similarity is a metric used to measure how similar two items are.
Mathematically, it measures the cosine of the angle between two vectors projected in a multi-
dimensional space.

29
REFERENCES

[1] C. A. Gomez-Uribe and N. Hunt, The Netflix Recommender System: Algorithms,


Business Value, and Innovation, 01-Dec-2015. [Online].
Available: https://dl.acm.org/doi/10.1145/2843948. [Accessed: 04-Feb-2021].
[2] “How Netflix Became a $100 Billion Company in 20 Years,” Product Habits, 30-Apr-
2018. [Online]. Available: https://producthabits.com/how-netflix-became-a-100-billion-
company-in-20-years. [Accessed: 06-Feb-2021].
[3] Y. Koren, R. Bell, and C. Volinsky, “Matrix Factorization Techniques For Recommender
Systems,” datajobs, Aug-2019. [Online]. Available: https://datajobs.com/data-science-
repo/Recommender-Systems-[Netflix].pdf. [Accessed: 06-Feb-2021].
[4] R. Sharma, D. Gopalani and Y. Meena, “Collaborative filtering-based recommender
system: Approaches and research challenges,” 2017 3rd International Conference on
Computational Intelligence & Communication Technology (CICT), Ghaziabad, 2017, pp. 1-
6, doi: 10.1109/CIACT.2017.7977363.
[5] G. Pipis, “Item-Based Collaborative Filtering in Python,” Predictive Hacks, 20-Jun-2020.
[Online]. Available: https://predictivehacks.com/item-based-collaborative-filtering-in-
python/. [Accessed: 07-Feb-2021].
[6] Samuel, Michael. “Time Wasting and the Contemporary Television-Viewing
Experience.” University of Toronto Quarterly, vol. 86, no. 4, University of Toronto Press,
2017, pp. 78–89, doi:10.3138/utq.86.4.78.
[7] E. Semeijn, “Paradox of choice: why showing less to your customers is
more!,” Neurofied, 10-Sep-2020. [Online]. Available: https://neurofied.com/paradox-of-
choice-why-less-more/. [Accessed: 09-Feb-2021].
[8] C. Johnson, “Goodbye Stars, Hello Thumbs,” Netflix, 05-Apr-2017. [Online].
Available: https://about.netflix.com/en/news/goodbye-stars-hello-thumbs. [Accessed: 05-Feb-
2021].
[9] C. Alvino and J. Basilico, “Learning a Personalized Homepage,” Medium, 19-Apr-2017.
[Online]. Available: https://netflixtechblog.com/learning-a-personalized-homepage-
aa8ec670359a. [Accessed: 08-Feb-2021].
[10] A. Trafton, “In the Blink of an Eye,” MIT News, 16-Jan-2014. [Online].
Available: https://news.mit.edu/2014/in-the-blink-of-an-eye-0116. [Accessed: 09-Feb-2021].
[11] A. Chandrashekar, F. Amat, J. Basilico, and T. Jebara, “Artwork Personalization at
Netflix,” Medium, 07-Dec-2017. [Online]. Available: https://netflixtechblog.com/artwork-
personalization-c589f074ad76. [Accessed: 08-Feb-2021].
[12] J. Cohen, “US Netflix Subscribers Watch 3.2 Hours and Use 9.6 GB of Data Per
Day,” PCMAG, 01-May-2020. [Online]. Available: https://www.pcmag.com/news/us-netflix-
subscribers-watch-32-hours-and-use-96-gb-of-data-per-day. [Accessed: 07-Feb-2021].

30

You might also like