You are on page 1of 25

Module Code & Module Title

CU6051NT- Artificial Intelligence


Assessment Weightage & Type
25% Individual Coursework
Year and Semester

2021-22 Autumn
Student Name: Yogesh Limbu
London Met ID: 19031898
College ID: NP05CP4A190185
Assignment Due Date: 22nd December, 2021
Assignment Submission Date: 22nd December, 2021

I confirm that I understand my coursework needs to be submitted online via Google Classroom under the relevant
module page before the deadline in order for my assignment to be accepted and marked. I am fully aware that late
submissions will be treated as non-submission and a mark of zero will be awarded.
Contents

1. Introduction............................................................................................................................................ 1

1.1. Recommendation System ............................................................................................................. 1

1.2. Problem Scenario .............................................................................................................................. 5

2. Background ........................................................................................................................................... 6

2.1. Research work on Problem Domain ............................................................................................. 6

2.2. Review and Analysis of similar systems ....................................................................................... 7

2.2.1. Movie Recommendation System using genre correlation..................................................... 7

2.2.2. Content-Based Movie Recommendation Using Different Feature Set ................................. 8

2.2.3. Netflix Recommendation system ........................................................................................... 9

2.2.4. User Trends Modeling for a Content-based Recommender System .................................. 10

3. Problem Solution ................................................................................................................................. 11

3.1. Explanation of the proposed solution .......................................................................................... 11

3.2. Explanation of the Algorithm ....................................................................................................... 12

3.3. Pseudocode ................................................................................................................................ 16

3.4. Flowchart ..................................................................................................................................... 17

Conclusion................................................................................................................................................... 18

3.5. Analysis of work done ................................................................................................................. 18

3.6. How the solution addresses the real-world problem? ................................................................. 19

3.7. Further Work ............................................................................................................................... 20


Table of Figures

Figure 1: Recommendation System with collaborative filtering .................................................................... 2

Figure 2: Recommendation system with content based filtering .................................................................. 4

Figure 3: Recommendation system with hybrid system ............................................................................... 5

Figure 4: Content-Based Movie Recommendation System Using Genre Correlation .................................. 7

Figure 5: Content-Based Movie Recommendation Using Different Feature Sets ........................................ 8

Figure 6: Recommendation System in Netflix ............................................................................................... 9

Figure 7: User Trends Modeling for a Content-based Recommender System ........................................... 10

Figure 8: Angles for cos θ (cosine similarity) .............................................................................................. 13

Figure 9: cosine similarity mathematically .................................................................................................. 13

Figure 10: Similarity matrix .......................................................................................................................... 14

Figure 11: Flowchart ................................................................................................................................... 17


CU6051NT Artificial Intelligence

1. Introduction

The term Artificial Intelligence is composed of two words ‘artificial’ and ‘intelligence’. The word artificial

refers to something that is made by humans and intelligence that refers to the ability to think (Dhankar &

Walia, 2020). The term artificial intelligence simply means the process of creating intelligent machines.

This field was formed with an idea of making machines think or in other words make them have feature as

humans. AI is a broad field where various studies have been made over the past years and its community

is growing rapidly in this modern era. It has set to become one of the core components of modern

software and will continue to be in coming years. This may lead to various new problems but it will also

create a lot of opportunity. It provides enormous number of new ways of doing things that even human

beings can’t compete. AI provides wide number of applications that includes reduction in human error,

digital assistance, Faster Decisions and many more. With the use of AI in software, works are becoming

more simpler than ever. People no longer have to worry about the making mistakes as machines never

make mistakes. They no longer have to take a large amount of time to make decisions as with software

integrated with AI can make decisions faster more accurately and precisely. Almost many of the

limitations in human beings have been overcome with AI in their life.

1.1. Recommendation System

There are various fields in AI, one of such field is recommendation system. A recommendation system is

a dedicated software and methods that finds the ranking or user preference for a particular

item(Recommender System with Machine Learning and Artificial Intelligence, 2020). Recommendation

system help the user by providing them with recommendations on which product to buy, best product to

read or the type of music to listen. It helps the user to make considerable decisions in fast and in precise

way. For example: If a customer were to buy a bag from an e-commerce app, the recommendation

system of that app would also suggest him to buy some other item such as pen, books or notebooks to

the customer. Thus, this would result to benefit both the customer and the ecommerce site by suggesting

them with considerable items and by helping them with their business. These days recommendation

1|Page
19031898
CU6051NT Artificial Intelligence

systems are used in almost any business. In e-commerce sites, millions of customers and data on their

online behavior are tracked and utilized to make recommendation to the customers.

Recommendation System collects that data from the people and analyze it to make possible

recommendations. This system relies on both explicit data such as User’s feedback, review etc. and

implicit data such as browsing history, cookies and purchases. There are generally three ways of creating

recommendation system:

• Recommendation System with Collaborative Filtering: Collaborative filtering is the process in

which the filtering or the selection of the item takes place based upon the opinions of other

people. It recommends new items to the user based upon the preference interest of other similar

users (Schafer, Frankowski, Herlocker, & Sen, 2007). For example: In fig, suppose the yellow

person buys a pizza, donuts and chips for himself. Considerably the white person also seems to

have bought the same item as that of the yellow person (donuts and pizza). So, incase of the

collaborative filtering the overall ratings of the item of yellow person is measured and if the past

histories or preference of the white person is similar to that of the yellow. Then based on that data

it would recommend the white person with the items of the yellow person. Hence, the fat person

would be recommended to buy chips by the system.

Figure 1: Recommendation System with collaborative filtering

2|Page
19031898
CU6051NT Artificial Intelligence

In Collaborative filtering, recommendation can be done in many ways. Some of the popular ways

of recommendation are given below:

1. Item Based Collaborative Filtering: In this filtering, the filtering process takes place by

predicting ratings of item with rating on neighboring items. For example: If a user had

bought 3 items (A, B and C) and had to rate accordingly, the fourth item would be

recommended to user based on the similar ratings of the neighboring items.

2. User Based Collaborative Filtering: In this filtering, the filtering process takes place by

predicting the item with ratings of neighboring users. For example, if a user had bought 3

items (A, B and C) and had to rate accordingly, the fourth item would be recommended

based on the similar ratings of the neighboring users.

• Recommendation System with Content Based Filtering: In Content Based Filtering, the selection

of the item is based on the correlation between the content of the items and preference of the

user (Meteren & Someren). It attempts to guess what a user may like by tracking the activity of

the user. For example: Suppose a user had read a book named Book A in an e-commerce

application and wanted to read more for himself. Somehow, a book name Book B had a lot of

similar content or features with Book A. Thus, in the case of Content Based Filtering, the user

would be recommended with Book B as it had a lot of similar content or feature to that of the

Book A.

3|Page
19031898
CU6051NT Artificial Intelligence

Figure 2: Recommendation system with content based filtering

In content-based filtering, recommendations are done by using keywords, attributes or features

used for the objects in datasets and by comparing them with user’s previous history or profile e.g.

purchases, downloads, items searched etc. Unlike Collaborative filtering, it does not require data

from the other users for making recommendations. Once a user starts searching or buys anything

for the system, the user starts getting recommended in Content based filtering.

• Hybrid recommender system: In Hybrid recommender system, multiple data filtering methods are

used for recommending products. This system uses both Content based filtering and

Collaborative filtering approach for recommendation. Netflix is an excellent application using

Hybrid recommender system. In Netflix, the system makes recommendation by observing user’s

watch history, searching habits and finding similarity with other users. By using both collaborative

and content-based filtering approach, it recommends movies with similar genres along with those

that are highly rated by the people.

4|Page
19031898
CU6051NT Artificial Intelligence

Figure 3: Recommendation system with hybrid system

1.2. Problem Scenario

With the world driven by data, the task of managing and searching the data has become very complex.

The creation and consumption of data is constantly increasing due to introduction of mobile technologies

like smartphones, tablets and I-pads, as well as improvements in mobile networks and WiFi. In 2020, It

was estimated that on average a person created 1.7 MB of data in every 1 second (Bulao, 2021). On

YouTube, a 480p video is said to consume 8.3 MB per minute and 500 MB per hour. In an overall basis,

people create 2.5 quintillion of data in a day. Thus, the data produced by the people are inevitably huge

and difficult to handle.

YouTube as one of the largest social media Platform for video sharing is said to have users upload more

than 500 hours of fresh video in every minute (Hale, 2019). As more and more data are being added, it

becomes difficult for the organization to manage them. With such a huge amount of data, the organization

cannot take manual actions and they cannot share or recommend contents to the other users. Hence, to

5|Page
19031898
CU6051NT Artificial Intelligence

overcome this problem many automation algorithms and machines are developed to manage the data in

accurate and in less time-consuming pattern. With Automated technologies, huge amount of data could

be easily read, calculated and certain set of automated actions could be performed. In the case of

YouTube, recommendation system is used for recommending the videos and contents shared by millions

of users in world on the basis of similar user category or similar contents.

2. Background

2.1. Research work on Problem Domain

Machine Learning is a sort of Artificial Intelligence that allows the software to learn by using data for

experience and improve their accuracy without being designed to do so. In order to forecast new output

values, machine learning algorithms use historical data as input (Burns, 2021). Recommendation System

is one of the best examples for machine learning. A recommendation system is a part of machine learning

system that provides recommendation based on the preference of the consumer. These days most of the

organizations use recommendation system for suggesting suitable products and services to the

consumers. The company’s revenue increases as this system suggest relevant items to buy.

Recommendation System is a broad topic that is used in almost every field. People use

recommendations because they save time, so they play an important role in a variety of fields. It is used

in a variety of real-world applications such as entertainment, e-commerce, social media, and so on. There

are various popular applications that are greatly know for their recommendation system. For example,

Netflix recommends movies based on the interest of their users using an algorithm. Hotstar, SonyLIV,

Voot, and ALTBalaji are some other popular platforms that provide recommendations (Goyani &

Chaurasiya, 2020).

This report is all about creating movie recommendation system on the basis of similar genres, actors and

directors where the user would be suggested with similar movies having similar genres or similar director

and actors that the user had previously watched. Thus, finding the similarity between the movie genres

would be crucial for the proposed recommendation system.

6|Page
19031898
CU6051NT Artificial Intelligence

2.2. Review and Analysis of similar systems

2.2.1. Movie Recommendation System using genre correlation

Figure 4: Content-Based Movie Recommendation System Using Genre Correlation

In this given project, movie recommendation system is created using genre correlation where data

filtering takes place with analysis of user’s past behavior and recommend movies based on similar movie

genre watched. For this project, the dataset is divided into two sections where one section is responsible

for containing movie details like movie name, genre, release date and so on. While the other section is

responsible for containing ratings of the movies given by the user. The user profile matrix for this project

was created by computing the dot product of genre and the ratings matrix. Finally, a similarity measure is

computed after obtaining a dot product matrix of all the movies by computing the shortest distance

between the user under consideration and the others.

Therefore, the system at the end recommends the movies that have the least deviation from the current

user’s preference. For this recommendation system, based on the rating given by a user for a movie with

particular genre would decide the recommendation of similar movies.

7|Page
19031898
CU6051NT Artificial Intelligence

2.2.2. Content-Based Movie Recommendation Using Different Feature Set

Figure 5: Content-Based Movie Recommendation Using Different Feature Sets

The given project is based on content-based filtering where movies are recommended according to the

user’s past behavior. In this project, first the viewing history of the user is converted into implicit ratings by

using the viewing duration of the users. Then the feature of the movies that the user had viewed in the

past along with implicit ratings were computed to get the feature weight. The feature weight for each user

was calculated separately. After the calculation of feature weight for each user, it was used for the

purpose of calculating the prediction of the recommendation rating of the contents. If the user had

watched the movie entirely or for a significant portion, the extracted features from that movie is crucial as

their weighs had been assigned accordingly. To calculate the total effect of all the features, the rating of

each feature calculated accordingly had to be normalized. Finally, the performance of the

recommendation system was evaluated precision, recall and F-measure metrics (Uluyagmur, Cataltepe,

& Tayfur, 2012).

Therefore, at the end the main objective of this project was to create a recommendation system that

recommends the users with movies even though when the user had viewed very few contents. The

challenge was to catch the interest of the people even when they had very few data. Thus, this project

provides an overall view on how Content based filtering for recommendation system is supposed to work.

8|Page
19031898
CU6051NT Artificial Intelligence

2.2.3. Netflix Recommendation system

Figure 6: Recommendation System in Netflix

It has been estimated that more than 80 percent of shows in Netflix are discovered from recommendation

system (Blattmann, 2018). The recommender systems of Netflix use a variety of algorithmic approaches,

including reinforcement learning, neural networks, causal modeling, probabilistic graphical models, matrix

factorization, ensembles, and bandits. It uses a hybrid filtering approach which is combination of both

collaborative filtering and content-based filtering. The recommendation system of Netflix estimates the

user of watching a show with particular title based on following factors:

• It uses the explicit data gathered from the user like viewer ratings, history and so on for

recommendation.

• It also finds the similarity between other viewers by using their watching preferences and taste.

• It also accounts the time of the day on which the viewer watches the show. The reason behind

this is because Netflix has data that shows that viewing behavior of the viewer varies depending

on the time of the day, week or location (Springboard India, 2019).

9|Page
19031898
CU6051NT Artificial Intelligence

The main objective of Netflix recommendation system is to provide personalized suggestion by displaying

appropriate contents to the viewer at the right time. This recommendation system learns form their own

users and every time when a movie is watched by a viewer it informs the collected data to the machine

learning algorithm which in return calculates and refreshes the recommendations.

2.2.4. User Trends Modeling for a Content-based Recommender System

Figure 7: User Trends Modeling for a Content-based Recommender System

In this research, the user profile is taken into account that includes content-based features of used

resource. An evolutionary user model (based on user profile) was proposed to develop the

recommendation system. The user profile under consideration in this project was made up of user

activities that includes item chosen by the user in the past. In order to extract the user’s interests,

Clustering was done for the user profile. Clustering is an unsupervised machine learning that involves

automatic grouping of natural data. Clustering algorithms, in contrast to supervised learning, only interpret

the input data and find natural groups or clusters in feature space (Brownlee, 2020).

The user preferences were then calculated by the influence of each cluster. The trend concept was used

for creating user models. A trend represents a cluster of user activities (or a user's interest) along with a

probability value indicating how much the user prefers to choose a resource from that cluster. This

recommendation system used trend vectors for suggesting new item based on more influential trends for

which the user had strong preferences on. Thus, the recommendation system used a behavioral user

model along with the concept of trend to capture the dynamic nature of interests and preferences of the

10 | P a g e
19031898
CU6051NT Artificial Intelligence

user. This project also showed that the use of behavioral user model improved the performance of

recommendation system.

3. Problem Solution

3.1. Explanation of the proposed solution

The proposed project is all about movie recommendation system where movies are recommended on

the basis of couple of features such as genres, actor and director. For this project, the system would be

created with Content-based filtering approach. In this approach, the model does not need any data about

other user, since recommendation are specified to the only user using the system. The model (user

profile) is created in such a way that it captures the specific contents that interests the user and

recommend items based on their similarity. Thus, this proposed system would not involve other users and

simply based on the personal preference of the user the algorithm would pick the movies of similar

genres, actor and director.

Figure 8: Content based filtering workings

Even though content-based filtering may have some really great benefits. It also has some drawbacks

that cannot be overlooked:

11 | P a g e
19031898
CU6051NT Artificial Intelligence

• Content-based recommendation system has the tendency to over-specialize, recommends items

that had already been used or recommended.

• In content-based recommendation system, suggestion of the item cannot be suggested in the

basis of similarity between users.

3.2. Explanation of the Steps Involved in Proposed Solution

Content based filtering is one of the algorithms in machine learning that allows various ways to filter data

in terms of similar content from the previously used contents. It tries to predict the items that the user

might want to have based on the item that was previously bought.

Step-1: In order to build a proper recommendation system that recommends movies on the basis of

genre from the previously watched movie, the first task in the algorithm would be to get the appropriate

data and analyze them. For my case, I would be using the imdb dataset having data of movies from 1972-

2019.

Step-2: Specific features like title, genre, actor and director would be selected as the main feature, as the

recommendation would performed on the basis of the specified features.

Step-3: Since in machine learning texts cannot be used in algorithms. The words are needed to be

converted into vectors. Thus, in order to find the similarities between the movies, the movies would have

to be vectorized. For this process, CountVectorizer would be used that converts the group of words into

vectors. It is a tool provided by python library that simply converts the given texts into vector based on the

count of each word in entire text (Brownlee, machinelearningmastery, 2017).

Step-4: The similarity between movies based on genre would be calculated using Cosine Similarity.

Cosine similarity is a metric that calculates the cosine of angle formed by two vectors. The smaller the

angle formed by the two vectors, the more similar they are (Prabhakaran, 2018)

Cosine similarity is represented by cos θ.

12 | P a g e
19031898
CU6051NT Artificial Intelligence

Figure 9: Angles for cos θ (cosine similarity)

Cosine similarity is defined mathematically as the ratio of the dot product of vectors to the product of the

Euclidean norms or magnitudes of each vector. Thus, suppose if a and b were two vectors then the

cosine similarity for the two vectors would be:

Figure 10: cosine similarity mathematically

Hence, using the cosine similarity approach the similarity between the contents of the movie could be

easily computed and movies with similar contents would be recommended according to the value of

similarity.

13 | P a g e
19031898
CU6051NT Artificial Intelligence

Figure 11: Similarity matrix

After the calculation of cosine similarity, the similarity value between each movie is represented in the

form of a symmetric metrics. The given matrix in fig: are symmetrical as the similarity between A and B is

equal to similarity between B and A. The figure shows the similarity value between Movie 1 and Movie 2

with a value of 0.158.

Thus, after getting the similarity matrix the final task would be to test the accuracy and recommendations

of the recommendation system.

Algorithm

1. The features of the movie datasets are analyzed.

2. Specific features like title, genre, actor and director are used as the main features for

recommendations because recommendation of the movie would be performed on the basis of

similar genre along with similar actor and director.

3. The data is vectorized in order to make the machine learning algorithm understand.

4. The cosine similarity metric is used to find the similarity between each vector.

5. A user model is created that would save the interests of the people.

14 | P a g e
19031898
CU6051NT Artificial Intelligence

6. A list of similar movies would be generated.

7. Finally, the list of similar movies based on similarity value would be recommended.

15 | P a g e
19031898
CU6051NT Artificial Intelligence

3.3. Pseudocode

START

IMPORT Libraries

IMPORT CSV file (imdb_1972-2019.csv)

SELECT the specific feature and store in a variable

MERG the selected feature and RETURN all the feature in a single string

EXTRACT the merged feature in vectors using CountVectorizer

CALCULATE the similarity value using cosine similarity metric

CREATE a variable to store user’s likes/inputs

CREATE a function to get the index from the name of the movie

LIST the movies based on similarity score of the similar movies

CHECK the accuracy

SORT the movies according to similarity score

RECOMMEND to the user

END

16 | P a g e
19031898
CU6051NT Artificial Intelligence

3.4. Flowchart

Figure 12: Flowchart

17 | P a g e
19031898
CU6051NT Artificial Intelligence

Conclusion

3.5. Analysis of work done

The given report displays detailed AI concepts and methods available for recommendation system along

with research done on related topics. The study and working of Recommendation system was clear. The

development of system with Content-Based filtering was made possible.

For the proposed system, the system used some feature of movies like genre, title, director and actor as

the main basis for recommending movie. It was based on the concept that if a user were to like a movie,

then he/she would also love to watch similar movie with similar genre, similar actor or similar movie with

same director. Content based filtering algorithm was used for the purpose of recommending similar

movies based upon the user’s previously watched history.

The documentation and development of the proposed topic was quite difficult. Many researches had to be

done in order to find the appropriate methods and algorithms discussed in the above topics. However, at

the end the coursework was completed successfully with the proper guidelines given by our module

leader.

18 | P a g e
19031898
CU6051NT Artificial Intelligence

3.6. How the solution addresses the real-world problem?

Recommendation System are mainly created for the purpose of providing relevant items for the user

without having them go through a large volume of data. Its main objective is to reduce the overload and

provide users with personalized recommendations.

Similarly, in the same way the proposed system is about a movie recommendation system that helps the

user by recommending movies on the basis of similar genre that the user had already watched before.

With the Content based filtering approach for movie recommendation, the users would no longer have to

go through several movies and search for similar contents. The system would compute user’s past

behavior (including user’s previous watch history, genre, director etc.) and based on that contents the

system would recommend movies to the users.

Many companies use recommendation system to enhance their business and offer good service to the

customers. The quality use of recommendation system is also one of the reasons why some companies

are very successful and well known for their customer service. Therefore, the proposed movie

recommendation system would help the users by suggesting them to make relevant choice based on the

contents and thus also help the service providers in the business in real world scenario.

19 | P a g e
19031898
CU6051NT Artificial Intelligence

3.7. Further Work

In the above report, relevant research was done to find out the appropriate solutions used for creating a

recommendation system. Pseudocodes, flowcharts and an appropriate algorithm were used in the

respective documentation.

For the next phase of the report, appropriate dataset would be extracted from website. Proper

implementation of algorithm along with proper codes would be used for the development of the

recommender system. After the implementation of algorithms and codes, various testing would be done

to check the precision of the system. Therefore, at the end a proper movie recommendation system with

well documented system would be formed.

20 | P a g e
19031898
CU6051NT Artificial Intelligence

References

Blattmann, J. (2018, August 2). uxplanet. Retrieved from uxplanet: https://uxplanet.org/netflix-binging-on-

the-algorithm-a3a74a6c1f59

Brownlee, J. (2017, Septemeber 29). machinelearningmastery. Retrieved from machinelearningmastery:

https://machinelearningmastery.com/prepare-text-data-machine-learning-scikit-learn/

Brownlee, J. (2020, April 6). machinelearningmastery. Retrieved from machinelearningmastery:

https://machinelearningmastery.com/clustering-algorithms-with-python/

Bulao, J. (2021, December 7). techjury. Retrieved from techjury: https://techjury.net/blog/how-much-data-

is-created-every-day/#gref

Burns, E. (2021). techtarget. Retrieved from techtarget:

https://www.techtarget.com/searchenterpriseai/definition/machine-learning-ML

Dhankar, M., & Walia, N. (2020). An Introduction to Artificial Intelligence., (p. 105).

Goyani, M., & Chaurasiya, N. (2020). A Review of Movie Recommendation System: Limitations,.

Gudivada, V. N. (2018). sciencedirect. Retrieved from sciencedirect:

https://www.sciencedirect.com/topics/computer-science/vector-space-models

Hale, J. (2019). tubefilter. Retrieved from tubefilter: https://www.tubefilter.com/2019/05/07/number-hours-

video-uploaded-to-youtube-per-minute/

Meteren, R. v., & Someren, M. v. (n.d.). Using Content-Based Filtering for Recommendation.

Prabhakaran, S. (2018, October 22). machinelearningplus. Retrieved from machinelearningplus:

https://www.machinelearningplus.com/nlp/cosine-similarity/

21 | P a g e
19031898
CU6051NT Artificial Intelligence

Recommender System with Machine Learning and Artificial Intelligence. (2020). 111 River Street,

Hoboken: John Wiley & Sons.

Schafer, J. B., Frankowski, D., Herlocker, J., & Sen, S. (2007, January). Collaborative Filtering

Recommender Systems. p. 292.

Springboard India. (2019, November 5). Springboard India. Retrieved from Springboard India:

https://medium.com/@springboard_ind/how-netflixs-recommendation-engine-works-

bd1ee381bf81

Uluyagmur, M., Cataltepe, Z., & Tayfur, E. (2012). Content-Based Movie Recommendation Using.

22 | P a g e
19031898

You might also like