Professional Documents
Culture Documents
01-235181-047
BSIT-7A
Group Member:
Asjad Ali
01-235181-007
Project Phase_1
Expected Submission
With the help of this particular data set we have to build a recommended engine. And our
recommended engine will return maximum 10 movies name if an user search for a particular movie.
Evaluation
Recommended engine must return 5 movie names and maximum it can return 10 movie names if an
user search for a particular movie. This recommender engine should not give suggestion in between 1 to
4 and 6 to 10 it have to return 5 movie names for 10 movie names.
Import Dataset:
Identifying types of titles:
We have dropped duration, description etc. These columns will not affect our data set at any case so we
dropped them.
Encoding:
2. Standardization:
Data Visualization:
Histogram:
Boxplot:
Line Chart:
Bar Plot:
Scatter Plot:
Project Phase_2
Models/ Algorithms like TF-IDF score, word2vec are used to capture the similarity in Content
Based RS.
The goal of this project is to develop a content-based recommendation engine for movies and TV
shows on Netflix. I will compare two different methods:
For Movies:
For TV:
Search function for Movies and TV:
Recommended result for movies:
Second Method
Using the words in the movie/TV show descriptions as features