CC Project - Tarik Sulic

INTERNATIONAL BURCH UNIVERSITY
FACULTY OF ENGINEERING AND NATURAL SCIENCES

DEPARTMENT OF INFORMATION TECHNOLOGIES
MOVIE RECOMMENDATION SYSTEM USING MICROSOFT AZURE

MACHINE LEARING STUDIO
CLOUD COMPUTING PROJECT REPORT

TARIK SULIĆ
Supervisor
Assist. Prof. Dr. Nejra Beganović
SARAJEVO
January, 2018
Contents
Abstract ................................................................................................................................................... 3
Introduction ............................................................................................................................................. 4
Literature Review .................................................................................................................................... 6
Methodology ........................................................................................................................................... 6
Data set information ............................................................................................................................ 7
Building the Model.............................................................................................................................. 8
Results ................................................................................................................................................... 11
Predicting Ratings ............................................................................................................................. 12
Recommendations from Catalog ....................................................................................................... 14
Web Service .......................................................................................................................................... 15
References ............................................................................................................................................. 16
2
Abstract
This paper shows the use of the Matchbox recommender modules to train a movie recommender
system on Azure Machine Learning platform. A pure collaborative filtering approach is used
for training the model. The model learns from a collection of users who have given ratings to
some of the movies in the dataset. Matrix factorization is used to deduce from this latent user
preferences and movie traits. These preferences and traits can later be used to predict what
rating a specific user will give to unseen movies so that movies that the user is most likely to
enjoy can be recommended. After training the model, a web service is deployed for an easier
user interface.
3
Introduction
Nowadays, almost everyone had an online experience where a website made custom-made
recommendations in hopes of continuing traffic and future sales. Amazon gives suggestions
like “Customers Who Bought This Item Also Bought”, Udemy gives similar suggestion
“Students Who Viewed This Course Also Viewed”. Netflix gave an award of $1 million to a
developer crew in 2009, for making an algorithm which increased the accuracy of the
company’s recommendation system by 10 percent.
In the last few years recommender systems have become increasingly popular and are used in
a variety of areas like: movies, music, news, books, research articles, search queries, social tags,
and products in general. There are also recommender systems for experts, collaborators, jokes,
restaurants, garments, financial services, life insurance, romantic partners (online dating), and
Twitter pages.
Recommender systems are a useful alternative to search algorithms since they help users
discover items they might not have found otherwise.
Building recommender systems nowadays needs specialized knowledge in analytics, machine
learning and software engineering, and acquiring new skills and tools is problematic and time-
consuming.
There are three different groups of recommendation systems. Those are the following:
• Collaborative filtering systems – Collaborative systems generate recommendations
based on user-based input. They recommend items based on user behavior, and
similarities between users. An example of this is Google PageRank, which recommends
similar web pages based on a web pages’ back links.
4
• Content-based filtering systems – Content-based systems generate suggestions based on
items and similarities between them. Pandora, a popular music streaming service, uses
content-based filtering to make its music recommendations.
• Hybrid recommendation systems – Hybrid recommendation systems mix both
collaborative and content-based algorithms. They help improve recommendations that
are derived from thin datasets. Netflix is one of the examples of a hybrid recommender
(“Building a recommendation system in Python - as easy as 1-2-3!,” 2017).
Collaborative systems often use nearest neighbor technique. The end objective of collaborative
filtering systems is to make recommendations based on users’ behavior, purchasing patterns,
and favorite items, along with item characteristics, price ranges, and product categories. This
paper analyzes the implementation of a collaborative filtering recommendation system.
5
Literature Review
Neighborhood-based collaborative filtering algorithms, also referred to as memory-based
algorithms, were among the earliest algorithms developed for collaborative filtering. These
algorithms are based on the fact that similar users display similar patterns of rating behavior
and similar items receive similar ratings (Aggarwal, 2016).
In the last few years recommender systems have become increasingly popular and were used in
a variety of areas like: movies, music, news, books, research articles, search queries, social tags,
and products in general. A lot of research was done on this topic.
One of the most popular examples on this field is the work of Ruslan Salakhutdinov and Andriy
Mnih and their recommender system which won the 1st prize on Netflix’s competition for
improving their old recommendation system in 2009. They used Probabilistic Matrix
Factorization (PMF) model which scales linearly with the number of observations and, more
importantly, performs well on the large, sparse, and very imbalanced Netflix dataset. They
achieved an error rate of 0.8861, which is nearly 7% better than the score of Netflix’s old system
(Salakhutdinov & Mnih, 2008).
Michael J. Pazzani and Daniel Billsus in their paper (Pazzani & Billsus, n.d.) discuss content-
based recommendation systems, i.e., systems that recommend an item to a user based upon a
description of the item and a profile of the user’s interests. Content-based recommendation
systems may be used in a variety of domains ranging from recommending web pages, news
articles, restaurants, television programs, and items for sale. Although the details of various
systems differ, content-based recommendation systems share in common a means for
describing the items that may be recommended.
Ivens Portugal, Paulo Alencar and Donald Cowan in their paper (Portugal, Alencar, & Cowan,
2018) discuss why choosing a suitable machine learning algorithm for a recommender system
6
is difficult because of the number of algorithms available today. They analyse different learning
algorithms in recommender systems and gives useful tips which ones to use with their
advantages and disadvantages.
In paper (Pazzani & Billsus, n.d.) Shreya Agrawal and Pooja Jain analyze how to improve the
quality of a movie recommendation system, a Hybrid approach by combining content based
filtering and collaborative filtering, using Support Vector Machine as a classifier and a genetic
algorithm which they provided in their methodology.
Methodology
This paper shows the use of the Matchbox recommender modules to train a movie recommender
system on Azure Machine Learning platform. To get started a free trial (200$) account on Azure
Portal needs to be created. When the account is made, implementation of the recommendation
system can start.
Data set information
The training data consists of approximately 225,000 ratings for 15,742 movies which was given
by 26,770 users. It was gathered from Twitter using techniques described in the original paper
by Dooms, De Pessemier and Martens (Dooms & Martens, 2014). The data can be found on
the following website: https://github.com/sidooms/MovieTweetings.
Each instance of data consists of a user identifier, a movie identifier, and the rating. The dataset
also contains a time-stamp, but it was not used in this analysis. A short insight of the dataset
can be seen on Figure 1. To this data, a file containing movie names extracted from IMDB was
added. Because a movie id does not tell give any insights about which movie it could be, they
were combined together on the movie identifier from the ratings data.
7
Figure 1. Statistical overview of the dataset
Building the Model
First, because a Train Recommender module will be used, the data needs to be prepared for that
usage. It requires triplets in this format: <user, item, rating>.
The ratings and movie datasets have already been uploaded and are available in Azure ML
Studio, so they are just connected to the studio environment.
1. The rating field looks like an integer, but is actually numeric type. Since the trainer
requires an integer rating, Metadata Editor is used to transform it to integer.
2. The Train Recommender module is more tolerant with respect to the user and item
identifiers. To make the results easier to work with, the data needs to be merged
8
including both the ratings and movie title datasets, using the Join module. A specific
key column that is common to both the left and right datasets needs to be chosen. In this
case it is the “Movie Id” column.
3. The Train Recommender module requires that the input contain three fields used for
training, so Project Columns module is used to select only the user ID, movie name, and
rating fields.
4. This dataset contains a few inconsistent ratings for the same user-movie pairs. This
presents noise in the training and evaluation, so the duplicates need to be removed,
randomly recalling only the first occurrence of each user-movie pair that is encountered.
The overview of the data preprocessing can be seen on Figure 2.
Figure 2. Overview of Data Preprocessing
As with any statistical model, the parameters need to fit on one set of data and test accuracy on
a hold-out set. In a collaborative filtering approach, something about each user and each item
needs to give information, so simply taking a random sample of all the observations will not
9
work. Fortunately, Azure ML Studio provides a special Recommender split option in the Split
module that gives control to the user on how the train and test samples are selected.
For this experiment, the following settings were used:
• Fraction of training-only users: 0.75. This means that 75% of the users will be used to
train.
• Fraction of test-user ratings for testing: 0.25. For each user in the testing group, 25% of
that user's ratings will be used for testing the model.
• Fraction of cold users: 0. Cold users are users for whom no prior training data is known.
Usually, the Matchbox algorithm can use optional user metadata to make
recommendations for users even before we've seen a single rating. However, for this
experiment the user metadata is not given, so fraction of cold users is made 0.
• Fraction of cold items: 0. Cold items are treated the same as cold users, and are evaluated
only on movies for which the ratings are known.
• Fraction of ignored users: 0. In some cases the user might want to test an algorithm or
settings on a subset of the data. Here the full dataset is used.
• Fraction of ignored items: 0. Same as for users.
Now, everything is ready to train the model. The Train Recommender module requires two
parameters:
• Number of features: This determines the number of hidden parameters that will be
extracted for each user and each item. More features make more powerful models, but
have a risk of over fitting the training data. The parameter is typically determined
through experimentation, with the goal of finding the smallest number that achieves
acceptable performance. For this experiment, the default value of 20 features is used.
10
• Number of iterations: Model parameters are found by arbitrary initialization, followed
by minimizing a residual error, difference between the true and predicted ratings for
each user-movie pair, using an iterative gradient descent technique. The error typically
decreases exponentially, meaning that most of the benefit occurs in the initial iterations.
Therefore, it is common practice not to run the optimization all the way to convergence,
but in its place, limit the iterations to a reasonable number to limit training time. For this
experiment the default value of 30 is used.
Results
In this experiment, two different ways that can be used for the trained recommender model are
shown: predicting the ratings and making n recommendations from the full catalog for each
user. The first method is used for simply evaluating the performance of the learned model, while
the second method represents a typical manufacturing use case.
Figure 3. Overview of Training the Matchbox Recommender System
11
To perform different types of predictions, the Score Recommender module is used. The module
has two required and two optional inputs.
• The first required input is a trained model. In this case the output of the trainer has been
directly connected, but for production one the trained model will be saved and then
connected to the scorer.
• The second input is a dataset to be scored. The format of this dataset will depend on the
task, which will be described below.
• The two optional ports are for user and item metadata, similar to the optional inputs
when training. Here no metadata was given, so these fields were left blank.
Predicting Ratings
Prediction is a straightforward task. An input dataset for which the scores are needed are
provided, using the three-item tuple format used for training. The Score Recommender module
will use the trained model to predict a rating for each user-movie pair, and will output a tuple
consisting of <user, item, predicted rating>.
For evaluating the accuracy of predictions, the Evaluate Recommender module is used. The
first input is the testing dataset, containing tuples (movie-user-rating) similar to those provided
for training. Typically, this data is gained by using the dataset output from the test output port
of the Split module which was used while setting up the experiment. The Evaluate
Recommender module requires two parameters:
• Minimum number of items
• Minimum number of users
By using these parameters, the user can limit the evaluation to users who have rated at least n
items; and items that have been rated by at least m users, respectively.
12
In this experiment, the second input contains the same set of tuples that were used earlier for
training the model; thus, evaluation will compare the predicted ratings with the actual ratings,
using these two metrics:
• Mean Absolute Error (MAE): MAE measures the average magnitude of the errors in a
set of predictions, without considering their direction. It is the average over the test
sample of the absolute differences between prediction and actual observation where all
individual differences have equal weight.
• Root mean squared error (RMSE): RMSE is a quadratic scoring rule that also measures
the average magnitude of the error. It is the square root of the average of squared
differences between prediction and actual observation. This measures how well the
model approximates the true expected value of the ratings and penalizes large errors
more heavily (JJ, 2016).
Figure 4. MAE and RMSE Values
The real value of these metrics is for comparing different parameter settings for the trainer.
For this run values of MAE=1.77 and RMSE=2.46 were obtained. These are reasonable,
considering the 1-10 rating scale.
13
Recommendations from Catalog
A characteristic usage for a recommendation system is to request the top n items most likely of
interest to a user from the catalog of all items. For this mode the input to the scorer should
contain only one column, containing the user IDs for which to generate recommendations.
To demonstrate this approach, a list of 100 user IDs was generated by taking the test data and
extracting a list of unique user IDs, and then used the Head option in the Partition and Sample
module to select the first 100.
The output, which can be seen on Figure 5., shows the three recommendations for each of the
100 user IDs provided. The Shawshank Redemption and Dark Knight seem to be popular
choices, which is not surprising because they have one of the best scores on IMDb.
Figure 5. Movie Recommendations for 100 User IDs
14
Web Service
A key feature of Azure Machine Learning is the ability to straightforwardly publish models as
web services on Windows Azure platform. In order to publish the movie recommender, the first
step is to save the trained model. This can be done by clicking the output port of Train
Recommender and selecting the option, Save as Trained Model.
A new experiment is then created which only has the scoring module, and then the train model
is added. Sample input data needs to be provided, so in this case the data pipeline that was built
for sampling 100 user IDs was used. To specify the Web service entry and exit points, the
special Web Service modules were used. Web service input module is attached to the node
where input data would enter the experiment and the output module is attached to the output of
the Matchbox recommender system.
After successfully running the experiment, the experiment can be published by clicking Publish
Web Service at the bottom of the experiment canvas. The overview of the experiment can be
seen on Figure 6.
Figure 5. Web Service Experiment Overview
15
References
Aggarwal, C. C. (2016). Recommender Systems. Cham: Springer International Publishing.
Building a recommendation system in Python - as easy as 1-2-3! (2017, May 2). Retrieved
January 22, 2018, from http://www.data-mania.com/blog/recommendation-system-python/
Pazzani, M. J., & Billsus, D. (n.d.). Content-Based Recommendation Systems. In Lecture Notes
in Computer Science (pp. 325–341).
Portugal, I., Alencar, P., & Cowan, D. (2018). The use of machine learning algorithms in
recommender systems: A systematic review. Expert Systems with Applications, 97, 205–227.
Salakhutdinov, R., & Mnih, A. (2008). Bayesian probabilistic matrix factorization using
Markov chain Monte Carlo. In Proceedings of the 25th international conference on Machine
learning - ICML ’08. https://doi.org/10.1145/1390156.1390267
Dooms, S., & Martens, L. (2014). “Harvesting movie ratings from structured data in social
media” by Simon Dooms and Luc Martens with Ching-man Au Yeung as coordinator. ACM
SIGWEB Newsletter, (Winter), 1–5.
JJ. (2016, March 23). MAE and RMSE — Which Metric is Better? – Human in a Machine
World – Medium. Retrieved January 31, 2018, from https://medium.com/human-in-a-machine-
world/mae-and-rmse-which-metric-is-better-e60ac3bde13d
16

CC Project - Tarik Sulic

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CC Project - Tarik Sulic

Uploaded by

Copyright:

Available Formats

INTERNATIONAL BURCH UNIVERSITY

FACULTY OF ENGINEERING AND NATURAL SCIENCES

MOVIE RECOMMENDATION SYSTEM USING MICROSOFT AZURE

CLOUD COMPUTING PROJECT REPORT

company’s recommendation system by 10 percent.

discover items they might not have found otherwise.

Building recommender systems nowadays needs specialized knowledge in analytics, machine

• Collaborative filtering systems – Collaborative systems generate recommendations

similarities between users. An example of this is Google PageRank, which recommends

similar web pages based on a web pages’ back links.

content-based filtering to make its music recommendations.

• Hybrid recommendation systems – Hybrid recommendation systems mix both

collaborative and content-based algorithms. They help improve recommendations that

(“Building a recommendation system in Python - as easy as 1-2-3!,” 2017).

filtering systems is to make recommendations based on users’ behavior, purchasing patterns,

paper analyzes the implementation of a collaborative filtering recommendation system.

Neighborhood-based collaborative filtering algorithms, also referred to as memory-based

and similar items receive similar ratings (Aggarwal, 2016).

and products in general. A lot of research was done on this topic.

(Salakhutdinov & Mnih, 2008).

systems differ, content-based recommendation systems share in common a means for

describing the items that may be recommended.

advantages and disadvantages.

quality of a movie recommendation system, a Hybrid approach by combining content based

algorithm which they provided in their methodology.

system can start.

Data set information

the following website: https://github.com/sidooms/MovieTweetings.

Building the Model

usage. It requires triplets in this format: <user, item, rating>.

Studio, so they are just connected to the studio environment.

requires an integer rating, Metadata Editor is used to transform it to integer.

case it is the “Movie Id” column.

The overview of the data preprocessing can be seen on Figure 2.

Figure 2. Overview of Data Preprocessing

For this experiment, the following settings were used:

that user's ratings will be used for testing the model.

only on movies for which the ratings are known.

settings on a subset of the data. Here the full dataset is used.

• Fraction of ignored items: 0. Same as for users.

experiment the default value of 30 is used.

the second method represents a typical manufacturing use case.

Figure 3. Overview of Training the Matchbox Recommender System

has two required and two optional inputs.

connected to the scorer.

task, which will be described below.

consisting of <user, item, predicted rating>.

Recommender module requires two parameters:

• Minimum number of items

• Minimum number of users

using these two metrics:

individual differences have equal weight.

more heavily (JJ, 2016).

Figure 4. MAE and RMSE Values

considering the 1-10 rating scale.

module to select the first 100.

Figure 5. Movie Recommendations for 100 User IDs

Recommender and selecting the option, Save as Trained Model.

the Matchbox recommender system.

Figure 5. Web Service Experiment Overview

Aggarwal, C. C. (2016). Recommender Systems. Cham: Springer International Publishing.

January 22, 2018, from http://www.data-mania.com/blog/recommendation-system-python/

in Computer Science (pp. 325–341).

learning - ICML ’08. https://doi.org/10.1145/1390156.1390267