You are on page 1of 37

Objective

The goal of this assignment is to try out different ways of implementing and configuring a recommender, and
to evaluate different approaches. Start with an existing dataset of user-item ratings, and implement at least
two of these recommendation algorithms:

 Content-Based Filtering
 User-User Collaborative Filtering
 Item-Item Collaborative Filtering

Evaluate and compare different approaches, and provide at least one graph and a summary of findings and
recommendations.

2 Data source and approach

For this project, we use dataset 1_2 from the Jester database of joke ratings. The dataset includes ratings.
Ratings are given as real values ranging from -10.00 (worse) to +10.00 (best).
We use the recommenderlab package to develop and test our recommender system using different
recommendation algorithms. In addition, we use some of the ideas and code suggestions from

library(tidyverse)
library(recommenderlab)
library(knitr)

3 Data preparation
We start by preparing the data for the recommender system. We note from the data description for the Jester
database that the rating value “99” corresponds to “null” or “not rated”, and that the first column in the
dataset is the number of jokes rated by each user. So, our data preparation includes the following steps:

 Confirm that the number of non-blank ratings per row matches the number in the first column
 Drop the first column
 Convert rating values of “99” to NA
 Set row names to user ID and column names to joke ID
 Create the ratings matrix as type realRatingMatrix
 View the ratings matrix.

In the framework of the user-item utility matrix, the “users” are the raters and the “items” are the jokes. We
see that the ratings matrix includes 1.71 million ratings in total, with 23,500 users and 100 jokes. This implies
that the ratings matrix is 73% populated with non-blank ratings (or conversely, 27% sparse), compared to a
total of 2.35 million possible ratings.

# load data from github or from disk


# github repo
file <- "https://raw.githubusercontent.com/kecbenson/DATA612/master/jester-data-2.csv"
# local directory
#file <- "C:/Users/Kevin/Documents/CUNY - MSDS/DATA 612/jester_dataset_1_2/jester-data-2.csv"
df <- read.csv (file, header = FALSE)

# check that rowcounts match, then discard first column


all. equal (rowSums (df [, 2: ncol(df)]! = 99), df [, 1])
## [1] TRUE
df <- df [-1]

# set values of 99 to NA and set row & column names


df [df == 99] <- NA
rownames(df) <- paste0("u", 0:(nrow(df)-1))
colnames(df) <- paste0("j", 0:(ncol(df)-1))

glimpse(df)
## Observations: 23,500
## Variables: 100
## $ j0 <dbl> NA, -4.37, NA, 0.34, NA, NA, 3.50, -7.67, 1.02, 3.64, 9.27...
## $ j1 <dbl> 8.11, -3.88, NA, -6.55, NA, NA, 2.28, 7.14, 6.07, 1.60, 6....
## $ j2 <dbl> NA, 0.73, NA, 2.86, NA, NA, 3.01, -6.31, 6.65, -0.39, -9.8...
## $ j3 <dbl> NA, -3.20, NA, NA, NA, NA, -0.63, NA, -0.87, 3.74, -8.83, ...
## $ j4 <dbl> -2.28, -6.41, 0.73, -3.64, 9.13, -1.46, 4.95, -9.32, 6.80,...
## $ j5 <dbl> -4.22, 1.17, NA, 1.12, NA, NA, -0.49, 6.07, 0.68, -9.61, 0...
## $ j6 <dbl> 5.49, 7.82, 5.53, 5.34, -9.32, 2.72, -0.49, 3.59, 5.34, 8....
## $ j7 <dbl> -2.62, -4.76, 3.25, 2.33, -2.04, -3.83, -0.68, -2.72, 4.56...
## $ j8 <dbl> NA, -6.41, NA, NA, NA, NA, 3.30, NA, -0.39, -7.09, -6.17, ...
## $ j9 <dbl> -2.28, 0.73, NA, 2.33, NA, NA, -1.41, -8.01, 8.25, 3.88, -...
## $ j10 <dbl> 5.34, 1.99, NA, 3.54, NA, NA, 4.71, -9.13, 8.93, -6.36, 9....
## $ j11 <dbl> 3.54, 0.63, NA, 0.92, 2.04, NA, 3.30, -8.06, 6.21, -9.76, ...
## $ j12 <dbl> -9.71, -1.41, -2.04, -5.87, -2.04, -3.83, 1.26, 2.67, 0.10...
## $ j13 <dbl> 3.30, -4.32, 2.91, 2.77, NA, 0.97, -4.03, 7.43, 9.08, -7.9...
## $ j14 <dbl> -5.58, 0.63, 2.62, 4.76, -2.23, -3.83, -5.49, -7.23, -0.73...
## $ j15 <dbl> -6.07, -9.22, -1.70, -3.54, -8.50, -1.26, -5.44, -7.23, 2....
## $ j16 <dbl> -0.29, 0.87, 2.28, -7.38, 3.16, 0.00, 2.86, -0.53, 6.46, -...
## $ j17 <dbl> 2.33, 0.68, 3.11, -4.37, 2.18, -4.90, 1.17, 1.60, -5.34, -...
## $ j18 <dbl> 3.93, 0.29, 2.91, -1.75, 3.93, -2.72, 1.80, -7.91, 0.97, 2...
## $ j19 <dbl> -0.34, -4.51, 6.75, -4.81, 2.96, -0.39, 1.07, -8.11, 3.83,...
## $ j20 <dbl> -6.89, -6.65, 2.72, -1.75, 6.41, 1.50, 2.62, -7.23, 8.59, ...
## $ j21 <dbl> -2.09, -4.03, 2.82, 2.72, 2.62, NA, 0.73, -9.13, 4.71, 4.8...
## $ j22 <dbl> -8.83, 1.94, NA, 2.96, NA, NA, -0.39, -7.77, 8.45, -9.13, ...
## $ j23 <dbl> NA, -3.11, NA, NA, NA, NA, -1.94, NA, -0.44, 6.12, -8.98, ...
## $ j24 <dbl> 5.15, -4.32, NA, 1.02, NA, NA, 1.99, -8.88, 8.59, 1.60, -6...
## $ j25 <dbl> 3.98, -0.63, 4.22, 2.96, 5.10, 2.62, 1.21, 7.38, 7.52, -9....
## $ j26 <dbl> 4.47, 5.83, 2.77, 3.25, 0.05, 3.74, 2.09, 0.00, 6.26, -9.3...
## $ j27 <dbl> 3.88, 3.64, 3.35, 0.10, 6.12, NA, 3.74, -8.59, 9.08, -3.83...
## $ j28 <dbl> 5.29, 0.39, 4.22, 2.18, 7.18, 3.06, 0.63, 1.65, 7.28, -0.4...
## $ j29 <dbl> NA, -0.44, NA, 0.97, NA, NA, 3.30, -8.16, 6.17, -6.17, -6....
## $ j30 <dbl> 7.09, 0.73, 3.20, -6.89, 8.35, 2.04, 2.04, 8.40, 6.99, -8....
## $ j31 <dbl> -2.23, 1.07, 4.51, -3.93, 7.43, 4.03, -2.48, -6.55, 8.30, ...
## $ j32 <dbl> NA, 3.06, NA, NA, NA, NA, 3.35, -7.86, -7.91, -2.67, -0.49...
## $ j33 <dbl> NA, -4.51, NA, -2.43, NA, NA, 2.18, -8.01, 6.80, -4.90, 4....
## $ j34 <dbl> 3.11, -0.39, 5.24, 4.17, 7.57, 4.27, 4.32, 0.44, 2.82, 3.2...
## $ j35 <dbl> -4.42, 3.11, 2.33, -4.47, 5.49, 5.49, 1.89, -1.50, 3.50, 4...
## $ j36 <dbl> NA, 1.02, NA, NA, NA, NA, -1.99, NA, 0.15, 3.30, -9.08, 4....
## $ j37 <dbl> -5.87, -2.09, 0.00, 0.63, 8.98, 3.06, 4.17, -8.88, 1.99, 0...
## $ j38 <dbl> NA, -6.21, 1.84, 1.50, NA, NA, 2.43, 8.20, 9.17, -5.87, -0...
## $ j39 <dbl> -7.28, -4.71, 4.42, 3.45, NA, NA, 2.86, -9.13, 8.20, -4.17...
## $ j40 <dbl> NA, 1.07, NA, 2.91, NA, NA, 4.47, -8.45, 8.40, -3.54, -9.6...
## $ j41 <dbl> -4.71, 2.77, 2.82, 1.65, 5.63, 2.72, -1.02, 7.38, 6.80, -9...
## $ j42 <dbl> -8.93, 2.04, NA, NA, NA, NA, 2.38, -7.48, 8.30, -2.62, -0....
## $ j43 <dbl> NA, -5.97, NA, 3.06, NA, NA, -0.63, NA, -1.26, 1.75, -8.83...
## $ j44 <dbl> 3.40, -0.58, NA, 0.58, 4.66, NA, -4.76, 7.86, 8.01, -2.57,...
## $ j45 <dbl> -7.18, -3.74, 4.17, 3.93, NA, 0.97, -2.09, 7.67, 7.48, -1....
## $ j46 <dbl> 2.28, 3.30, NA, 3.69, NA, NA, 3.16, 7.48, 8.11, -8.88, -8....
## $ j47 <dbl> 7.48, -1.46, 3.69, 5.15, NA, 0.97, 3.79, -8.69, 7.91, 1.31...
## $ j48 <dbl> 5.15, 6.02, 3.79, 3.30, 4.17, 4.08, 4.27, -7.57, 9.03, -3....
## $ j49 <dbl> 5.73, 0.44, 2.48, 1.55, 8.11, 1.99, 1.99, 1.60, 8.88, -3.5...
## $ j50 <dbl> NA, -4.22, NA, -4.90, 7.33, NA, 1.70, 8.20, 7.28, -9.66, -...
## $ j51 <dbl> NA, -0.44, NA, -6.99, NA, NA, 4.08, 7.82, 8.11, -9.66, -7....
## $ j52 <dbl> 5.78, -1.94, 3.64, 3.06, 6.02, 4.90, 0.87, 0.68, 9.08, -4....
## $ j53 <dbl> 3.20, 8.45, 3.01, 3.64, 4.71, 0.97, -2.62, 7.38, 8.88, -2....
## $ j54 <dbl> -0.44, -3.01, NA, 1.21, -0.58, NA, 2.04, -8.69, 6.21, -3.3...
## $ j55 <dbl> 5.73, -4.51, 2.52, 0.68, 7.09, 1.50, 3.30, -7.23, 8.83, -8...
## $ j56 <dbl> NA, 3.54, NA, NA, -2.23, 0.68, 3.06, NA, 3.54, -8.01, -9.8...
## $ j57 <dbl> NA, -0.44, 4.71, NA, NA, NA, -3.74, NA, 0.19, -5.78, -4.13...
## $ j58 <dbl> NA, 4.56, NA, 1.02, NA, NA, 0.34, -8.16, 6.65, -9.76, -4.1...
## $ j59 <dbl> NA, 2.04, NA, 2.28, NA, NA, -0.92, -7.67, 6.07, -0.34, -0....
## $ j60 <dbl> -8.25, 5.78, -0.34, 1.41, 5.78, 2.09, 1.75, 8.35, 8.93, -3...
## $ j61 <dbl> 4.61, 5.63, 3.35, 1.21, 1.94, 3.06, 0.78, -7.23, 3.40, -1....
## $ j62 <dbl> NA, 6.02, NA, 0.83, NA, NA, 3.69, -7.72, 8.11, -9.71, -6.8...
## $ j63 <dbl> NA, -4.37, NA, -0.10, NA, NA, -0.68, -7.67, 8.54, 6.75, -0...
## $ j64 <dbl> -3.98, -8.98, 3.88, 4.71, 5.24, 3.20, 3.35, -1.80, 7.82, 1...
## $ j65 <dbl> 4.90, 0.87, -0.15, 1.12, 6.36, 1.89, 1.65, -8.01, 7.67, -5...
## $ j66 <dbl> NA, -4.66, NA, 3.01, NA, NA, 0.92, -6.07, 5.29, 0.24, -0.3...
## $ j67 <dbl> -4.32, 2.82, 3.06, 2.67, NA, 1.50, 2.52, -7.23, -0.29, -0....
## $ j68 <dbl> -2.72, 2.86, 2.96, 3.06, NA, 3.69, 1.70, -8.01, 0.78, -6.0...
## $ j69 <dbl> NA, -4.03, NA, 1.31, NA, NA, 5.00, -8.01, 8.06, -9.71, -3....
## $ j70 <dbl> NA, 3.06, NA, -3.64, NA, NA, 7.18, NA, -0.05, 0.44, NA, NA...
## $ j71 <dbl> NA, 2.91, NA, NA, NA, NA, 6.75, NA, 8.50, -2.14, NA, NA, N...
## $ j72 <dbl> NA, 5.44, 4.56, 2.43, NA, NA, 4.71, NA, 8.88, -9.56, NA, N...
## $ j73 <dbl> NA, -4.76, NA, NA, NA, NA, 3.20, NA, 4.27, -9.71, NA, NA, ...
## $ j74 <dbl> NA, -8.93, NA, NA, NA, NA, 2.23, NA, 9.08, -9.27, NA, NA, ...
## $ j75 <dbl> NA, -0.49, NA, NA, NA, NA, 4.47, NA, 8.54, -9.08, NA, NA, ...
## $ j76 <dbl> NA, 8.16, NA, NA, -0.05, NA, 3.50, -4.32, 6.17, -8.01, NA,...
## $ j77 <dbl> NA, -3.16, NA, NA, NA, NA, 5.97, NA, 8.74, -9.61, NA, NA, ...
## $ j78 <dbl> NA, 0.83, NA, NA, NA, NA, 0.44, NA, 3.20, -9.61, NA, NA, N...
## $ j79 <dbl> NA, 1.94, NA, -4.66, NA, NA, 4.51, NA, 8.50, -9.56, NA, NA...
## $ j80 <dbl> NA, -0.49, NA, NA, NA, 1.80, 7.14, NA, 0.92, 4.61, 9.13, 8...
## $ j81 <dbl> NA, 2.62, NA, NA, NA, NA, 5.00, -6.21, 9.13, -0.29, -9.71,...
## $ j82 <dbl> NA, 1.99, NA, NA, NA, NA, 8.98, NA, 8.54, -9.56, NA, NA, N...
## $ j83 <dbl> NA, -3.11, NA, NA, NA, NA, 5.39, NA, 6.36, -3.45, NA, NA, ...
## $ j84 <dbl> NA, -0.44, NA, NA, NA, NA, 4.27, NA, 9.03, -9.71, NA, NA, ...
## $ j85 <dbl> NA, -6.50, NA, NA, NA, NA, 4.47, NA, 9.03, 1.12, NA, NA, N...
## $ j86 <dbl> NA, 2.04, NA, NA, NA, NA, 3.20, NA, 3.54, -6.50, 9.03, NA,...
## $ j87 <dbl> NA, -3.16, NA, NA, NA, NA, 4.56, NA, 4.66, -6.94, -5.83, N...
## $ j88 <dbl> 1.21, 2.09, NA, NA, NA, NA, 4.22, 1.65, 8.93, -5.10, NA, N...
## $ j89 <dbl> NA, 5.34, 2.38, NA, NA, NA, 3.83, NA, 6.46, -6.46, NA, NA,...
## $ j90 <dbl> NA, 5.73, NA, NA, NA, NA, 1.89, NA, 6.75, 8.59, NA, NA, NA...
## $ j91 <dbl> NA, -6.70, NA, NA, NA, NA, 3.54, NA, 8.74, -3.69, NA, NA, ...
## $ j92 <dbl> NA, 1.99, NA, NA, NA, NA, 2.38, NA, 9.17, -1.84, NA, NA, N...
## $ j93 <dbl> NA, 2.62, NA, NA, NA, 1.89, 2.77, NA, 7.52, 3.11, NA, 7.04...
## $ j94 <dbl> NA, -0.49, 3.16, NA, NA, NA, -0.24, NA, 5.73, 0.87, NA, NA...
## $ j95 <dbl> -5.92, 3.45, NA, NA, NA, NA, 2.28, NA, 5.92, 1.02, NA, NA,...
## $ j96 <dbl> NA, 3.20, NA, NA, NA, NA, 5.05, 1.60, -6.60, 4.17, NA, NA,...
## $ j97 <dbl> NA, -0.53, NA, NA, NA, NA, 4.51, NA, 0.24, -6.55, NA, NA, ...
## $ j98 <dbl> NA, -0.53, NA, NA, NA, NA, 4.08, NA, 9.08, -8.93, NA, NA, ...
## $ j99 <dbl> NA, -2.96, NA, NA, NA, NA, 2.96, NA, 8.98, -9.42, NA, NA, ...
# create raw ratings matrix
r <- as.matrix(df) %>% as("realRatingMatrix")
r
## 23500 x 100 rating matrix of class 'realRatingMatrix' with 1708993 ratings.
getRatingMatrix(r)[1:10, 1:10]
## 10 x 10 sparse Matrix of class "dgCMatrix"
## [[ suppressing 10 column names 'j0', 'j1', 'j2' ... ]]
##
## u0 . 8.11 . . -2.28 -4.22 5.49 -2.62 . -2.28
## u1 -4.37 -3.88 0.73 -3.20 -6.41 1.17 7.82 -4.76 -6.41 0.73
## u2 . . . . 0.73 . 5.53 3.25 . .
## u3 0.34 -6.55 2.86 . -3.64 1.12 5.34 2.33 . 2.33
## u4 . . . . 9.13 . -9.32 -2.04 . .
## u5 . . . . -1.46 . 2.72 -3.83 . .
## u6 3.50 2.28 3.01 -0.63 4.95 -0.49 -0.49 -0.68 3.30 -1.41
## u7 -7.67 7.14 -6.31 . -9.32 6.07 3.59 -2.72 . -8.01
## u8 1.02 6.07 6.65 -0.87 6.80 0.68 5.34 4.56 -0.39 8.25
## u9 3.64 1.60 -0.39 3.74 -6.36 -9.61 8.88 -4.08 -7.09 3.88

For comparison, we also create the normalized forms of the rating matrix, using centering and Z-score
methods for normalization.

# create normalized ratings matrix


r_n1 <- normalize(r)
r_n1
## 23500 x 100 rating matrix of class 'realRatingMatrix' with 1708993 ratings.
## Normalized using center on rows.
getRatingMatrix(r_n1) [1:10, 1:10] %>% round (2)
## 10 x 10 sparse Matrix of class "dgCMatrix"
## [[ suppressing 10 column names 'j0', 'j1', 'j2' ...]]
##
## u0 . 8.17 . . -2.22 -4.16 5.55 -2.56 . -2.22
## u1 -4.09 -3.60 1.01 -2.92 -6.13 1.45 8.10 -4.48 -6.13 1.01
## u2 . . . . -2.17 . 2.63 0.35 . .
## u3 -0.18 -7.07 2.34 . -4.16 0.60 4.82 1.81 . 1.81
## u4 . . . . 5.65 . -12.80 -5.52 . .
## u5 . . . . -2.79 . 1.39 -5.16 . .
## u6 1.39 0.17 0.90 -2.74 2.84 -2.60 -2.60 -2.79 1.19 -3.52
## u7 -4.87 9.94 -3.51 . -6.52 8.87 6.39 0.08 . -5.21
## u8 -4.57 0.48 1.06 -6.46 1.21 -4.91 -0.25 -1.03 -5.98 2.66
## u9 7.07 5.03 3.04 7.17 -2.93 -6.18 12.31 -0.65 -3.66 7.31
r_n2 <- normalize(r, method = "Z-score")
r_n2
## 23500 x 100 rating matrix of class 'realRatingMatrix' with 1708993 ratings.
## Normalized using z-score on rows.
getRatingMatrix(r_n2)[1:10, 1:10] %>% round(2)
## 10 x 10 sparse Matrix of class "dgCMatrix"
## [[ suppressing 10 column names 'j0', 'j1', 'j2' ... ]]
##
## u0 . 1.57 . . -0.43 -0.80 1.07 -0.49 . -0.43
## u1 -1.02 -0.90 0.25 -0.73 -1.54 0.36 2.03 -1.12 -1.54 0.25
## u2 . . . . -1.23 . 1.49 0.20 . .
## u3 -0.05 -2.09 0.69 . -1.23 0.18 1.42 0.53 . 0.53
## u4 . . . . 1.27 . -2.87 -1.24 . .
## u5 . . . . -1.08 . 0.54 -2.00 . .
## u6 0.51 0.06 0.33 -1.01 1.05 -0.96 -0.96 -1.03 0.44 -1.30
## u7 -0.76 1.54 -0.54 . -1.01 1.38 0.99 0.01 . -0.81
## u8 -1.20 0.13 0.28 -1.70 0.32 -1.29 -0.07 -0.27 -1.57 0.70
## u9 1.41 1.00 0.61 1.43 -0.59 -1.23 2.45 -0.13 -0.73 1.46
We use image to visually represent portions of the raw and normalized ratings matrices.

# show part of ratings matrices


image(r[1:25, 1:50], main = "Raw Ratings")

image(r_n1[1:25, 1:50], main = "Normalized Ratings - Centered")


image(r_n1[1:25, 1:50], main = "Normalized Ratings - Z-Score")
4 Exploratory data analysis

4.1 Distribution of overall ratings


Next, we do some preliminary data analysis. Reviewing the distributions of the raw and normalized ratings, we
see that the mean and median ratings are 0.8 and 1.4, respectively. As expected, the normalized ratings are
centered at 0, and they are approximately normally distributed, as seen in the histogram of rating Z-scores. It
is interesting that the ratings distribution is skewed to the right, with a higher frequency of ratings above 0
than ratings below 0. This is true even after normalizing the ratings, which suggests a significant user bias
toward ratings above 0. Perhaps some people are too polite to give bad jokes a negative rating?

# distribution of ratings
par(mfrow = c(1, 3))
hist(getRatings(r), main = "Raw Ratings")
hist(getRatings(r_n1), main = "Norm. Ratings - Centered")
hist(getRatings(r_n2), prob = TRUE, xlim = c(-5, 5), breaks = 40,
main = "Norm. Ratings - Z-Score")
curve(dnorm(x), add = TRUE, col = "blue")

par (mfrow = c(1,1))


summary(getRatings(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -9.9500 -3.2000 1.3600 0.7573 5.1000 10.0000
summary(getRatings(r_n1))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -18.7149 -2.9704 0.4446 0.0000 3.1672 18.2654
summary(getRatings(r_n2))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -9.3382 -0.7244 0.1254 0.0000 0.7520 8.8608

4.2 Distribution of rating count and mean rating per user


Turning to ratings per user, we can see that the number of jokes rated per user averages about 72-73. There
are two modes in the distribution, from users who rated all 100 jokes and from those who rated 70-74 jokes.
Also there appears to be a slight mode at the minimum of 36; this probably arises from the user cutoff of 36
ratings applied when creating the dataset (the Jester data description indicates that only users with at least 36
ratings are included). In addition, the average raw rating per user is approximately 1 and ranges from -10 to 9;
as expected the average normalized ratings per user is 0.

# distribution of rating count & avg rating per user


hist(rowCounts(r), breaks = 40, main = "Distribution of Rating Count per User")

summary(rowCounts(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 36.00 53.00 72.00 72.72 100.00 100.00
par (mfrow = c(1,3))
hist(rowMeans(r), main = "Avg. Rating per User")
hist(rowMeans(r_n1), main = "Avg. Rating (Centered) per User")
hist(rowMeans(r_n2), main = "Avg. Rating (Z-Score) per User")

par (mfrow = c (1,1))


summary(rowMeans(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -9.9474 -0.8841 0.8597 0.7576 2.5702 9.3133
summary(rowMeans(r_n1))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.460e-14 -2.842e-16 0.000e+00 4.248e-18 2.961e-16 1.296e-14
summary(rowMeans(r_n2))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -6.399e-13 -6.550e-17 2.000e-19 -2.500e-17 6.990e-17 8.982e-14

4.3 Distribution of rating count and mean rating per joke


Reviewing the rating matrix along the second dimension (by joke), we can see that the average number of
ratings per joke is 17,090 (mean) or 18,648 (median). The distribution appears bi-modal, with many jokes
receiving under 11,000 ratings and some jokes receiving well over 20,000 ratings (with a particular spike from
jokes rated by virtually all 23,500 users). There are no jokes in the range of 11,000 to 14,000 ratings. Further,
the average rating per joke is about 1, with a high of 3.5 and low of -3.6. As expected, the average normalized
rating per joke is approximately 0.

# distribution of rating count & avg rating per country


hist(colCounts(r), breaks = 40, main = "Distribution of Rating Count per Joke")

summary(colCounts(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 8164 9732 18648 17090 23329 23499
par(mfrow = c(1,3))
hist(colMeans(r), main = "Avg. Rating per Joke")
hist(colMeans(r_n1), main = "Avg. Rating (Centered) per Joke")
hist(colMeans(r_n2), main = "Avg. Rating (Z-Score) per Joke")
par (mfrow = c(1,1))
summary(colMeans(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3.5738 -0.4375 0.9445 0.7163 1.8245 3.4593
summary(colMeans(r_n1))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -4.25191 -1.19483 0.20238 -0.06176 0.97356 2.69231
summary(colMeans(r_n2))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.95941 -0.25569 0.05488 -0.01312 0.22596 0.60609

We can see that the most frequently rated jokes were rated by virtually all 23,500 users.

# top N most frequently rated jokes


N <- 10
t <- sort(colCounts(r), decreasing = TRUE) %>% head(N)
topN <- cbind(Rank = 1:N,
Joke = names(t),
Frequency = t
)
rownames(topN) <- NULL
kable (topN, caption = "Top 10 Most Frequently Rated Jokes")

Top 10 Most Frequently Rated Jokes

Rank Joke Frequency

1 j12 23499

2 j14 23499

3 j16 23499

4 j35 23499

5 j49 23499

6 j4 23498

7 j19 23498

8 j52 23498

9 j6 23497

10 j7 23497

On the other hand, the least frequently rated jokes were rated by under 9,000 users.

# top N least frequently rated jokes


N <- 10
t <- sort(colCounts(r), decreasing = FALSE) %>% head(N)
topN <- cbind (Rank = 1:N,
Joke = names(t),
Frequency = t
)
rownames(topN) <- NULL
kable (topN, caption = "10 Least Frequently Rated Jokes")
10 Least Frequently Rated Jokes

Rank Joke Frequency

1 j70 8164

2 j72 8231

3 j71 8288

4 j73 8392

5 j74 8393

6 j77 8494

7 j75 8513

8 j76 8551

9 j78 8586

10 j79 8643

Finally, we summarize the top 10 most highly rated jokes based on the raw ratings as well as the normalized
ratings (for both centering and Z-scores). Note that the top 10 lists are almost identical regardless of rating
method; only a few jokes move slightly in position.

# top N most highly rated jokes


N <- 10
t <- sort(colMeans(r), decreasing = TRUE) %>% head(N)
t1 <- sort(colMeans(r_n1), decreasing = TRUE) %>% head(N)
t2 <- sort(colMeans(r_n2), decreasing = TRUE) %>% head(N)
topN <- cbind (Rank = 1: N,
Joke = names(t), Raw = round (t, 2),
Joke1 = names(t1), Centered = round(t1, 2),
Joke2 = names(t2), "Z-Score" = round (t2, 2)
)
rownames(topN) <- NULL
kable(topN, caption = "Top 10 Most Highly Rated Jokes (Based on Raw, Centered, and Z-Score Ratings)")

Top 10 Most Highly Rated Jokes (Based on Raw, Centered, and Z-Score Ratings)

Rank Joke Raw Joke1 Centered Joke2 Z-Sco

1 j88 3.46 j49 2.69 j49 0.61

2 j49 3.45 j88 2.55 j88 0.58

3 j31 3 j31 2.24 j31 0.5

4 j26 2.95 j26 2.19 j26 0.5

5 j34 2.94 j34 2.18 j35 0.49

6 j35 2.92 j35 2.16 j34 0.49

7 j61 2.66 j61 1.9 j28 0.43

8 j71 2.66 j28 1.9 j61 0.43

9 j28 2.65 j52 1.78 j52 0.4

10 j52 2.53 j71 1.75 j48 0.39

Here are the top three jokes:

 Joke 88: “A Czechoslovakian man felt his eyesight was growing steadily worse, and felt it was time to
go see an optometrist. The doctor started with some simple testing, and showed him a standard eye
chart with letters of diminishing size: CRKBNWXSKZY … ‘Can you read this?’ the doctor asked. ‘Read
it?’ the Czech answered. ‘Doc, I know him!’”

 Joke 49: “Three engineering students were gathered together discussing the possible designers of the
human body. One said, ‘It was a mechanical engineer. Just look at all the joints.’ Another said, ‘No, it
was an electrical engineer. The nervous systems many thousands of electrical connections.’ The last
said, ‘Actually it was a civil engineer. Who else would run a toxic waste pipeline through a recreational
area?’”

 Joke 31: “President Clinton looks up from his desk in the Oval Office to see one of his aides nervously
approach him. ‘What is it?’ exclaims the President. ‘It’s this Abortion Bill Mr. President, what do you
want to do about it?’ the aide replies. ‘Just go ahead and pay it.’ responds the President.”
Likewise, the lowest rated jokes are consistent regardless of rating method.

# top N lowest-rated jokes


N <- 10
t <- sort(colMeans(r), decreasing = FALSE) %>% head(N)
t1 <- sort(colMeans(r_n1), decreasing = FALSE) %>% head(N)
t2 <- sort(colMeans(r_n2), decreasing = FALSE) %>% head(N)
topN <- cbind(Rank = 1:N,
Joke = names(t), Raw = round(t, 2),
Joke1 = names(t1), Centered = round(t1, 2),
Joke2 = names(t2), "Z-Score" = round(t2, 2)
)
rownames(topN) <- NULL
kable(topN, caption = "10 Lowest-Rated Jokes (Based on Raw, Centered, and Z-Score Ratings)")

10 Lowest-Rated Jokes (Based on Raw, Centered, and Z-Score Ratings)

Rank Joke Raw Joke1 Centered Joke2 Z-Score

1 j57 -3.57 j57 -4.25 j57 -0.96

2 j15 -2.89 j15 -3.64 j15 -0.87

3 j14 -2.18 j14 -2.94 j14 -0.7

4 j43 -1.87 j43 -2.54 j12 -0.58

5 j56 -1.81 j56 -2.5 j43 -0.56

6 j23 -1.78 j12 -2.5 j56 -0.55

7 j12 -1.74 j23 -2.41 j23 -0.52

8 j36 -1.41 j73 -2.3 j73 -0.52

9 j32 -1.41 j32 -2.1 j16 -0.49

10 j73 -1.4 j36 -2.1 j19 -0.48

Here are the three worst-rated jokes in the dataset:

 Joke 57: “Why are there so many Jones’s in the phone book? Because they all have phones.”

 Joke 15: “Q: What did the blind person say when given some matzah? A: Who the hell wrote this?”
 Joke 14: “The father was very anxious to marry off his only daughter so he wanted to impress her
date. ‘Do you like to screw,’ he says. ‘Huh’ replied the surprised first date. ‘My daughter she loves to
screw and she’s good at it, you and her should go screw,’ carefully explained the father. Now very
interested the boy replied, ‘Yes, sir.’ Minutes later the girl came down the stairs, kissed her father
goodbye and the couple left. After only a few minutes she reappeared, furious, dress torn, hair a mess
and screamed ‘Dammit, Daddy, it’s the TWIST, get it straight!’”

5 Model development

5.1 Sample the dataset


Because of the large size of the dataset, we create a smaller more manageable dataset with random sampling.
This allows more efficient model development on my PC. We randomly sample 2,000 users from the dataset,
which results in a 2,000-by-100 rating matrix with 145,509 ratings. Again the rating matrix is roughly 73%
dense (27% sparse), out of 200,000 possible ratings.

# random sample of dataset


set. seed(42)
n_samp <- 2000
idx <- sort(sample(1: nrow(r), n_samp))
rs <- r[idx]
rs
## 2000 x 100 rating matrix of class 'realRatingMatrix' with 145509 ratings.

The distributions of rating count per user and per joke are similar to the prior distributions (seen above in the
exploratory data analysis), other than the fact that we have a smaller set of 2,000 users rather than 23,500.

hist(rowCounts(rs), breaks = 40, main = "Distribution of Rating Count per User")


summary(rowCounts(rs))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 36.00 53.00 72.00 72.75 100.00 100.00
hist(colCounts(rs), breaks = 20, main = "Distribution of Rating Count per Joke")

summary(colCounts(rs))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 674 827 1597 1455 1988 2000

5.2 Create and test sample recommender models


Now we test the recommenderlab infrastructure, by creating and testing sample recommender algorithms.

5.2.1 Top-N lists
We start by creating a sample recommender system using the “POPULAR” algorithm (see list of prediction
algorithms below). First we show top 5 and top 3 recommended jokes for various users. Note that the top-N
lists are returned already sorted in descending order.

## create recommender, show sample outputs


rec <- Recommender (rs, method = "POPULAR")
rec
## Recommender of type 'POPULAR' for 'realRatingMatrix'
## learned using 2000 users.
names(getModel(rec))
## [1] "topN" "ratings" "normalize"
## [4] "aggregation Ratings" "aggregation Popularity" "verbose"
# sample predicted top5 list, for 5 users
rec_top5 <- predict (rec, rs[51:55], n = 5)
rec_top5
## Recommendations as 'topNList' with n = 5 for 5 users.
as(rec_top5, "list")
## $u529
## [1] "j88" "j71" "j92" "j75" "j27"
##
## $u556
## [1] "j88" "j71" "j92" "j75" "j87"
##
## $u563
## [1] "j88" "j71" "j92" "j75" "j87"
##
## $u583
## [1] "j88" "j11" "j71" "j37" "j92"
##
## $u586
## [1] "j88" "j71" "j92" "j75" "j87"
# sample predicted top3 list, for 5 users
rec_top3 <- predict (rec, rs[51:55], n = 3)
rec_top3
## Recommendations as 'topNList' with n = 3 for 5 users.
as (rec_top3, "list")
## $u529
## [1] "j88" "j71" "j92"
##
## $u556
## [1] "j88" "j71" "j92"
##
## $u563
## [1] "j88" "j71" "j92"
##
## $u583
## [1] "j88" "j11" "j71"
##
## $u586
## [1] "j88" "j71" "j92"

5.2.2 Predicted ratings
Second, we show predicted ratings for various users. In the ratings version, the users’ actual ratings show up as
“NA”, but they are included in the ratingMatrix version. Also note that the ratings are normalized using
centering on rows.

# sample predicted ratings (NA for actual ratings), for 5 users


rec_rtg <- predict (rec, rs[51:55], type = "ratings")
rec_rtg
## 5 x 100 rating matrix of class 'realRatingMatrix' with 244 ratings.
as(rec_rtg, "matrix")[ , 1:10] %>% round(2)
## j0 j1 j2 j3 j4 j5 j6 j7 j8 j9
## u529 -0.34 -1.13 -1.25 -2.73 NA -0.26 NA NA -1.93 NA
## u556 NA NA NA NA NA NA NA NA NA NA
## u563 -1.31 NA -2.23 -3.71 NA NA NA NA -2.91 -1.28
## u583 -3.48 -4.27 -4.40 -5.88 NA -3.40 NA NA -5.08 -3.45
## u586 1.31 NA NA -1.08 NA 1.40 NA NA -0.28 NA
# sample predicted ratings (with actual ratings), for 5 users
rec_rtg1 <- predict (rec, rs [51:55], type = "ratingMatrix")
rec_rtg1
## 5 x 100 rating matrix of class 'realRatingMatrix' with 500 ratings.
as(rec_rtg1, "matrix")[ , 1:10] %>% round(2)
## j0 j1 j2 j3 j4 j5 j6 j7 j8 j9
## u529 -0.34 -1.13 -1.25 -2.73 -0.27 -0.26 -2.17 -3.53 -1.93 2.54
## u556 1.24 -0.46 0.36 -3.71 -6.92 -3.96 -3.52 -2.31 0.51 -4.59
## u563 -1.31 -6.73 -2.23 -3.71 3.80 7.93 10.26 -2.22 -2.91 -1.28
## u583 -3.48 -4.27 -4.40 -5.88 0.34 -3.40 -2.91 6.46 -5.08 -3.45
## u586 1.31 -5.90 -8.77 -1.08 5.21 1.40 -10.81 3.66 -0.28 -7.94
5.2.3 Evaluation of predicted ratings
Third, we measure and compare error metrics of predicted ratings from competing models. Below we compare
the UBCF (user-based collaborative filtering) and IBCF (item-based collaborative filtering) algorithms. To do
this, we set up an evaluation scheme based on the following criteria:

 90/10 split of the data into training and holdout / test samples.
 For the holdout sample, the model is given 20 ratings (“given-20 protocol”) and then it predicts
ratings; recall that for this dataset, each user has rated at least 36 jokes.
 For predicted ratings, a rating of 5 or higher (on a scale from -10.0 to +10.0) will be classified as a
good rating.

Calculated error metrics include the root mean squared error (RMSE), the mean squared error (MSE), and the
mean average error (MAE).

# sample errors of predicted ratings: UBCF vs IBCF


es <- evaluationScheme (rs, method = "split", train = 0.9, given = 20, goodRating = 5)
rec1 <- Recommender (getData (es, "train"), "UBCF")
rec1
## Recommender of type 'UBCF' for 'realRatingMatrix'
## learned using 1800 users.
rec2 <- Recommender (getData (es, "train"), "IBCF")
rec2
## Recommender of type 'IBCF' for 'realRatingMatrix'
## learned using 1800 users.
pred1 <- predict (rec1, getData (es, "known"), type = "ratings")
pred1
## 200 x 100 rating matrix of class 'realRatingMatrix' with 16000 ratings.
pred2 <- predict (rec2, getData (es, "known"), type = "ratings")
pred2
## 200 x 100 rating matrix of class 'realRatingMatrix' with 15994 ratings.
error <- rbind (
UBCF = calcPredictionAccuracy (pred1, getData (es, "unknown")),
IBCF = calcPredictionAccuracy (pred2, getData (es, "unknown"))
)
error %>% round (3) %>% kable (caption = "Evaluation of Rating Prediction Errors: UBCF vs. IBCF")

Evaluation of Rating Prediction Errors: UBCF vs. IBCF

RMSE MSE MAE

UBCF 4.523 20.454 3.584


Evaluation of Rating Prediction Errors: UBCF vs. IBCF

RMSE MSE MAE

IBCF 5.107 26.079 4.018

5. Evaluation of top-N recommender system


Finally, we evaluate a recommender system with 3 measures of model performance:

 Confusion matrix (contingency table)


 Receiver Operator Characteristic (ROC) curve
 Precision-recall plot.

We illustrate with a top-N recommender system using the UBCF algorithm under the following evaluation
scheme:

 Cross validation with k=4 folds


 Given-20 protocol (same as before, i.e., for each user in the holdout sample, the model will be given
20 ratings and then will predict ratings)
 Predicted ratings of >= 5 are classified as good ratings (same as before)
 N varies from 1 to 80 for the top-N lists (recall that the dataset includes a total of 100 jokes).

# sample evaluation of top-N recommender algorithm: UBCF


es1 <- evaluationScheme (rs, method = "cross", k = 4, given = 20, goodRating = 5)
es1
## Evaluation scheme with 20 items given
## Method: 'cross-validation' with 4 run(s).
## Good ratings: >=5.000000
## Data set: 2000 x 100 rating matrix of class 'realRatingMatrix' with 145509 ratings.
results <- evaluate (es1, method = "UBCF", type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/1.03sec]
## 2 [0.01sec/1.27sec]
## 3 [0.02sec/1.05sec]
## 4 [0.01sec/1.05sec]
results
## Evaluation results for 4 folds/samples using method 'UBCF'.
avg(results) %>% round(3) %>%
kable(caption = "Confusion Matrix (k=4): Top-N Lists for N=1 to 100")

Confusion Matrix (k=4): Top-N Lists for N=1 to 100

TP FP FN TN precision recall TPR FPR

1 0.414 0.586 13.160 65.839 0.414 0.042 0.042 0.008

5 1.924 3.075 11.650 63.350 0.385 0.193 0.193 0.045

10 3.543 6.457 10.032 59.968 0.354 0.339 0.339 0.094

20 5.878 14.122 7.696 52.303 0.294 0.512 0.512 0.207

30 7.468 22.532 6.107 43.894 0.249 0.613 0.613 0.333

40 8.834 31.166 4.741 35.259 0.221 0.698 0.698 0.462

50 10.076 39.924 3.498 26.502 0.202 0.770 0.770 0.594

60 11.293 48.707 2.281 17.719 0.188 0.840 0.840 0.727

80 13.575 66.425 0.000 0.000 0.170 1.000 1.000 1.000

plot (results, annotate = TRUE, main = "ROC Curve for UBCF Method")
plot(results, "prec/rec", annotate = TRUE, main = "Precision-Recall Plot for UBCF Method")
5. Develop recommender models

In order to develop a recommender system for the Jester dataset, UBCF: user-based collaborative filtering;
depends on parameter nn, similarity measure, and normalization method (defaults are 25, cosine, and center,
respectively)

 SVD: singular value decomposition method; depends on parameter k and normalization method
(defaults are 10 and center).

 Predicted ratings >= 5 are classified as good ratings.

## compare recommender algorithms


scheme <- evaluationScheme (rs, method = "split", train = 0.9, given = 20, goodRating = 5)
scheme
## Evaluation scheme with 20 items given
## Method: 'split' with 1 run(s).
## Training set proportion: 0.900
## Good ratings: >=5.000000
## Data set: 2000 x 100 rating matrix of class 'realRatingMatrix' with 145509 ratings.
# compare IBCF for various k, similarity=cosine (default)
alg_ic <- list (
"ib5" = list (name="IBCF", param = list (k = 5)),
"ib15" = list (name="IBCF", param = list (k = 15)),
"ib30" = list (name="IBCF", param = list (k = 30)),
"ib50" = list (name="IBCF", param = list (k = 50))
)
results_ic <- evaluate (scheme, alg_ic, type = "topNList",
n = c (1, 5, 10, 20, 30, 40, 50, 60, 80))
## IBCF run fold/sample [model time/prediction time]
## 1 [0.14sec/0.03sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.14sec/0.06sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.13sec/0.04sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.13sec/0.04sec]
# compare IBCF for various k, similarity=pearson
alg_ip <- list (
"ib5" = list (name="IBCF", param = list (k = 5, method="pearson")),
"ib15" = list (name="IBCF", param = list(k = 15, method="pearson")),
"ib30" = list (name="IBCF", param = list (k = 30, method="pearson")),
"ib50" = list (name="IBCF", param = list (k = 50, method="pearson"))
)
results_ip <- evaluate (scheme, alg_ip, type = "topNList",
n = c (1, 5, 10, 20, 30, 40, 50, 60, 80))
## IBCF run fold/sample [model time/prediction time]
## 1 [0.17sec/0.03sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.17sec/0.04sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.18sec/0.04sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.17sec/0.04sec]
par (mfrow = c (1, 2))
# plot ROC curves
plot (results_ic, annotate = 2, legend = "bottomright")
title (main = "ROC: IBCF-Cosine")
plot (results_ip, annotate = 2, legend = "bottomright")
title (main = "ROC: IBCF-Pearson")
# plot precision-recall plots
plot (results_ic, "prec/rec", annotate = 2, legend = "bottomright", ylim = c (0, 0.3))
title (main = "Prec-Recall: IBCF-Cosine")
plot (results_ip, "prec/rec", annotate = 2, legend = "bottomright", ylim = c (0, 0.3))
title (main = "Prec-Recall: IBCF-Pearson")
# compare UBCF for various nearest neighbors, similarity=cosine (default)
alg_uc <- list (
"ub5" = list (name="UBCF", param = list (nn = 5)),
"ub15" = list (name="UBCF", param = list (nn = 15)),
"ub25" = list (name="UBCF", param = list (nn = 25)),
"ub50" = list (name="UBCF", param = list (nn = 50))
)
results_uc <- evaluate (scheme, alg_uc, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.43sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.44sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.01sec/0.44sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.46sec]
# compare UBCF for various nearest neighbors, similarity=pearson
alg_up <- list (
"ub5" = list (name="UBCF", param = list (nn = 5, method="pearson")),
"ub15" = list (name="UBCF", param = list (nn = 15, method="pearson")),
"ub25" = list (name="UBCF", param = list (nn = 25, method="pearson")),
"ub50" = list (name="UBCF", param = list (nn = 50, method="pearson"))
)
results_up <- evaluate (scheme, alg_up, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.48sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.03sec/0.45sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.49sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.51sec]
# plot ROC curves
plot(results_uc, annotate = 4, legend = "bottomright")
title (main = "ROC: UBCF-Cosine")
plot (results_up, annotate = 4, legend = "bottomright")
title (main = "ROC: UBCF-Pearson")

# plot precision-recall plots


plot (results_uc, "prec/rec", annotate = 4, legend = "bottomleft", ylim = c(0, 0.4))
title (main = "Prec-Recall: UBCF-Cosine")
plot (results_up, "prec/rec", annotate = 4, legend = "bottomleft", ylim = c(0, 0.4))
title (main = "Prec-Recall: UBCF-Pearson")
# compare SVD for various k
alg_s <- list (
"svd3" = list (name="SVD", param = list (k = 3)),
"svd5" = list (name="SVD", param = list (k = 5)),
"svd10" = list (name="SVD", param = list (k = 10)),
"svd20" = list (name="SVD", param = list (k = 20))
)
results_s <- evaluate (scheme, alg_s, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## SVD run fold/sample [model time/prediction time]
## 1 [0.05sec/0.03sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.05sec/0.03sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.04sec/0.03sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.08sec/0.04sec]
par (mfrow = c (1, 1))
plot (results_s, annotate = 3, legend = "bottomright")
title (main = "ROC Curve: SVD")

plot (results_s, "prec/rec", annotate = 3, legend = "bottomleft", ylim = c (0, 0.4))


title (main = "Precision-Recall Plot: SVD")
5.3.2 Compare the algorithms
Having chosen the algorithm parameters, now we can evaluate model performance using the 5 algorithms. In
our case, predictor systems based on the UBCF and POPULAR algorithms produce the best performance, as
measured by ROC curves, precision-recall plots, and error metrics.

# now compare different algorithms with optimal parameters


algorithms <- list(
"random items" = list (name="RANDOM", param = NULL),
"popular items" = list(name="POPULAR", param = NULL),
"item-based CF" = list(name="IBCF", param = list(k = 5, method = "pearson")),
"user-based CF" = list(name="UBCF", param = list(nn = 50, method = "pearson")),
"SVD approximation" = list(name="SVD", param = list(k = 10))
)
# run algorithms for top-N list
results <- evaluate (scheme, algorithms, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## RANDOM run fold/sample [model time/prediction time]
## 1 [0sec/0.06sec]
## POPULAR run fold/sample [model time/prediction time]
## 1 [0.02sec/0.31sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.16sec/0.03sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.45sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.04sec/0.04sec]
results
## List of evaluation results for 5 recommenders:
## Evaluation results for 1 folds/samples using method 'RANDOM'.
## Evaluation results for 1 folds/samples using method 'POPULAR'.
## Evaluation results for 1 folds/samples using method 'IBCF'.
## Evaluation results for 1 folds/samples using method 'UBCF'.
## Evaluation results for 1 folds/samples using method 'SVD'.
plot (results, annotate = 4, legend = "bottomright")
title (main = "Comparison of ROC Curves")

plot(results, "prec/rec", annotate = 4, legend = "topright", ylim = c(0, 0.4))


title (main = "Comparison of Precision-Recall Plots")
# run algorithms for predicted ratings
results <- evaluate (scheme, algorithms, type = "ratings")
## RANDOM run fold/sample [model time/prediction time]
## 1 [0sec/0.01sec]
## POPULAR run fold/sample [model time/prediction time]
## 1 [0.02sec/0.01sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.16sec/0.01sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.44sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.05sec/0.02sec]
results
## List of evaluation results for 5 recommenders:
## Evaluation results for 1 folds/samples using method 'RANDOM'.
## Evaluation results for 1 folds/samples using method 'POPULAR'.
## Evaluation results for 1 folds/samples using method 'IBCF'.
## Evaluation results for 1 folds/samples using method 'UBCF'.
## Evaluation results for 1 folds/samples using method 'SVD'.
plot(results)
title(main = "Comparison of Model Error vs. Actual Ratings")
plot (results, ylim = c(0, 10))
title (main = "Same Comparison: Close-up")
6 Summary of findings

 For both the IBCF and UBCF algorithms, Pearson similarity out-performed cosine similarity for this
dataset. It would be interesting to see if this conclusion still holds under different evaluation schemes.
 Out of the 5 algorithms considered, UBCF and POPULAR were the top-performing algorithms, as
measured by ROC, precision-recall, and error metrics. It is disconcerting that simply choosing the most
popular items works just as well as our collaborative filtering model. Perhaps the UBCF model can be
improved for higher nearest-neighbor values.
 In the final analysis, the best models (UBCF and POPULAR) are still not particularly good. For
instance, the RMSE and MAE for both models are in the 4.0-4.5 range, meaning that model-predicted
ratings are accurate to within ±± 4 to ±± 4.5 points on average, which is a spread of 8-9 points out of a
total rating scale of 20 points (from -10 to +10).

You might also like