Professional Documents
Culture Documents
- Existing CF-based methods can only grasp single type of relation, such as restricted
Boltzmann machine which distinctly seize the correlation of user–user or item–item
relation.
- In the initial stage, corresponding low dimensional vectors of users and items are
learned separately, which embeds the semantic information reflecting the user– user
and item–item correlation.
Prediction stage - feed-forward neural networks is employed.
Two effective based CF based methods with impressive performance:-
1) matrix factorization (MF)
2) restricted Boltzmann machine (RBM)
MF directly learns the latent vectors of the users and the items from the user-item ratings
matrix and captures the interaction between the user and the item.
The RBM-like methods explicitly make recommendation from either user or item side via
constructing independent models for users or items.
Correlation can only be considered from a single side,that is item–item or user–user
RBM can not deep enough to capture complex feature.
The most powerful approach to capture complex relations is to use deep learning
techniques.
Specially, for content-based model, numerous methods applying deep learning to
recommendation had been proposed.
There are very limited researches on employing deep neural networks to CF-based model.
- neural networks MF (NNMF)
- Collaborative Deep Learning (CDL) - which constructed the latent vector of item from
item’s content text by autoencoder and then the ratings are estimated via MF using this
latent vector.
- Deng proposed a deep learning-based MF, which employed deep autoencoder to
generate initial vectors of users and items and adopted MF with the pretrained vectors
to prediction for recommendation in social rating networks.
Two crucial issues –
1) the representations of users and items
2) the architecture of the prediction neural networks
For the first issue, an effective features representation can significantly improve the
performance.
Generating the features of the user and the item from the ratings matrix (the usual data
sources of CF-based method) for deep neural networks is still an open problem.
For the second issue, a well-designed architecture of the neural networks for CF-based RS is a
pressing problem.
To effectively address the issues above, we propose to build prior-knowledge on users and
items.
Semantic information of user framework can be generally divided into two major stages:-
1) Understanding - the embeddings of user and item can capture the user–user and item–
item co-occurrence, respectively.
2) Prediction - the interaction between item and user can be simulated by the predictive
neural network.
Information is collected form :-
1) current user and item
2) the historical records of the items and users
Two models:-
1) Constraint Model (CM)
2) Rating Independent Model (RIM)
Multivalued feed forward N. N. is used to take information of both given user and item with the
corresponding historical information into consideration.
Dataset :- MovieLens 1M and MovieLens 10M
Advantages of models:-
1) The pretrained representations contain the co-occurrence information with respect to
the user and item. The pretrained representations can be flexible and thus freely be
used in the multiviews neural networks since they are weakly coupled
2) Training the embedding and neural networks separately is more effective.
Where E˜ and Eˆ are two different matrices of embeddings of items which contain different
semantic information of items, e˜i and eˆi are the column vectors denoted item i in E˜ and Eˆ,
respectively.
Thus, e˜i and eˆj can be learned by minmizing the difference between xji and e˜ieˆj.
CM:- to count only the items giving the same ratings and drop the ones giving different ratings,
this is named as CM.
Rating Independent Model:- CM forces on capturing the global relationship between items.
However, the item with different ratings also implies different local information of item, and
which cannot be revealed by CM.
In RIM – Item giving different ratings are denoted by difference and independent embedings.
CM – Captures global co-occurrences
RIM – Captures local co-occurrences
E.g. Two items A and B there are three kinds of different rating combination
1) A@4-B@5
2) A@4-B@4
3) A@5- B@5
4) A@4-B@5
Two RIM embeddings e ˜4B and e˜5B for item B. e ˜5A and eˆ5B for A@5-B@5.
Only two global CM embeddings e˜A and eˆB for items A and B to capture the overall co-
occurrence of A and B.
A@4-B@4 and A@5B@5 are considered.
Combination of CM and RIM is useful.
Capturing both relationships of co-occurrence and interaction is the advantage of our overall
method with the traditional MF.
Optimization
The cost function of the neural networks is Mean Square Error.
The prediction neural networks are trained via stochastic gradient descent (SGD).
Experiment Setting
The dimensions of the vectors of users and items in term of CM and RIM are equal to 400.
Training method - stochastic gradient descent (SGD).
Regularization – L2 - to prevent over-fitting.
Number of layers – 5 (one input layer, one transformation layer, one merge layer, one hidden
layer, and one output layer).
Learning Rate = 0.0005
Future Scope
1) Construct a end-to-end neural networks on history view a) RNN to consider the
temporal relations between the history items of user and b) CNN to address overall
historical data of users and items.
2) Consider the content information, such as text, images, and video via appending
additional view of content into our multiview neural networks.
3) Improving the training time of our model.