Professional Documents
Culture Documents
So in this case, you can see that there are many ways to go.
And people at the time when these algorithms were a popular
designed variety of metrics.
Like, for instance, we can say that if the user--
maybe we as users-- and actually,
people found that this is true in practice--
that our mediian kind of what is my basic score
differs from user to user, meaning that maybe I
think that a movie that is OK, I should just give to it 3.
And I would be 5 to some super, super exceptional movie.
The majority of movies that I like, I give to them 3.
Well, somebody else may give to vast majorities of movies 5,
like this happy user.
And only if they don't like something
they would give to it 4.
So what people have done to that,
they actually compared not the scores themselves,
but how you deviate from your average score.
So instead of using the original matrix,
you would take an average score and record your deviation,
either positive or negative from your average,
and then you'd compare.
So there are lots of different heuristics
how people try to approach this problem.
And they produce some OK results, but they're very far,
all these techniques, from the state-of-the-art method.
And the question is, why?
So clearly, the goodness of this method
would depend on how do you compute the similarity
between me and somebody else.
And it can be cosine similarities.
It can be any other similarity.
But the problem is that if you look at people--
like look at me, I may buy some books in machine learning,
because this is an area that I work in,
but I can also buy books on plants,
because I really love plants.
So how many people do you have which
would have the same tastes both in plants
and in technical books like me?
Maybe not that many.
But there clearly will be people who
would buy the same machine learning textbooks and also
people who are interested in gardening.
So the idea is that this method doesn't
enable you to detect the hidden structures that is there
in the data, which is I may be similar to some pool of users
in one dimension, but similar to some other set of users
in a different dimension.
So when we are moving from K nearest neighbor,
we are going to go to this new approach, which
goes by the name collaborative filtering or matrix
factorization, where the algorithm would be actually
able to detect these hidden groupings among the users,
both in terms of products and in terms of users.
So then you don't have to engineer very sophisticated
similarity measure, your algorithm
would be able to pick up these very complex dependencies.
And for us, as human, it would be definitely not tractable
to come up with such measures.
So now, we are ready to start talking
about the next approach, which is called matrix factorization.