Professional Documents
Culture Documents
This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.
CS8080
INFORMATION
RETRIEVAL
TECHNIQUES
Department: CSE UNIT I
Created by:
Dr S.SRINIVASAN, Professor,
CSE Department, RMDEC
Date: 06.11.2023
Table of Contents
Course Objectives
Pre Requisites
Syllabus
Course outcomes
CO- PO/PSO Mapping
Lecture Plan
Unit I:
Lecture Notes
Lecture Notes – Links to Videos
Lecture Notes – e book reference
Lecture Notes – PPTs
Assignments
Part A Q & A (with K level and CO)
Assessment Schedule
Prescribed Text Books & Reference Books
capabilities.
UNIT I INTRODUCTION 9
Information Retrieval – Early Developments – The IR Problem – The User‗s Task –
Information versus Data Retrieval - The IR System – The Software Architecture of
the IR System – The Retrieval and Ranking Processes - The Web – The e-Publishing
Era – How the web changed Search – Practical Issues on the Web – How People
Search – Search Interfaces Today – Visualization in Search Interfaces.
TOTAL: 45 PERIODS
Course outcomes
Tounderstand the basics of Information Retrieval
CO PO PO PO PO PO PO PO PO PO PO PO PO PS PS PS
1 2 3 4 5 6 7 8 9 10 11 12 O1 O2 O3
1 3 3 3 - - - - - - - - - 3 - -
2 3 2 2 - - - - - - - - - 3 - -
3 3 3 3 - - - - - - - - - 3 - -
4 3 3 3 - - - - - - - - - 3 - -
Unit V
RECOMMENDER
SYSTEM
Lecture Plan
per
Ses No. Proposed Actual tai Taxon Mode
Topics to be date nin
sion of Lecture omy of
No. covered Per Date g level Deliver
iod CO y
s
Recommender Systems
Functions
1 1 1 K1 PPT
Advantages and
Drawbacks of Content-
6 based Filtering 1 1 K1 PPT
Collaborative Filtering
7 1 1 K2 PPT
Matrix factorization
8 models 1 1 K2 PPT
Neighborhood models.
9 1 1 K2 PPT
Recommender Systems
Function
Introduction
Most of today’s internet businesses deeply root their success in the ability to provide
users with strongly personalized experiences
Definition
Recommender Systems (RSs) are software tools and techniques providing
suggestions for items to be of use to a user
Choicestream: 28% of the people would buy more music if they found what they
liked.
Recommender Systems Function
Its primary function is to locate documents that are relevant to the user’s
information need, but it can also be used to check the importance of a Web page or
to discover the various usages of a word in a collection of documents.
Find Some Good Items: Recommend to a user some items as a ranked list along
with predictions of how much the user would like them (e.g., on a one- to fivestar
scale). This is the main recommendation task that many commercial systems
address Some systems do not show the predicted rating.
• Find all good items: Recommend all the items that can satisfy some user needs. In
such cases it is insufficient to just find some good items. This is especially true when
the number of items is relatively small or when the RS is mission-critical, such as in
medical or financial applications. In these situations, in addition to the benefit
derived from carefully examining all the possibilities, the user may also benefit from
the RS ranking of these items or from additional explanations that the RS generates.
Recommender Systems
Function
•Annotation in context: Given an existing context, e.g., a list of items, emphasize
some of them depending on the user’s long-term preferences. For example, a TV
recommender system might annotate which TV shows displayed in the electronic
program guide (EPG) are worth watching
. • Recommend a sequence: Instead of focusing on the generation of a single
recommendation, the idea is to recommend a sequence of items that is pleasing as
a whole. Typical examples include recommending a TV series; a book on RSs after
having recommended a book on data mining; or a compilation of musical tracks
Recommend a bundle: Suggest a group of items that fits well together. For instance
a travel plan may be composed of various attractions, destinations, and
accommodation services that are located in a delimited area. From the point of view
of the user these various alternatives can be considered and selected as a single
travel destination.
Just browsing: In this task, the user browses the catalog without any imminent
intention of purchasing an item. The task of the recommender is to help the user to
browse the items that are more likely to fall within the scope of the user’s interests
for that specific browsing session. This is a task that has been also supported by
adaptive hypermedia techniques.
• Improve the profile: This relates to the capability of the user to provide (input)
information to the recommender system about what he likes and dislikes. This is a
fundamental task that is strictly necessary to provide personalized
recommendations. If the system has no specific knowledge about the active user
then it can only provide him with the same recommendations that would be
delivered to an “average” user.
Recommender Systems
Function
•Express self: Some users may not care about the recommendations at all. Rather,
what it is important to them is that they be allowed to contribute with their ratings
and express their opinions and beliefs. The user satisfaction for that activity can still
act as a leverage for holding the user tightly to the application (as we mentioned
above in discussing the service provider’s motivations).
•Help others: Some users are happy to contribute with information, e.g., their
evaluation of items (ratings), because they believe that the community benefits from
their contribution. This could be a major motivation for entering information into a
recommender system that is not used routinely. For instance, with a car RS, a user,
who has already bought her new car is aware that the rating entered in the system
is more likely to be useful for other users rather than for the next time she will buy a
car
. • Influence others: In Web-based RSs, there are users whose main goal is to
explicitly influence other users into purchasing particular products. As a matter of
fact, there are also some malicious users that may use the system just to promote
or penalize certain items
Applications
Increase the number of items sold
Sell more diverse items
Increase the user satisfaction
Increase user fidelity
Better understand what the user wants
Data and Knowledge Sources
• RSs are information processing systems that actively gather various kinds of
data in order to build their recommendations.
• Data is primarily about the items to suggest and the users who will receive these
recommendations. But, since the data and knowledge sources available for
recommender systems can be very diverse, ultimately, whether they can be
exploited or not depends on the recommendation technique.
• In general, there are recommendation techniques that are knowledge poor, i.e.,
they use very simple and basic data, such as user ratings/evaluations .
• Other techniques are much more knowledge dependent, e.g., using ontological
• Items are the objects that are recommended. Items may be characterized by
their complexity and their value or utility. The value of an item may be positive if
the item is useful for the user, or negative if the item is not appropriate and the
user made a wrong decision when selecting it.
• We note that when a user is acquiring an item she will always incur in a cost,
which includes the cognitive cost of searching for the item and the real monetary
cost eventually paid for the item. For instance, the designer of a news RS must
take into account the complexity of a news item, i.e., its structure, the textual
representation, and the time-dependent importance of any news item. But, at the
same time, the RS designer must understand that even if the user is not paying
for reading news, there is always a cognitive cost associated to searching and
reading news items.
Data and Knowledge Sources
• If a selected item is relevant for the user this cost is dominated by the benefit of
having acquired a useful information, whereas if the item is not relevant the net
value of that item for the user, and its recommendation, is negative. In other
domains, e.g., cars, or financial investments, the true monetary cost of the items
becomes an important element to consider when selecting the most appropriate
recommendation approach. Items with low complexity and value are: news, Web
pages, books, CDs, movies. Items with larger complexity and value are: digital
cameras, mobile phones, PCs, etc.
• The most complex items that have been considered are insurance policies, financial
investments, travels, jobs.
• RSs, according to their core technology, can use a range of properties and features
of the items. For example in a movie recommender system, the genre (such as
comedy, thriller, etc.), as well as the director, and actors can be used to describe a
movie and to learn how the utility of an item depends on its features. Items can be
represented using various information and representation approaches, e.g., in a
minimalist way as a single id code, or in a richer form, as a set of attributes, but
even as a concept in an ontological representation of the domain.
• Users:
• Users of a RS, as mentioned above, may have very diverse goals and
characteristics. In order to personalize the recommendations and the human-
computer interaction, RSs exploit a range of information about the users. This
information can be structured in various ways and again the selection of what
information to model depends on the recommendation technique. For instance, in
collaborative filtering, users are modeled as a simple list containing the ratings
provided by the user for some items. In a demographic RS, sociodemographic
attributes such as age, gender, profession, and education, are used. User data is
said to constitute the user model.
Data and Knowledge Sources
The user model profiles the user, i.e., encodes her preferences and needs. Various
user modeling approaches have been used and, in a certain sense, a RS can be
viewed as a tool that generates recommendations by building and exploiting user
models. Since no personalization is possible without a convenient user model, unless
the recommendation is non-personalized, as in the top-10 selection, the user model
will always play a central role. For instance, considering again a collaborative filtering
approach, the user is either profiled directly by its ratings to items or, using these
ratings, the system derives a vector of factor values, where users differ in how each
factor weights in their model.
Users can also be described by their behavior pattern data, for example, site
browsing patterns (in a Web-based recommender system), or travel search patterns
(in a travel recommender system). Moreover, user data may include relations
between users such as the trust level of these relations between users. A RS might
utilize this information to recommend items to users that were preferred by similar or
trusted users.
Transactions.
We generically refer to a transaction as a recorded interaction between a user and
the RS. Transactions are log-like data that store important information generated
during the human-computer interaction and which are useful for the recommendation
That is, the user may request a recommendation and the system may produce a
suggestion list. But it can also request additional user preferences to provide the user
with better results. Here, in the transaction model, the system collects the various
requests-responses, and may eventually learn to modify its interaction strategy by
observing the outcome of the recommendation process.
Data and Knowledge Sources
That is, the user may request a recommendation and the system may produce a
suggestion list. But it can also request additional user preferences to provide the user
with better results. Here, in the transaction model, the system collects the various
requests-responses, and may eventually learn to modify its interaction strategy by
observing the outcome of the recommendation process.
Recommendation Techniques
• In order to implement its core function, identifying the useful items for the user, a
RS must predict that an item is worth recommending. In order to do this, the
system must be able to predict the utility of some of them, or at least compare
the utility of some items, and then decide what items to recommend based on this
comparison. The prediction step may not be explicit in the recommendation
algorithm but we can still apply this unifying model to describe the general role of
a RS.
• The rationale for using this approach is that in absence of more precise
information about the user’s preferences, a popular song, i.e., something that is
liked (high utility) by many users, will also be probably liked by a generic user, at
least more than another randomly selected song. Hence the utility of these
popular songs is predicted to be reasonably high for this generic user
•
• This view of the core recommendation computation as the prediction of the utility
of an item for a user. They model this degree of utility of the user u for the item i
as a (real valued) function R(u,i), as is normally done in collaborative filtering by
considering the ratings of users for items. Then the fundamental task of a
collaborative filtering RS is to predict the value of R over pairs of users and items,
i.e., to compute 𝑅̈ (u,i), where we denote with 𝑅̈ the estimation, computed by
the RS, of the true function R. Consequently, having computed this prediction for
the active user u on a set of items, i.e., 𝑅 ̈ (u,i1),...,𝑅 ̈ (u,iN) the system will
recommend the items ij1 ,...,ijK (K ≤ N) with the largest predicted utility. K is
typically a small number, i.e., much smaller than the cardinality of the item data
set or the items on which a user utility prediction can be computed, i.e., RSs
“filter” the items that are recommended to users.
Recommendation Techniques
Some recommender systems do not fully estimate the utility before making a
recommendation but they may apply some heuristics to hypothesize that an item is of
use to a user. This is typical, for instance, in knowledge-based systems. These utility
predictions are computed with specific algorithms and use various kind of knowledge
about users, items, and the utility function itself.
FILTERING COMPONENT – This module exploits the user profile to suggest relevant
items by matching the profile representation against that of items to be
recommended. The result is a binary or continuous relevance judgment (computed
using some similarity metrics, the latter case resulting in a ranked list of potentially
interesting items. In the above mentioned example, the matching is realized by
computing the cosine similarity between the prototype vector and the item vectors.
The first step of the recommendation process is the one performed by the CONTENT
ANALYZER, that usually borrows techniques from Information Retrieval systems. Item
descriptions coming from Information Source are processed by the CONTENT
ANALYZER, that extracts features (keywords, n-grams, concepts, . . . ) from
unstructured text to produce a structured item representation, stored in the
repository Represented Items.
A High Level Architecture of
Content-based Systems
In order to construct and update the profile of the active user ua (user
for which recommendations must be provided) her reactions to items
are collected in some way and recorded in the repository Feedback.
These reactions, called annotations or feedback, together with the
related item descriptions, are exploited during the process of learning
a model useful to predict the actual relevance of newly presented
items. Users can also explicitly define their areas of interest as an
initial profile without providing any feedback.
Typically, it is possible to distinguish between two kinds of relevance
feedback: positive information (inferring features liked by the user)
and negative information (i.e., inferring features the user is not
interested in.
Two different techniques can be adopted for recording user’s
feedback. When a system requires the user to explicitly evaluate
items, this technique is usually referred to as “explicit feedback”; the
other technique, called “implicit feedback”, does not require any
active user involvement, in the sense that feedback is derived from
monitoring and analyzing user’s activities.
Explicit evaluations indicate how relevant or interesting an item is to
the user. There are three main approaches to get explicit relevance
feedback:
A High Level Architecture of
Content-based Systems
like/dislike – items are classified as “relevant” or “not relevant” by adopting a simple
binary rating scale, such as in [12]; • ratings – a discrete numeric scale is usually
adopted to judge items. Alternatively, symbolic ratings are mapped to a numeric
scale, such as in Syskill & Webert where users have the possibility of rating a Web
page as hot, lukewarm, or cold; • text comments – Comments about a single item
are collected and presented to the users as a means of facilitating the decision-
making process, For instance, customer’s feedback at Amazon.com or eBay.com
might help users in deciding whether an item has been appreciated by the
community.
In order to build the profile of the active user ua, the training set TRa for ua must be
defined. TRa is a set of pairs ⟨Ik ,rk⟩, where rk is the rating provided by ua on the
item representation Ik . Given a set of item representation labeled with ratings, the
PROFILE LEARNER applies supervised learning algorithms to generate a predictive
model – the user profile – which is usually stored in a profile repository for later use
by the FILTERING COMPONENT.
SVD
Matrix factorization models map both users and items to a joint latent factor space of
dimensionality f , such that user-item interactions are modeled as inner products in
that space. The latent space tries to explain ratings by characterizing both products
and users on factors automatically inferred from user feedback. For example, when
the products are movies, factors might measure obvious dimensions such as comedy
vs. drama, amount of action, or orientation to children; less well defined dimensions
such as depth of character development or “quirkiness”; or completely
uninterpretable dimensions.
Accordingly, each item i is associated with a vector qi ∈ Rf , and each user u is
associated with a vector pu ∈ Rf . For a given item i, the elements of qi measure the
extent to which the item possesses those factors, positive or negative. For a given
user u, the elements of pu measure the extent of interest the user has in items that
are high on the corresponding factors (again, these may be positive or negative).
The resulting dot product, 𝑞 𝑇𝑖 p u, captures the interaction between user u and item
i—i.e., the overall interest of the user in characteristics of the item. The final rating is
created by also adding in the aforementioned baseline predictors that depend only on
the user or item. Thus, a rating is predicted by the rule.
The constant λ4, which controls the extent of regularization, is usually determined by
cross validation. Minimization is typically performed by either stochastic gradient
descent or alternating least squares. Alternating least squares techniques rotate
between fixing the pu’s to solve for the qi’s and fixing the qi’s to solve for the pu’s.
Notice that when one of these is taken as a constant, the optimization problem is
quadratic and can be optimally solved
The algorithm loops through all ratings in the training data. For each given rating rui,
a prediction 𝑟Ƹui is made, and the associated prediction error eui ഺ rui – 𝑟Ƹui is computed.
=
For a given training case rui, we modify the parameters by moving in the opposite
direction of the gradient, yielding
prediction error
One can expect better accuracy by dedicating separate learning rates (γ) and
regularization (λ) to each type of learned parameter. Thus, for example, it is advised
to employ distinct learning rates to user biases, item biases and the factors
themselves.
SVD++
Prediction accuracy is improved by considering also implicit feedback, which provides
an additional indication of user preferences. This is especially helpful for those users
that provided much more implicit feedback than explicit one. As explained earlier,
even in cases where independent implicit feedback is absent, one can capture a
significant signal by accounting for which items users rate, regardless of their rating
value. This led to several methods that modeled a user factor by the identity of the
items he/she has rated. Here we focus on the SVD++ method, which was shown to
offer accuracy superior to SVD.
Matrix Factorization models
To this end, a second set of item factors is added, relating each item i to a factor
vector yi ∈ Rf . Those new item factors are used to characterize users based on the
set of items that they rated. The exact model is as follows:
yj=factor vector
The set R(u) contains the items rated by user u. Now, a user u is modeled as
We use a free user-factors vector, pu, which is learnt from the given explicit ratings.
This vector is complemented by the sum
which represents the perspective of implicit feedback. Since the yj’s are centered
around zero.
the sum is normalized by |R(u)|− 1/2 , in order to stabilize its variance across the
range of observed values of |R(u)|.
Similarity measures
Central to most item-item approaches is a similarity measure between
items. Frequently, it is based on the Pearson correlation coefficient, ρi
j, which measures the tendency of users to rate items i and j similarly.
Since many ratings are unknown, some items may share only a
handful of common observed raters.
The empirical correlation coefficient, 𝜌j,î is based only on the common
user support. It is advised to work with residuals from the baseline
predictors to compensate for user- and item-specific deviations. Thus
the approximated correlation coefficient is given by
The set U(i, j) contains the users who rated both items i and j.
Because estimated correlations based on a greater user support are
more reliable, an appropriate similarity measure, denoted by si j, is a
shrunk correlation coefficient of the form
Neighborhood models
The variable ni j = |U(i, j)| denotes the number of users that rated both i and j. A
typical value for λ8 is 100. Suppose that the true ρi j are independent random
variables drawn from a normal distribution,
for known τ2 . The mean of 0 is justified if the bui account for both user and item
deviations from average. Meanwhile, suppose that
Similarity-based interpolation
Similarity-based methods became very popular because they are intuitive and
relatively simple to implement. They also offer the following two useful properties:
1.Explainability. Users expect a system to give a reason for its predictions, rather
than presenting “black box” recommendations. Explanations not only enrich the user
experience, but also encourage users to interact with the system, fix wrong
impressions and improve long-term accuracy. The neighborhood framework allows
identifying which of the past user actions are most influential on the computed
prediction.
Some of these issues can be fixed to a certain degree, while others are
more difficult to solve within the basic framework. For example, the
third item, dealing with the sum-to-one constraint, can be alleviated by
using the following prediction rule:
Neighborhood models
Formal model
To start, we consider a hypothetical dense case, where all users but u rated both i
and all its neighbors in Sk (i;u). In that case, we could learn the interpolation weights
by modeling the relationships between item i and its neighbors through a least
squares problem
The elements of 𝐴ҧjl or 𝑏തj may differ by orders of magnitude in terms of the
number of users included in the average. Let us denote this baseline value by avg; its
precise computation is described in the next subsection. Accordingly, we define the
corresponding k ×k matrix 𝐴ҧjl and the vector 𝑏ത∈Rk :
The parameter β controls the extent of the shrinkage. A typical value would be β =
500.
Therefore, we modify so that the interpolation weights are defined as the solution of
the linear system 𝐴መw= 𝑏.The resulting interpolation weights are used to predict rui.
A global neighborhood model
4. The model naturally allows integrating different forms of user input, such as
explicit and implicit feedback.
5.A highly scalable implementation allows linear time and space complexity, thus
facilitating both item-item and user-user implementations to scale well to very large
datasets.
6.Time drifting aspects of the data can be integrated into the model, thereby
improving its accuracy
The weight from j to i is denoted by wij and will be learned from the data through
optimization. An initial sketch of the model describes each rating rui by the equation
This rule starts with the crude, yet robust, baseline predictors (bui). Then, the
estimate is adjusted by summing over all ratings by u.
A global neighborhood model
. Let us consider the interpretation of the weights. Usually the weights in a
neighborhood model represent interpolation coefficients relating unknown ratings to
existing ones. Here, we adopt a different viewpoint, that enables a more flexible
usage of the weights. We no longer treat weights as interpolation coefficients.
Similarly, one could employ here another set of implicit feedback, N(u)—e.g., the set
of items rented or purchased by the user—leading to the rule
For two items i and j, an implicit preference by u for j leads us to adjust our estimate
of rui by ci j, which is expected to be high if j is predictive on i.
Employing global weights, rather than user-specific interpolation coefficients,
emphasizes the influence of missing ratings. In other words, a user’s opinion is
formed not only by what he rated, but also by what he did not rate. However, here
we do not use interpolation, so we can decouple the definitions of bui and bu j.
Neighborhood Models :
https://www.youtube.com/watch?v=M72Ez5YvO0I
Link:
https://www.cse.iitk.ac.in/users/nsrivast/HCC/Recommender_systems
_handbook.pdf
Lecture Notes - PPTs
https://drive.google.com/file/d/12qQ1OzwndBOUn4_OEJzzgqGS30OOL
5dn/view?usp=sharing
Lecture Notes – Quiz
Link for Quiz
https://forms.gle/ToQSAmU6S95wtS6p6
Lecture Notes – References
https://www.tutorialspoint.com/human_computer_interface/informati
on_search_and_visualization.htm
http://what-when-how.com/artificial-intelligence/artificial-
intelligence-for-information-retrieval/
Assignments
Numerical ratings such as the 1-5 stars provided in the book recommender associated
with Amazon.com.
Ordinal ratings, such as “strongly agree, agree, neutral, disagree, strongly disagree”
where the user is asked to select the term that best indicates her opinion regarding an
item (usually via questionnaire).
Binary ratings that model choices in which the user is simply asked to decide if a certain
item is good or bad.
Unary ratings can indicate that a user has observed or purchased an item, or otherwise
rated the item positively. In such cases, the absence of a rating indicates that we have
no information relating the user to the item (perhaps she purchased the item somewhere
else)
Part A Q & A (with K level and CO)
16.What is the difference between content based filtering and collaborative filtering?
CO4)(K2)
Content based filtering - The point of content-based filtering system is to know the content
of both user and item. Usually it constructs and then compare user-profile and item-profile
using the content of shared attribute space. For example, for a movie, you represent it with
the movie stars in it and the genres (using a binary coding for example).
For user profile, you can do the same thing based on the users likes some movie
stars/genres etc
Collaborative filtering - Collaborative algorithm uses “User Behavior” for recommending
items. They exploit behavior of other users and items in terms of transaction history, ratings,
selection and purchase information. Other users behavior and preferences over the items
are used to recommend items to the new users. In this case, features of the items are not
known.
https://www.coursera.org/learn/datavisualization
https://www.coursera.org/learn/web-data
https://www.udemy.com/course/information-retrieval-and-mining-massive-data-
sets/
Real time Applications in day to day
life and to Industry
Adversarial information retrieval.
Question answering.
Model Exam
REFERENCE BOOKS:
1.C. Manning, P. Raghavan, and H. Schütze, ―Introduction to
Information Retrieval, Cambridge University Press, 2008.
Document summarization
Title recommendation
Disclaimer:
This document is confidential and intended solely for the educational purpose of RMK Group
of Educational Institutions. If you have received this document through email in error,
please notify the system manager. This document contains proprietary information and is
intended only to the respective group / learning community as intended. If you are not the
addressee you should not disseminate, distribute or copy through e-mail. Please notify the
sender immediately by e-mail if you have received this document by mistake and delete this
document from your system. If you are not the intended recipient you are notified that
disclosing, copying, distributing or taking any action in reliance on the contents of this
information is strictly prohibited.