IR Assignment-2

Information Retrieval
ASSIGNMENT-2

DEVANSHU MODI
179301064
Sub: Information Retrieval (CS1759)

Q1.) Explain the concept of the Probability Ranking Principle in

Information Retrieval.
If the Information Retrieval system’s response to each query is a

ranking of the documents in order of decreasing probability of
relevance to the query, where the probabilities are estimated as
accurately as possible on the basis of whatever data have been made
available to the system for this purpose, the overall effectiveness of the
system to its use will be the best that is obtainable on the basis of
those data.
● Let d represent a document in the collection.

● Let R represent the relevance of a document w.r.t. to a query q
● Let R=1 represent relevant and R=0 not relevant.
● Our goal is to estimate:
p(d,q | r=1) * p(r=1)
○ p(r = 1| q, d) = p(d,q)
p(d,q | r=0) * p(r=0)
○ p(r = 0| q, d) = p(d,q)
● PRP in action: Rank all documents by 𝑝(𝑟 = 1 | 𝑞, 𝑑)
○ Theorem: Using the PRP is optimal, in that it minimizes the
loss (Bayes risk) under 1/0 loss.
○ Provable if all probabilities correct, etc.
● Using odds, we reach a more convenient formulation of ranking :
p(r=1 | q,d)
○ O(R | q, d) = p(r=0 | q,d)
Q2.) Explain Language Modeling Versus other approaches in

information retrieval.
1

The Language Modeling Approach provides a different approach to

scoring matches between queries and documents, and the hope is that
the probabilistic language modeling foundation improves the weights
that are used, and hence the performance of the model. The major
issue is the estimation of the document model, such as choices of how
to smooth it effectively. The model has achieved very good retrieval
results. Compared to other probabilistic approaches, such as BIM, the
main difference initially appears to be that the LM approach does away
with explicitly modeling relevance (whereas this is the central variable
evaluated in the BIM approach). But this may not be the correct way to
think about things. The LM approach assumes that documents and
expressions of information needs are objects of the same type, and
assesses their match by importing the tools and methods of language
modeling from speech and natural language processing. The resulting
model is mathematically precise, conceptually simple, computationally
tractable, and intuitively appealing. This seems similar to the situation
with XML retrieval: there the approaches that assume queries and
documents are objects of the same type are also among the most
successful.
On the other hand, like all IR models, you can also raise objections to
the model. The assumption of equivalence between document and
information need representation is unrealistic. Current LM approaches
use very simple models of language, usually unigram models. Without
an explicit notion of relevance, relevance feedback is difficult to
integrate into the model, as are user preferences. It also seems
necessary to move beyond a unigram model to accommodate notions
of phrase or passage matching or Boolean retrieval operators.
2

Subsequent work in the LM approach has looked at addressing some

of these concerns, including putting relevance back into the model and
allowing a language mismatch between the query language and the
document language.
The model has significant relations to traditional tf-idf models. Term

frequency is directly represented in tf-idf models, and much recent
work has recognized the importance of document length normalization.
The effect of doing a mixture of document generation probability with
collection generation probability is a little like idf: terms rare in the
general collection but common in some documents will have a greater
influence on the ranking of documents. In most concrete realizations,
the models share treating terms as if they were independent. On the
other hand, the intuitions are probabilistic rather than geometric, the
mathematical models are more principled rather than a heuristic, and
the details of how statistics like term frequency and document length
are used differ.
3

Q3.) Explain the following with an example:

a. Text Classification and Naive Bayes.
b. Vector Space Classification and K nearest neighbor.
a.) The Naive Bayes classifier is a simple classifier that classifies based
on probabilities of events. It is applied commonly to text classification.
Though it is a simple algorithm, it performs well in many text
classification problems.
Example:
Our example will consist of four questions and statements.
We need to find out if a new sentence, say, ‘what is the price of the
book’ is a question or not.
Bayes’ Theorem:
4

We need to find out which class has a bigger probability for the new
sentence. i.e., we need to find which is bigger
P (Stmt | W hat is the price of the book) or P (Question | W hat is the price of the book)
P (W hat is the price of the book | Stmt) * P (Stmt)
P (Stmt | W hat is the price of the book) = P (W hat is the price of the book)
P (W hat is the price of the book | Question) * P (Question)
P (Question | W hat is the price of the book) = P (W hat is the price of the book)
N umber of sentences in Stmt Class
P (Stmt) = T otal number of sentences = 0.5
N umber of sentences in Stmt Class
P (Stmt) = T otal number of sentences = 0.5
P (W hat is the price of the book | Stmt) = P (W hat | Stmt) x P (is | Stmt) x P (the | Stmt)
P (price | Stmt) x P (of | Stmt) x P (the | Stmt) x P (book | Stmt)
P (W hat is the price of the book | Question) = P (W hat | Question) x P (is | Question)
x P (the | Question) x P (price | Question) x P (of | Question)
x P (the | Question) x P (book | Stmt)
Using the frequencies of the words we can calculate their respective

probabilities as
In Statement class
5

In Question class:
Therefore,
P (W hat is the price of the book | Stmt) = 1.2583314328
P (W hat is the price of the book | Question) = 1.7624289971
P (Stmt | W hat is the price of the book) = 1.2583314328 * 0.5 = 6.29165
P (Stmt | W hat is the price of the book) = 1.7624289971 * 0.5 = 8.81214
Therefore the new sentence ‘What is the price of the book’ will be
classified as ‘Question’.

6

Q4.) How Machine Learning Methods are used in ad hoc

Informational Retrieval
Rather than coming up with term and document weighting functions by
hand, we can view different sources of relevance signal (cosine score,
title match, etc.) as features in a learning problem. A classifier that has
been fed examples of relevant and nonrelevant documents for each of
a set of queries can then figure out the relative weights of these
signals. If we configure the problem so that there are pairs of a
document and a query which are assigned a relevance judgment of
relevant or nonrelevant, then we can think of this problem too as a text
classification problem. Taking such a classification approach is not
necessarily best, and we present an alternative. Nevertheless, given
the material we have covered, the simplest place to start is to approach
this problem as a classification problem, by ordering the documents
according to the confidence of a two-class classifier in its relevance
decision. And this move is not purely pedagogical; exactly this
approach is sometimes used in practice.
7

IR Assignment-2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IR Assignment-2

Uploaded by

Copyright:

Available Formats

Sub: Information Retrieval (CS1759)

Q1.) Explain the concept of the Probability Ranking Principle in

If the Information Retrieval system’s response to each query is a

● Let d represent a document in the collection.

Q2.) Explain Language Modeling Versus other approaches in

The Language Modeling Approach provides a different approach to

Subsequent work in the LM approach has looked at addressing some

The model has signiﬁcant relations to traditional tf-idf models. Term

Q3.) Explain the following with an example:

Our example will consist of four questions and statements.

Using the frequencies of the words we can calculate their respective

Q4.) How Machine Learning Methods are used in ad hoc

You might also like