This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Before the final implementation of an information retrieval system, an evaluation of the system is usually carried out. The first type of evaluation is functional checking in which the specified system functionalities are tested one by one.( Checking for errors) After the functional analysis check the performance of the system. The most common measures of system performance are time and space. Shorter the response time and smaller the space, better the system is considered to be. In a system designed for providing information retrieval, other matrices besides time and space are also of interest.ie we have to test the relevance of the retrieved document and also should rank them.ie retrieval performance must be evaluated.

**Retrieval Performance Evaluation
**

Given a retrieval strategy S, the evaluation measure quantifies the similarity between the set of documents retrieved by S and the set of relevant documents provided by the specialist. Test collection: TIPSTER/TREC, CACM, CISI and Cystic Fibrosis. Visit the site http://trec.nist.gov/ and click the overview to know about TREC

Recall and Precision

Recall: is the fraction of relevant document retrieved |Ra|/|R| Precision: is the fraction of the retrieved document which is relevant. |Ra|/|A|

Precision and Recall assume that all the documents in the answer set A have been examined. However the user is not presented with all the documents in the answer set A at once. Instead the documents in A are first sorted according to the degree of relevance. The user then examines this ranked list starting from the top document. In this situation the precision and recall measures vary as the user proceeds with his examination of the answer set A. Thus a proper evaluation requires plotting a precision versus recall curve.

Consider Rq={d3,d5,d9,d25,d39,d44,d56,d71,d89,d123} // relevant documents for query q.

d6 5. _ Nq __ P(r)= ∑ Pi(r) / Nq where P(r) is the average precision at the recall level r.d48 8.d187 14. Several strategies are used for this .d250 9.Ranking of the retrieved documents 1. They are simple.d9 * 11. intuitive and can be combined to a single curve. This single value should be interpreted as a summary of the corresponding precision versus recall curve.d38 7.d3* In the above example precision and recall figure is for a single query.d129 13. In these situations a single precision value for each query can be used. we average precision figures at each recall level as follows. I=1 The curve of precision versus recall which results from averaging the results for various queries is usually referred to as precision versus recall figures. Such average figures are normally used to compare the retrieval performance of distinct retrieval algorithms. Usually this single value summary is taken as the precision at a specified recall level. To evaluate the retrieval performance of an algorithm over all test queries.d8 * 6. An average recall Vs precision figures for two distinct retrieval algorithms is shown below There are some techniques to summaries the precision versus recall figures by a single numerical value. we might be interested in investigating whether one of them outperforms the other for each query in a given set of example queries. In this case for each query a distinct precision versus recall curve is generated.d511 12.d56 * 4.d25 * 15.d84 3.d113 10. Single Value Summaries When comparing algorithms.d123 2. Usually retrieval algorithms are evaluated by running several distinct queries. Average precision and recall figures are now a standard evaluation strategy for information retrieval systems and are used extensively in the information retrieval literature.

.

R –Precision The idea here is to generate a single value summary of the ranking by computing the precision at the R th position in the ranking. For eg 1. Precision Histograms The R-precision measures for several queries can be used to compare the retrieval history of two algorithms as follows. A positive value of RPa/b(i) indicates a better retrieval performance by algorithm A. the number of documents in the set Rq.0.4.Average precision at Seen Relevant Intervals The idea here is to generate single value summary of the ranking by averaging the precision figures obtained after each new relevant document is observed. then the average precision at seen relevant documents is given by( 1+0.66+0.Define for instance the difference RP a/b(i)=RPa(i)-RPb(i) A value of RPa/b equal to 0 indicate that both algorithms have equivalent performance in terms of R precision for i-th query.0.5.0.66.3)/5 or 0. The R-precision measure is a useful parameter for observing the behaviour of an algorithm for each individual query in an experiment.4 and 0. . Additionally one can also compute an average R-precision figure over all queries. This favours system which retrieves documents quickly. while a negative value indicate a better retrieval performance by algorithm B.5+0.In the above example R-Precision is 0.3 are the precision figures after each new relevant document is retrieved.57.4+0. Let RPa(i) and RPb(i) be the Rprecision values of retrieval algorithms A and B for the i-th query. Where R is the total number of relevant document for the current query ie.

the number of queries used in the task.Summary Table Statistics Single value measures can also be stored in a table to provide a statistical summary regarding the set of all queries in a retrieval task. Fourth recall and precision are easy to define when linear ordering of the retrieved documents is enforced. recall and precision might be inadequate.. In many situations. the total number of relevant documents which were effectively retrieved when all the queries are considered. Second the recall and precision are related measures which capture different aspects of the set of retrieved documents. However with modern systems. the total number of relevant documents which could have been retrieved by all queries etc. the total number of documents retrieved by all queries. However a more careful reflection reveals problem with these two measures. For systems which require a weak ordering though. Third recall and precision measure the effectiveness over a set of queries processed in batch mode. Thus measures which quantify the informativeness of the retrieval process might now be more appropriate. the use of a single measure which combines recall and precision could be more appropriate. For instance. With large collection such knowledge is unavailable which implies the recall cannot be estimated precisely. interactivity is the key aspect of the retrieval process.. First the proper estimation of maximum recall for a query requires detailed knowledge of all the documents in the collection. these summary table statistics could include. Precision and Recall Appropriateness Precision and recall have been used extensively to evaluate the retrieval performance of retrieval algorithms. .

the E(j) measure works as the compliment of the Harmonic mean F(j). Values of b greater than 1 indicates that the user is more interested in precision than in recall while values of b smaller than 1 indicates that the user is more interested in recall than in precision. Therefore determination of the maximum value for F can be interpreted as an attempt to find the best possible compromise between recall and precision. The idea is to allow the user to specify whether he is more interested in recall or in precision. The harmonic mean assumes high value only when both recall and precision are high. relative recall and recall effort. user oriented measures have been proposed such as coverage ratio . To cope with this problem . It is 0 when no relevant documents have been retrieved and is 1 when all the ranked documents are relevant. Let Let Let Let I be the information request R be the set of relevant documents for I A be the answer set retrieved for I U be the subset of R which is known to the user . For b=1. The E-measure Another measure which combines recall and precision was proposed by Van Rijsbergen and is called the E evaluation measure. novelty ratio . independent of the user. E(j)=1- User oriented measures Recall and precision are based on the assumption that the set of relevant documents for a query is the same. However different users might have a different interpretation of which document is relevant and which one is not. The E –measure is defined as follows 1+b2 b2 + 1 r(j) p(j) b is the user specified parameter which reflects the relative importance of recall and precision.Alternative Measures The Harmonic Mean It is a single measure which combines recall and precision computed as F(j)= and is 2 1/r(j) + 1/p(j) r(j) recall for the jth document in the ranking p(j) is the precision for the jth document in the ranking F(j) harmonic mean of r(j) and p(j) The function F assumes values in the interval [0.1].

U R Rk Ru A Coverage=|Rk| / |U| .fraction of document known to be relevant which has actually been retrieved Novelty = |Ru| . .Intersection of A and U is the documents known to the user to be relevant which are retrieved Let it be Rk Let Ru be the number of relevant documents previously unknown to the user which were retrieved. fraction of relevant documents retrieved which was unknown to the user |Ru| + |Rk| Relative recall= ratio between the number of relevant documents found and the number of relevant documents the user expected to find. Recall effort= ratio between the number of relevant documents user expected to find and the number of documents examined in an attempt to find the expected relevant documents.

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd