Automated information retrieval: theory and methods
Automated information retrieval: theory and methods
Text retrieval and filtering: analytic models of performance
Text retrieval and filtering: analytic models of performance
Information Retrieval
Information Retrieval: Computational and Theoretical Aspects
Information Retrieval: Computational and Theoretical Aspects
Information Retrieval: Algorithms and Heuristics
Information Retrieval: Algorithms and Heuristics
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Vector retrieval, fuzzy retrieval and the universal fuzzy IR surface for IR evaluation
Information Processing and Management: an International Journal
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Percent perfect performance (PPP)
Information Processing and Management: an International Journal
Journal of Information Science
Existence theorem of the quadruple (P, R, F, M): precision, recall, fallout and miss
Information Processing and Management: an International Journal
Information Processing and Management: an International Journal
Information Processing and Management: an International Journal
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Hi-index | 0.00 |
The paper shows that the present evaluation methods in information retrieval (basically recall R and precision P and in some cases fallout F) lack universal comparability in the sense that their values depend on the generality of the IR problem. A solution is given by using all "parts" of the database, including the non-relevant documents and also the not-retrieved documents. It turns out that the solution is given by introducing the measure M being the fraction of the not-retrieved documents that are relevant (hence the "miss" measure). We prove that--independent of the IR problem or of the IR action--the quadruple (P, R, F, M) belongs to a universal IR surface, being the same for all IR-activities. This universality is then exploited by defining a new measure for evaluation in IR allowing for unbiased comparisons of all IR results. We also show that only using one, two or even three measures from the set {P, R, F, M} necessary leads to evaluation measures that are non-universal and hence not capable of comparing different IR situations.