A critical investigation of recall and precision as measures of retrieval system performance
ACM Transactions on Information Systems (TOIS)
Elements of information theory
Elements of information theory
Evaluation of evaluation in information retrieval
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating and optimizing autonomous text classification systems
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
A maximum entropy approach to natural language processing
Computational Linguistics
The maximum entropy principle in information retrieval
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
On selecting a measure of retrieval effectiveness. Part I.
Readings in information retrieval
When information retrieval measures agree about the relative quality of document rankings
Journal of the American Society for Information Science
Evaluating evaluation measure stability
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
The maximum entropy approach and probabilistic IR models
ACM Transactions on Information Systems (TOIS)
Current Status of the Evaluation of Information Retrieval
Journal of Medical Systems
A maximum entropy approach to species distribution modeling
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Inferring document relevance via average precision
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Estimating average precision with incomplete and imperfect judgments
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Inferring document relevance from incomplete information
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Score standardization for inter-collection comparison of retrieval systems
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Empirical justification of the gain and discount function for nDCG
Proceedings of the 18th ACM conference on Information and knowledge management
On the choice of effectiveness measures for learning to rank
Information Retrieval
On statistical analysis and optimization of information retrieval effectiveness metrics
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Extending average precision to graded relevance judgments
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
On the informativeness of cascade and intent-aware effectiveness measures
Proceedings of the 20th international conference on World wide web
Evaluating diversified search results using per-intent graded relevance
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Information retrieval evaluation with partial relevance judgment
BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Evaluation of system measures for incomplete relevance judgment in IR
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Extended expectation maximization for inferring score distributions
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
On the inference of average precision from score distributions
Proceedings of the 21st ACM international conference on Information and knowledge management
Information-theoretic term weighting schemes for document clustering
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Modelling Score Distributions Without Actual Scores
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Hi-index | 0.00 |
We present a model, based on the maximum entropy method, for analyzing various measures of retrieval performance such as average precision, R-precision, and precision-at-cutoffs. Our methodology treats the value of such a measure as a constraint on the distribution of relevant documents in an unknown list, and the maximum entropy distribution can be determined subject to these constraints. For good measures of overall performance (such as average precision), the resulting maximum entropy distributions are highly correlated with actual distributions of relevant documents in lists as demonstrated through TREC data; for poor measures of overall performance, the correlation is weaker. As such, the maximum entropy method can be used to quantify the overall quality of a retrieval measure. Furthermore, for good measures of overall performance (such as average precision), we show that the corresponding maximum entropy distributions can be used to accurately infer precision-recall curves and the values of other measures of performance, and we demonstrate that the quality of these inferences far exceeds that predicted by simple retrieval measure correlation, as demonstrated through TREC data.