A critical investigation of recall and precision as measures of retrieval system performance
ACM Transactions on Information Systems (TOIS)
Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
How reliable are the results of large-scale information retrieval experiments?
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
European Research Letter: cross-language system evaluation: the CLEF campaigns
Journal of the American Society for Information Science and Technology
Information retrieval system evaluation: effort, sensitivity, and reliability
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Learning to rank for information retrieval (LR4IR 2007)
ACM SIGIR Forum
On the history of evaluation in IR
Journal of Information Science
Introduction to Information Retrieval
Introduction to Information Retrieval
Computing information retrieval performance measures efficiently in the presence of tied scores
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
DIRECT: a system for evaluating information access components of digital libraries
ECDL'05 Proceedings of the 9th European conference on Research and Advanced Technology for Digital Libraries
An empirical comparison of social, collaborative filtering, and hybrid recommenders
ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context
Hi-index | 0.00 |
We consider Information Retrieval evaluation, especially at TREC with the trec_eval program. It appears that systems obtain scores regarding not only the relevance of retrieved documents, but also according to document names in case of ties (i.e., when they are retrieved with the same score). We consider this tie-breaking strategy as an uncontrolled parameter influencing measure scores, and argue the case for fairer tie-breaking strategies. A study of 22 TREC editions reveals significant differences between the Conventional unfair TREC's strategy and the fairer strategies we propose. This experimental result advocates using these fairer strategies when conducting evaluations.