Variations in relevance judgments and the evaluation of retrieval performance
Information Processing and Management: an International Journal
Variations in relevance assessments and the measurement of retrieval effectiveness
Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
How reliable are the results of large-scale information retrieval experiments?
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating evaluation measure stability
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
IR evaluation methods for retrieving highly relevant documents
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness
Information Processing and Management: an International Journal
Novelty and redundancy detection in adaptive filtering
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The effect of topic set size on retrieval experiment error
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Multilingual Topic Generation within the CLEF 2001 Experiments
CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Engineering a multi-purpose test collection for web retrieval experiments
Information Processing and Management: an International Journal
Redundant documents and search effectiveness
Proceedings of the 14th ACM international conference on Information and knowledge management
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Introduction to the special issue on patent processing
Information Processing and Management: an International Journal
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Overview of the Reliable Information Access Workshop
Information Retrieval
Information Retrieval Evaluation
Information Retrieval Evaluation
Hi-index | 0.00 |
TREC-style evaluation is generally considered to be the use of test collections, an evaluation methodology referred to as the Cranfield paradigm. This paper starts with a short description of the original Cranfield experiment, with the emphasis on the how and why of the Cranfield framework. This framework is then updated to cover the more recent "batch" evaluations, examining the methodologies used in the various open evaluation campaigns such as TREC. Here again the focus is on the how and why, and in particular on the evolving of the older evaluation methodologies to handle new information access techniques. The final section contains advice on using these existing test collections and building new ones.