Variations in relevance judgments and the measurement of retrieval effectiveness
Information Processing and Management: an International Journal
Identifying and Filtering Near-Duplicate Documents
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
Minimal test collections for retrieval evaluation
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Inferring document relevance via average precision
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation over thousands of queries
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Relevance assessment: are judges exchangeable and does it matter
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
How does clickthrough data reflect retrieval quality?
Proceedings of the 17th ACM conference on Information and knowledge management
IR system evaluation using nugget-based test collections
Proceedings of the fifth ACM international conference on Web search and data mining
Pseudo test collections for training and tuning microblog rankers
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
The problem of building test collections is central to the development of information retrieval systems such as search engines. Starting with a few relevant "nuggets" of information manually extracted from existing TREC corpora, we implement and test a methodology that finds and correctly assesses the vast majority of relevant documents found by TREC assessors - as well as up to four times more additional relevant documents. Our methodology produces highly accurate test collections that hold the promise of addressing the issues of scalability, reusability, and applicability.