Minimal test collections for retrieval evaluation
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Overview of the INEX 2009 ad hoc track
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
Evaluation effort, reliability and reusability in XML retrieval
Journal of the American Society for Information Science and Technology
Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Crowdsourcing assessments for XML ranked retrieval
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Ranking of retrieval systems for focused tasks requires large number of relevance judgments. We propose an approach that minimizes the number of relevance judgments, where the performance measures are approximated using a Monte-Carlo sampling technique. Partial measures are taken using relevance judgments, whereas the remaining part of passages are annotated using a generated relevance probability distribution based on result rank. We define two conditions for stopping the assessment procedure when the ranking between systems is stable.