Variations in relevance judgments and the measurement of retrieval effectiveness
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Reliable information retrieval evaluation with incomplete and biased judgements
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
Both research and evaluation in human language technology have enjoyed a big surge for the last fifteen years. Performance has made major advances, partially due to the availability of resources and the interest in the many evaluation forums present today. But there is much more to do, both in terms of new areas of research and in improved evaluation for these areas. This paper addresses the current state-ofthe-art in evaluation and then discusses some ideas for improving this evaluation.