IR evaluation methods for retrieving highly relevant documents
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness
Information Processing and Management: an International Journal
Liberal relevance criteria of TREC -: counting on negligible documents?
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The influence of relevance levels on the effectiveness of interactive information retrieval
Journal of the American Society for Information Science and Technology
User performance versus precision measures for simple search tasks
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating web archive search systems
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Differences in search engine evaluations between query owners and non-owners
Proceedings of the sixth ACM international conference on Web search and data mining
Identifying top news using crowdsourcing
Information Retrieval
Hi-index | 0.00 |
This paper investigates the agreement of relevance assessments between official TREC judgments and those generated from an interactive IR experiment. Results show that 63% of documents judged relevant by our users matched official TREC judgments. Several factors contributed to differences in the agreements: the number of retrieved relevant documents; the number of relevant documents judged; system effectiveness per topic and the ranking of relevant documents.