Relevance judgments between TREC and Non-TREC assessors

Authors:
Azzah Al-Maskari;Mark Sanderson;Paul Clough
Affiliations:
University of Sheffield, Sheffield, United Kngdm;University of Sheffield, Sheffield, United Kngdm;University of Sheffield, Sheffield, United Kngdm
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 5
Cited 4

IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness

Information Processing and Management: an International Journal
Liberal relevance criteria of TREC -: counting on negligible documents?

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The influence of relevance levels on the effectiveness of interactive information retrieval

Journal of the American Society for Information Science and Technology
User performance versus precision measures for simple search tasks

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

Science models as value-added services for scholarly information systems

Scientometrics
Evaluating web archive search systems

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Differences in search engine evaluations between query owners and non-owners

Proceedings of the sixth ACM international conference on Web search and data mining
Identifying top news using crowdsourcing

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates the agreement of relevance assessments between official TREC judgments and those generated from an interactive IR experiment. Results show that 63% of documents judged relevant by our users matched official TREC judgments. Several factors contributed to differences in the agreements: the number of retrieved relevant documents; the number of relevant documents judged; system effectiveness per topic and the ranking of relevant documents.