Document features predicting assessor disagreement

Authors:
Praveen Chandar;William Webber;Ben Carterette
Affiliations:
University of Delaware, Newark, DE, USA;University of Maryland, College Park, MD, USA;University of Delaware, Newark, DE, USA
Venue:
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Year:
2013

Citing 10
Cited 0

Variations in relevance judgments and the measurement of retrieval effectiveness

Information Processing and Management: an International Journal
Measure-based metasearch

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting reading difficulty with statistical language models

Journal of the American Society for Information Science and Technology
Relevance assessment: are judges exchangeable and does it matter

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
The effect of assessor error on IR system evaluation

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A user study of relevance judgments for e-discovery

Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem - Volume 47
Quality-biased ranking of web documents

Proceedings of the fourth ACM international conference on Web search and data mining
Characterizing web content, user interests, and search behavior by reading level and topic

Proceedings of the fifth ACM international conference on Web search and data mining
Effect of written instructions on assessor agreement

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Alternative assessor disagreement and retrieval depth

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The notion of relevance differs between assessors, thus giving rise to assessor disagreement. Although assessor disagreement has been frequently observed, the factors leading to disagreement are still an open problem. In this paper we study the relationship between assessor disagreement and various topic independent factors such as readability and cohesiveness. We build a logistic model using reading level and other simple document features to predict assessor disagreement and rank documents by decreasing probability of disagreement. We compare the predictive power of these document-level features with that of a meta-search feature that aggregates a document's ranking across multiple retrieval runs. Our features are shown to be on a par with the meta-search feature, without requiring a large and diverse set of retrieval runs to calculate. Surprisingly, however, we find that the reading level features are negatively correlated with disagreement, suggesting that they are detecting some other aspect of document content.