Assessor disagreement and text classifier accuracy

Authors:
William Webber;Jeremy Pickens
Affiliations:
University of Maryland, College Park, MD, USA;Catalyst Repository Systems, Denver, CO, USA
Venue:
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Year:
2013

Citing 4
Cited 0

Variations in relevance judgments and the measurement of retrieval effectiveness

Information Processing and Management: an International Journal
A model for handling approximate, noisy or incomplete labeling in text classification

ICML '05 Proceedings of the 22nd international conference on Machine learning
The effect of assessor error on IR system evaluation

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text classifiers are frequently used for high-yield retrieval from large corpora, such as in e-discovery. The classifier is trained by annotating example documents for relevance. These examples may, however, be assessed by people other than those whose conception of relevance is authoritative. In this paper, we examine the impact that disagreement between actual and authoritative assessor has upon classifier effectiveness, when evaluated against the authoritative conception. We find that using alternative assessors leads to a significant decrease in binary classification quality, though less so ranking quality. A ranking consumer would have to go on average 25% deeper in the ranking produced by alternative-assessor training to achieve the same yield as for authoritative-assessor training.