Assessor disagreement and text classifier accuracy

  • Authors:
  • William Webber;Jeremy Pickens

  • Affiliations:
  • University of Maryland, College Park, MD, USA;Catalyst Repository Systems, Denver, CO, USA

  • Venue:
  • Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text classifiers are frequently used for high-yield retrieval from large corpora, such as in e-discovery. The classifier is trained by annotating example documents for relevance. These examples may, however, be assessed by people other than those whose conception of relevance is authoritative. In this paper, we examine the impact that disagreement between actual and authoritative assessor has upon classifier effectiveness, when evaluated against the authoritative conception. We find that using alternative assessors leads to a significant decrease in binary classification quality, though less so ranking quality. A ranking consumer would have to go on average 25% deeper in the ranking produced by alternative-assessor training to achieve the same yield as for authoritative-assessor training.