A methodology towards effective and efficient manual document annotation: addressing annotator discrepancy and annotation quality

  • Authors:
  • Ziqi Zhang;Sam Chapman;Fabio Ciravegna

  • Affiliations:
  • Department of Computer Science, University of Sheffield, UK;K-Now, UK;Department of Computer Science, University of Sheffield, UK and K-Now, UK

  • Venue:
  • EKAW'10 Proceedings of the 17th international conference on Knowledge engineering and management by the masses
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Manual document annotation is an essential technique for knowledge acquisition and capture. Creating high-quality annotations is a difficult task due to inter-annotator discrepancy, the problem that annotators can never agree completely on what and exactly how to annotate. To address this, traditional document annotation involves multiple domain experts working on the same annotation task in an iterative and collaborative manner to identify and resolve discrepancies progressively. However, such a detailed process is often ineffective despite taking significant time and effort; unfortunately, discrepancies remain high in many cases. This paper proposes an alternative approach to document annotation. The approach tackles the problem by firstly studying annotators' suitability based on the types of information to be annotated; then identifying and isolating the most inconsistent annotators who tend to cause the majority of discrepancies in a task; finally distributing annotation workload among the most suitable annotators. Tested in a named entity annotation task in the domain of archaeology, we show that compared to the traditional approach to document annotation, it produces larger amounts of better quality annotations that result in higher machine learning accuracy while requires significantly less time and effort.