Detecting errors in discontinuous structural annotation

  • Authors:
  • Markus Dickinson;W. Detmar Meurers

  • Affiliations:
  • The Ohio State University;The Ohio State University

  • Venue:
  • ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Consistency of corpus annotation is an essential property for the many uses of annotated corpora in computational and theoretical linguistics. While some research addresses the detection of inconsistencies in positional annotation (e.g., part-of-speech) and continuous structural annotation (e.g., syntactic constituency), no approach has yet been developed for automatically detecting annotation errors in discontinuous structural annotation. This is significant since the annotation of potentially discontinuous stretches of material is increasingly relevant, from tree-banks for free-word order languages to semantic and discourse annotation.In this paper we discuss how the variation n-gram error detection approach (Dickinson and Meurers, 2003a) can be extended to discontinuous structural annotation. We exemplify the approach by showing how it successfully detects errors in the syntactic annotation of the German TIGER corpus (Brants et al., 2002).