Coreference resolution across corpora: languages, coding schemes, and preprocessing information

  • Authors:
  • Marta Recasens;Eduard Hovy

  • Affiliations:
  • CLiC - University of Barcelona, Barcelona, Spain;USC Information Sciences Institute, Marina del Rey, CA

  • Venue:
  • ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system, as measured by MUC, B3, and CEAF. By varying separately three parameters (language, annotation scheme, and preprocessing information) and applying the same coreference resolution system, the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition, coding schemes, and features. They also expose systematic biases in the coreference evaluation metrics. We show that system comparison is only possible when corpus parameters are in exact agreement.