A Deeper Look into Features for Coreference Resolution

  • Authors:
  • Marta Recasens;Eduard Hovy

  • Affiliations:
  • CLiC, University of Barcelona, Barcelona, Spain;Information Sciences Institute, Marina del Rey, USA

  • Venue:
  • DAARC '09 Proceedings of the 7th Discourse Anaphora and Anaphor Resolution Colloquium on Anaphora Processing and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

All automated coreference resolution systems consider a number of features, such as head noun, NP type, gender, or number. Although the particular features used is one of the key factors for determining performance, they have not received much attention, especially for languages other than English. This paper delves into a considerable number of pairwise comparison features for coreference, including old and novel features, with a special focus on the Spanish language. We consider the contribution of each of the features as well as the interaction between them. In addition, given the problem of class imbalance in coreference resolution, we analyze the effect of sample selection. From the experiments with TiMBL (Tilburg Memory-Based Learner) on the AnCora corpus, interesting conclusions are drawn from both linguistic and computational perspectives.