Semantic and syntactic features for dutch coreference resolution

  • Authors:
  • Iris Hendrickx;Veronique Hoste;Walter Daelemans

  • Affiliations:
  • CNTS, Language Technology Group, University of Antwerp, Antwerp, Belgium;Language and Translation Technology Team, University College Ghent, Ghent, Belgium;CNTS, Language Technology Group, University of Antwerp, Antwerp, Belgium

  • Venue:
  • CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate the effect of encoding additional semantic and syntactic information sources in a classification-based machine learning approach to the task of coreference resolution for Dutch. We experiment both with a memory-based learning approach and a maximum entropy modeling method. As an alternative to using external lexical resources, such as the lowcoverage Dutch EuroWordNet, we evaluate the effect of automatically generated semantic clusters as information source. We compare these clusters, which group together semantically similar nouns, to two semantic features based on EuroWordNet encoding synonym and hypernym relations between nouns. The syntactic function of the anaphor and antecedent in the sentence can be an important clue for resolving coreferential relations. As baseline approach, we encode syntactic information as predicted by a memorybased shallow parser in a set of features. We contrast these shallow parse based features with features encoding richer syntactic information from a dependency parser. We show that using both the additional semantic information and syntactic information lead to small but significant performance improvement of our coreference resolution approach.