Dutch word sense disambiguation: optimizing the localness of context

  • Authors:
  • Véronique Hoste;Walter Daelemans;Iris Hendrickx;Antal van den Bosch

  • Affiliations:
  • University of Antwerp, Belgium;University of Antwerp, Belgium;Tilburg University, The Netherlands;Tilburg University, The Netherlands

  • Venue:
  • WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a new version of the Dutch word sense disambiguation system trained and tested on a corrected version of the SENSEVAL-2 data. The system is an ensemble of word experts; each word expert is a memory-based classifier of which the parameters are automatically determined through cross-validation on training material. The original best-performing system, which used only local context features for disambiguation, is further refined by performing additional parallel cross-validation experiments for optimizing algorithmic parameters and the amount of local context available to each of the word experts' memory-based kernels. This procedure produces an accuracy of 84.8% on test material, improving on a baseline score of 77.2% and the previous SENSEVAL-2 score of 84.2%. We show that cross-validation overfits; had the local context been held constant at two left and right neighbouring words, the system would have scored 85.0%.