Instance-Based Learning Algorithms
Machine Learning
IGTree: Using Trees for Compression and Classification in Lazy LearningAlgorithms
Artificial Intelligence Review - Special issue on lazy learning
Forgetting Exceptions is Harmful in Language Learning
Machine Learning - Special issue on natural language learning
The interaction of knowledge sources in word sense disambiguation
Computational Linguistics
Parameter optimization for machine-learning of word sense disambiguation
Natural Language Engineering
Memory-based learning: using similarity for smoothing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Dutch word sense disambiguation: data and preliminary results
SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
A lemma-based approach to a maximum entropy word sense disambiguation system for Dutch
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
UvT-WSD1: A cross-lingual word sense disambiguation system
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Hi-index | 0.00 |
We describe a new version of the Dutch word sense disambiguation system trained and tested on a corrected version of the SENSEVAL-2 data. The system is an ensemble of word experts; each word expert is a memory-based classifier of which the parameters are automatically determined through cross-validation on training material. The original best-performing system, which used only local context features for disambiguation, is further refined by performing additional parallel cross-validation experiments for optimizing algorithmic parameters and the amount of local context available to each of the word experts' memory-based kernels. This procedure produces an accuracy of 84.8% on test material, improving on a baseline score of 77.2% and the previous SENSEVAL-2 score of 84.2%. We show that cross-validation overfits; had the local context been held constant at two left and right neighbouring words, the system would have scored 85.0%.