Parameter optimization for machine-learning of word sense disambiguation

Authors:
V. Hoste;I. Hendrickx;W. Daelemans;A. Van Den Bosch
Affiliations:
CNTS Language Technology Group, University of Antwerp, Belgium e-mail: hoste@uia.ua.ac.be, daelem@uia.ua.ac.be;ILK Computational Linguistics, Tilburg University, The Netherlands e-mail: I.H.E.Hendrickx@kub.nl, Antal.vdnBosch@kub.nl;CNTS Language Technology Group, University of Antwerp, Belgium e-mail: hoste@uia.ua.ac.be, daelem@uia.ua.ac.be and ILK Computational Linguistics, Tilburg University, The Netherlands;ILK Computational Linguistics, Tilburg University, The Netherlands e-mail: I.H.E.Hendrickx@kub.nl, Antal.vdnBosch@kub.nl and WhizBang! Labs – Research, Pittsburgh, PA, USA
Venue:
Natural Language Engineering
Year:
2002

Citing 12
Cited 26

Instance-Based Learning Algorithms

Machine Learning
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Machine Learning
IGTree: Using Trees for Compression and Classification in Lazy LearningAlgorithms

Artificial Intelligence Review - Special issue on lazy learning
Forgetting Exceptions is Harmful in Language Learning

Machine Learning - Special issue on natural language learning
Empirical Learning of Natural Language Processing Task

ECML '97 Proceedings of the 9th European Conference on Machine Learning
The interaction of knowledge sources in word sense disambiguation

Computational Linguistics
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Memory-based learning: using similarity for smoothing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Memory-based morphological analysis

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Scaling to very very large corpora for natural language disambiguation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Classifier optimization and combination in the English all words task

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems

Dutch word sense disambiguation: optimizing the localness of context

WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Evaluating the results of a memory-based word-expert approach to unrestricted word sense disambiguation

WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Ensemble methods for unsupervised WSD

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A lemma-based approach to a maximum entropy word sense disambiguation system for Dutch

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Trajectory based word sense disambiguation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Optimizing to arbitrary NLP metrics using ensemble selection

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Dependency-Based Construction of Semantic Space Models

Computational Linguistics
Aligning features with sense distinction dimensions

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Combining classifiers for word sense disambiguation based on Dempster-Shafer theory and OWA operators

Data & Knowledge Engineering
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
Learning-based pronoun resolution for Turkish with a comparative evaluation

Computer Speech and Language
Case-Sensitivity of Classifiers for WSD: Complex Systems Disambiguate Tough Words Better

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Adaptively entropy-based weighting classifiers in combination using Dempster-Shafer theory for word sense disambiguation

Computer Speech and Language
A semantics-enhanced language model for unsupervised word sense disambiguation

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
UvT-WSD1: A cross-lingual word sense disambiguation system

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Class-based approach to disambiguating levin verbs

Natural Language Engineering
ParaSense or how to use parallel corpora for word sense disambiguation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
An evaluation and possible improvement path for current SMT behavior on ambiguous nouns

SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Defining classifier regions for WSD ensembles using word space features

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Building an optimal WSD ensemble using per-word selection of best system

CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
mySENSEVAL: explaining WSD system performance using target word features

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Combining classifiers based on OWA operators with an application to word sense disambiguation

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part I
An evidential reasoning approach to weighted combination of classifiers for word sense disambiguation

MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
Combining classifiers with multi-representation of context in word sense disambiguation

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A quick tour of word sense disambiguation, induction and related approaches

SOFSEM'12 Proceedings of the 38th international conference on Current Trends in Theory and Practice of Computer Science
Five languages are better than one: an attempt to bypass the data acquisition bottleneck for WSD

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Various Machine Learning (ML) approaches have been demonstrated to produce relatively successful Word Sense Disambiguation (WSD) systems. There are still unexplained differences among the performance measurements of different algorithms, hence it is warranted to deepen the investigation into which algorithm has the right ‘bias’ for this task. In this paper, we show that this is not easy to accomplish, due to intricate interactions between information sources, parameter settings, and properties of the training data. We investigate the impact of parameter optimization on generalization accuracy in a memory-based learning approach to English and Dutch WSD. A ‘word-expert’ architecture was adopted, yielding a set of classifiers, each specialized in one single wordform. The experts consist of multiple memory-based learning classifiers, each taking different information sources as input, combined in a voting scheme. We optimized the architectural and parametric settings for each individual word-expert by performing cross-validation experiments on the learning material. The results of these experiments show that the variation of both the algorithmic parameters and the information sources available to the classifiers leads to large fluctuations in accuracy. We demonstrate that optimization per word-expert leads to an overall significant improvement in the generalization accuracies of the produced WSD systems.