Vector-Based Unsupervised Word Sense Disambiguation for Large Number of Contexts

Authors:
Gyula Papp
Affiliations:
Faculty of Information Technology, Interdisciplinary Technical Sciences Doctoral School, Pázmány Péter Catholic University, Budapest, Hungary 1083
Venue:
TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
Year:
2009

Citing 5
Cited 0

Evaluation of hierarchical clustering algorithms for document datasets

Proceedings of the eleventh international conference on Information and knowledge management
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
SenseClusters - finding clusters that represent word senses

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Selecting the "right" number of senses based on clustering criterion functions

EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a possible improvement of unsupervised word sense disambiguation (WSD) systems by extending the number of contexts applied by the discrimination algorithms. We carried out an experiment for several WSD algorithms based on the vector space model with the help of the SenseClusters ([1]) toolkit. Performances of algorithms were evaluated on a standard benchmark, on the nouns of the Senseval-3 English lexical-sample task ([2]). Paragraphs from the British National Corpus were added to the contexts of Senseval-3 data in order to increase the number of contexts used by the discrimination algorithms. After parameter optimization on Senseval-2 English lexical sample data performance measures show slight improvement, and the optimized algorithm is competitive with the best unsupervised WSD systems evaluated on the same data, such as [3].