From predicting predominant senses to local context for word sense disambiguation

Authors:
Rob Koeling;Diana McCarthy
Affiliations:
University of Sussex, UK;University of Sussex, UK
Venue:
STEP '08 Proceedings of the 2008 Conference on Semantics in Text Processing
Year:
2008

Citing 7
Cited 0

Evaluating sense disambiguation across diverse parameter spaces

Natural Language Engineering
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A semantic concordance

HLT '93 Proceedings of the workshop on Human Language Technology
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Domain-specific sense distributions and predominant sense acquisition

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Unsupervised acquisition of predominant word senses

Computational Linguistics
Integrating Domain and Paradigmatic Similarity for Unsupervised Sense Tagging

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent work on automatically predicting the predominant sense of a word has proven to be promising (McCarthy et al., 2004). It can be applied (as a first sense heuristic) to Word Sense Disambiguation (WSD) tasks, without needing expensive hand-annotated data sets. Due to the big skew in the sense distribution of many words (Yarowsky and Florian, 2002), the First Sense heuristic for WSD is often hard to beat. However, the local context of an ambiguous word can give important clues to which of its senses was intended. The sense ranking method proposed by McCarthy et al. (2004) uses a distributional similarity thesaurus. The k nearest neighbours in the thesaurus are used to establish the predominant sense of a word. In this paper we report on a first investigation on how to use the grammatical relations the target word is involved with, in order to select a subset of the neighbours from the automatically created thesaurus, to take the local context into account. This unsupervised method is quantitatively evaluated on SemCor. We found a slight improvement in precision over using the predicted first sense. Finally, we discuss strengths and weaknesses of the method and suggest ways to improve the results in the future.