Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance

  • Authors:
  • Khaled Abdalgader;Andrew Skabar

  • Affiliations:
  • La Trobe University, Bundoora, Australia;La Trobe University, Bundoora, Australia

  • Venue:
  • ACM Transactions on Speech and Language Processing (TSLP)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The process of identifying the actual meanings of words in a given text fragment has a long history in the field of computational linguistics. Due to its importance in understanding the semantics of natural language, it is considered one of the most challenging problems facing this field. In this article we propose a new unsupervised similarity-based word sense disambiguation (WSD) algorithm that operates by computing the semantic similarity between glosses of the target word and a context vector. The sense of the target word is determined as that for which the similarity between gloss and context vector is greatest. Thus, whereas conventional unsupervised WSD methods are based on measuring pairwise similarity between words, our approach is based on measuring semantic similarity between sentences. This enables it to utilize a higher degree of semantic information, and is more consistent with the way that human beings disambiguate; that is, by considering the greater context in which the word appears. We also show how performance can be further improved by incorporating a preliminary step in which the relative importance of words within the original text fragment is estimated, thereby providing an ordering that can be used to determine the sequence in which words should be disambiguated. We provide empirical results that show that our method performs favorably against the state-of-the-art unsupervised word sense disambiguation methods, as evaluated on several benchmark datasets through different models of evaluation.