Measuring the impact of sense similarity on word sense induction

  • Authors:
  • David Jurgens;Keith Stevens

  • Affiliations:
  • HRL Laboratories, LLC, Malibu, California, and University of California, Los Angeles, California;University of California, Los Angeles, California

  • Venue:
  • EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Word Sense Induction (WSI) is an unsupervised learning approach to discovering the different senses of a word from its contextual uses. A core challenge to WSI approaches is distinguishing between related and possibly similar senses of a word. Current WSI evaluation techniques have yet to analyze the specific impact of similarity on accuracy. Therefore, we present a new WSI evaluation that quantifies the relationship between the relatedness of a word's senses and the ability of a WSI algorithm to distinguish between them. Furthermore, we perform an analysis on sense confusions in SemEval-2 WSI task according to sense similarity. Both analyses for a representative selection of clustering-based WSI approaches reveals that performance is most sensitive to the clustering algorithm and not the lexical features used.