Unknown word sense detection as outlier detection

  • Authors:
  • Katrin Erk

  • Affiliations:
  • Saarland University Saarbrücken, Germany

  • Venue:
  • HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem of unknown word sense detection: the identification of corpus occurrences that are not covered by a given sense inventory. We model this as an instance of outlier detection, using a simple nearest neighbor-based approach to measuring the resemblance of a new item to a training set. In combination with a method that alleviates data sparseness by sharing training data across lemmas, the approach achieves a precision of 0.77 and recall of 0.82.