Using corpus statistics and WordNet relations for sense identification

  • Authors:
  • Claudia Leacock;George A. Miller;Martin Chodorow

  • Affiliations:
  • Educational Testing Service;Princeton University;Hunter College of CUNY

  • Venue:
  • Computational Linguistics - Special issue on word sense disambiguation
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Corpus-based approaches to word sense identification have flexibility and generality but suffer from a knowledge acquisition bottleneck. We show how knowledge-based techniques can be used to open the bottleneck by automatically locating training corpora. We describe a statistical classifier that combines topical context with local cues to identify a word sense. The classifier is used to disambiguate a noun, a verb, and an adjective. A knowledge base in the form of WordNet's lexical relations is used to automatically locate training examples in a general text corpus. Test results are compared with those from manually tagged training examples.