Using WordNet to disambiguate word senses for text retrieval

  • Authors:
  • Ellen M. Voorhees

  • Affiliations:
  • -

  • Venue:
  • SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1993

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper describes an automatic indexing procedure that uses the “IS-A” relations contained within WordNet and the set of nouns contained in a text to select a sense for each plysemous noun in the text. The result of the indexing procedure is a vector in which some of the terms represent word senses instead of word stems. Retrieval experiments comparing the effectivenss of these sense-based vectors vs. stem-based vectors show the stem-based vectors to be superior overall, although the sense-based vectors do improve the performance of some queries. The overall degradation is due in large part to the difficulty of disambiguating senses in short query statements. An analysis of these results suggests two conclusions: the IS-A links define a generalization/specialization hierarchy that is not sufficient to reliably select the correct sense of a noun from the set of fine sense distinctions in WordNet; and missing correct matches because of incorrect sense resolution has a much more deleterious effect on retrieval performance than does making spurious matches.