Lexical ambiguity and information retrieval

  • Authors:
  • Robert Krovetz;W. Bruce Croft

  • Affiliations:
  • Univ. of Massachusetts, Amherst;Univ. of Massachusetts, Amherst

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 1992

Quantified Score

Hi-index 0.02

Visualization

Abstract

Lexical ambiguity is a pervasive problem in natural language processing. However, little quantitative information is available about the extent of the problem or about the impact that it has on information retrieval systems. We report on an analysis of lexical ambiguity in information retrieval test collections and on experiments to determine the utility of word meanings for separating relevant from nonrelevant documents. The experiments show that there is considerable ambiguity even in a specialized database. Word senses provide a significant separation between relevant and nonrelevant documents, but several factors contribute to determining whether disambiguation will make an improvement in performance. For example, resolving lexical ambiguity was found to have little impact on retrieval effectiveness for documents that have many words in common with the query. Other uses of word sense disambiguation in an information retrieval context are discussed.