The impact on retrieval effectiveness of skewed frequency distributions

  • Authors:
  • Mark Sanderson;C. J. Van Rijsbergen

  • Affiliations:
  • Univ. of Sheffield, Sheffield, UK;Univ. of Glasgow, Glasgow, Scotland, UK

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an analysis of word senses that provides a fresh insight into the impact of word ambiguity on retrieval effectiveness with potential broader implications for other processes of information retrieval. Using a methodology of forming artifically ambiguous words, known as pseudowords, and through reference to other researchers' work, the analysis illustrates that the distribution of the frequency of occurrance of the senses of a word plays a strong role in ambiguity's impact of effectiveness. Further investigation shows that this analysis may also be applicable to other processes of retrieval, such as Cross Language Information Retrieval, query expansion, retrieval of OCR'ed texts, and stemming. The analysis appears to provide a means of explaining, at least in part, reasons for the processes' impact (or lack of it) on effectiveness.