Evaluation of the 2-Poisson model as a basis for using term frequency data in searching

  • Authors:
  • Vijay V. Raghavan;Hong-pao Shi;C. T. Yu

  • Affiliations:
  • University of Regina, Regina, Sask. S4S 0A2 Canada;Sian Jiao-Tong University China;University of Illinois at Chicago, Chicago, Ill.

  • Venue:
  • SIGIR '83 Proceedings of the 6th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1983

Quantified Score

Hi-index 0.00

Visualization

Abstract

The early work on the probabilistic models of retrieval assumed that the document representation is binary, indicating only the presence or absence of index terms. The 2-Poisson (TP) model which was proposed as a model of how the occurrence frequency of specialty words in a collection is distributed, has since been used to develop retrieval strategies that incorporate term frequency information. This work investigates the use of the TP model, in this context, further. It is shown that the search effectiveness, when no relevance information is assumed, can be further enhanced by using this model. Furthermore, when the term weights proposed in this work are used in conjunction with weights known as term significance weights, the results are very encouraging.