A discrimination gain hypothesis

  • Authors:
  • C. J. van Rijsbergen

  • Affiliations:
  • University College Dublin, Belfield, Dublin 4

  • Venue:
  • SIGIR '83 Proceedings of the 6th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1983

Quantified Score

Hi-index 0.00

Visualization

Abstract

Underlying many of the probabilistic models for information retrieval are assumptions of stochastic dependence or independence of varying degrees of severity for the index terms describing the documents. These models generally specify a matching function, that is a function which compares a query with each document. The form of that function is to a large extent determined by the particular dependence/independence assumption. For example, if the index terms are assumed to be independently distributed over both the set of relevant and non-relevant documents then the matching function will in general be linear, whereas an assumption of dependence will lead to a non-linear function.Irrespective of the form that the matching function may take it is always assumed that the search terms in the query are known. In this paper I wish to address the problem of the choice of search terms and how this choice may be affected by an independence assumption.