Some ideas for estimating the number of relevant documents

  • Authors:
  • Robert T. Dattola

  • Affiliations:
  • -

  • Venue:
  • ACM SIGIR Forum
  • Year:
  • 1980

Quantified Score

Hi-index 0.00

Visualization

Abstract

A model is proposed for estimating the total number of relevant documents in a collection for a given query. The total number of relevant documents is needed in order to compute recall values for use in evaluating document retrieval systems. If x represents document rank and y represents precision, then one of the following functions is fit to the points obtained by plotting precision vs. document rank after each retrieved document:1. y = AeBx exponential2. y = AxB power3. y = A - B/x hyperbolic4. y = 1/(A + Bx) hyperbolic5. y = x/(A + Bx) hyperbolicThat equation with the best fit satisfying certain constraints is used to estimate the total number of relevant documents for any given query. Experimental comparisons of this best fit are made with random sampling methods.