On score distributions and relevance

  • Authors:
  • Stephen Robertson

  • Affiliations:
  • Microsoft Research, Cambridge, UK

  • Venue:
  • ECIR'07 Proceedings of the 29th European conference on IR research
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We discuss the idea of modelling the statistical distributions of scores of documents, classified as relevant or non-relevant. Various specific combinations of standard statistical distributions have been used for this purpose. Some theoretical considerations indicate problems with some of the choices of pairs of distributions. Specifically, we revisit a generalisation of the well-known inverse relationship between recall and precision: some choices of pairs of distributions violate this generalised relationship. We identify the choices and the violations, and explore some of the consequences of this theoretical view.