Evaluation of retrieval effectiveness with incomplete relevance data: Theoretical and experimental comparison of three measures

  • Authors:
  • Per Ahlgren;Leif Grönqvist

  • Affiliations:
  • University College of Borås, Swedish School of Library and Information Science, Sweden;Växjö University, School of Mathematics and Systems Engineering, Sweden

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates two relatively new measures of retrieval effectiveness in relation to the problem of incomplete relevance data. The measures, Bpref and RankEff, which do not take into account documents that have not been relevance judged, are compared theoretically and experimentally. The experimental comparisons involve a third measure, the well-known mean uninterpolated average precision. The results indicate that RankEff is the most stable of the three measures when the amount of relevance data is reduced, with respect to system ranking and absolute values. In addition, RankEff has the lowest error-rate.