Evaluating information retrieval system performance based on multi-grade relevance

  • Authors:
  • Bing Zhou;Yiyu Yao

  • Affiliations:
  • Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada;Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada

  • Venue:
  • ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the challenges of modern information retrieval is to rank the most relevant documents at the top of the large system output. This brings a call for choosing the proper methods to evaluate the system performance. The traditional performance measures, such as precision and recall, are not able to distinguish different levels of relevance because they are only based on binary relevance. The main objective of this paper is to review 10 existing evaluation methods based on multi-grade relevance and compare their similarities and differences through theoretical and numerical examinations. We find that the normalized distance performance measure is the best choice in terms of the sensitivity to document rank order and giving higher credits to systems for their ability of retrieving highly relevant documents. The cumulated gain-based methods rely on the total relevance score and are not sensitive enough to document rank order.