Information retrieval evaluation with partial relevance judgment

  • Authors:
  • Shengli Wu;Sally McClean

  • Affiliations:
  • School of Computing and Mathematics, University of Ulster, Northern Ireland, UK;School of Computing and Mathematics, University of Ulster, Northern Ireland, UK

  • Venue:
  • BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mean Average Precision has been widely used by researchers in information retrieval evaluation events such as TREC, and it is believed to be a good system measure because of its sensitivity and reliability. However, its drawbacks as regards partial relevance judgment has been largely ignored. In many cases, partial relevance judgment is probably the only reasonable solution due to the large document collections involved. In this paper, we will address this issue through analysis and experiment. Our investigation shows that when only partial relevance judgment is available, mean average precision suffers from several drawbacks: inaccurate values, no explicit explanation, and being subject to the evaluation environment. Further, mean average precision is not superior to some other measures such as precision at a given document level for sensitivity and reliability, both of which are believed to be the major advantages of mean average precision. Our experiments also suggest that average precision over all documents would be a good measure for such a situation.