Reliability tests for the XCG and inex-2002 metrics

  • Authors:
  • Gabriella Kazai;Mounia Lalmas;Arjen de Vries

  • Affiliations:
  • Dept. of Computer Science, Queen Mary University of London, London, UK;Dept. of Computer Science, Queen Mary University of London, London, UK;CWI, Amsterdam, The Netherlands

  • Venue:
  • INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we compare the effectiveness scores and system rankings obtained with the inex-2002 metric, the official measure of INEX 2004, and the XCG metrics proposed in [4] and further developed here. For the comparisons, we use simulated runs as we can easily derive the desired system rankings that a reliable measure should produce based on a predefined set of user preferences. The results indicate that the XCG metrics are better suited for comparing systems for the INEX content-only (CO) task, where systems aim to return the highest scoring elements according to the user preferences reflected in a quantisation function, while also aiming to avoid returning overlapping components.