Reliability tests for the XCG and inex-2002 metrics

Authors:
Gabriella Kazai;Mounia Lalmas;Arjen de Vries
Affiliations:
Dept. of Computer Science, Queen Mary University of London, London, UK;Dept. of Computer Science, Queen Mary University of London, London, UK;CWI, Amsterdam, The Netherlands
Venue:
INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Year:
2004

Citing 5
Cited 7

A critical investigation of recall and precision as measures of retrieval system performance

ACM Transactions on Information Systems (TOIS)
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Using graded relevance assessments in IR evaluation

Journal of the American Society for Information Science and Technology
The overlap problem in content-oriented XML retrieval evaluation

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The interactive track at INEX 2004

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval

Controlling overlap in content-oriented XML retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
eXtended cumulated gain measures for the evaluation of content-oriented XML retrieval

ACM Transactions on Information Systems (TOIS)
Contextualization models for XML retrieval

Information Processing and Management: an International Journal
HiXEval: highlighting XML retrieval evaluation

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
On effectiveness measures and relevance functions in ranking INEX systems

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
TRIX 2004: struggling with the overlap

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Extended structural relevance framework: a framework for evaluating structured document retrieval

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we compare the effectiveness scores and system rankings obtained with the inex-2002 metric, the official measure of INEX 2004, and the XCG metrics proposed in [4] and further developed here. For the comparisons, we use simulated runs as we can easily derive the desired system rankings that a reliable measure should produce based on a predefined set of user preferences. The results indicate that the XCG metrics are better suited for comparing systems for the INEX content-only (CO) task, where systems aim to return the highest scoring elements according to the user preferences reflected in a quantisation function, while also aiming to avoid returning overlapping components.