eXtended cumulated gain measures for the evaluation of content-oriented XML retrieval

Authors:
Gabriella Kazai;Mounia Lalmas
Affiliations:
Queen Mary, University of London, London, UK;Queen Mary, University of London, London, UK
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2006

Citing 27
Cited 12

Retrieval system evaluation using recall and precision: problems and answers

SIGIR '89 Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the evaluation of retrieval performance

Information Processing and Management: an International Journal
The pragmatics of information retrieval experimentation, revisited

Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Using statistical testing in the evaluation of retrieval experiments

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance judgments for assessing recall

Information Processing and Management: an International Journal
Variations in relevance assessments and the measurement of retrieval effectiveness

Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
Readings in information retrieval

Readings in information retrieval
Evaluating evaluation measure stability

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness

Information Processing and Management: an International Journal
Evaluation by highly relevant documents

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval

Information Retrieval
Modern Information Retrieval

Modern Information Retrieval
The effect of topic set size on retrieval experiment error

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
A Graphical User Interface for Structured Document Retrieval

Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Using graded relevance assessments in IR evaluation

Journal of the American Society for Information Science and Technology
The concept of relevance in IR

Journal of the American Society for Information Science and Technology
The overlap problem in content-oriented XML retrieval evaluation

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Binary and graded relevance in IR evaluations: comparison of the effects on ranking of IR systems

Information Processing and Management: an International Journal
Information retrieval system evaluation: effort, sensitivity, and reliability

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)

TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Evaluating the effectiveness of content-oriented XML retrieval methods

Information Retrieval
Overview of INEX 2005

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
The interactive track at INEX 2005

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
The reliability of metrics based on graded relevance

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Reliability tests for the XCG and inex-2002 metrics

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval

Evaluating XML retrieval effectiveness at INEX

ACM SIGIR Forum
Keyword proximity search in complex data graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A new interpretation of average precision

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Structural relevance: a common basis for the evaluation of structured document retrieval

Proceedings of the 17th ACM conference on Information and knowledge management
Flexible document-query matching based on a probabilistic content and structure score combination

Proceedings of the 2010 ACM Symposium on Applied Computing
INEX 2002-2006: understanding XML retrieval evaluation

DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
Expected reading effort in focused retrieval evaluation

Information Retrieval
Evaluation effort, reliability and reusability in XML retrieval

Journal of the American Society for Information Science and Technology
Processing keyword search on XML: a survey

World Wide Web
INEX 2005 evaluation measures

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Contextualization using hyperlinks and internal hierarchical structure of Wikipedia documents

Proceedings of the 21st ACM international conference on Information and knowledge management
Extended structural relevance framework: a framework for evaluating structured document retrieval

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose and evaluate a family of measures, the eXtended Cumulated Gain (XCG) measures, for the evaluation of content-oriented XML retrieval approaches. Our aim is to provide an evaluation framework that allows the consideration of dependency among XML document components. In particular, two aspects of dependency are considered: (1) near-misses, which are document components that are structurally related to relevant components, such as a neighboring paragraph or container section, and (2) overlap, which regards the situation wherein the same text fragment is referenced multiple times, for example, when a paragraph and its container section are both retrieved. A further consideration is that the measures should be flexible enough so that different models of user behavior may be instantiated within. Both system- and user-oriented aspects are investigated and both recall and precision-like qualities are measured. We evaluate the reliability of the proposed measures based on the INEX 2004 test collection. For example, the effects of assessment variation and topic set size on evaluation stability are investigated, and the upper and lower bounds of expected error rates are established. The evaluation demonstrates that the XCG measures are stable and reliable, and in particular, that the novel measures of effort-precision and gain-recall (ep/gr) show comparable behavior to established IR measures like precision and recall.