How reliable are the results of large-scale information retrieval experiments?
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness
Information Processing and Management: an International Journal
A survey in indexing and searching XML documents
Journal of the American Society for Information Science and Technology - XML
Using graded relevance assessments in IR evaluation
Journal of the American Society for Information Science and Technology
Providing consistent and exhaustive relevance assessments for XML retrieval evaluation
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Using RankBoost to compare retrieval systems
Proceedings of the 14th ACM international conference on Information and knowledge management
Investigating the exhaustivity dimension in content-oriented XML element retrieval evaluation
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Evaluating XML retrieval effectiveness at INEX
ACM SIGIR Forum
Where to start reading a textual XML document?
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
HiXEval: highlighting XML retrieval evaluation
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
The interactive track at INEX 2004
INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Structural relevance: a common basis for the evaluation of structured document retrieval
Proceedings of the 17th ACM conference on Information and knowledge management
Towards methods for the collective gathering and quality control of relevance assessments
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
Supporting polyrepresentation in a quantum-inspired geometrical retrieval framework
Proceedings of the third symposium on Information interaction in context
Exploring a multidimensional representation of documents and queries
RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
Evaluation effort, reliability and reusability in XML retrieval
Journal of the American Society for Information Science and Technology
Crowdsourcing assessments for XML ranked retrieval
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
On using a quantum physics formalism for multidocument summarization
Journal of the American Society for Information Science and Technology
Better than their reputation? on the reliability of relevance assessments with students
CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
Hi-index | 0.00 |
In information retrieval research, comparing retrieval approaches requires test collections consisting of documents, user requests and relevance assessments. Obtaining relevance assessments that are as sound and complete as possible is crucial for the comparison of retrieval approaches. In XML retrieval, the problem of obtaining sound and complete relevance assessments is further complicated by the structural relationships between retrieval results. A major difference between XML retrieval and flat document retrieval is that the relevance of elements (the retrievable units) is not independent of that of related elements. This has major consequences for the gathering of relevance assessments. This article describes investigations into the creation of sound and complete relevance assessments for the evaluation of content-oriented XML retrieval as carried out at INEX, the evaluation campaign for XML retrieval. The campaign, now in its seventh year, has had three substantially different approaches to gather assessments and has finally settled on a highlighting method for marking relevant passages within documents—even though the objective is to collect assessments at element level. The different methods of gathering assessments at INEX are discussed and contrasted. The highlighting method is shown to be the most reliable of the methods.