Evaluation of methods for relative comparison of retrieval systems based on clickthroughs

Authors:
Jing He;Chengxiang Zhai;Xiaoming Li
Affiliations:
Peking University, Beijing, China;University of Illinois at Urbana-Champaign, Urbana, IL, USA;Peking University, Beijing, China
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 3
Cited 13

Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search

ACM Transactions on Information Systems (TOIS)
How does clickthrough data reflect retrieval quality?

Proceedings of the 17th ACM conference on Information and knowledge management
Here or there: preference judgments for relevance

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval

Validating query simulators: an experiment using commercial searches and purchases

CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Balancing exploration and exploitation in learning to rank online

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Enhanced results for web search

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A probabilistic method for inferring preferences from clicks

Proceedings of the 20th ACM international conference on Information and knowledge management
Large-scale validation and analysis of interleaved search evaluation

ACM Transactions on Information Systems (TOIS)
On caption bias in interleaving experiments

Proceedings of the 21st ACM international conference on Information and knowledge management
Estimating interleaved comparison outcomes from historical click data

Proceedings of the 21st ACM international conference on Information and knowledge management
Optimized interleaving for online retrieval evaluation

Proceedings of the sixth ACM international conference on Web search and data mining
Evaluating aggregated search using interleaving

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Lerot: an online learning to rank framework

Proceedings of the 2013 workshop on Living labs for information retrieval evaluation
Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods

ACM Transactions on Information Systems (TOIS)
Relative confidence sampling for efficient on-line ranker evaluation

Proceedings of the 7th ACM international conference on Web search and data mining
"Learning to rank for information retrieval from user interactions" by K. Hofmann, S. Whiteson, A. Schuth, and M. de Rijke with Martin Vesely as coordinator

ACM SIGWEB Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Cranfield evaluation method has some disadvantages, including its high cost in labor and inadequacy for evaluating interactive retrieval techniques. As a very promising alternative, automatic comparison of retrieval systems based on observed clicking behavior of users has recently been studied. Several methods have been proposed, but there has so far been no systematic way to assess which strategy is better, making it difficult to choose a good method for real applications. In this paper, we propose a general way to evaluate these relative comparison methods with two measures: utility to users(UtU) and effectiveness of differentiation(EoD). We evaluate two state of the art methods by systematically simulating different retrieval scenarios. Inspired by the weakness of these methods revealed through our evaluation, we further propose a novel method by considering the positions of clicked documents. Experiment results show that our new method performs better than the existing methods.