Relative confidence sampling for efficient on-line ranker evaluation

Authors:
Masrour Zoghi;Shimon A. Whiteson;Maarten de Rijke;Remi Munos
Affiliations:
University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands;INRIA Lille - Nord Europe, Villeneuve d'Ascq, France
Venue:
Proceedings of the 7th ACM international conference on Web search and data mining
Year:
2014

Citing 23
Cited 0

Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)

TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Click data as implicit relevance feedback in web search

Information Processing and Management: an International Journal
An experimental comparison of click position-bias models

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
How does clickthrough data reflect retrieval quality?

Proceedings of the 17th ACM conference on Information and knowledge management
Efficient multiple-click models in web search

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Comparative analysis of clicks and judgments for IR evaluation

Proceedings of the 2009 workshop on Web Search Click Data
Tailoring click models to user goals

Proceedings of the 2009 workshop on Web Search Click Data
Interactively optimizing information retrieval systems as a dueling bandits problem

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Global ranking by exploiting user clicks

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Evaluation of methods for relative comparison of retrieval systems based on clickthroughs

Proceedings of the 18th ACM conference on Information and knowledge management
Using clicks as implicit judgments: expectations versus observations

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Comparing the sensitivity of information retrieval metrics

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
X-Armed Bandits

The Journal of Machine Learning Research
A probabilistic method for inferring preferences from clicks

Proceedings of the 20th ACM international conference on Information and knowledge management
Large-scale validation and analysis of interleaved search evaluation

ACM Transactions on Information Systems (TOIS)
The K-armed dueling bandits problem

Journal of Computer and System Sciences
Thompson sampling: an asymptotically optimal finite-time analysis

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Optimized interleaving for online retrieval evaluation

Proceedings of the sixth ACM international conference on Web search and data mining
Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval

Information Retrieval
Evaluating aggregated search using interleaving

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

A key challenge in information retrieval is that of on-line ranker evaluation: determining which one of a finite set of rankers performs the best in expectation on the basis of user clicks on presented document lists. When the presented lists are constructed using interleaved comparison methods, which interleave lists proposed by two different candidate rankers, then the problem of minimizing the total regret accumulated while evaluating the rankers can be formalized as a K-armed dueling bandits problem. In this paper, we propose a new method called relative confidence sampling (RCS) that aims to reduce cumulative regret by being less conservative than existing methods in eliminating rankers from contention. In addition, we present an empirical comparison between RCS and two state-of-the-art methods, relative upper confidence bound and SAVAGE. The results demonstrate that RCS can substantially outperform these alternatives on several large learning to rank datasets.