Proceedings of the 6th international conference on Intelligent user interfaces
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Display time as implicit feedback: understanding task effects
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating implicit measures to improve web search
ACM Transactions on Information Systems (TOIS)
Context-sensitive information retrieval using implicit feedback
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating implicit feedback models using searcher simulations
ACM Transactions on Information Systems (TOIS)
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Click data as implicit relevance feedback in web search
Information Processing and Management: an International Journal
An experimental comparison of click position-bias models
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Proceedings of the 25th international conference on Machine learning
Learning diverse rankings with multi-armed bandits
Proceedings of the 25th international conference on Machine learning
Relevance assessment: are judges exchangeable and does it matter
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
How does clickthrough data reflect retrieval quality?
Proceedings of the 17th ACM conference on Information and knowledge management
Efficient multiple-click models in web search
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Tailoring click models to user goals
Proceedings of the 2009 workshop on Web Search Click Data
A dynamic bayesian network click model for web search ranking
Proceedings of the 18th international conference on World wide web
Evaluation of methods for relative comparison of retrieval systems based on clickthroughs
Proceedings of the 18th ACM conference on Information and knowledge management
Using clicks as implicit judgments: expectations versus observations
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Comparing the sensitivity of information retrieval metrics
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Comparing click-through data to purchase decisions for retrieval evaluation
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Balancing exploration and exploitation in learning to rank online
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Pattern classification using neural networks
IEEE Communications Magazine
Large-scale validation and analysis of interleaved search evaluation
ACM Transactions on Information Systems (TOIS)
On caption bias in interleaving experiments
Proceedings of the 21st ACM international conference on Information and knowledge management
Estimating interleaved comparison outcomes from historical click data
Proceedings of the 21st ACM international conference on Information and knowledge management
Reusing historical interaction data for faster online learning to rank for IR
Proceedings of the sixth ACM international conference on Web search and data mining
Optimized interleaving for online retrieval evaluation
Proceedings of the sixth ACM international conference on Web search and data mining
Practical online retrieval evaluation
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Click model-based information retrieval metrics
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Using historical click data to increase interleaving sensitivity
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Evaluating aggregated search using interleaving
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
User intent and assessor disagreement in web search evaluation
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Lerot: an online learning to rank framework
Proceedings of the 2013 workshop on Living labs for information retrieval evaluation
Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods
ACM Transactions on Information Systems (TOIS)
Relative confidence sampling for efficient on-line ranker evaluation
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
Evaluating rankers using implicit feedback, such as clicks on documents in a result list, is an increasingly popular alternative to traditional evaluation methods based on explicit relevance judgments. Previous work has shown that so-called interleaved comparison methods can utilize click data to detect small differences between rankers and can be applied to learn ranking functions online. In this paper, we analyze three existing interleaved comparison methods and find that they are all either biased or insensitive to some differences between rankers. To address these problems, we present a new method based on a probabilistic interleaving process. We derive an unbiased estimator of comparison outcomes and show how marginalizing over possible comparison outcomes given the observed click data can make this estimator even more effective. We validate our approach using a recently developed simulation framework based on a learning to rank dataset and a model of click behavior. Our experiments confirm the results of our analysis and show that our method is both more accurate and more robust to noise than existing methods.