Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Eligibility Traces for Off-Policy Policy Evaluation
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Click data as implicit relevance feedback in web search
Information Processing and Management: an International Journal
Robust test collections for retrieval evaluation
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
How does clickthrough data reflect retrieval quality?
Proceedings of the 17th ACM conference on Information and knowledge management
Efficient multiple-click models in web search
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Comparative analysis of clicks and judgments for IR evaluation
Proceedings of the 2009 workshop on Web Search Click Data
Global ranking by exploiting user clicks
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Evaluation of methods for relative comparison of retrieval systems based on clickthroughs
Proceedings of the 18th ACM conference on Information and knowledge management
A model to estimate intrinsic document relevance from the clickthrough logs of a web search engine
Proceedings of the third ACM international conference on Web search and data mining
Comparing the sensitivity of information retrieval metrics
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Comparing click-through data to purchase decisions for retrieval evaluation
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A search log-based approach to evaluation
ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms
Proceedings of the fourth ACM international conference on Web search and data mining
Evaluating new search engine configurations with pre-existing judgments and clicks
Proceedings of the 20th international conference on World wide web
A probabilistic method for inferring preferences from clicks
Proceedings of the 20th ACM international conference on Information and knowledge management
Reusing historical interaction data for faster online learning to rank for IR
Proceedings of the sixth ACM international conference on Web search and data mining
Practical online retrieval evaluation
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Using historical click data to increase interleaving sensitivity
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Lerot: an online learning to rank framework
Proceedings of the 2013 workshop on Living labs for information retrieval evaluation
Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
Interleaved comparison methods, which compare rankers using click data, are a promising alternative to traditional information retrieval evaluation methods that require expensive explicit judgments. A major limitation of these methods is that they assume access to live data, meaning that new data must be collected for every pair of rankers compared. We investigate the use of previously collected click data (i.e., historical data) for interleaved comparisons. We start by analyzing to what degree existing interleaved comparison methods can be applied and find that a recent probabilistic method allows such data reuse, even though it is biased when applied to historical data. We then propose an interleaved comparison method that is based on the probabilistic approach but uses importance sampling to compensate for bias. We experimentally confirm that probabilistic methods make the use of historical data for interleaved comparisons possible and effective.