Balancing exploration and exploitation in learning to rank online

Authors:
Katja Hofmann;Shimon Whiteson;Maarten de Rijke
Affiliations:
ISLA, University of Amsterdam;ISLA, University of Amsterdam;ISLA, University of Amsterdam
Venue:
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Year:
2011

Citing 20
Cited 5

Analysis of a very large web search engine query log

ACM SIGIR Forum
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A taxonomy of web search

ACM SIGIR Forum
Accurately interpreting clickthrough data as implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Active exploration for learning rankings from clickthrough data

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
An experimental comparison of click position-bias models

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Exploration scavenging

Proceedings of the 25th international conference on Machine learning
Learning diverse rankings with multi-armed bandits

Proceedings of the 25th international conference on Machine learning
A user browsing model to predict search engine click data from past observations.

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
How does clickthrough data reflect retrieval quality?

Proceedings of the 17th ACM conference on Information and knowledge management
Efficient multiple-click models in web search

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Tailoring click models to user goals

Proceedings of the 2009 workshop on Web Search Click Data
Active Sampling for Rank Learning via Optimizing the Area under the ROC Curve

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Interactively optimizing information retrieval systems as a dueling bandits problem

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning to Rank for Information Retrieval

Foundations and Trends in Information Retrieval
Evaluation of methods for relative comparison of retrieval systems based on clickthroughs

Proceedings of the 18th ACM conference on Information and knowledge management
Incorporating diversity and density in active learning for relevance feedback

ECIR'07 Proceedings of the 29th European conference on IR research
Comparing the sensitivity of information retrieval metrics

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Fast active exploration for link-based preference learning using Gaussian processes

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III

Search engines that learn online

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Adapting rankers online

IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
A probabilistic method for inferring preferences from clicks

Proceedings of the 20th ACM international conference on Information and knowledge management
Interactive exploratory search for multi page search results

Proceedings of the 22nd international conference on World Wide Web
"Learning to rank for information retrieval from user interactions" by K. Hofmann, S. Whiteson, A. Schuth, and M. de Rijke with Martin Vesely as coordinator

ACM SIGWEB Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

As retrieval systems become more complex, learning to rank approaches are being developed to automatically tune their parameters. Using online learning to rank approaches, retrieval systems can learn directly from implicit feedback, while they are running. In such an online setting, algorithms need to both explore new solutions to obtain feedback for effective learning, and exploit what has already been learned to produce results that are acceptable to users. We formulate this challenge as an exploration-exploitation dilemma and present the first online learning to rank algorithm that works with implicit feedback and balances exploration and exploitation. We leverage existing learning to rank data sets and recently developed click models to evaluate the proposed algorithm. Our results show that finding a balance between exploration and exploitation can substantially improve online retrieval performance, bringing us one step closer to making online learning to rank work in practice.