Regret-based online ranking for a growing digital library

Authors:
Erick Delage
Affiliations:
HEC Montréal, Montréal, PQ, Canada
Venue:
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2009

Citing 10
Cited 0

Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Accurately interpreting clickthrough data as implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Prediction, Learning, and Games

Prediction, Learning, and Games
Active exploration for learning rankings from clickthrough data

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Logarithmic regret algorithms for online convex optimization

Machine Learning
Learning diverse rankings with multi-armed bandits

Proceedings of the 25th international conference on Machine learning
Loss bounds for online category ranking

COLT'05 Proceedings of the 18th annual conference on Learning Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

The most common environment in which ranking is used takes a very specific form. Users sequentially generate queries in a digital library. For each query, ranking is applied to order a set of relevant items from which the user selects his favorite. This is the case when ranking search results for pages on the World Wide Web or for merchandize on an e-commerce site. In this work, we present a new online ranking algorithm, called NoRegret KLRank. Our algorithm is designed to use "clickthrough" information as it is provided by the users to improve future ranking decisions. More importantly, we show that its long term average performance will converge to the best rate achievable by any competing fixed ranking policy selected with the benefit of hindsight. We show how to ensure that this property continues to hold as new items are added to the set thus requiring a richer class of ranking policies. Finally, our empirical results show that, while in some context NoRegret KLRank might be considered conservative, a greedy variant of this algorithm actually outperforms many popular ranking algorithms.