A contextual-bandit approach to personalized news article recommendation
Proceedings of the 19th international conference on World wide web
Exploitation and exploration in a performance based contextual advertising system
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Online learning for recency search ranking using real-time user feedback
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms
Proceedings of the fourth ACM international conference on Web search and data mining
Value of learning in sponsored search auctions
WINE'10 Proceedings of the 6th international conference on Internet and network economics
All the news that's fit for you
Communications of the ACM
Learning to model relatedness for news recommendation
Proceedings of the 20th international conference on World wide web
Latent OLAP: data cubes over latent variables
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Click shaping to optimize multiple objectives
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Personalized pricing recommender system: multi-stage epsilon-greedy approach
Proceedings of the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems
Hierarchical composable optimization of web pages
Proceedings of the 21st international conference companion on World Wide Web
Traffic shaping to optimize ad delivery
Proceedings of the 13th ACM Conference on Electronic Commerce
Usage data in web search: benefits and limitations
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Recommendation challenges in web media settings
Proceedings of the sixth ACM conference on Recommender systems
An Online Learning Framework for Refining Recency Search Results with User Click Feedback
ACM Transactions on Information Systems (TOIS)
LogUCB: an explore-exploit algorithm for comments recommendation
Proceedings of the 21st ACM international conference on Information and knowledge management
Content recommendation on web portals
Communications of the ACM
Estimating sharer reputation via social data calibration
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Feasibility and a case study on content optimization services on cloud
Information Systems Frontiers
Machine learning in an auction environment
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.03 |
We propose novel multi-armed bandit (explore/exploit) schemes to maximize total clicks on a content module published regularly on Yahoo! Intuitively, one can ``explore'' each candidate item by displaying it to a small fraction of user visits to estimate the item's click-through rate (CTR), and then ``exploit'' high CTR items in order to maximize clicks. While bandit methods that seek to find the optimal trade-off between explore and exploit have been studied for decades, existing solutions are not satisfactory for web content publishing applications where dynamic set of items with short lifetimes, delayed feedback and non-stationary reward (CTR) distributions are typical. In this paper, we develop a Bayesian solution and extend several existing schemes to our setting. Through extensive evaluation with nine bandit schemes, we show that our Bayesian solution is uniformly better in several scenarios. We also study the empirical characteristics of our schemes and provide useful insights on the strengths and weaknesses of each. Finally, we validate our results with a ``side-by-side'' comparison of schemes through live experiments conducted on a random sample of real user visits to Yahoo!