Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Predicting clicks: estimating the click-through rate for new ads
Proceedings of the 16th international conference on World Wide Web
Proceedings of the 25th international conference on Machine learning
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
Theoretical Computer Science
Spatio-temporal models for estimating click-through rate
Proceedings of the 18th international conference on World wide web
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms
Proceedings of the fourth ACM international conference on Web search and data mining
Multi-armed bandit algorithms and empirical evaluation
ECML'05 Proceedings of the 16th European conference on Machine Learning
Reinforcement Learning: An Introduction
IEEE Transactions on Neural Networks
Dynamic ad layout revenue optimization for display advertising
Proceedings of the Sixth International Workshop on Data Mining for Online Advertising and Internet Economy
Ad click prediction: a view from the trenches
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Visual design plays an important role in online display advertising: changing the layout of an online ad can increase or decrease its effectiveness, measured in terms of click-through rate (CTR) or total revenue. The decision of which lay- out to use for an ad involves a trade-off: using a layout provides feedback about its effectiveness (exploration), but collecting that feedback requires sacrificing the immediate reward of using a layout we already know is effective (exploitation). To balance exploration with exploitation, we pose automatic layout selection as a contextual bandit problem. There are many bandit algorithms, each generating a policy which must be evaluated. It is impractical to test each policy on live traffic. However, we have found that offline replay (a.k.a. exploration scavenging) can be adapted to provide an accurate estimator for the performance of ad layout policies at Linkedin, using only historical data about the effectiveness of layouts. We describe the development of our offline replayer, and benchmark a number of common bandit algorithms.