Beating the hold-out: bounds for K-fold and progressive cross-validation
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Predicting good probabilities with supervised learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Predicting clicks: estimating the click-through rate for new ads
Proceedings of the 16th international conference on World Wide Web
Sample Selection Bias Correction Theory
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Spatio-temporal models for estimating click-through rate
Proceedings of the 18th international conference on World wide web
Feature hashing for large scale multitask learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Sparse Online Learning via Truncated Gradient
The Journal of Machine Learning Research
A novel click model and its applications to online advertising
Proceedings of the third ACM international conference on Web search and data mining
Improving ad relevance in sponsored search
Proceedings of the third ACM international conference on Web search and data mining
Overlapping experiment infrastructure: more, better, faster experimentation
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Exploitation and exploration in a performance based contextual advertising system
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Value of learning in sponsored search auctions
WINE'10 Proceedings of the 6th international conference on Internet and network economics
Scaling up Machine Learning: Parallel and Distributed Approaches
Scaling up Machine Learning: Parallel and Distributed Approaches
Photon: fault-tolerant and scalable joining of continuous data streams
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Automatic ad format selection via contextual bandits
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
LASER: a scalable response prediction platform for online advertising
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
Predicting ad click-through rates (CTR) is a massive-scale learning problem that is central to the multi-billion dollar online advertising industry. We present a selection of case studies and topics drawn from recent experiments in the setting of a deployed CTR prediction system. These include improvements in the context of traditional supervised learning based on an FTRL-Proximal online learning algorithm (which has excellent sparsity and convergence properties) and the use of per-coordinate learning rates. We also explore some of the challenges that arise in a real-world system that may appear at first to be outside the domain of traditional machine learning research. These include useful tricks for memory savings, methods for assessing and visualizing performance, practical methods for providing confidence estimates for predicted probabilities, calibration methods, and methods for automated management of features. Finally, we also detail several directions that did not turn out to be beneficial for us, despite promising results elsewhere in the literature. The goal of this paper is to highlight the close relationship between theoretical advances and practical engineering in this industrial setting, and to show the depth of challenges that appear when applying traditional machine learning methods in a complex dynamic system.