Machine Learning
Proceedings of the ninth international conference on Information and knowledge management
Ensembles of Learning Machines
WIRN VIETRI 2002 Proceedings of the 13th Italian Workshop on Neural Nets-Revised Papers
Constructing informative priors using transfer learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Convex multi-task feature learning
Machine Learning
Feature hashing for large scale multitask learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Audience selection for on-line brand advertising: privacy-friendly social network targeting
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Estimating rates of rare events with multiple hierarchies through scalable log-linear models
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
IEEE Transactions on Knowledge and Data Engineering
Supervised nonlinear dimensionality reduction for visualization and classification
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Fast and robust fixed-point algorithms for independent component analysis
IEEE Transactions on Neural Networks
Bid optimizing and inventory scoring in targeted online advertising
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Design principles of massive, robust prediction systems
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
The automated targeting of online display ads at scale requires the simultaneous evaluation of a single prospect against many independent models. When deciding which ad to show to a user, one must calculate likelihood-to-convert scores for that user across all potential advertisers in the system. For modern machine-learning-based targeting, as conducted by Media6Degrees (M6D), this can mean scoring against thousands of models in a large, sparse feature space. Dimensionality reduction within this space is useful, as it decreases scoring time and model storage requirements. To meet this need, we develop a novel algorithm for scalable supervised dimensionality reduction across hundreds of simultaneous classification tasks. The algorithm performs hierarchical clustering in the space of model parameters from historical models in order to collapse related features into a single dimension. This allows us to implicitly incorporate feature and label data across all tasks without operating directly in a massive space. We present experimental results showing that for this task our algorithm outperforms other popular dimensionality-reduction algorithms across a wide variety of ad campaigns, as well as production results that showcase its performance in practice.