Scalable hierarchical multitask learning algorithms for conversion optimization in display advertising

Authors:
Amr Ahmed;Abhimanyu Das;Alexander J. Smola
Affiliations:
Research at Google, Mountain View, CA, USA;Microsoft Reserach, Mountain View, CA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 7th ACM international conference on Web search and data mining
Year:
2014

Citing 19
Cited 0

Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Multitask Learning

Machine Learning - Special issue on inductive transfer
Using Knowledge to Speed Learning: A Comparison of Knowledge-based Cascade-correlation and Multi-task Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning Gaussian processes from multiple tasks

ICML '05 Proceedings of the 22nd international conference on Machine learning
Large Scale Multiple Kernel Learning

The Journal of Machine Learning Research
A scalable modular convex solver for regularized risk minimization

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Convex multi-task feature learning

Machine Learning
More generality in efficient multiple kernel learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Feature hashing for large scale multitask learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Stochastic gradient boosted distributed decision trees

Proceedings of the 18th ACM conference on Information and knowledge management
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

SIAM Journal on Imaging Sciences
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

SIAM Journal on Imaging Sciences
An architecture for parallel topic models

Proceedings of the VLDB Endowment
Proximal Methods for Hierarchical Sparse Coding

The Journal of Machine Learning Research
Scalable inference in latent variable models

Proceedings of the fifth ACM international conference on Web search and data mining
Rank, trace-norm and max-norm

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Optimization with Sparsity-Inducing Penalties

Foundations and Trends® in Machine Learning
Web-scale user modeling for targeting

Proceedings of the 21st international conference companion on World Wide Web
Web-scale multi-task feature selection for behavioral targeting

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many estimation tasks come in groups and hierarchies of related problems. In this paper we propose a hierarchical model and a scalable algorithm to perform inference for multitask learning. It infers task correlation and subtask structure in a joint sparse setting. Implementation is achieved by a distributed subgradient oracle and the successive application of prox-operators pertaining to groups and subgroups of variables. We apply this algorithm to conversion optimization in display advertising. Experimental results on over 1TB data for up to 1 billion observations and 1 million attributes show that the algorithm provides significantly better prediction accuracy while simultaneously beingefficiently scalable by distributed parameter synchronization.