STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Machine Learning - Special issue on inductive transfer
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning Gaussian processes from multiple tasks
ICML '05 Proceedings of the 22nd international conference on Machine learning
Large Scale Multiple Kernel Learning
The Journal of Machine Learning Research
A scalable modular convex solver for regularized risk minimization
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Convex multi-task feature learning
Machine Learning
More generality in efficient multiple kernel learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Feature hashing for large scale multitask learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Stochastic gradient boosted distributed decision trees
Proceedings of the 18th ACM conference on Information and knowledge management
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
SIAM Journal on Imaging Sciences
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
SIAM Journal on Imaging Sciences
An architecture for parallel topic models
Proceedings of the VLDB Endowment
Proximal Methods for Hierarchical Sparse Coding
The Journal of Machine Learning Research
Scalable inference in latent variable models
Proceedings of the fifth ACM international conference on Web search and data mining
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Optimization with Sparsity-Inducing Penalties
Foundations and Trends® in Machine Learning
Web-scale user modeling for targeting
Proceedings of the 21st international conference companion on World Wide Web
Web-scale multi-task feature selection for behavioral targeting
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Many estimation tasks come in groups and hierarchies of related problems. In this paper we propose a hierarchical model and a scalable algorithm to perform inference for multitask learning. It infers task correlation and subtask structure in a joint sparse setting. Implementation is achieved by a distributed subgradient oracle and the successive application of prox-operators pertaining to groups and subgroups of variables. We apply this algorithm to conversion optimization in display advertising. Experimental results on over 1TB data for up to 1 billion observations and 1 million attributes show that the algorithm provides significantly better prediction accuracy while simultaneously beingefficiently scalable by distributed parameter synchronization.