Optimal distributed online prediction using mini-batches

Authors:
Ofer Dekel;Ran Gilad-Bachrach;Ohad Shamir;Lin Xiao
Affiliations:
Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Research, Cambridge, MA;Microsoft Research, Redmond, WA
Venue:
The Journal of Machine Learning Research
Year:
2012

Citing 15
Cited 5

Reevaluating Amdahl's law

Communications of the ACM
Parallel and distributed computation: numerical methods

Parallel and distributed computation: numerical methods
Stochastic programming

Optimization
Smooth minimization of non-smooth functions

Mathematical Programming: Series A and B
Prediction, Learning, and Games

Prediction, Learning, and Games
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
Confidence level solutions for stochastic programming

Automatica (Journal of IFAC)
Primal-dual subgradient methods for convex problems

Mathematical Programming: Series A and B - Series B - Special Issue: Nonsmooth Optimization and Applications
Robust Stochastic Approximation Approach to Stochastic Programming

SIAM Journal on Optimization
Efficient Online and Batch Learning Using Forward Backward Splitting

The Journal of Machine Learning Research
Online Bayes point machines

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Distributed asynchronous online learning for natural language processing

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Primal-dual first-order methods with $${\mathcal {O}(1/\epsilon)}$$iteration-complexity for cone programming

Mathematical Programming: Series A and B
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

The Journal of Machine Learning Research
Stochastic Methods for l1-regularized Loss Minimization

The Journal of Machine Learning Research

Manifold identification in dual averaging for regularized stochastic online learning

The Journal of Machine Learning Research
Efficient protocols for distributed classification and optimization

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Multi-space probabilistic sequence modeling

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Researcher homepage classification using unlabeled data

Proceedings of the 22nd international conference on World Wide Web
Communication-efficient algorithms for statistical optimization

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Online prediction methods are typically presented as serial algorithms running on a single processor. However, in the age of web-scale prediction problems, it is increasingly common to encounter situations where a single processor cannot keep up with the high rate at which inputs arrive. In this work, we present the distributed mini-batch algorithm, a method of converting many serial gradient-based online prediction algorithms into distributed algorithms. We prove a regret bound for this method that is asymptotically optimal for smooth convex loss functions and stochastic inputs. Moreover, our analysis explicitly takes into account communication latencies between nodes in the distributed environment. We show how our method can be used to solve the closely-related distributed stochastic optimization problem, achieving an asymptotically linear speed-up over multiple processors. Finally, we demonstrate the merits of our approach on a web-scale online prediction problem.