A distributed algorithm for large-scale generalized matching

Authors:
Faraz Makari Manshadi;Baruch Awerbuch;Rainer Gemulla;Rohit Khandekar;Julián Mestre;Mauro Sozio
Affiliations:
Max-Planck-Institut für Informatik;Johns Hopkins University;Max-Planck-Institut für Informatik;Knight Capital Group;School of IT, The University of Sydney;Institut Mines-Telecom, Telecom ParisTech, CNRS
Venue:
Proceedings of the VLDB Endowment
Year:
2013

Citing 21
Cited 0

Faster scaling algorithms for network problems

SIAM Journal on Computing
Fast approximation algorithms for fractional packing and covering problems

SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
Network flows: theory, algorithms, and applications

Network flows: theory, algorithms, and applications
A parallel approximation algorithm for positive linear programming

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Constrained multi-object auctions and b-matching

Information Processing Letters
Sequential and Parallel Algorithms for Mixed Packing and Covering

FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
Dependent rounding and its applications to approximation algorithms

Journal of the ACM (JACM)
A general approach to online network optimization problems

ACM Transactions on Algorithms (TALG)
Greedy in approximation algorithms

ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Graph construction and b-matching for semi-supervised learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Structure preserving embedding

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Stateless Distributed Gradient Descent for Positive Linear Programs

SIAM Journal on Computing
Fast algorithms for finding matchings in lopsided bipartite graphs with applications to display ads

Proceedings of the 11th ACM conference on Electronic commerce
Distributed fractional packing and maximum weighted b-matching via tail-recursive duality

DISC'09 Proceedings of the 23rd international conference on Distributed computing
Assigning Papers to Referees

Algorithmica - Special Issue: Matching Under Preferences; Guest Editors: David F. Manlove, Robert W. Irving and Kazuo Iwama
Social content matching in MapReduce

Proceedings of the VLDB Endowment
Filtering: a method for solving graph problems in MapReduce

Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Large-scale matrix factorization with distributed stochastic gradient descent

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
B-Matching for spectral clustering

ECML'06 Proceedings of the 17th European conference on Machine Learning
Distributed GraphLab: a framework for machine learning and data mining in the cloud

Proceedings of the VLDB Endowment
Online allocation of display ads with smooth delivery

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Generalized matching problems arise in a number of applications, including computational advertising, recommender systems, and trade markets. Consider, for example, the problem of recommending multimedia items (e.g., DVDs) to users such that (1) users are recommended items that they are likely to be interested in, (2) every user gets neither too few nor too many recommendations, and (3) only items available in stock are recommended to users. State-of-the-art matching algorithms fail at coping with large real-world instances, which may involve millions of users and items. We propose the first distributed algorithm for computing near-optimal solutions to large-scale generalized matching problems like the one above. Our algorithm is designed to run on a small cluster of commodity nodes (or in a MapReduce environment), has strong approximation guarantees, and requires only a poly-logarithmic number of passes over the input. In particular, we propose a novel distributed algorithm to approximately solve mixed packing-covering linear programs, which include but are not limited to generalized matching problems. Experiments on real-world and synthetic data suggest that a practical variant of our algorithm scales to very large problem sizes and can be orders of magnitude faster than alternative approaches.