Fast greedy algorithms in mapreduce and streaming

  • Authors:
  • Ravi Kumar;Benjamin Moseley;Sergei Vassilvitskii;Andrea Vattani

  • Affiliations:
  • Google, Mountain View, CA, USA;Toyota Technological Institute at Chicago, Chicago, IL, USA;Google, Mountain View, CA, USA;University of California, San Diego, CA, USA

  • Venue:
  • Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Greedy algorithms are practitioners' best friends - they are intuitive, simple to implement, and often lead to very good solutions. However, implementing greedy algorithms in a distributed setting is challenging since the greedy choice is inherently sequential, and it is not clear how to take advantage of the extra processing power. Our main result is a powerful sampling technique that aids in parallelization of sequential algorithms. We then show how to use this primitive to adapt a broad class of greedy algorithms to the MapReduce paradigm; this class includes maximum cover and submodular maximization subject to p-system constraints. Our method yields efficient algorithms that run in a logarithmic number of rounds, while obtaining solutions that are arbitrarily close to those produced by the standard sequential greedy algorithm. We begin with algorithms for modular maximization subject to a matroid constraint, and then extend this approach to obtain approximation algorithms for submodular maximization subject to knapsack or p-system constraints. Finally, we empirically validate our algorithms, and show that they achieve the same quality of the solution as standard greedy algorithms but run in a substantially fewer number of rounds.