Pairwise Element Computation with MapReduce

  • Authors:
  • Tim Kiefer;Peter Benjamin Volk;Wolfgang Lehner

  • Affiliations:
  • Dresden University of Technology, Dresden, Germany;Dresden University of Technology, Dresden, Germany;Dresden University of Technology, Dresden, Germany

  • Venue:
  • Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a parallel method to evaluate functions on pairs of elements. It is a challenge to partition the Cartesian product of a set with itself in order to parallelize the function evaluation on all pairs. Our solution uses (a) replication of set elements to allow for partitioning and (b) aggregation of the results gathered for different copies of an element. Based on an execution model with nodes that execute tasks on local data without online communication, we present a generic algorithm and show how it can be implemented with MapReduce. Three different distribution schemes that define the partitioning of the Cartesian product are introduced, compared, and evaluated. Any one of the distribution schemes can be used to derive and implement a specific algorithm for parallel pairwise element computation.