MapReduce for the cell broadband engine architecture

  • Authors:
  • M. de Kruijf;K. Sankaralingam

  • Affiliations:
  • University of Wisconsin, Department of Computer Science, Madison, Wisconsin;University of Wisconsin, Department of Computer Science, Madison, Wisconsin

  • Venue:
  • IBM Journal of Research and Development
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce is a simple and flexible parallel programming model proposed by Google for large-scale distributed data processing. In this paper, we present a design and prototype implementation of MapReduce for the Cell Broadband Engine® Architecture (CBEA). The MapReduce model provides a simple machine abstraction that shields users from parallelization and other distributed programming complications. The goal of this paper is to describe the tradeoffs in the design of the runtime and demonstrate the potential for high performance. We study the basic characteristics of the MapReduce model and identify three types of MapReduce applications: map dominated, partition dominated, and sort dominated. We evaluate our runtime performance, scalability, and efficiency for microbenchmarks representing each of these application types as well as for complete applications. We find that map-dominated applications map well to the CBEA and that our prototype sustains high performance on these applications. For partition-dominated and sort-dominated applications, we analyze runtime performance, identify sources of inefficiency, and propose several future enhancements to significantly improve performance. Overall, we find that the simplicity and efficiency of the model make it an attractive tool for programming Cell Broadband Engine processor-based platforms.