Adapting MapReduce for HPC environments

  • Authors:
  • Zacharia Fadika;Elif Dede;Madhusudhan Govindaraju;Lavanya Ramakrishnan

  • Affiliations:
  • Binghamton University, Binghamton, USA;Binghamton University, Binghamton, USA;Binghamton University, Binghamton, USA;Lawrence Berkeley National Laboratory, Berkeley, USA

  • Venue:
  • Proceedings of the 20th international symposium on High performance distributed computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce is increasingly gaining popularity as a programming model for use in large-scale distributed processing. The model is most widely used when implemented using the Hadoop Distributed File System (HDFS). The use of the HDFS, however, precludes the direct applicability of the model to HPC environments, which use high performance distributed file systems. In such distributed environments, the MapReduce model can rarely make use of full resources, as local disks may not be available for data placement on all the nodes. This work proposes a MapReduce implementation and design choices directly suitable for such HPC environments.