Scheduling mapreduce jobs in HPC clusters

  • Authors:
  • Marcelo Veiga Neves;Tiago Ferreto;César De Rose

  • Affiliations:
  • Faculty of Informatics, PUCRS, Brazil;Faculty of Informatics, PUCRS, Brazil;Faculty of Informatics, PUCRS, Brazil

  • Venue:
  • Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce (MR) has become a de facto standard for large-scale data analysis. Moreover, it has also attracted the attention of the HPC community due to its simplicity, efficiency and highly scalable parallel model. However, MR implementations present some issues that may complicate its execution in existing HPC clusters, specially concerning the job submission. While on MR there are no strict parameters required to submit a job, in a typical HPC cluster, users must specify the number of nodes and amount of time required to complete the job execution. This paper presents the MR Job Adaptor, a component to optimize the scheduling of MR jobs along with HPC jobs in an HPC cluster. Experiments performed using real-world HPC and MapReduce workloads have show that MR Job Adaptor can properly transform MR jobs to be scheduled in an HPC Cluster, minimizing the job turnaround time, and exploiting unused resources in the cluster.