Multi-domain job coscheduling for leadership computing systems

  • Authors:
  • Wei Tang;Narayan Desai;Venkatram Vishwanath;Daniel Buettner;Zhiling Lan

  • Affiliations:
  • Illinois Institute of Technology, Chicago, USA;Argonne National Laboratory, Argonne, USA;Argonne National Laboratory, Argonne, USA;Argonne National Laboratory, Argonne, USA;Illinois Institute of Technology, Chicago, USA

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current supercomputing centers usually deploy a large-scale compute system together with an associated data analysis or visualization system. Multiple scenarios have driven the demand that some associated jobs co-execute on different machines. We propose a multi-domain coscheduling mechanism, providing the ability to coordinate execution between jobs on multiple resource management domains without manual intervention. We have evaluated our mechanism based on real job traces from Intrepid and Eureka, the production Blue Gene/P system and a cluster with the largest GPU installation, deployed at Argonne National Laboratory. The experimental results show that coscheduling can be achieved with limited impact on system performance under varying workloads.