Job co-allocation strategies for multiple high performance computing clusters

  • Authors:
  • Jinhui Qin;Michael A. Bauer

  • Affiliations:
  • Department of Computer Science, The University of Western Ontario, London, Canada N6A 5B7;Department of Computer Science, The University of Western Ontario, London, Canada N6A 5B7

  • Venue:
  • Cluster Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

To more effectively use a network of high performance computing clusters, allocating multi-process jobs across multiple connected clusters becomes an attractive possibility. This allocation process entails dividing the processes of a job among several clusters, which we refer to as co-allocation. Co-allocation offers the possibility of more efficient use of computer resources, reduced turn-around time and computations using numbers of processes larger than processes on any single cluster. In order to realize these possibilities, effective co-allocation, ultimately, depends on the inter-cluster communication cost. In this paper, we introduce a scalable co-allocation strategy called the Maximum Bandwidth Adjacent cluster Set (MBAS) strategy. The strategy makes use of two thresholds to control allocation: one to control the limit on bandwidth on usable inter-cluster communication links and another to control how jobs are split. A simulator that can simulate the dynamic behavior of jobs running across multiple clusters was developed and used to examine the performance of the MBAS co-allocation strategy. Our results indicate that by adjusting the thresholds for link level control and chunk size control in splitting jobs, the MBAS co-allocation strategy can significantly improve both user satisfaction and system utilization.