Characterization of Bandwidth-Aware Meta-Schedulers for Co-Allocating Jobs Across Multiple Clusters

  • Authors:
  • William M. Jones;Walter B. Ligon, III;Louis W. Pang;Dan Stanzione

  • Affiliations:
  • Parallel Architecture Research Lab, Department of Electrical and Computer Engineering, Clemson University, Clemson 29634-0915;Parallel Architecture Research Lab, Department of Electrical and Computer Engineering, Clemson University, Clemson 29634-0915;Parallel Architecture Research Lab, Department of Electrical and Computer Engineering, Clemson University, Clemson 29634-0915;High Performance Computing Center, Fulton School of Engineering, Arizona State University, Tempe 85287-5206

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a bandwidth-centric job communication model that captures the interaction and impact of simultaneously co-allocating jobs across multiple clusters. We compare our dynamic model with previous research that utilizes a fixed execution time penalty for co-allocated jobs. We explore the interaction of simultaneously co-allocated jobs and the contention they often create in the network infrastructure of a dedicated computational multi-cluster.We also present several bandwidth-aware co-allocating meta-schedulers. These schedulers take inter-cluster network utilization into account as a means by which to mitigate degraded job run-time performance. We make use of a bandwidth-centric parallel job communication model that captures the time-varying utilization of shared inter-cluster network resources. By doing so, we are able to evaluate the performance of multi-cluster scheduling algorithms that focus not only on node resource allocation, but also on shared inter-cluster network bandwidth.