Job co-allocation strategies for multiple high performance computing clusters

Authors:
Jinhui Qin;Michael A. Bauer
Affiliations:
Department of Computer Science, The University of Western Ontario, London, Canada N6A 5B7;Department of Computer Science, The University of Western Ontario, London, Canada N6A 5B7
Venue:
Cluster Computing
Year:
2009

Citing 15
Cited 0

Algorithms for constraint-satisfaction problems: a survey

AI Magazine
The grid: blueprint for a new computing infrastructure

The grid: blueprint for a new computing infrastructure
Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors

Journal of the ACM (JACM)
A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
Network Communication Technology

Network Communication Technology
High Performance Cluster Computing: Programming and Applications

High Performance Cluster Computing: Programming and Applications
High Performance Cluster Computing: Architectures and Systems

High Performance Cluster Computing: Architectures and Systems
The Performance of Processor Co-Allocation in Multicluster Systems

CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
On Advantages of Grid Computing for Parallel Job Scheduling

CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Benefits of Global Grid Computing for Job Scheduling

GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Job Scheduling for Grid Computing on Metacomputers

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 4 - Volume 05
Characterization of Bandwidth-Aware Meta-Schedulers for Co-Allocating Jobs Across Multiple Clusters

The Journal of Supercomputing
Workflow-based grid applications

Future Generation Computer Systems
A Study on Job Co-Allocation in Multiple HPC Clusters

HPCS '06 Proceedings of the 20th International Symposium on High-Performance Computing in an Advanced Collaborative Environment
Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms

Scientific Programming - Scientific Workflows

Quantified Score

Hi-index	0.00

Visualization

Abstract

To more effectively use a network of high performance computing clusters, allocating multi-process jobs across multiple connected clusters becomes an attractive possibility. This allocation process entails dividing the processes of a job among several clusters, which we refer to as co-allocation. Co-allocation offers the possibility of more efficient use of computer resources, reduced turn-around time and computations using numbers of processes larger than processes on any single cluster. In order to realize these possibilities, effective co-allocation, ultimately, depends on the inter-cluster communication cost. In this paper, we introduce a scalable co-allocation strategy called the Maximum Bandwidth Adjacent cluster Set (MBAS) strategy. The strategy makes use of two thresholds to control allocation: one to control the limit on bandwidth on usable inter-cluster communication links and another to control how jobs are split. A simulator that can simulate the dynamic behavior of jobs running across multiple clusters was developed and used to examine the performance of the MBAS co-allocation strategy. Our results indicate that by adjusting the thresholds for link level control and chunk size control in splitting jobs, the MBAS co-allocation strategy can significantly improve both user satisfaction and system utilization.