Performance-Aware load balancing for multiclusters

  • Authors:
  • Ligang He;Stephen A. Jarvis;David Bacigalupo;Daniel P. Spooner;Graham R. Nudd

  • Affiliations:
  • Department of Computer Science, University of Warwick, Coventry, United Kingdom;Department of Computer Science, University of Warwick, Coventry, United Kingdom;Department of Computer Science, University of Warwick, Coventry, United Kingdom;Department of Computer Science, University of Warwick, Coventry, United Kingdom;Department of Computer Science, University of Warwick, Coventry, United Kingdom

  • Venue:
  • ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a multicluster architecture, where jobs can be submitted through each constituent cluster, the job arrival rates in individual clusters may be uneven and the load therefore needs to be balanced among clusters. In this paper we investigate load balancing for two types of jobs, namely non-QoS and QoS-demanding jobs and as a result, two performance-specific load balancing strategies (called ORT and OMR) are developed. The ORT strategy is used to obtain the optimised mean response time for non-QoS jobs and the OMR strategy is used to achieve the optimised mean miss rate for QoS-demanding jobs. The ORT and OMR strategies are mathematically modelled combining queuing network theory to establish sets of optimisation equations. Numerical solutions are developed to solve these optimisation equations, and a so called fair workload level is determined for each cluster. When the current workload in a cluster reaches this pre-calculated fair workload level, the jobs subsequently submitted to the cluster are transferred to other clusters for execution. The effectiveness of both strategies is demonstrated through theoretical analysis and experimental verification. The results show that the proposed load balancing mechanisms bring about considerable performance gains for both job types, while the job transfer frequency among clusters is considerably reduced. This has a number of advantages, in particular in the case where scheduling jobs to remote resources involves the transfer of large executable and data files.