Scheduling efficiently for irregular load distributions in a large-scale cluster

  • Authors:
  • Bao-Yin Zhang;Ze-Yao Mo;Guang-Wen Yang;Wei-Min Zheng

  • Affiliations:
  • Institute of Applied Physics and Computational Mathematics, Beijing, P.R. China;Institute of Applied Physics and Computational Mathematics, Beijing, P.R. China;Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China;Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China

  • Venue:
  • ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

Random stealing is a well-known dynamic scheduling algorithm. However, in a large-scale cluster, an idle node must randomly steal many times to obtain a task from another node, especially, this problem severely affects performance in systems where only a few nodes generate most of the system workload. In this paper, we present an efficient dynamic scheduling algorithm, Transitive Random Stealing (TRS) based on random stealing, which makes any idle node rapidly obtain a task from another node for irregular load distributions in a large-scale cluster. Then by the random baseline technique, we experimentally compare TRS with Shis, one of load balance policies in the EARTH system, and random stealing for different load distributions in the Tsinghua EastSun cluster and show that TRS is a highly efficient scheduling algorithm for irregular load distributions in a large-scale cluster. Finally, TRS is implemented in the Jcluster environment, a high performance Java parallel environment, and an experiment result is given in the HKU Gideon 300 cluster.