Scheduling concurrent workflows in HPC cloud through exploiting schedule gaps

  • Authors:
  • He-Jhan Jiang;Kuo-Chan Huang;Hsi-Ya Chang;Di-Syuan Gu;Po-Jen Shih

  • Affiliations:
  • Department of Computer and Information Science, National Taichung University of Education, Taichung, Taiwan;Department of Computer and Information Science, National Taichung University of Education, Taichung, Taiwan;National Center for High-Performance Computing, National Applied Research Laboratories, Hsinchu, Taiwan;Department of Computer and Information Science, National Taichung University of Education, Taichung, Taiwan;Department of Computer and Information Science, National Taichung University of Education, Taichung, Taiwan

  • Venue:
  • ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many large-scale scientific applications are usually constructed as workflows due to large amounts of interrelated computation and communication. Workflow scheduling has long been a research topic in parallel and distributed computing. However, most previous research focuses on single workflow scheduling. As cloud computing emerges, users can now have easy access to on-demand high performance computing resources, usually called HPC cloud. Since HPC cloud has to serve many users simultaneously, it is common that many workflows submitted from different users are running concurrently. Therefore, how to schedule concurrent workflows efficiently becomes an important issue in HPC cloud environments. Due to the dependency and communication costs between tasks in a workflow, there usually are gaps formed in the schedule of a workflow. In this paper, we propose a method which exploits such schedule gaps to efficiently schedule concurrent workflows in HPC cloud. The proposed scheduling method was evaluated with a series of simulation experiments and compared to the existing method in the literature. The results indicate that our method can deliver good performance and outperform the existing method significantly in terms of average makespan, up to 18% performance improvement.