A scheduling framework for large-scale, parallel, and topology-aware applications

  • Authors:
  • Valentin Kravtsov;Pavel Bar;David Carmeli;Assaf Schuster;Martin Swain

  • Affiliations:
  • Computer Science Department, Technion-Israel Institute of Technology, Technion City, Haifa, Israel;Computer Science Department, Technion-Israel Institute of Technology, Technion City, Haifa, Israel;Computer Science Department, Technion-Israel Institute of Technology, Technion City, Haifa, Israel;Computer Science Department, Technion-Israel Institute of Technology, Technion City, Haifa, Israel;Systems Biology Research Group, University of Ulster, Cromore Road, Coleraine, BT52 1SA, Northern Ireland, UK

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Scheduling of large-scale, distributed topology-aware applications requires that not only the properties of the requested machines be considered, but also the properties of the machines' interconnections. This requirement severely complicates the scheduling process, as even a matching between a single multi-processor task and available machines in a single time slot becomes an NP-complete problem with no polynomial approximation. In this paper we propose a complete scheduling framework for multi-cluster, heterogeneous environments that provides, in practice, an efficient solution for the scheduling of topology-aware applications. The proposed framework is very flexible as it is composed of pluggable components and can be easily configured to support a variety of scheduling policies. We also describe three novel scheduling and coallocation algorithms that were developed and plugged into the framework. The proposed scheduling framework was integrated into the QosCosGrid system, where it is used as the main decision-making module.