Centralized versus distributed schedulers for multiple bag-of-task applications

Authors:
Olivier Beaumont;Larry Carter;Jeanne Ferrante;Arnaud Legrand;Loris Marchal;Yves Robert
Affiliations:
Laboratoire LaBRI, CNRS, INRIA Bordeaux, France;Dept. of Computer Science and Engineering, University of California, San Diego;Dept. of Computer Science and Engineering, University of California, San Diego;Laboratoire ID-IMAG, CNRS, INRIA, Grenoble, France;Laboratoire LIP, CNRS, INRIA, École Normale Supérieure de Lyon, France;Laboratoire LIP, CNRS, INRIA, École Normale Supérieure de Lyon, France
Venue:
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Year:
2006

Citing 14
Cited 7

Data networks

Data networks
Improved approximation algorithms for the multi-commodity flow problem and local competitive routing in dynamic networks

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
A Proposal for a Heterogeneous Cluster ScaLAPACK (Dense Linear Solvers)

IEEE Transactions on Computers
Bandwidth sharing: objectives and algorithms

IEEE/ACM Transactions on Networking (TON)
Load Balancing in Distributed Systems: An Approach Using Cooperative Games

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Bandwidth-Centric Allocation of Independent Tasks on Heterogeneous Platforms

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Scheduling Distributed Applications: the SimGrid Simulation Framework

CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Autonomous Protocols for Bandwidth-Centric Scheduling of Independent-Task Applications

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Efficient collective communication in distributed heterogeneous systems

Journal of Parallel and Distributed Computing
Efficient Collective Communication in Distributed Heterogeneous Systems

ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
Scheduling Strategies for Master-Slave Tasking on Heterogeneous Processor Platforms

IEEE Transactions on Parallel and Distributed Systems
Independent and Divisible Tasks Scheduling on Heterogeneous Star-shaped Platforms with Limited Memory

PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing
A Strategyproof Mechanism for Scheduling Divisible Loads in Distributed Systems

ISPDC '05 Proceedings of the The 4th International Symposium on Parallel and Distributed Computing
A simple local-control approximation algorithm for multicommodity flow

SFCS '93 Proceedings of the 1993 IEEE 34th Annual Foundations of Computer Science

Efficient reuse of replicated parallel data segments in computational grids

Future Generation Computer Systems
The performance of bags-of-tasks in large-scale distributed systems

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Performance modeling of parallel applications for grid scheduling

Journal of Parallel and Distributed Computing
Liana: a decentralized load-dependent scheduler for performance-cost optimization of grid service

The Journal of Supercomputing
A survey of job scheduling in grids

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Collaborative scheduling of DAG structured computations on multicore processors

Proceedings of the 7th ACM international conference on Computing frontiers
Strategies for Rescheduling Tightly-Coupled Parallel Applications in Multi-Cluster Grids

Journal of Grid Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiple applications that execute concurrently on heterogeneous platforms compete for CPU and network resources. In this paper we consider the problem of scheduling applications to ensure fair and efficient execution on a distributed network of processors. We limit our study to the case where communication is restricted to a tree embedded in the network, and the applications consist of a large number of independent tasks that originate at the tree's root. The tasks of a given application all have the same computation and communication requirements, but these requirements can vary for different applications. Each application is given a weight that quantifies its relative value. The goal of scheduling is to maximize throughput while executing tasks from each application in the same ratio as their weights. We can find the optimal asymptotic rates by solving a linear program that expresses all necessary problem constraints, and we show how to construct a periodic schedule. For single-level trees, the solution is characterized by processing tasks with larger communication-to-computation ratios at children with larger bandwidths. For multi-level trees, this approach requires global knowledge of all application and platform parameters. For large-scale platforms, such global coordination by a centralized scheduler may be unrealistic. Thus, we also investigate decentralized schedulers that use only local information at each participating resource. We assess their performance via simulation, and compare to a centralized solution obtained via linear programming. The best of our decentralized heuristics achieves the same performance on about two-thirds of our test cases, but is far worse in a few cases. While our results are based on simplistic assumptions and do not explore all parameters (such as buffer size), they provide insight into the important question of fairly and optimally co-scheduling heterogeneous applications on heterogeneous grids.