Performance analysis of dynamic workflow scheduling in multicluster grids

Authors:
Ozan Sonmez;Nezih Yigitbasi;Saeid Abrishami;Alexandru Iosup;Dick Epema
Affiliations:
Delft University of Technology, Delft, The Netherlands;Delft University of Technology, Delft, The Netherlands;Ferdowsi University, Mashhad, Iran;Delft University of Technology, Delft, The Netherlands;Delft University of Technology, Delft, The Netherlands
Venue:
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Year:
2010

Citing 30
Cited 9

Graph contraction for physical optimization methods: a quality-cost tradeoff for mapping data on parallel computers

ICS '93 Proceedings of the 7th international conference on Supercomputing
A parallel algorithm for multilevel graph partitioning and sparse matrix ordering

Journal of Parallel and Distributed Computing
Benchmarking and comparison of the task graph scheduling algorithms

Journal of Parallel and Distributed Computing
A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing

IEEE Transactions on Parallel and Distributed Systems
Parallel Multilevel Graph Partitioning

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Sun Grid Engine: Towards Creating a Compute Power Grid

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Distributed computing in practice: the Condor experience: Research Articles

Concurrency and Computation: Practice & Experience - Grid Performance
Scheduling of scientific workflows in the ASKALON grid environment

ACM SIGMOD Record
The Globus Striped GridFTP Framework and Server

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Task scheduling strategies for workflow-based applications in grids

CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Mapping DAG-based applications to multiclusters with background workload

CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Pegasus: A framework for mapping complex scientific workflows onto distributed systems

Scientific Programming
Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Backfilling Using System-Generated Predictions Rather than User Runtime Estimates

IEEE Transactions on Parallel and Distributed Systems
The Grid Workloads Archive

Future Generation Computer Systems
Adaptive Workflow Processing and Execution in Pegasus

GPC-WORKSHOPS '08 Proceedings of the 2008 The 3rd International Conference on Grid and Pervasive Computing - Workshops
The performance of bags-of-tasks in large-scale distributed systems

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
DGSim: Comparing Grid Resource Management Architectures through Trace-Based Simulation

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
KOALA: a co-allocating grid scheduler

Concurrency and Computation: Practice & Experience
Run-time Optimisation of Grid Workflow Applications

GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Trace-based evaluation of job runtime and queue wait time predictions in grids

Proceedings of the 18th ACM international symposium on High performance distributed computing
Hybrid Re-scheduling Mechanisms for Workflow Applications on Multi-cluster Grid

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Data Staging Strategies and Their Impact on the Execution of Scientific Workflows

Proceedings of the second international workshop on Data-aware distributed computing
A performance study of grid workflow engines

GRID '08 Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing
On grid performance evaluation using synthetic workloads

JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
An opportunistic algorithm for scheduling workflows on grids

VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
On the Benefit of Processor Coallocation in Multicluster Grid Systems

IEEE Transactions on Parallel and Distributed Systems
Scheduling multiple DAGs onto heterogeneous systems

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
The characteristics and performance of groups of jobs in grids

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing

Graph-Cut Based Coscheduling Strategy Towards Efficient Execution of Scientific Workflows in Collaborative Cloud Environments

GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
Performance Evaluation of Overload Control in Multi-cluster Grids

GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
PonD: dynamic creation of HTC pool on demand using a decentralized resource discovery system

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Workflow Scheduling to Minimize Data Movement Using Multi-constraint Graph Partitioning

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Integration of Workflow Partitioning and Resource Provisioning

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Scheduling parameter sweep workflow in the Grid based on resource competition

Future Generation Computer Systems
Scheduling of scientific workflow in non-dedicated heterogeneous multicluster platform

Journal of Systems and Software
Exploring portfolio scheduling for long-term execution of scientific workloads in IaaS clouds

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Distributed workflow mapping algorithm for maximized reliability under end-to-end delay constraint

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scientists increasingly rely on the execution of workflows in grids to obtain results from complex mixtures of applications. However, the inherently dynamic nature of grid workflow scheduling, stemming from the unavailability of scheduling information and from resource contention among the (multiple) workflows and the non-workflow system load, may lead to poor or unpredictable performance. In this paper we present a comprehensive and realistic investigation of the performance of a wide range of dynamic workflow scheduling policies in multicluster grids. We first introduce a taxonomy of grid workflow scheduling policies that is based on the amount of dynamic information used in the scheduling process, and map to this taxonomy seven such policies across the full spectrum of information use. Then, we analyze the performance of these scheduling policies through simulations and experiments in a real multicluster grid. We find that there is no single grid workflow scheduling policy with good performance across all the investigated scenarios. We also find from our real system experiments that with demanding workloads, the limitations of the head-nodes of the grid clusters may lead to performance loss not expected from the simulation results. We show that task throttling, that is, limiting the per-workflow number of tasks dispatched to the system, prevents the head-nodes from becoming overloaded while largely preserving performance, at least for communication-intensive workflows.