DAGMap: efficient and dependable scheduling of DAG workflow job in Grid

Authors:
Haijun Cao;Hai Jin;Xiaoxin Wu;Song Wu;Xuanhua Shi
Affiliations:
Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China 430074;Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China 430074;Communication Technology Lab, Intel China Research Center, Beijing, China 100080;Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China 430074;Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China 430074
Venue:
The Journal of Supercomputing
Year:
2010

Citing 12
Cited 2

A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing

IEEE Transactions on Parallel and Distributed Systems
CoCheck: Checkpointing and Process Migration for MPI

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A Dynamic Matching and Scheduling Algorithm for Heterogeneous Computing Systems

HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Critical event prediction for proactive management in large-scale computer clusters

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
GridAnt: A Client-Controllable Grid Work.ow System

HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04) - Track 7 - Volume 7
A dynamic job grouping-based scheduling for deploying applications with fine-grained tasks on global grids

ACSW Frontiers '05 Proceedings of the 2005 Australasian workshop on Grid computing and e-research - Volume 44
BlueGene/L Failure Analysis and Prediction Models

DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
Cooperative checkpointing: a robust approach to large-scale systems reliability

Proceedings of the 20th annual international conference on Supercomputing
Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Scheduling strategies for mapping application workflows onto the grid

HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Globus toolkit version 4: software for service-oriented systems

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing

LDAG: a new model for grid workflow applications

WSEAS Transactions on Computers
A grid workflow Quality-of-Service estimation based on resource availability prediction

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

DAG has been extensively used in Grid workflow modeling. Since Grid resources tend to be heterogeneous and dynamic, efficient and dependable workflow job scheduling becomes essential. It poses great challenges to achieve minimum job accomplishing time and high resource utilization efficiency, while providing fault tolerance. Based on list scheduling and group scheduling, in this paper, we propose a novel scheduling heuristic called DAGMap. DAGMap consists of two phases, namely Static Mapping and Dependable Execution. Four salient features of DAGMap are: (1) Task grouping is based on dependency relationships and task upward priority; (2) Critical tasks are scheduled first; (3) Min-Min and Max-Min selective scheduling are used for independent tasks; and (4) Checkpoint server with cooperative checkpointing is designed for dependable execution. The experimental results show that DAGMap can achieve better performance than other previous algorithms in terms of speedup, efficiency, and dependability.