Improving cluster availability using workstation validation
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A taxonomy and survey of grid resource management systems for distributed computing
Software—Practice & Experience
Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
GridFlow: Workflow Management for Grid Computing
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
GridWorkflow: A Flexible Failure Handling Framework for the Grid
HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
Critical event prediction for proactive management in large-scale computer clusters
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Failure Data Analysis of a Large-Scale Heterogeneous Server Environment
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Automatic methods for predicting machine availability in desktop Grid and peer-to-peer systems
CCGRID '04 Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid
Software—Practice & Experience
GNARE: an environment for grid-based high-throughput genome analysis
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid - Volume 01
Grid harvest service: a performance system of grid computing
Journal of Parallel and Distributed Computing
Quantifying Temporal and Spatial Correlation of Failure Events for Proactive Management
SRDS '07 Proceedings of the 26th IEEE International Symposium on Reliable Distributed Systems
Scheduling strategies for mapping application workflows onto the grid
HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Combining Futures and Spot Markets: A Hybrid Market Approach to Economic Grid Resource Management
Journal of Grid Computing
Journal of Grid Computing
ChinaGrid: making grid computing a reality
ICADL'04 Proceedings of the 7th international Conference on Digital Libraries: international collaboration and cross-fertilization
Performance implications of failures in large-scale cluster scheduling
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Hi-index | 0.00 |
Due to the highly dynamic feature, dependable workflow scheduling is critical in the Grid environment. Various scheduling algorithms have been proposed, but seldom consider the resource reliability. Current Grid systems mainly exploit fault tolerance mechanism to guarantee the dependable workflow execution, which, however, wastes system resources. The paper proposes a dependable Grid workflow scheduling system (called DGWS). It introduces a Markov Chain-based resource availability prediction model. Based on the model, a reliability cost driven workflow scheduling algorithm is presented. The performance evaluation results, including the simulation on both parametric randomly generated DAGs and two real scientific workflow applications, demonstrate that compared to present workflow scheduling algorithms, DGWS improves the success ratio of tasks and diminishes the makespan of workflow, so improves the dependability of workflow execution in the dynamic Grid environments.