The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Improving cluster availability using workstation validation
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A taxonomy and survey of grid resource management systems for distributed computing
Software—Practice & Experience
GridFlow: Workflow Management for Grid Computing
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Failure Data Analysis of a Large-Scale Heterogeneous Server Environment
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
A grid service broker for scheduling distributed data-oriented applications on global grids
MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
Security-Driven Heuristics and A Fast Genetic Algorithm for Trusted Grid Job Scheduling
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
An adaptive meta-scheduler for data-intensive applications
International Journal of Grid and Utility Computing
Hi-index | 0.00 |
Due to the diverse failures and error conditions in grid environments, node unavailability is increasingly becoming severe and poses great challenges to reliable job scheduling in grid environment. Current job management systems mainly exploit fault recovery mechanism to guarantee the completion of jobs, but sacrificing system efficiency. To address the challenges, in this paper, a node TTF (Time To Failure) prediction model and job completion prediction model are designed. Based on these models, the paper proposes a dependability guided job scheduling system, called DGSS, which provides failure avoidance job scheduling. The experimental results validate the improvement in the dependability of job execution and system resources utilization.