Safety and Reliability Driven Task Allocation in Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Journal of Systems Architecture: the EUROMICRO Journal
Measurement of Failure Rate in Widely Distributed Software
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
An Agent Oriented Proactive Fault-Tolerant Framework for Grid Computing
E-SCIENCE '05 Proceedings of the First International Conference on e-Science and Grid Computing
Grid workflow scheduling based on reliability cost
Proceedings of the 2nd international conference on Scalable information systems
From P2P and Grids to Services on the Web: Evolving Distributed Communities
From P2P and Grids to Services on the Web: Evolving Distributed Communities
Scheduling in Distributed Computing Systems: Analysis, Design and Models
Scheduling in Distributed Computing Systems: Analysis, Design and Models
Reliability in grid computing systems
Concurrency and Computation: Practice & Experience - A Special Issue from the Open Grid Forum
Testing-Effort Dependent Software Reliability Model for Distributed Systems
International Journal of Distributed Systems and Technologies
Hi-index | 0.00 |
Computational Grid attributed with distributed load sharing has evolved as a platform to large scale problem solving. Grid is a collection of heterogeneous resources, offering services of varying natures, in which jobs are submitted to any of the participating nodes. Scheduling these jobs in such a complex and dynamic environment has many challenges. Reliability analysis of the grid gains paramount importance because grid involves a large number of resources which may fail anytime, making it unreliable. These failures result in wastage of both computational power and money on the scarce grid resources. It is normally desired that the job should be scheduled in an environment that ensures maximum reliability to the job execution. This work presents a reliability based scheduling model for the jobs on the computational grid. The model considers the failure rate of both the software and hardware grid constituents like application demanding execution, nodes executing the job, and the network links supporting data exchange between the nodes. Job allocation using the proposed scheme becomes trusted as it schedules the job based on a priori reliability computation.