A Resource Management Architecture for Metacomputing Systems
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
A Fault Detection Service for Wide Area Distributed Computations
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Representing Dynamic Performance Information in Grid Environments with the Network Weather Service
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Integrating fault-tolerance techniques in grid applications
Integrating fault-tolerance techniques in grid applications
The Grid 2: Blueprint for a New Computing Infrastructure
The Grid 2: Blueprint for a New Computing Infrastructure
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
Evaluating the reliability of computational grids from the end user's point of view
Journal of Systems Architecture: the EUROMICRO Journal
An evaluation methodology for computational grids
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Hi-index | 0.01 |
This paper proposes fault tolerance service to satisfy QoS requirement in grid computing. The probability of failure in the grid computing is higher than in a tradition parallel computing. Since the failure of resources affects job execution fatally, fault tolerance service is essential in grid computing. And grid services are often expected to meet some minimum levels of quality of service (QoS) for desirable operation. However Globus toolkit does not provide fault tolerance service that supports fault detection service and management service and satisfies QoS requirement. In order to provide fault tolerance service and satisfy QoS requirements, we expand the definition of failure, such as process failure, processor failure, and network failure. And we propose fault detection service and fault management service and show simulation results.