Resilient computing systems; vol. 1
Resilient computing systems; vol. 1
A uniform approach to software and hardware fault tolerance
Journal of Systems and Software
Concurrent Exception Handling and Resolution in Distributed Object Systems
IEEE Transactions on Parallel and Distributed Systems
Rigorous Development of an Embedded Fault-Tolerant System Based on Coordinated Atomic Actions
IEEE Transactions on Computers - Special issue on fault-tolerant embedded systems
Concurrent Systems: Operating Systems, Database and Distributed Systems: An Integrated Approach
Concurrent Systems: Operating Systems, Database and Distributed Systems: An Integrated Approach
N-Version Design Versus One Good Version
IEEE Software
On Performability Modeling and Evaluation of Software Fault Tolerant Structures
EDCC-1 Proceedings of the First European Dependable Computing Conference on Dependable Computing
The Reliability of Diverse Systems: A Contribution Using Modelling of the Fault Creation Process
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Conceptual Models for the Reliability of Diverse Systems - New Results
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
Issues Insufficiently Resolved in Century 20 in the Fault-Tolerant Distributed Computing Field
SRDS '00 Proceedings of the 19th IEEE Symposium on Reliable Distributed Systems
WORDS '01 Proceedings of the Sixth International Workshop on Object-Oriented Real-Time Dependable Systems (WORDS'01)
Improving impact of self-adaptation and self-management research through evaluation methodology
Proceedings of the 2010 ICSE Workshop on Software Engineering for Adaptive and Self-Managing Systems
xDFT: an extensible dynamic fault tolerance model for cooperative system
APWeb'06 Proceedings of the 2006 international conference on Advanced Web and Network Technologies, and Applications
On improving the dependability of cloud applications with fault-tolerance
Proceedings of the WICSA 2014 Companion Volume
Hi-index | 0.00 |
This paper focuses on the problem of providing tolerance to both hardware and software faults in independent applications running on a distributed computing environment. Several hybrid-fault-tolerant architectures are identified and proposed. Given the highly varying and dynamic characteristics of the operating environment, solutions are developed mainly exploiting the adaptation property. They are based on the adaptive execution of redundant programs so as to minimise hardware resource consumption and to shorten response time, as much as possible, for a required level of fault tolerance. A method is introduced for evaluating the proposed architectures with respect to reliability, resource utilisation and response time. Examples of quantitative evaluations are also given.