(N, K) Concept Fault Tolerance
IEEE Transactions on Computers - The MIT Press scientific computation series
The MAFT Architecture for Distributed Fault Tolerance
IEEE Transactions on Computers - Fault-Tolerant Computing
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
An Object-Oriented Fault-Tolerance Framework based on Specialization Techniques
WORDS '97 Proceedings of the 3rd Workshop on Object-Oriented Real-Time Dependable Systems - (WORDS '97)
OPODIS '09 Proceedings of the 13th International Conference on Principles of Distributed Systems
Hi-index | 0.00 |
Computers are being used to achieve increasingly sophisticated control for large and complex systems. Many of these systems require a large shared state-space or database. Thus, handling real-time concurrent accesses to a shared database is an essential feature for modern fault-tolerant systems. Many fault-tolerant systems have been implemented for uniformly tolerating various types of failures, such as MAFT (Multicomputer Architecture for Fault Tolerance), FTP (Fault-Tolerant Processor), FTPP (Fault-Tolerant Parallel Processors) and Delta-4. However, most of these either lack the notion of a shared state-space or do not efficiently support parallel tasks that concurrently access a shared state-space. We use a processor-specialization approach to increase the effectiveness of replication and, consequently, achieve cost-effective fault tolerance in such systems. The SNMR (specialized N-modular redundancy) protocol has been developed based on these concepts. Compared to many existing Byzantine-resilient systems, the SNMR approach incurs less overhead and can be easily parameterized to fit various fault models.