Resilient computing systems: vol. 1
Low-cost schemes for fault tolerance
Low-cost schemes for fault tolerance
Fault Tolerance in Multiprocessor Systems Without Dedicated Redundancy
IEEE Transactions on Computers
Ensuring Data Security and Integrity with a Fast Stable Storage
Proceedings of the Fourth International Conference on Data Engineering
A case for two-level distributed recovery schemes
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Roll-Forward and Rollback Recovery: Performance-Reliability Trade-Off
IEEE Transactions on Computers - Special issue on mobile computing
A Roll-Forward Recovery Scheme for Solving the Problem of Coasting Forward for Distributed Systems
ACM SIGOPS Operating Systems Review
Checking a Non-Byzantine FT Scheme against Byzantine Faults
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
COMPSAC '97 Proceedings of the 21st International Computer Software and Applications Conference
Recoverable mobile environment: design and trade-off analysis
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
An Object-Oriented Fault-Tolerance Framework based on Specialization Techniques
WORDS '97 Proceedings of the 3rd Workshop on Object-Oriented Real-Time Dependable Systems - (WORDS '97)
A Replication Technique Based on a Functional and Attribute Grammar Computation Model
ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
A New Approach for High Performance Computing Systems with Various Checkpointing Schemes
The Journal of Supercomputing
Architectural-Level Fault Tolerant Computation in Nanoelectronic Processors
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Fault tolerant nanoelectronic processor architectures
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Towards Nanoelectronics Processor Architectures
Journal of Electronic Testing: Theory and Applications
In-field healing of integration problems with COTS components
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Optimal checkpointing interval for two-level recovery schemes
Computers & Mathematics with Applications
A middleware approach to achieving fault tolerance of Kahn process networks on networks on chips
International Journal of Reconfigurable Computing - Special issue on selected papers from the international workshop on reconfigurable communication-centric systems on chips (ReCoSoC' 2010)
Rigorous fault tolerance using aspects and formal methods
Rigorous Development of Complex Fault-Tolerant Systems
A SAFE approach towards early design space exploration of fault-tolerant multimedia MPSoCs
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Performance evaluation of cloud service considering fault recovery
The Journal of Supercomputing
Exception handlers for healing component-based systems
ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
Improving the fault resilience of an H.264 decoder using static analysis methods
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on ESTIMedia'10
Hi-index | 14.98 |
We propose a novel architecture for a fault-tolerant multiprocessor environment. It is assumed that the multiprocessor organization consists of a pool of active processing modules and either a small number of spare modules or active modules with some spare processing capacity. A fault-tolerance scheme is developed for duplex systems using checkpoints. Our scheme, unlike traditional checkpointing schemes, requires no rollbacks for recovering from single faults. The objective is to achieve performance of a triple modular redundant system using duplex system redundancy.