A fault-tolerant scheduling problem
IEEE Transactions on Software Engineering
Reliable computer systems (2nd ed.): design and evaluation
Reliable computer systems (2nd ed.): design and evaluation
Real-Time Systems
Implementation and Results of Hypothesis Testing from the C3I Parallel Benchmark Suite
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Towards energy-aware software-based fault tolerance in real-time systems
Proceedings of the 2002 international symposium on Low power electronics and design
A technique for non-invasive application-level checkpointing
The Journal of Supercomputing
An adaptive fault tolerance scheme for applications on real-time embedded system
ICESS'04 Proceedings of the First international conference on Embedded Software and Systems
Hi-index | 0.00 |
As multiprocessor systems become more complex, their reliability will need to increase as well. In this paper we propose a novel technique which is applicable to a wide variety of distributed real-time systems, especially those exhibiting data parallelism. System-level fault tolerance involves reliability techniques incorporated within the system hardware and software whereas application-level fault tolerance involves reliability techniques incorporated within the application software. We assert that, for high reliability, a combination of system-level fault tolerance and application-level fault tolerance works best. In many systems, application-level fault tolerance can be used to bridge the gap when system-level fault tolerance alone does not provide the required reliability. We exemplify this with the RTHT target tracking benchmark and the ABF beamforming benchmark.