Performance Optimization of Checkpointing Schemes with Task Duplication
IEEE Transactions on Computers
A Case for Two-Level Recovery Schemes
IEEE Transactions on Computers
Hi-index | 0.98 |
This paper considers the reliability of a microprocessor (@mP) system where errors can be detected by using signatures. A system consists of DMR (double modular redundancy); i.e., each task is executed on two processors. A job is divided into tasks and signatures are compared at the end of each task. If after completion of a task the signatures do not agree, the task is executed again. We derive the mean time and the total number of task executions until a job completes successfully, using the theory of Markov renewal processes. Moreover, we discuss an optimal policy which minimizes the mean processing time. Numerical examples show that it is effective to make use of signatures when the size of jobs is large.