IEEE Transactions on Computers
Adapting to intermittent faults in multicore systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Dynamic heterogeneity and the need for multicore virtualization
ACM SIGOPS Operating Systems Review
Hi-index | 0.00 |
Increasing chip density combined with heightened reliability expectations has spawned greater interest in fault tolerant design. In recent years, research into rollback and retry techniques has established them as an effective approach to recovery from transient and intermittent faults. For applications with strict timing requirements, however, the high error latency inherent in retry approaches is unacceptable. We have developed an alternative recovery method with strict error latency boundaries. In addition, the bulky state storage hardware required in rollback designs has been eliminated. The result is a more efficient, more broadly applicable approach to fault tolerant design.