Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
Transient-fault recovery for chip multiprocessors
Proceedings of the 30th annual international symposium on Computer architecture
Fingerprinting: bounding soft-error detection latency and bandwidth
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling
Proceedings of the 32nd annual international symposium on Computer Architecture
Utilizing Dynamically Coupled Cores to Form a Resilient Chip Multiprocessor
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Perturbation-based Fault Screening
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
IOLTS '08 Proceedings of the 2008 14th IEEE International On-Line Testing Symposium
DDMR: Dynamic and Scalable Dual Modular Redundancy with Short Validation Intervals
IEEE Computer Architecture Letters
Using Underutilized CPU Resources to Enhance Its Reliability
IEEE Transactions on Dependable and Secure Computing
Hi-index | 0.01 |
We introduce the difficulties in processing context switches, exceptions, and interrupts in DMR architectures. We propose ways to address these problems in a dynamic DMR (DDMR) architecture, providing methods that assure both cores detect the event, synchronize it to the same instruction, perform a secure context switch, run correct interrupt service routines, and avoid process termination. DDMR uses a time-division multiplexing (TDM) ring architecture to dynamically connect pairs of cores. We enhance this protocol to include the different message types required to handle interrupts and exceptions. We also propose a more efficient address-based, rather than TDM-based, ring architecture.