Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
IEEE Micro
The Soft Error Problem: An Architectural Perspective
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Improving Multiple-CMP Systems Using Token Coherence
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Analysis of Error Recovery Schemes for Networks on Chips
IEEE Design & Test
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Exploring Fault-Tolerant Network-on-Chip Architectures
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
A Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Hi-index | 0.00 |
One way of dealing with transient faults that will affect theinterconnection network of future large-scale ChipMultiprocessor (CMP)systems is by extending the cache coherence protocol. Fault tolerance atthe level of the cache coherence protocol has been proven to achieve verylow performance overhead in absence of faults while being able to supportvery high fault rates. In this work, we compare two already proposed fault-tolerant cache coherence protocols in a common framework and present anew one based in the cache coherence protocol used in AMD Opteron processors.Also, we thoroughly evaluate the performance of the three protocols,show how to adjust the fault tolerance parameters of the protocols toachieve a desired level of fault tolerance andmeasure the overhead achievedto be able to support very high transient fault rates.