Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
System structure for software fault tolerance
Proceedings of the international conference on Reliable software
The Soft Error Problem: An Architectural Perspective
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Trading off Cache Capacity for Reliability to Enable Low Voltage Operation
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
A class of optimal minimum odd-weight-column SEC-DED codes
IBM Journal of Research and Development
Soft error benchmarking of L2 caches with PARMA
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Orleans: cloud computing for everyone
Proceedings of the 2nd ACM Symposium on Cloud Computing
Hi-index | 0.00 |
This work introduces Check-on-Write: a memory array error protection approach that enables a trade-off between a memory array's fault-coverage and energy. The presented approach checks for error in a value stored in an array before it is overwritten rather than, as currently done, when it is read (check-on-read). This aims at reducing the number and energy of error code checks. This lazy protection approach can be used for caches in systems that support failure-atomicity to recover from corrupted state due to a fault. The paper proposes and evaluates an adaptive memory protection scheme that is capable of both check-on-read and check-on-write and switches between the two protection modes depending on the energy to be saved and fault coverage requirements. Experimental analysis shows that our technique reduces the average dynamic energy of the L1 instruction cache tag and data arrays by 18.6% and 17.7% respectively. For the L1 data cache, this is 17.2% and 2.9%, and the savings are 13.4% for the L2 tag array. The paper also quantifies the implications of the proposed scheme on fault-coverage by analyzing the mean-time-to-failure as a function of the transient failure rate.