High-bandwidth data memory systems for superscalar processors
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Increasing cache port efficiency for dynamic superscalar microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Data caches for superscalar processors
ICS '97 Proceedings of the 11th international conference on Supercomputing
On high-bandwidth data cache design for multi-issue processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Area efficient architectures for information integrity in cache memories
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Enhancing data cache reliability by the addition of a small fully-associative replication cache
Proceedings of the 18th annual international conference on Supercomputing
Hi-index | 0.00 |
Performance and reliability are both of great importance for microprocessor design. Recently, the replication cache has been proposed to enhance data cache reliability against soft errors. The replication cache is a small fully associative cache to store the replica for every write to the L1 data cache. In addition to enhance data reliability, this paper proposes several cost-effective techniques to improve performance of multiple-issue microprocessors by exploiting the replication cache. The idea is to make use of the replication cache to increase cache bandwidth through dual load and to reduce the L1 data cache miss rate through partial victim caching. Built upon these two schemes, we also propose a hybrid approach to combine the benefits of both dual load and partial victim caching for improving performance further. Our experimental results show that exploiting a replication cache with only 8 entries can improve performance by 13.0% on average without compromising the enhanced data integrity.