Fine-grain CAM-tag cache resizing using miss tags
Proceedings of the 2002 international symposium on Low power electronics and design
Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Hybrid cache architecture with disparate memory technologies
Proceedings of the 36th annual international symposium on Computer architecture
Proceedings of the 47th Design Automation Conference
An energy-efficient adaptive hybrid cache
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Processor caches with multi-level spin-transfer torque ram cells
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Power-aware variable partitioning for DSPs with hybrid PRAM and DRAM main memory
Proceedings of the 48th Design Automation Conference
A reuse-aware prefetching scheme for scratchpad memory
Proceedings of the 48th Design Automation Conference
C1C: A configurable, compiler-guided STT-RAM L1 cache
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
In this paper, a combined static and dynamic scheme is proposed to optimize the block placement for endurance and energy-efficiency in a hybrid SRAM and STT-RAM cache. With the proposed scheme, STT-RAM endurance is maximized while performance is maintained. We use the compiler to provide static hints to guide initial data placement, and use the hardware to correct the hints based on the run-time cache behavior. Experimental results show that the combined scheme improves the endurance by 23.9x and 5.9x compared to pure static and pure dynamic optimizations respectively. Furthermore, the system energy can be reduced by 17% compared to pure dynamic optimization through minimizing STT-RAM writes.