Code placement techniques for cache miss rate reduction
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Cache decay: exploiting generational behavior to reduce cache leakage power
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Assigning Program and Data Objects to Scratchpad for Energy Reduction
Proceedings of the conference on Design, automation and test in Europe
Low-leakage asymmetric-cell SRAM
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Compiler-Based Register Name Adjustment for Low-Power Embedded Processors
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Reducing data cache leakage energy using a compiler-based approach
ACM Transactions on Embedded Computing Systems (TECS)
A case for asymmetric-cell cache memories
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Leakage-aware intraprogram voltage scaling for embedded processors
Proceedings of the 43rd annual Design Automation Conference
Energy/power breakdown of pipelined nanometer caches (90nm/65nm/45nm/32nm)
Proceedings of the 2006 international symposium on Low power electronics and design
Cache-Aware Scratchpad-Allocation Algorithms for Energy-Constrained Embedded Systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
Share of leakage in cache memories is increasing with technology scaling. Studies show that most stored bits in instruction caches are zero, and hence, asymmetric SRAM cells which dissipate less leakage when storing 0, effectively reduce leakage with negligible performance penalty. We show that by carefully choosing register operands of instructions, it is possible to further increase the number of 0 bits, and hence, increase leakage savings in instruction cache. This compiler technique is performed off-line and introduces absolutely no delay penalty since processor registers are all the same. Experimental results of our benchmarks show up to 33% (averaging 30.35%) improvement in leakage.