The design and implementation of PowerMill
ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Code placement techniques for cache miss rate reduction
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors
Proceedings of the 2002 international symposium on Low power electronics and design
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness
CASCON '94 Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research
Compiler-Directed Array Interleaving for Reducing Energy in Multi-Bank Memories
ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
Reducing translation lookaside buffer active power
Proceedings of the 2003 international symposium on Low power electronics and design
A selective filter-bank TLB system
Proceedings of the 2003 international symposium on Low power electronics and design
Instruction Scheduling for Low Power
Journal of VLSI Signal Processing Systems
Optimizing instruction TLB energy using software and hardware techniques
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Compiler-Directed Code Restructuring for Reducing Data TLB Energy
CODES+ISSS '04 Proceedings of the international conference on Hardware/Software Codesign and System Synthesis: 2004
A Low Power TLB Structure for Embedded Systems
IEEE Computer Architecture Letters
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Dynamic code management: improving whole program code locality in managed runtimes
Proceedings of the 2nd international conference on Virtual execution environments
Energy-efficient instruction scheduling utilizing cache miss information
MEDEA '05 Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architecture
Compiler-directed physical address generation for reducing dTLB power
ISPASS '04 Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software
Dynamic scratchpad memory management for code in portable systems with an MMU
ACM Transactions on Embedded Computing Systems (TECS)
Code Transformations for TLB Power Reduction
VLSID '09 Proceedings of the 2009 22nd International Conference on VLSI Design
Hi-index | 0.00 |
High performance embedded processors are equipped with the Translation Look-aside Buffer (TLB) which forms the key ingredient to efficient and speedy virtual memory management. The TLB though small, is frequently accessed, and therefore not only consumes significant energy, but also is one of the important thermal hot-spots in the processor. Among the many circuit and microarchitectural techniques proposed to reduce TLB power consumption, the Use-Last TLB is one very efficient technique in which power is consumed only when different pages are accessed in succession, i.e., when there is a page-switch [26]. Though the Use-Last technique is effective in reducing i-TLB power, there is scope to further improve its effectiveness by changing the relative code placement of the program. In this work, we formulate the code placement problem to minimize the page-switches in a program. We prove that this problem is NP-complete and propose an efficient Bounds Based Procedure Placement (B2P2) heuristic to efficiently reduce the program's page-switches. Our procedure placement technique delivers an average of 76% reduction in the instrucion-TLB power with negligible (Use-Last TLB architecture alone.