B2P2: bounds based procedure placement for instruction TLB power reduction in embedded systems

Authors:
Reiley Jeyapaul;Aviral Shrivastava
Affiliations:
Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ
Venue:
Proceedings of the 13th International Workshop on Software & Compilers for Embedded Systems
Year:
2010

Citing 21
Cited 0

The design and implementation of PowerMill

ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Code placement techniques for cache miss rate reduction

ACM Transactions on Design Automation of Electronic Systems (TODAES)
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors

Proceedings of the 2002 international symposium on Low power electronics and design
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Virtual-Address Caches Part 1: Problems and Solutions in Uniprocessors

IEEE Micro
A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness

CASCON '94 Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research
Compiler-Directed Array Interleaving for Reducing Energy in Multi-Bank Memories

ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
Reducing translation lookaside buffer active power

Proceedings of the 2003 international symposium on Low power electronics and design
A selective filter-bank TLB system

Proceedings of the 2003 international symposium on Low power electronics and design
Instruction Scheduling for Low Power

Journal of VLSI Signal Processing Systems
Optimizing instruction TLB energy using software and hardware techniques

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Compiler-Directed Code Restructuring for Reducing Data TLB Energy

CODES+ISSS '04 Proceedings of the international conference on Hardware/Software Codesign and System Synthesis: 2004
A Low Power TLB Structure for Embedded Systems

IEEE Computer Architecture Letters
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
An ultra low-power TLB design

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Dynamic code management: improving whole program code locality in managed runtimes

Proceedings of the 2nd international conference on Virtual execution environments
Energy-efficient instruction scheduling utilizing cache miss information

MEDEA '05 Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architecture
Compiler-directed physical address generation for reducing dTLB power

ISPASS '04 Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software
Dynamic scratchpad memory management for code in portable systems with an MMU

ACM Transactions on Embedded Computing Systems (TECS)
Code Transformations for TLB Power Reduction

VLSID '09 Proceedings of the 2009 22nd International Conference on VLSI Design

Quantified Score

Hi-index	0.00

Visualization

Abstract

High performance embedded processors are equipped with the Translation Look-aside Buffer (TLB) which forms the key ingredient to efficient and speedy virtual memory management. The TLB though small, is frequently accessed, and therefore not only consumes significant energy, but also is one of the important thermal hot-spots in the processor. Among the many circuit and microarchitectural techniques proposed to reduce TLB power consumption, the Use-Last TLB is one very efficient technique in which power is consumed only when different pages are accessed in succession, i.e., when there is a page-switch [26]. Though the Use-Last technique is effective in reducing i-TLB power, there is scope to further improve its effectiveness by changing the relative code placement of the program. In this work, we formulate the code placement problem to minimize the page-switches in a program. We prove that this problem is NP-complete and propose an efficient Bounds Based Procedure Placement (B2P2) heuristic to efficiently reduce the program's page-switches. Our procedure placement technique delivers an average of 76% reduction in the instrucion-TLB power with negligible (Use-Last TLB architecture alone.