Dynamic voltage scaling on a low-power microprocessor
Proceedings of the 7th annual international conference on Mobile computing and networking
Register Allocation for Banked Register File
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Drowsy caches: simple techniques for reducing leakage power
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
VISI Physical Design Automation: Theory and Practice
VISI Physical Design Automation: Theory and Practice
An efficient technique for exploring register file size in ASIP synthesis
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Energy-Efficient Register Access
SBCCI '00 Proceedings of the 13th symposium on Integrated circuits and systems design
Full chip leakage estimation considering power supply and temperature variations
Proceedings of the 2003 international symposium on Low power electronics and design
Getting Gigascale Chips: Challenges and Opportunities in Continuing Moore's Law
Queue - Power Management
Compilers and operating systems for low power
Compilers and operating systems for low power
Power-aware compilation for register file energy reduction
International Journal of Parallel Programming - Special issue: Workshop on application specific processors (WASP)
Operation tables for scheduling in the presence of incomplete bypassing
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Partitioning Variables across Register Windows to Reduce Spill Code in a Low-Power Processor
IEEE Transactions on Computers
Closing the POWER Gap between ASIC & Custom: Tools and Techniques for Low Power Design
Closing the POWER Gap between ASIC & Custom: Tools and Techniques for Low Power Design
Power Breakdown Analysis for a Heterogeneous NoC Platform Running a Video Application
ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
Journal of Signal Processing Systems
ASAP '08 Proceedings of the 2008 International Conference on Application-Specific Systems, Architectures and Processors
Design of a low power pre-synchronization ASIP for multimode SDR terminals
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
A novel software solution for localized thermal problems
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Register File Power Reduction Using Bypass Sensitive Compiler
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
AFReP: application-guided function-level registerfile power-gating for embedded processors
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
Register files in modern embedded processors contribute a substantial budget in the energy consumption due to their large switching capacitance and long working time. For some embedded processors, on average 25% of registers account for 83% of register file accessing time. This motivates us to partition the register file into hot and cold regions, with the most frequently used registers placed in the hot region, and the rarely accessed ones in the cold region. We employ the bit-line splitting and drowsy register cell techniques to reduce the overall register file accessing power. We propose a novel approach to partition the register in a way that can achieve the largest power saving. We formulate the register file partitioning process into a graph partitioning problem, and apply an effective algorithm to obtain the optimal result. We evaluate our algorithm for MiBench and SPEC2000 applications on the SimpleScalar PISA system, and an average saving of 58.3% and 54.4% over the nonpartitioned register file accessing power is achieved. The area overhead is negligible, and the execution time overhead is acceptable (5.5% for MiBench 2.4% for SPEC2000). Further evaluation for MiBench applications is performed on Alpha and X86 system.