Alternative implementations of hybrid branch predictors
Proceedings of the 28th annual international symposium on Microarchitecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The impact of delay on the design of branch predictors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Reducing leakage in a high-performance deep-submicron instruction cache
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Cache decay: exploiting generational behavior to reduce cache leakage power
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Dynamic fine-grain leakage reduction using leakage-biased bitlines
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Design tradeoffs for the Alpha EV8 conditional branch predictor
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Adaptive Mode Control: A Static-Power-Efficient Cache Design
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Compiler-directed instruction cache leakage optimization
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Dynamic Branch Prediction for a VLIW Processor
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Applying Decay Strategies to Branch Predictors for Leakage Energy Savings
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Branch prediction on demand: an energy-efficient solution
Proceedings of the 2003 international symposium on Low power electronics and design
Power Issues Related to Branch Prediction
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Discovering and Exploiting Program Phases
IEEE Micro
Reducing leakage power with BTB access prediction
Integration, the VLSI Journal
Compiler-assisted leakage-aware loop scheduling for embedded VLIW DSP processors
Journal of Systems and Software
Low power branch prediction for embedded application processors
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Enabling large decoded instruction loop caching for energy-aware embedded processors
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
DLIC: Decoded loop instructions caching for energy-aware embedded processors
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
With the scaling of technology, leakage energy will become the dominant source of energy consumption. Besides cache memories, branch predictors are among the largest on-chip array structures and consume nontrivial leakage energy. This paper proposes two cost-effective loop-based strategies to reduce the branch predictor leakage without impacting prediction accuracy or performance. The loop-based approaches exploit the fact that loops usually only contain a small number of instructions and, hence, even fewer branch instructions while taking a significant fraction of the execution time. Consequently, all the nonactive entries of branch predictors can be placed into the low leakage mode during the loop execution in order to reduce leakage energy. Compiler and circuit supports are discussed to implement the proposed leakage-reduction strategies. Compared to the recently proposed decay-based approach, our experimental results show that the loop-based approach can extract 16.2% more dead time of the branch predictor, on average, leading to more leakage energy savings without impacting the branch prediction accuracy and performance.