Performance evaluation of the PowerPC 620 microarchitecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Alternative implementations of hybrid branch predictors
Proceedings of the 28th annual international symposium on Microarchitecture
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
The YAGS branch prediction scheme
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
The impact of delay on the design of branch predictors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Design tradeoffs for the Alpha EV8 conditional branch predictor
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Branch Target Buffer Design and Optimization
IEEE Transactions on Computers
The Alpha 21264: A 500 MHz Out-of-Order Execution Microprocessor
COMPCON '97 Proceedings of the 42nd IEEE International Computer Conference
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Branch Predictor Prediction: A Power-Aware Branch Predictor for High-Performance Processors
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Applying Decay Strategies to Branch Predictors for Leakage Energy Savings
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Power Issues Related to Branch Prediction
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Improving Branch Prediction Accuracy by Reducing Pattern History Table Interference
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Power efficient branch prediction through early identification of branch addresses
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Computational and storage power optimizations for the O-GEHL branch predictor
Proceedings of the 4th international conference on Computing frontiers
Thrifty BTB: A comprehensive solution for dynamic power reduction in branch target buffers
Microprocessors & Microsystems
A new case for the TAGE branch predictor
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
A power-aware alternative for the perceptron branch predictor
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Energy-efficient branch prediction with compiler-guided history stack
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Designers have invested much effort in developing accurate branch predictors with short learning periods. Such techniques rely on exploiting complex and relatively large structures. Although exploiting such structures is necessary to achieve high accuracy and fast learning, once the short learning phase is over, a simple structure can efficiently predict the branch outcome for the majority of branches. Moreover, for a large number of branches, once the branch reaches the steady state phase, updating the branch predictor unit is unnecessary since there is already enough information available to the predictor to predict the branch outcome accurately. Therefore, aggressive usage of complex large branch predictors appears to be inefficient since it results in unnecessary energy consumption.In this work we introduce Selective Predictor Access (SEPAS) to exploit this design inefficiency. SEPAS uses a simple power efficient structure to identify well behaved branch instructions that are in their steady state phase. Once such branches are identified, the predictor is no longer accessed to predict their outcome or to update the associated data. We show that it is possible to reduce the number of predictor accesses and energy consumption considerably with a negligible performance loss (worst case 0.25%).