The agree predictor: a mechanism for reducing negative branch history interference
Proceedings of the 24th annual international symposium on Computer architecture
Design Challenges of Technology Scaling
IEEE Micro
Branch prediction techniques for low-power VLIW processors
Proceedings of the 13th ACM Great Lakes symposium on VLSI
Applying Decay Strategies to Branch Predictors for Leakage Energy Savings
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Branch prediction on demand: an energy-efficient solution
Proceedings of the 2003 international symposium on Low power electronics and design
Dynamic Branch Prediction with Perceptrons
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Low-power Branch Target Buffer for Application-Specific Embedded Processors
DSD '03 Proceedings of the Euromicro Symposium on Digital Systems Design
Branchless cycle prediction for embedded processors
Proceedings of the 2006 ACM symposium on Applied computing
Hi-index | 0.00 |
Current superscalar processors access the BTB early to anticipate the branch/jump target address. This access is frequent and aggressively performed since the BTB is accessed every cycle for all instructions in the ICache line being fetched. This fact increases the power density, which could create hot spots, thus increasing packaging and cooling costs. Power consumption in the BTB comes mostly from its two main fields: the tag and the target address fields. Reducing the length of either of these fields reduces power consumption, silicon area and access time. This paper analyzes at what extent tag and target address lengths could be reduced to benefit both dynamic and static power consumption, silicon area, and access time, while sustaining performance. Experimental results show that the tag length and the target address could be reduced by about a half and one byte, respectively with no performance losses. BTB peak power savings can reach about 35% when both reductions are combined together, thus effectively attacking the hot-spot.