Low-power Branch Target Buffer for Application-Specific Embedded Processors

Authors:
Peter Petrov;Alex Orailoglu
Affiliations:
-;-
Venue:
DSD '03 Proceedings of the Euromicro Symposium on Digital Systems Design
Year:
2003

Citing 0
Cited 8

Lazy BTB: reduce BTB energy consumption using dynamic profiling

ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Branchless cycle prediction for embedded processors

Proceedings of the 2006 ACM symposium on Applied computing
Block-aware instruction set architecture

ACM Transactions on Architecture and Code Optimization (TACO)
Reducing the Number of Bits in the BTB to Attack the Branch Predictor Hot-Spot

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Thrifty BTB: A comprehensive solution for dynamic power reduction in branch target buffers

Microprocessors & Microsystems
Branch target buffer design for embedded processors

Microprocessors & Microsystems
Power-aware BTB for modern processors

Computers and Electrical Engineering
Power-aware branch logic: a hardware based technique for filtering access to branch logic

SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a methodology for a low-powerbranch identification mechanism, which enables the designof extremely power efficient branch predictors forembedded processors. The proposed technique utilizesapplication-specific information regarding the control-flowstructure of the program major loops. Such informationis used to completely eliminate the power hungry BranchTarget Buffer (BTB) lookups which normally occur at everyexecution cycle. Exact application knowledge regardingthe control-flow structure of the program obviates thepower expensive BTB operations, thus enabling the utilizationof contemporary branch predictors in high-end, yetpower-sensitive embedded processors. The utilization ofexact application knowledge results not only in the completeelimination of the power hungry BTB structure butalso in a perfect branch and target address identification. Acost-efficient and programmable hardware architecture forcapturing the control-flow structure of the program is presentedthereafter. The hardware complexity of the proposedarchitecture is carefully analyzed in terms of power, performanceand area overhead. The proposed technique deliverspower reductions in excess of 90% for a set of embeddedbenchmarks.