The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Clock rate versus IPC: the end of the road for conventional microarchitectures
Proceedings of the 27th annual international symposium on Computer architecture
The impact of delay on the design of branch predictors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
MEMOCODE '03 Proceedings of the First ACM and IEEE International Conference on Formal Methods and Models for Co-Design
Low-power Branch Target Buffer for Application-Specific Embedded Processors
DSD '03 Proceedings of the Euromicro Symposium on Digital Systems Design
Power-Aware Branch Prediction: Characterization and Design
IEEE Transactions on Computers
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Reducing the Number of Bits in the BTB to Attack the Branch Predictor Hot-Spot
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Thrifty BTB: A comprehensive solution for dynamic power reduction in branch target buffers
Microprocessors & Microsystems
Branch target buffer design for embedded processors
Microprocessors & Microsystems
Power-aware BTB for modern processors
Computers and Electrical Engineering
Hi-index | 0.00 |
Modern embedded processors access the Branch Target Buffer (BTB) every cycle to speculate branch target addresses. Such accesses, quite often, are unnecessary as there is no branch instruction among those fetched.In this work we introduce Branchless Cycle Prediction (BLCP) to exploit this design inefficiency. BLCP uses a simple power efficient structure to predict cycles where there is no branch instruction among those fetched, at least one cycle in advance. We avoid accessing BTB during such cycles.We show that, by using BLCP, it is possible to reduce BTB power dissipation by 32% while paying a negligible performance cost (average: 0.2%).