Partial resolution in branch target buffers
Proceedings of the 28th annual international symposium on Microarchitecture
Assigning confidence to conditional branch predictions
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Selective cache ways: on-demand cache resource allocation
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Influence of compiler optimizations on system power
Proceedings of the 37th Annual Design Automation Conference
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Bloom filtering cache misses for accurate data speculation and prefetching
ICS '02 Proceedings of the 16th international conference on Supercomputing
Branch Target Buffer Design and Optimization
IEEE Transactions on Computers
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Energy and Performance Improvements in Microprocessor Design Using a Loop Cache
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Branch Predictor Prediction: A Power-Aware Branch Predictor for High-Performance Processors
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Branch prediction on demand: an energy-efficient solution
Proceedings of the 2003 international symposium on Low power electronics and design
Power Savings in Embedded Processors through Decode Filer Cache
Proceedings of the conference on Design, automation and test in Europe
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Power Issues Related to Branch Prediction
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Design of a Predictive Filter Cache for Energy Savings in High Performance Processor Architectures
ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
Low-power Branch Target Buffer for Application-Specific Embedded Processors
DSD '03 Proceedings of the Euromicro Symposium on Digital Systems Design
Power-Aware Branch Prediction: Characterization and Design
IEEE Transactions on Computers
SEPAS: a highly accurate energy-efficient branch predictor
Proceedings of the 2004 international symposium on Low power electronics and design
A static and dynamic energy reduction technique for I-cache and BTB in embedded processors
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Using a serial cache for energy efficient instruction fetching
Journal of Systems Architecture: the EUROMICRO Journal
Lazy BTB: reduce BTB energy consumption using dynamic profiling
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Branchless cycle prediction for embedded processors
Proceedings of the 2006 ACM symposium on Applied computing
POWER5 System microarchitecture
IBM Journal of Research and Development - POWER5 and packaging
Power efficient branch prediction through early identification of branch addresses
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Asymmetrically Banked Value-Aware Register Files
ISVLSI '07 Proceedings of the IEEE Computer Society Annual Symposium on VLSI
Branch target buffer design for embedded processors
Microprocessors & Microsystems
Low power branch prediction for embedded application processors
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Enabling large decoded instruction loop caching for energy-aware embedded processors
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Extending the cell SPE with energy efficient branch prediction
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
DLIC: Decoded loop instructions caching for energy-aware embedded processors
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
We propose Thrifty BTB, a mechanism to reduce the dynamic power dissipated by the BTB. We studied two mechanisms that reduce dynamic power dissipation. The first one is a serial-BTB configuration. The second mechanism is the filter-BTB, a combination of a low power counting Bloom filter placed in front of a conventional BTB. We also studied the effect of placing a small 32 entry direct-mapped BTB, functioning as a bypass, in parallel with the first two mechanisms. The filter-BTB reduces the number of lookups relative to a conventional BTB and the dynamic power dissipated. The serial-BTB variant only accesses the data array of the BTB upon a hit, therefore for most of the accesses the actual power dissipated is only what is dissipated by accessing the tag array. The bypass is used in parallel to either the filter- or the serial-BTB and reduces the performance cost by providing a low latency response in case of a hit. By integrating these mechanisms into a BTB design we achieve an average reduction of 51% in the dynamic power dissipation of the BTB. These benefits come at a small performance cost that is on average slightly less than 1.2%. The energy delay product was reduced by an average of 50%.