MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Partial Resolution in Branch Target Buffers
IEEE Transactions on Computers
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Selective cache ways: on-demand cache resource allocation
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Branch Target Buffer Design and Optimization
IEEE Transactions on Computers
StrongARM: a high-performance ARM processor
COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Low-power Branch Target Buffer for Application-Specific Embedded Processors
DSD '03 Proceedings of the Euromicro Symposium on Digital Systems Design
Power-Aware Branch Prediction: Characterization and Design
IEEE Transactions on Computers
Thrifty BTB: A comprehensive solution for dynamic power reduction in branch target buffers
Microprocessors & Microsystems
Branch target buffer design for embedded processors
Microprocessors & Microsystems
Low power branch prediction for embedded application processors
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Enabling large decoded instruction loop caching for energy-aware embedded processors
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
DLIC: Decoded loop instructions caching for energy-aware embedded processors
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
In this paper, we propose an alternative BTB design, called lazy BTB, to reduce the BTB energy consumption by filtering out the redundant lookups. The most distinct feature of the lazy BTB is that it dynamically profiles the taken traces during program execution. Unlike the traditional design in which the BTB has to be looked up every instruction fetch, by introducing an additional field to record the trace information, our design can achieve the goal of one BTB lookup per taken trace. The experimental results show that with a negligible performance degradation the lazy BTB can reduce the BTB energy consumption by about 77% on average for the MediaBench applications.