ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Line (block) size choice for CPU cache memories
IEEE Transactions on Computers
Branch folding in the CRISP microprocessor: reducing branch delay to zero
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
An evaluation of branch architectures
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
MIPS RISC architecture
The Clipper processor: instruction set architecture and implementation
Communications of the ACM
The IBM RISC System/6000 processor: hardware overview
IBM Journal of Research and Development
Branch history table prediction of moving target branches due to subroutine returns
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Two-level adaptive training branch prediction
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Branch Strategies: Modeling and Optimization (Pipeline Processing)
IEEE Transactions on Computers
Alternative implementations of two-level adaptive branch prediction
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improving the accuracy of dynamic branch prediction using branch correlation
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Generation and analysis of very long address traces
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Cache evaluation and the impact of workload choice
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Communications of the ACM - Special issue on computer architecture
MC68020 32-Bit Microprocessor User's Manual
MC68020 32-Bit Microprocessor User's Manual
The Architecture of Symbolic Computers
The Architecture of Symbolic Computers
Computer Architecture and Parallel Processing
Computer Architecture and Parallel Processing
MICRO 15 Proceedings of the 15th annual workshop on Microprogramming
A study of instruction cache organizations and replacement policies
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Performance measurements on HEP - a pipelined MIMD computer
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
RISC I: A Reduced Instruction Set VLSI Computer
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
The effect of instruction fetch strategies upon the performance of pipelined instruction units
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
An instruction timing model of CPU performance
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Strategies for branch target buffers
Strategies for branch target buffers
Branch Target Buffer Design
Aspects of Cache Memory and Instruction
Aspects of Cache Memory and Instruction
Reducing indirect function call overhead in C++ programs
POPL '94 Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Fast and accurate instruction fetch and branch prediction
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Reducing branch costs via branch alignment
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Next cache line and set prediction
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Instruction cache fetch policies for speculative execution
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Dynamic path-based branch correlation
Proceedings of the 28th annual international symposium on Microarchitecture
Partial resolution in branch target buffers
Proceedings of the 28th annual international symposium on Microarchitecture
A system level perspective on branch architecture performance
Proceedings of the 28th annual international symposium on Microarchitecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
An analysis of dynamic branch prediction schemes on system workloads
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Improving the Accuracy of History-Based Branch Prediction
IEEE Transactions on Computers
MIDEE: smoothing branch and instruction cache miss penalties on deep pipelines
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Partial Resolution in Branch Target Buffers
IEEE Transactions on Computers
A scalable front-end architecture for fast instruction delivery
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Improving BTB performance in the presence of DLLs
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Optimizations Enabled by a Decoupled Front-End Architecture
IEEE Transactions on Computers
Two cache lines prediction for a wide-issue micro-architecture
ACSAC '01 Proceedings of the 6th Australasian conference on Computer systems architecture
Understanding and improving operating system effects in control flow prediction
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
The Performance of Counter- and Correlation-Based Schemes for Branch Target Buffers
IEEE Transactions on Computers
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Integrated I-cache Way Predictor and Branch Target Buffer to Reduce Energy Consumption
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
A Comprehensive Analysis of Indirect Branch Prediction
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Speeding Up Target Address Generation Using a Self-indexed FTB (Research Note)
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Branch prediction techniques for low-power VLIW processors
Proceedings of the 13th ACM Great Lakes symposium on VLSI
Design and characterization of the Berkeley multimedia workload
Multimedia Systems
Speculating to reduce unnecessary power consumption
ACM Transactions on Embedded Computing Systems (TECS)
SEPAS: a highly accurate energy-efficient branch predictor
Proceedings of the 2004 international symposium on Low power electronics and design
Low-power branch prediction techniques for VLIW architectures: a compiler-hints based approach
Integration, the VLSI Journal - Special issue: ACM great lakes symposium on VLSI
The instruction register file micro-architecture
Future Generation Computer Systems - Special issue: Parallel computing technologies
Lazy BTB: reduce BTB energy consumption using dynamic profiling
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
OS-Aware Branch Prediction: Improving Microprocessor Control Flow Prediction for Operating Systems
IEEE Transactions on Computers
Evaluating the performance of dynamic branch prediction schemes with BPSim
WCAE-3 '97 Proceedings of the 1997 workshop on Computer architecture education
Proceedings of the 2007 workshop on Experimental computer science
ecs'07 Experimental computer science on Experimental computer science
Thrifty BTB: A comprehensive solution for dynamic power reduction in branch target buffers
Microprocessors & Microsystems
Phantom-BTB: a virtualized branch target buffer design
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Evaluation of branch-prediction methods on traces from commercial applications
IBM Journal of Research and Development
Reducing leakage power with BTB access prediction
Integration, the VLSI Journal
The instruction register file micro-architecture
Future Generation Computer Systems - Special issue: Parallel computing technologies
Low-power branch prediction techniques for VLIW architectures: a compiler-hints based approach
Integration, the VLSI Journal - Special issue: ACM great lakes symposium on VLSI
Real-time unobtrusive program execution trace compression using branch predictor events
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Power-aware branch logic: a hardware based technique for filtering access to branch logic
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Leveraging speculative architectures for runtime program validation
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 15.00 |
A branch target buffer (BTB) can reduce the performance penalty of branches in pipelined processors by predicting the path of the branch and caching information used by the branch. Two major issues in the design of BTBs that achieves maximum performance with a limited number of bits allocated to the BTB implementation are discussed. The first is BTB management. A method for discarding branches from the BTB is examined. This method discards the branch with the smallest expected value for improving performance; it outperforms the least recently used (LRU) strategy by a small margin, at the cost of additional complexity. The second issue is the question of what information to store in the BTB. A BTB entry can consist of one or more of the following: branch tag, prediction information, the branch target address, and instructions at the branch target. Various BTB designs, with one or more of these fields, are evaluated and compared.