ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Branch folding in the CRISP microprocessor: reducing branch delay to zero
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
An evaluation of branch architectures
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Characterization of branch and data dependencies on programs for evaluating pipeline performance
IEEE Transactions on Computers
Checkpoint repair for high-performance out-of-order execution machines
IEEE Transactions on Computers
Available instruction-level parallelism for superscalar and superpipelined machines
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Comparing software and hardware schemes for reducing the cost of branches
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Branch history table prediction of moving target branches due to subroutine returns
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Single instruction stream parallelism is greater than two
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Two-level adaptive training branch prediction
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
MICRO 15 Proceedings of the 15th annual workshop on Microprogramming
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
RISC I: A Reduced Instruction Set VLSI Computer
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
A comprehensive instruction fetch mechanism for a processor supporting speculative execution
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Toward zero-cost branches using instruction registers
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Extraction of massive instruction level parallelism
ACM SIGARCH Computer Architecture News
A comparison of dynamic branch predictors that use two levels of branch history
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache
ICS '93 Proceedings of the 7th international conference on Supercomputing
Reducing indirect function call overhead in C++ programs
POPL '94 Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Improving semi-static branch prediction by code replication
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Fast and accurate instruction fetch and branch prediction
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The impact of unresolved branches on branch prediction scheme performance
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Branch classification: a new mechanism for improving branch predictor performance
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
A fill-unit approach to multiple instruction issue
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
The effect of speculatively updating branch history on branch prediction accuracy, revisited
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Next cache line and set prediction
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Instruction cache fetch policies for speculative execution
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Direct-mapped versus set-associative pipelined caches
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
A modified approach to data cache management
Proceedings of the 28th annual international symposium on Microarchitecture
Partial resolution in branch target buffers
Proceedings of the 28th annual international symposium on Microarchitecture
A system level perspective on branch architecture performance
Proceedings of the 28th annual international symposium on Microarchitecture
Alternative implementations of hybrid branch predictors
Proceedings of the 28th annual international symposium on Microarchitecture
Control flow prediction with tree-like subgraphs for superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
IEEE Transactions on Computers
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Correlation and aliasing in dynamic branch predictors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Analysis of branch prediction via data compression
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Compiler synthesized dynamic branch prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Branch history table indexing to prevent pipeline bubbles in wide-issue superscalar processors
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
MIDEE: smoothing branch and instruction cache miss penalties on deep pipelines
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Low power data processing by elimination of redundant computations
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Implementation and analysis of path history in dynamic branch prediction schemes
ICS '97 Proceedings of the 11th international conference on Supercomputing
Trading conflict and capacity aliasing in conditional branch predictors
Proceedings of the 24th annual international symposium on Computer architecture
A language for describing predictors and its application to automatic synthesis
Proceedings of the 24th annual international symposium on Computer architecture
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The predictability of data values
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Can program profiling support value prediction?
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Highly accurate data value prediction using hybrid predictors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Partial Resolution in Branch Target Buffers
IEEE Transactions on Computers
Kin: a high performance asynchronous processor architecture
ICS '98 Proceedings of the 12th international conference on Supercomputing
The effect of instruction fetch bandwidth on value prediction
Proceedings of the 25th annual international symposium on Computer architecture
Execution characteristics of desktop applications on Windows NT
Proceedings of the 25th annual international symposium on Computer architecture
An analysis of correlation and predictability: what makes two-level branch predictors work
Proceedings of the 25th annual international symposium on Computer architecture
Branch prediction based on universal data compression algorithms
Proceedings of the 25th annual international symposium on Computer architecture
Confidence estimation for speculation control
Proceedings of the 25th annual international symposium on Computer architecture
Dynamic history-length fitting: a third level of adaptivity for branch prediction
Proceedings of the 25th annual international symposium on Computer architecture
Using prediction to accelerate coherence protocols
Proceedings of the 25th annual international symposium on Computer architecture
Threaded multiple path execution
Proceedings of the 25th annual international symposium on Computer architecture
Retrospective: alternative implementations of two-level adaptive training branch prediction
25 years of the international symposia on Computer architecture (selected papers)
Using value prediction to increase the power of speculative execution hardware
ACM Transactions on Computer Systems (TOCS)
Analyzing the working set characteristics of branch execution
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Load latency tolerance in dynamically scheduled processors
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Predicting indirect branches via data compression
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Variable length path branch prediction
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Performance of database workloads on shared-memory systems with out-of-order processors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Memory sharing predictor: the key to a speculative coherent DSM
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Simultaneous subordinate microthreading (SSMT)
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
On the use of trace sampling for architectural studies of desktop applications
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Control Flow Prediction Schemes for Wide-Issue Superscalar Processors
IEEE Transactions on Parallel and Distributed Systems
Increasing effective IPC by exploiting distant parallelism
ICS '99 Proceedings of the 13th international conference on Supercomputing
Dynamic removal of redundant computations
ICS '99 Proceedings of the 13th international conference on Supercomputing
Using dynamic cache management techniques to reduce energy in a high-performance processor
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Completion time multiple branch prediction for enhancing trace cache performance
Proceedings of the 27th annual international symposium on Computer architecture
Architecture of the Atlas Chip-Multiprocessor: Dynamically Parallelizing Irregular Applications
IEEE Transactions on Computers
Hardware prediction for data coherency of scientific codes on DSM
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Design tradeoffs for the Alpha EV8 conditional branch predictor
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Timing analysis of embedded software for speculative processors
Proceedings of the 15th international symposium on System Synthesis
NetBench: a benchmarking suite for network processors
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
The Misprediction Recovery Cache
International Journal of Parallel Programming
Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures
International Journal of Parallel Programming
Selective Branch Inversion: Confidence Estimation for Branch Predictors
International Journal of Parallel Programming
Branch Target Buffer Design and Optimization
IEEE Transactions on Computers
Effective Hardware-Based Data Prefetching for High-Performance Processors
IEEE Transactions on Computers
Optimal 2-Bit Branch Predictors
IEEE Transactions on Computers
The Performance of Counter- and Correlation-Based Schemes for Branch Target Buffers
IEEE Transactions on Computers
Operational Data Analysis: Improved Predictions Using Multi-computer Pattern Detection
DSOM '00 Proceedings of the 11th IFIP/IEEE International Workshop on Distributed Systems: Operations and Management: Services Management in Intelligent Networks
Using Dataflow Based Contextfor Accurate Branch Prediction
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Multiscalar Execution along a Single Flow of Control
ICPP '97 Proceedings of the international Conference on Parallel Processing
Cached Two-Level Adaptive Branch Predictors with Multiple Stages
ARCS '02 Proceedings of the International Conference on Architecture of Computing Systems: Trends in Network and Pervasive Computing
Quantifying behavioral differences between multimedia and general-purpose workloads
Journal of Systems Architecture: the EUROMICRO Journal
Exploiting data-width locality to increase superscalar execution bandwidth
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Accurate timing analysis by modeling caches, speculation and their interaction
Proceedings of the 40th annual Design Automation Conference
Itanium 2 Processor Microarchitecture
IEEE Micro
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Two-level branch prediction using neural networks
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Synthesis and verification
Evaluation and choice of various branch predictors for low-power embedded processor
Journal of Computer Science and Technology
Design and Optimization of Large Size and Low Overhead Off-Chip Caches
IEEE Transactions on Computers
Prophet/Critic Hybrid Branch Prediction
Proceedings of the 31st annual international symposium on Computer architecture
Proceedings of the 31st annual international symposium on Computer architecture
An Efficient Value Predictor Dynamically Using Loop and Locality Properties
The Journal of Supercomputing
Decode filter cache for energy efficient instruction cache hierarchy in super scalar architectures
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
IEEE Transactions on Computers
Modeling control speculation for timing analysis
Real-Time Systems
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Understanding the effects of wrong-path memory references on processor performance
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Improving branch prediction accuracy with parallel conservative correctors
Proceedings of the 2nd conference on Computing frontiers
Energy-aware fetch mechanism: trace cache and BTB customization
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Fast branch misprediction recovery in out-of-order superscalar processors
Proceedings of the 19th annual international conference on Supercomputing
IEEE Transactions on Computers
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Branch predictor design and performance estimation for a high performance embedded microprocessor
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
Simple penalty-sensitive replacement policies for caches
Proceedings of the 3rd conference on Computing frontiers
Dynamic feature selection for hardware prediction
Journal of Systems Architecture: the EUROMICRO Journal
International Journal of Parallel Programming
Evaluating Network Processors using NetBench
ACM Transactions on Embedded Computing Systems (TECS)
Modeling out-of-order processors for WCET analysis
Real-Time Systems
SlicK: slice-based locality exploitation for efficient redundant multithreading
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Reducing Cache Pollution via Dynamic Data Prefetch Filtering
IEEE Transactions on Computers
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Dynamic per-branch history length adjustment to improve branch prediction accuracy
Microprocessors & Microsystems
Proceedings of the 4th international conference on Computing frontiers
Visual simulator for ILP dynamic OOO processor
WCAE '04 Proceedings of the 2004 workshop on Computer architecture education: held in conjunction with the 31st International Symposium on Computer Architecture
An approach to reduce thread switch frequency for branch
DNCOCO'08 Proceedings of the 7th conference on Data networks, communications, computers
Performance Characterization of Itanium® 2-Based Montecito Processor
Proceedings of the 2009 SPEC Benchmark Workshop on Computer Performance Evaluation and Benchmarking
Proceedings of the 2009 SPEC Benchmark Workshop on Computer Performance Evaluation and Benchmarking
Phantom-BTB: a virtualized branch target buffer design
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Adaptive Read Validation in Time-Based Software Transactional Memory
Euro-Par 2008 Workshops - Parallel Processing
The impact of speculative execution on SMT processors
International Journal of Parallel Programming
Branch Predictor Warmup for Sampled Simulation through Branch History Matching
Transactions on High-Performance Embedded Architectures and Compilers II
Predictive algorithms in the management of computer systems
IBM Systems Journal
Saturating counter design for meta predictor in hybrid branch prediction
CSECS'09 Proceedings of the 8th WSEAS International Conference on Circuits, systems, electronics, control & signal processing
International Journal of Modelling and Simulation
Branch history matching: branch predictor warmup for sampled simulation
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
EXACT: explicit dynamic-branch prediction with active updates
Proceedings of the 7th ACM international conference on Computing frontiers
Impact analysis of performance faults in modern microprocessors
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Dynamic branch prediction and control speculation
International Journal of High Performance Systems Architecture
NTPT: on the end-to-end traffic prediction in the on-chip networks
Proceedings of the 47th Design Automation Conference
A novel meta predictor design for hybrid branch prediction
WSEAS Transactions on Computers
An adaptive cache coherence protocol for chip multiprocessors
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Write invalidation analysis in chip multiprocessors
PATMOS'09 Proceedings of the 19th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Branch strategies to optimize decision trees for wide-issue architectures
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
AGC: adaptive global clock in software transactional memory
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Tradeoffs between branch mispredictions and comparisons for sorting algorithms
WADS'05 Proceedings of the 9th international conference on Algorithms and Data Structures
Exploiting intra-function correlation with the global history stack
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Design space exploration of hybrid ultra low power branch predictors
ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
Maintaining consistency in software transactional memory through dynamic versioning tuning
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Improving performance of software transactional memory through contention locality
The Journal of Supercomputing
On the Impact of Performance Faults in Modern Microprocessors
Journal of Electronic Testing: Theory and Applications
Bandwidth Adaptive Cache Coherence Optimizations for Chip Multiprocessors
International Journal of Parallel Programming
Hi-index | 0.03 |
As the issue rate and depth of pipelining of high performance Superscalar processors increase, the importance of an excellent branch predictor becomes more vital to delivering the potential performance of a wide-issue, deep pipelined microarchitecture. We propose a new dynamic branch predictor (Two-Level Adaptive Branch Prediction) that achieves substantially higher accuracy than any other scheme reported in the literature. The mechanism uses two levels of branch history information to make predictions, the history of the last k branches encountered, and the branch behavior for the last s occurrences of the specific pattern of these k branches. We have identified three variations of the Two-Level Adaptive Branch Prediction, depending on how finely we resolve the history information gathered. We compute the hardware costs of implementing each of the three variations, and use these costs in evaluating their relative effectiveness. We measure the branch prediction accuracy of the three variations of two-Level Adaptive Branch Prediction, along with several other popular proposed dynamic and static prediction schemes, on the SPEC benchmarks. We show that the average prediction accuracy for Two-Level Adaptive Branch Prediction is 97 percent, while the other known schemes achieve at most 94.4 percent average prediction accuracy. We measure the effectiveness of different prediction algorithms and different amounts of history and pattern information. We measure the costs of each variation to obtain the same prediction accuracy.