A VLIW architecture for a trace Scheduling Compiler
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
A Theory of Reduced and Minimal Procedural Dependencies
IEEE Transactions on Computers
Two-level adaptive training branch prediction
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improving the accuracy of dynamic branch prediction using branch correlation
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Improving the accuracy of static branch prediction using branch correlation
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Disjoint eager execution: an optimal form of speculative execution
Proceedings of the 28th annual international symposium on Microarchitecture
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Accurate indirect branch prediction
Proceedings of the 25th annual international symposium on Computer architecture
The cascaded predictor: economical and adaptive branch target prediction
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Predicting indirect branches via data compression
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Data threaded microarchitecture
ACM SIGARCH Computer Architecture News
Disjoint Eager Execution: what it is / what it is not
ACM SIGARCH Computer Architecture News
Optimization strategies for a java virtual machine interpreter on the cell broadband engine
Proceedings of the 5th conference on Computing frontiers
An experimental study of sorting and branch prediction
Journal of Experimental Algorithmics (JEA)
Dynamic branch prediction and control speculation
International Journal of High Performance Systems Architecture
Hi-index | 4.10 |
The insatiable demand of both old and new applications demand improved capabilities. Developers must exploit parallelism for all types of programs to realize gains. Multiprocessor, multithreaded, vector, and dataflow computers achieve speedups up to the 1,000's for programs with large amounts of data parallelism or independent control flow. General-purpose code, however, has many conditional branches, irregular control flow, and much less data parallelism. These code characteristics and their detrimental consequences, in the form of branch effects, have severely limited the parallelism that can be exploited. Branch effects result from the uncertainties in the way branches execute. This article surveys techniques to reduce branch effects and describes their relative merits, including examples from commercial machines. Branch effect reduction techniques can be implemented in hardware, software, or both to free up more parallelism and speed up the execution of general-purpose code. Research is bearing fruit: Speedups of 10 or more are being demonstrated in research simulations and may be realized in hardware within a few years.