Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Optimally profiling and tracing programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Critical path reduction for scalar programs
Proceedings of the 28th annual international symposium on Microarchitecture
A recursive technique for computing lower-bound performance of schedules
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Profile-driven instruction level parallel scheduling with application to super blocks
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Speculative hedge: regulating compile-time speculation against profile variations
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
The program decision logic approach to predicated execution
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Balance scheduling: weighting branch tradeoffs in superblocks
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Improving Static Branch Prediction in a Compiler
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Efficient Edge Profiling for ILP-Processors
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Optimal Superblock Scheduling Using Enumeration
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
Since instruction level parallelism in basic blocks is often limited, compilers increase performance by creating superblocks that allow operations to be issued speculatively. This is difficult in general because each branch competes for the processor's limited resources. Previous work manages the performance trade-offs that exist between branches only indirectly. We show here that dependence and resource constraints can be used to gather explicit knowledge about scheduling trade-offs between branches. This paper's first contribution is a set of new, tighter lower bounds on the execution times of superblocks that specifically account for the dependence and resource conflicts between pairs of branches. This paper's second contribution is a novel superblock scheduling heuristic that finds high performance schedules by determining the operations that each branch needs to be scheduled early and selecting branches with compatible needs that favor beneficial branch trade-offs. Performance evaluations for superblocks from SPECint95 indicate that our bounds are very tight and that our scheduling heuristic outperforms well-known superblock scheduling algorithms.