Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Optimally profiling and tracing programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Critical path reduction for scalar programs
Proceedings of the 28th annual international symposium on Microarchitecture
A recursive technique for computing lower-bound performance of schedules
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Profile-driven instruction level parallel scheduling with application to super blocks
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Speculative hedge: regulating compile-time speculation against profile variations
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
The program decision logic approach to predicated execution
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
An experimental study of LP-based approximation algorithms for scheduling problems
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Algorithms for total weighted completion time scheduling
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Improving Static Branch Prediction in a Compiler
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Efficient Edge Profiling for ILP-Processors
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Scheduling Superblocks with Bound-Based Branch Trade-Offs
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Lower bounds on precedence-constrained scheduling for parallel processors
Information Processing Letters
Backtracking-Based Instruction Scheduling to Fill Branch Delay Slots
International Journal of Parallel Programming
Lower Bounds on Precedence-Constrained Scheduling for Parallel Processors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Optimal Superblock Scheduling Using Enumeration
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Data-Dependency Graph Transformations for Superblock Scheduling
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Instruction scheduling using evolutionary programming
ACC'08 Proceedings of the WSEAS International Conference on Applied Computing Conference
An Application of Constraint Programming to Superblock Instruction Scheduling
CP '08 Proceedings of the 14th international conference on Principles and Practice of Constraint Programming
Hi-index | 0.00 |
Since there is generally insufficient instruction level parallelism within a single basic block, higher performance is achieved by speculatively scheduling operations in superblocks. This is difficult in general because each branch competes for the processor's limited resources. Previous work manages the performance tradeoffs that exist between branches only indirectly. We show here that dependence and resource constraints can be used to gather explicit knowledge about scheduling tradeoffs between branches. The first contribution of this paper is a set of new, tighter lower bounds on the execution times of superblocks that specifically accounts for the dependence and resource conflicts between pairs of branches. The second contribution of this paper is a novel superblock scheduling heuristic that finds high performance schedules by determining the operations that each branch needs to be scheduled early and selecting branches with compatible needs that favor beneficial branch tradeoffs. Performance evaluations for superblocks from SPECint95 indicate that our bounds are very tight and that our scheduling heuristic outperforms well known superblock scheduling algorithms.