Introduction to algorithms
Global instruction scheduling for superscalar machines
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
An Optimal Instruction Scheduler for Superscalar Processor
IEEE Transactions on Parallel and Distributed Systems
A recursive technique for computing lower-bound performance of schedules
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Profile-driven instruction level parallel scheduling with application to super blocks
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Speculative hedge: regulating compile-time speculation against profile variations
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Superblock formation using static program analysis
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Advanced compiler design and implementation
Advanced compiler design and implementation
Balance scheduling: weighting branch tradeoffs in superblocks
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Postpass Code Optimization of Pipeline Constraints
ACM Transactions on Programming Languages and Systems (TOPLAS)
A fast approach to computing exact solutions to the resource-constrained scheduling problem
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Scheduling Superblocks with Bound-Based Branch Trade-Offs
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Learning basic block scheduling heuristics from optimal data
CASCON '05 Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research
Data-Dependency Graph Transformations for Superblock Scheduling
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
High-level interconnect model for the quantum logic array architecture
ACM Journal on Emerging Technologies in Computing Systems (JETC)
An Application of Constraint Programming to Superblock Instruction Scheduling
CP '08 Proceedings of the 14th international conference on Principles and Practice of Constraint Programming
Learning heuristics for basic block instruction scheduling
Journal of Heuristics
Optimal trace scheduling using enumeration
ACM Transactions on Architecture and Code Optimization (TACO)
Constraint-Based register allocation and instruction scheduling
CP'12 Proceedings of the 18th international conference on Principles and Practice of Constraint Programming
ACM Transactions on Architecture and Code Optimization (TACO)
ACM Transactions on Embedded Computing Systems (TECS)
Integrated modulo scheduling and cluster assignment for TI TMS320C64x+ architecture
Proceedings of the 11th Workshop on Optimizations for DSP and Embedded Systems
Hi-index | 0.00 |
The superblock is a scheduling region that is used by compilers for exploiting instruction-level parallelism across basic blocks. Many heuristic techniques have been proposed for solving this difficult scheduling problem, but none accurately approximates the optimal solution. This paper presents a new technique that finds provably optimal solutions to superblock scheduling problems. The technique is based on reducing the problem of finding branch combinations that yield incrementally increasing weighted execution times to a subset-sum problem, which is solved by dynamic programming. An enumerative approach that employs a number of powerful pruning techniques to efficiently explore the solution space is then used to search for a feasible schedule for each branch combination. Experimental evaluation using the SPEC CPU fp2000 and int2000 benchmarks shows that, within a per-problem time limit of one second, this combination of dynamic programming and enumeration optimally solves about 99% of the hard superblock scheduling problems with an average solution time of 9 milliseconds per problem. For 80% of the hard problems, the optimal schedule is improved compared to the schedule produced by an established heuristic technique.