Scheduling expressions on a pipelined processor with a maximal delay of one cycle
ACM Transactions on Programming Languages and Systems (TOPLAS)
On the Complexity of Scheduling Problems for Parallel/Pipelined Machines
IEEE Transactions on Computers
Introduction to algorithms
The Omega test: a fast and practical integer programming algorithm for dependence analysis
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Scheduling time-critical instructions on RISC machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
An Optimal Instruction Scheduler for Superscalar Processor
IEEE Transactions on Parallel and Distributed Systems
Time-constrained code compaction for DSP's
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Optimal control dependence computation and the Roman chariots problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Optimal and near-optimal global register allocations using 0–1 integer programming
Software—Practice & Experience
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Advanced compiler design and implementation
Advanced compiler design and implementation
Automatic Data Layout Using 0-1 Integer Programming
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Local code generation and compaction in optimizing microcode compilers
Local code generation and compaction in optimizing microcode compilers
Scheduling expression DAGs for minimal register need
Computer Languages
ILP-based Instruction Scheduling for IA-64
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
A Dynamic Programming Approach to Optimal Integrated Code Generation
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Scheduling expression trees for delayed-load architectures
Journal of Systems Architecture: the EUROMICRO Journal
Effective Compilation Support for Variable Instruction Set Architecture
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Fast Optimal Instruction Scheduling for Single-Issue Processors with Arbitrary Latencies
CP '01 Proceedings of the 7th International Conference on Principles and Practice of Constraint Programming
Variable Instruction Set Architecture and Its Compiler Support
IEEE Transactions on Computers
Efficient instruction scheduling for a pipelined architecture
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Instruction scheduling for a tiled dataflow architecture
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Reducing code size in VLIW instruction scheduling
Journal of Embedded Computing - Low-power Embedded Systems
Proceedings of the 34th annual international symposium on Computer architecture
VLIW instruction scheduling for minimal power variation
ACM Transactions on Architecture and Code Optimization (TACO)
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Optimal vs. heuristic integrated code generation for clustered VLIW architectures
SCOPES '08 Proceedings of the 11th international workshop on Software & compilers for embedded systems
An Application of Constraint Programming to Superblock Instruction Scheduling
CP '08 Proceedings of the 14th international conference on Principles and Practice of Constraint Programming
Optimal trace scheduling using enumeration
ACM Transactions on Architecture and Code Optimization (TACO)
ILP optimal scheduling for multi-module memory
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
CC'07 Proceedings of the 16th international conference on Compiler construction
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special issue on the 2009 ACM/IEEE international symposium on networks-on-chip
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Variable assignment and instruction scheduling for processor with multi-module memory
Microprocessors & Microsystems
An exact rational mixed-integer programming solver
IPCO'11 Proceedings of the 15th international conference on Integer programming and combinatoral optimization
Code generation for STA architecture
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Optimal integrated VLIW code generation with integer linear programming
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Integrated Code Generation for Loops
ACM Transactions on Embedded Computing Systems (TECS)
Constraint-Based register allocation and instruction scheduling
CP'12 Proceedings of the 18th international conference on Principles and Practice of Constraint Programming
Optimal and heuristic global code motion for minimal spilling
CC'13 Proceedings of the 22nd international conference on Compiler Construction
International Journal of High Performance Computing Applications
Integrated modulo scheduling and cluster assignment for TI TMS320C64x+ architecture
Proceedings of the 11th Workshop on Optimizations for DSP and Embedded Systems
Hi-index | 0.00 |
This paper presents a new approach to local instruction scheduling based on integer programming that produces optimal instruction schedules in a reasonable time, even for very large basic blocks. The new approach first uses a set of graph transformations to simplify the data-dependency graph while preserving the optimality of the final schedule. The simplified graph results in a simplified integer program which can be solved much faster. A new integer-programming formulation is then applied to the simplified graph. Various techniques are used to simplify the formulation, resulting in fewer integer-program variables, fewer integer-program constraints and fewer terms in some of the remaining constraints, thus reducing integer-program solution time. The new formulation also uses certain adaptively added constraints (cuts) to reduce solution time. The proposed optimal instruction scheduler is built within the Gnu Compiler Collection (GCC) and is evaluated experimentally using the SPEC95 floating point benchmarks. Although optimal scheduling for the target processor is considered intractable, all of the benchmarks' basic blocks are optimally scheduled, including blocks with up to 1000 instructions, while total compile time increases by only 14%.