Integrated Code Generation for Loops

Authors:
Mattias Eriksson;Christoph Kessler
Affiliations:
Linköping University;Linköping University
Venue:
ACM Transactions on Embedded Computing Systems (TECS)
Year:
2012

Citing 43
Cited 2

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A dynamic-programming technique for compacting loops

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
A novel framework of register allocation for software pipelining

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Scheduling and mapping: software pipelining in the presence of structural hazards

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Hypernode reduction modulo scheduling

Proceedings of the 28th annual international symposium on Microarchitecture
Minimizing register requirements of a modulo schedule via optimum stage scheduling

International Journal of Parallel Programming
Software pipelining showdown: optimal vs. heuristic methods in a production compiler

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Instruction selection, resource allocation, and scheduling in the AVIV retargetable code generator

DAC '98 Proceedings of the 35th annual Design Automation Conference
Effective cluster assignment for modulo scheduling

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Modulo scheduling for the TMS320C6x VLIW DSP architecture

Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Optimal instruction scheduling using integer programming

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Register pressure responsive software pipelining

Proceedings of the 2001 ACM symposium on Applied computing
An ILP Solution for Simultaneous Scheduling, Allocation, and Binding in Multiple Block Synthesis

ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
A Unified Modulo Scheduling and Register Allocation Technique for Clustered Processors

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Optimal Software Pipelining with Rational Initiation Interval

PDPTA '02 Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications - Volume 2
PROPAN: A Retargetable System for Postpass Optimisations and Analyses

LCTES '00 Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems
Very Long Instruction Word architectures and the ELI-512

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Distributed Modulo Scheduling

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Instruction Scheduling for Clustered VLIW DSPs

PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Power-Performance Trade-Offs for Energy-Efficient Architectures: A Quantitative Study

ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
CARS: A New Code Generation Framework for Clustered ILP Processors

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Swing Modulo Scheduling: A Lifetime-Sensitive Approach

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Phase Coupled Code Generation for DSPs Using a Genetic Algorithm

Proceedings of the conference on Design, automation and test in Europe - Volume 2
Integrated temporal and spatial scheduling for extended operand clustered VLIW processors

Proceedings of the 1st conference on Computing frontiers
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Generic software pipelining at the assembly level

SCOPES '05 Proceedings of the 2005 workshop on Software and compilers for embedded systems
Optimal integrated code generation for VLIW architectures: Research Articles

Concurrency and Computation: Practice & Experience - 10th International Workshop on Compilers for Parallel Computers (CPC 2003)
Classification and generation of schedules for VLIW processors: Research Articles

Concurrency and Computation: Practice & Experience - Current Trends in Compilers for Parallel Computers (CPC2006)
On Periodic Register Need in Software Pipelining

IEEE Transactions on Computers
An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family

Computer
Optimal versus Heuristic Global Code Scheduling

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Optimal vs. heuristic integrated code generation for clustered VLIW architectures

SCOPES '08 Proceedings of the 11th international workshop on Software & compilers for embedded systems
An Application of Constraint Programming to Superblock Instruction Scheduling

CP '08 Proceedings of the 14th international conference on Principles and Practice of Constraint Programming
Integrated Modulo Scheduling for Clustered VLIW Architectures

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Register allocation and optimal spill code scheduling in software pipelined loops using 0-1 integer linear programming formulation

CC'07 Proceedings of the 16th international conference on Compiler construction
SCAN: a heuristic for near-optimal software pipelining

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Optimal integrated VLIW code generation with integer linear programming

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Global optimization approach for architectural synthesis

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Optimal and heuristic global code motion for minimal spilling

CC'13 Proceedings of the 22nd international conference on Compiler Construction
Integrated modulo scheduling and cluster assignment for TI TMS320C64x+ architecture

Proceedings of the 11th Workshop on Optimizations for DSP and Embedded Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Code generation in a compiler is commonly divided into several phases: instruction selection, scheduling, register allocation, spill code generation, and, in the case of clustered architectures, cluster assignment. These phases are interdependent; for instance, a decision in the instruction selection phase affects how an operation can be scheduled We examine the effect of this separation of phases on the quality of the generated code. To study this we have formulated optimal methods for code generation with integer linear programming; first for acyclic code and then we extend this method to modulo scheduling of loops. In our experiments we compare optimal modulo scheduling, where all phases are integrated, to modulo scheduling, where instruction selection and cluster assignment are done in a separate phase. The results show that, for an architecture with two clusters, the integrated method finds a better solution than the nonintegrated method for 27% of the instances.