Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A portable global optimizer and linker
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A dynamic-programming technique for compacting loops
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Instruction-level parallel processing: history, overview, and perspective
The Journal of Supercomputing - Special issue on instruction-level parallelism
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
ACM Computing Surveys (CSUR)
Instruction selection, resource allocation, and scheduling in the AVIV retargetable code generator
DAC '98 Proceedings of the 35th annual Design Automation Conference
EXPRESSION: a language for architecture exploration through compiler/simulator retargetability
DATE '99 Proceedings of the conference on Design, automation and test in Europe
Modulo scheduling for the TMS320C6x VLIW DSP architecture
Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
The Design and Application of a Retargetable Peephole Optimizer
ACM Transactions on Programming Languages and Systems (TOPLAS)
Generic control flow reconstruction from assembly code
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
A comparative study of modulo scheduling techniques
ICS '02 Proceedings of the 16th international conference on Supercomputing
Retargetable Code Generation for Digital Signal Processors
Retargetable Code Generation for Digital Signal Processors
Code Generation for Embedded Processors
Code Generation for Embedded Processors
Principles of Program Analysis
Principles of Program Analysis
A Retargetable C Compiler: Design and Implementation
A Retargetable C Compiler: Design and Implementation
Sifting out the mud: low level C++ code reuse
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Perfect Pipelining: A New Loop Parallelization Technique
ESOP '88 Proceedings of the 2nd European Symposium on Programming
Post-pass compaction techniques
Communications of the ACM - Program compaction
Register-Sensitive Software Pipelining
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Local code generation and compaction in optimizing microcode compilers
Local code generation and compaction in optimizing microcode compilers
Code optimization libraries for retargetable compilation for embedded digital signal processors
Code optimization libraries for retargetable compilation for embedded digital signal processors
TDL: a hardware description language for retargetable postpass optimizations and analyses
Proceedings of the 2nd international conference on Generative programming and component engineering
Link-time optimization of ARM binaries
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Timing optimization via nest-loop pipelining considering code size
Microprocessors & Microsystems
Integrated Modulo Scheduling for Clustered VLIW Architectures
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Integrated Code Generation for Loops
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
Software used in embedded systems is subject to strict timing and space constraints. The growing software complexity creates an urgent need for fast program execution under the constraint of very limited code size. However, even modern compilers produce code whose quality often is far away from the optimum. The PROPAN system is a postpass optimization framework that enables high-quality machine-dependent postpass optimizers to be generated from a concise hardware specification. The postpass approach allows to enhance the code quality of existing compilers and offers a smooth integration into existing development tool chains. In this article we present an adaptation of the modulo scheduling software pipelining algorithm to the postpass level. The implementation is fully retargetable and has been incorporated in the PROPAN system. The differences of postpass modulo scheduling compared to the standard version of the algorithm are outlined. Experimental results conducted on the Philips TriMedia TM1000 processor demonstrate that modulo scheduling can be applied at the postpass level and allows to achieve a significant code speedup with moderate code size increase.