Optimal vs. heuristic integrated code generation for clustered VLIW architectures
SCOPES '08 Proceedings of the 11th international workshop on Software & compilers for embedded systems
Embedded DSP Processor Design: Application Specific Instruction Set Processors
Embedded DSP Processor Design: Application Specific Instruction Set Processors
Optimal integrated VLIW code generation with integer linear programming
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Integrated Code Generation for Loops
ACM Transactions on Embedded Computing Systems (TECS)
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
We present a dynamic programming method for optimal integrated code generation for basic blocks that minimizes execution time. It can be applied to single-issue pipelined processors, in-order-issue superscalar processors, VLIW architectures with a single homogeneous register set, and clustered VLIW architectures with multiple register sets. For the case of a single register set, our method simultaneously copes with instruction selection, instruction scheduling, and register allocation. For clustered VLIW architectures, we also integrate the optimal partitioning of instructions, allocation of registers for temporary variables, and scheduling of data transfer operations between clusters. Our method is implemented in the prototype of a retargetable code generation framework for digital signal processors (DSPs), called OPTIMIST. We present results for the processors ARM9E, TI C62x, and a single-cluster variant of C62x. Our results show that the method can produce optimal solutions for small and (in the case of a single register set) medium-sized problem instances with a reasonable amount of time and space. For larger problem instances, our method can be seamlessly changed into a heuristic. Copyright © 2006 John Wiley & Sons, Ltd.