Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
A function-composition approach to synthesize Fortran 90 array operations
Journal of Parallel and Distributed Computing
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
Array operation synthesis to optimize HPF programs on distributed memory machines
Journal of Parallel and Distributed Computing
Instruction Scheduling for Clustered VLIW DSPs
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Support of Software Maintenance Using Data Flow Analysis
Support of Software Maintenance Using Data Flow Analysis
Using Bidirectional Data Flow Analysis To Support Software Reuse
Using Bidirectional Data Flow Analysis To Support Software Reuse
An Efficient VLIW DSP Architecture for Baseband Processing
ICCD '03 Proceedings of the 21st International Conference on Computer Design
Interprocedural Probabilistic Pointer Analysis
IEEE Transactions on Parallel and Distributed Systems
A unified processor architecture for RISC & VLIW DSP
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
A sink-n-hoist framework for leakage power reduction
Proceedings of the 5th ACM international conference on Embedded software
Compilers for leakage power reduction
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Compiler supports and optimizations for PAC VLIW DSP processors
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Enabling compiler flow for embedded VLIW DSP processors with distributed register files
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Enhancing Microkernel Performance on VLIW DSP Processors via Multiset Context Switch
Journal of Signal Processing Systems
Register coalescing techniques for heterogeneous register architecture with copy sifting
ACM Transactions on Embedded Computing Systems (TECS)
Journal of Signal Processing Systems
Hi-index | 0.00 |
High-performance and low-power VLIW DSP processors are increasingly deployed on embedded devices to process video and multimedia applications. For reducing power and cost in designs of VLIW DSP processors, distributed register files and multi-bank register architectures are being adopted to eliminate the amount of read/write ports in register files. This presents new challenges for devising compiler optimization schemes for such architectures. In our research work, we address the compiler optimization issues for PAC architecture, which is a 5-way issue DSP processor with distributed register files. We show how to support an important class of compiler optimization problems, known as copy propagations, for such architecture. We illustrate that a naive deployment of copy propagations in embedded VLIW DSP processors with distributed register files might result in performance anomaly. In our proposed scheme, we derive a communication cost model by cluster distance, register port pressures, and the movement type of register sets. This cost model is used to guide the data flow analysis for supporting copy propagations over PAC architecture. Experimental results show that our schemes are effective to prevent performance anomaly with copy propagations over embedded VLIW DSP processors with distributed files.