Instruction Scheduling for Clustered VLIW DSPs

Authors:
Rainer Leupers
Affiliations:
-
Venue:
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Year:
2000

Citing 0
Cited 30

High-quality operation binding for clustered VLIW datapaths

Proceedings of the 38th annual Design Automation Conference
Cluster assignment for high-performance embedded VLIW processors

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Scheduling expression trees for delayed-load architectures

Journal of Systems Architecture: the EUROMICRO Journal
Convergent scheduling

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Compiler optimization-space exploration

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Region-based hierarchical operation partitioning for multicluster processors

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Integrated temporal and spatial scheduling for extended operand clustered VLIW processors

Proceedings of the 1st conference on Computing frontiers
VHC: Quickly Building an Optimizer for Complex Embedded Architectures

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Instruction buffering exploration for low energy VLIWs with instruction clusters

Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Design Space Exploration for Real-Time Embedded Stream Processors

IEEE Micro
Automatic data partitioning for the agere payload plus network processor

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Compiler-assisted leakage energy optimization for clustered VLIW architectures

EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Impact of intercluster communication mechanisms on ILP in clustered VLIW architectures

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Enabling compiler flow for embedded VLIW DSP processors with distributed register files

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Interactive presentation: Time-constrained clustering for DSE of clustered VLIW-ASP

Proceedings of the conference on Design, automation and test in Europe
Efficient implementation of nested-loop multimedia algorithms

EURASIP Journal on Applied Signal Processing
Application driven embedded system design: a face recognition case study

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Optimal vs. heuristic integrated code generation for clustered VLIW architectures

SCOPES '08 Proceedings of the 11th international workshop on Software & compilers for embedded systems
Effective Code Generation for Distributed and Ping-Pong Register Files: A Case Study on PAC VLIW DSP Cores

Journal of Signal Processing Systems
A Novel instruction stream buffer for VLIW architectures

Computers and Electrical Engineering
Copy propagation optimizations for VLIW DSP processors with distributed register files

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
An efficient heuristic for instruction scheduling on clustered vliw processors

CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Compiler supports and optimizations for PAC VLIW DSP processors

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Integrating a new cluster assignment and scheduling algorithm into an experimental retargetable code generation framework

HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Optimal methods for resource allocation and scheduling: a cross-disciplinary survey

Constraints
Integrated Code Generation for Loops

ACM Transactions on Embedded Computing Systems (TECS)
WCET-aware re-scheduling register allocation for real-time embedded systems with clustered VLIW architecture

Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Compiler-assisted energy optimization for clustered VLIW processors

Journal of Parallel and Distributed Computing
Feedback-Based global instruction scheduling for GPGPU applications

ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part I
A constraint programming approach for integrated spatial and temporal scheduling for clustered architectures

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent digital signal processors (DSPs) show a homogeneous VLIW-like data path architecture, which allows C compilers to generate efficient code. However, still some special restrictions have to be obeyed in code generation for VLIW DSPs. In order to reduce the number of register file ports needed to provide data for multiple functional units working in parallel, the DSP data path may be clustered into several sub-paths, with very limited capabilities of exchanging values between the different clusters. An example is the well-known Texas Instruments C6201 DSP. For such architecture, the tasks of scheduling and partitioning instructions between the clusters are highly interdependent. This paper presents a new instruction scheduling approach, which in contrast to earlier work, integrates partitioning and scheduling into a single technique, to achieve a high code quality. We show experimentally that the proposed technique is capable of generating more efficient code than a commercial code generator for the TI C6201.