CALiBeR: a software pipelining algorithm for clustered embedded VLIW processors

Authors:
Cagdas Akturan;Margarida F. Jacome
Affiliations:
The University of Texas at Austin;The University of Texas at Austin
Venue:
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Year:
2001

Citing 17
Cited 8

Loop optimization in register-transfer scheduling for DSP-systems

DAC '89 Proceedings of the 26th ACM/IEEE Design Automation Conference
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Minimizing register requirements under resource-constrained rate-optimal software pipelining

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Resource-Constrained Software Pipelining

IEEE Transactions on Parallel and Distributed Systems
Stage scheduling: a technique to reduce the register requirements of a modulo schedule

Proceedings of the 28th annual international symposium on Microarchitecture
Minimizing register requirements of a modulo schedule via optimum stage scheduling

International Journal of Parallel Programming
Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
An effective methodology for functional pipelining

ICCAD '92 Proceedings of the 1992 IEEE/ACM international conference on Computer-aided design
RS-FDRA: a register sensitive software pipelining algorithm for embedded VLIW processors

Proceedings of the ninth international symposium on Hardware/software codesign
High-quality operation binding for clustered VLIW datapaths

Proceedings of the 38th annual Design Automation Conference
FDRA: a software-pipelining algorithm for embedded VLIW processors

ISSS '00 Proceedings of the 13th international symposium on System synthesis
Instruction scheduling for clustered VLIW architectures

ISSS '00 Proceedings of the 13th international symposium on System synthesis
Distributed Modulo Scheduling

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Swing Modulo Scheduling: A Lifetime-Sensitive Approach

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
A systolic array optimizing compiler

A systolic array optimizing compiler
Rotation scheduling: a loop pipelining algorithm

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Cluster assignment for high-performance embedded VLIW processors

ACM Transactions on Design Automation of Electronic Systems (TODAES)
A scalable wide-issue clustered VLIW with a reconfigurable interconnect

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Exploiting Loop-Level Parallelism on Coarse-Grained Reconfigurable Architectures Using Modulo Scheduling

DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Register aware scheduling for distributed cache clustered architecture

ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
Time-constrained scheduling of large pipelined datapaths

Journal of Systems Architecture: the EUROMICRO Journal
Application driven embedded system design: a face recognition case study

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Joint hardware-software leakage minimization approach for the register file of VLIW embedded architectures

Integration, the VLSI Journal
Compiler-driven leakage energy reduction in banked register files

PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe a software pipelining framework, CALiBeR (Cluster Aware Load Balancing Retiming Algorithm), suitable for compilers targeting clustered embedded VLIW processors. CALiBeR can be effectively used by embedded system designers to explore different code optimization alternatives, i.e., can assist the generation of high-quality customized retiming solutions for desired program memory size and throughput requirements, while minimizing register pressure. An extensive set of experimental results is presented, considering several representative benchmark loop kernels and a wide variety of clustered datapath configurations, demonstrating that our algorithm compares favorably with one of the best state-of-the-art algorithms, achieving up to 50% improvement in performance and up to 47% improvement in register requirements.