Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Lifetime-sensitive modulo scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Effective exploitation of a zero overhead loop buffer
Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Modulo scheduling for the TMS320C6x VLIW DSP architecture
Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
DSP Processor Fundamentals: Architectures and Features
DSP Processor Fundamentals: Architectures and Features
DSP Processors Hit the Mainstream
Computer
Cosy Compiler Phase Embedding with the CoSy Compiler Model
CC '94 Proceedings of the 5th International Conference on Compiler Construction
Aggressive Loop Unrolling in a Retargetable Optimizing Compiler
CC '96 Proceedings of the 6th International Conference on Compiler Construction
Modulo scheduling, machine representations, and register-sensitive algorithms
Modulo scheduling, machine representations, and register-sensitive algorithms
Preprocessing strategy for effective modulo scheduling on multi-issue digital signal processors
CC'07 Proceedings of the 16th international conference on Compiler construction
Hi-index | 0.00 |
A Zero Overhead Loop Buffer (ZOLB) is an architectural feature that is commonly found in DSP processors. This buffer can be viewed as a compiler managed cache that contains a sequence of instructions that will be executed a specified number of times without incurring any loop overhead. Unlike loop unrolling, a loop buffer can be used to minimize loop overhead without the penalty of increasing code size. In addition, a ZOLB requires relatively little space and power, which are both important considerations for most DSP applications. This paper describes strategies for generating code to effectively use a ZOLB. We have found that many common code improving transformations used by optimizing compiler on conventional architectures can be easily used to (1) allow more loops to be placed in a ZOLB, (2) further reduce loop overhead of the loops placed in a ZOLB, and (3) avoid redundant loading of ZOLB loops. The results given in this paper demonstrate that this architectural feature can often be exploited with substantial improvements in execution time and slight reductions in code size for various signal processing applications.