Compiler transformations for effectively exploiting a zero overhead loop buffer

Authors:
Gang-Ryung Uh;Yuhong Wang;David Whalley;Sanjay Jinturkar;Yunheung Paek;Vincent Cao;Chris Burns
Affiliations:
Computer Science Department, Boise State University, Boise, ID;LA Times, Los Angeles, CA;CS Department of Florida State University, Florida;Sandbridge Technologies, White Plains, NY;Seoul National University, Seoul, South Korea;Agere Systems, Allentown, PA;Agere Systems, Allentown, PA
Venue:
Software—Practice & Experience
Year:
2005

Citing 12
Cited 2

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
Effective exploitation of a zero overhead loop buffer

Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Modulo scheduling for the TMS320C6x VLIW DSP architecture

Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
DSP Processor Fundamentals: Architectures and Features

DSP Processor Fundamentals: Architectures and Features
DSP Processors Hit the Mainstream

Computer
Cosy Compiler Phase Embedding with the CoSy Compiler Model

CC '94 Proceedings of the 5th International Conference on Compiler Construction
Aggressive Loop Unrolling in a Retargetable Optimizing Compiler

CC '96 Proceedings of the 6th International Conference on Compiler Construction
Modulo scheduling, machine representations, and register-sensitive algorithms

Modulo scheduling, machine representations, and register-sensitive algorithms

Preprocessing strategy for effective modulo scheduling on multi-issue digital signal processors

CC'07 Proceedings of the 16th international conference on Compiler construction
Efficient multimedia coprocessor with enhanced SIMD engines for exploiting ILP and DLP

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

A Zero Overhead Loop Buffer (ZOLB) is an architectural feature that is commonly found in DSP processors. This buffer can be viewed as a compiler managed cache that contains a sequence of instructions that will be executed a specified number of times without incurring any loop overhead. Unlike loop unrolling, a loop buffer can be used to minimize loop overhead without the penalty of increasing code size. In addition, a ZOLB requires relatively little space and power, which are both important considerations for most DSP applications. This paper describes strategies for generating code to effectively use a ZOLB. We have found that many common code improving transformations used by optimizing compiler on conventional architectures can be easily used to (1) allow more loops to be placed in a ZOLB, (2) further reduce loop overhead of the loops placed in a ZOLB, and (3) avoid redundant loading of ZOLB loops. The results given in this paper demonstrate that this architectural feature can often be exploited with substantial improvements in execution time and slight reductions in code size for various signal processing applications.