Enhancing loop buffering of media and telecommunications applications using low-overhead predication

  • Authors:
  • John W. Sias;Hillery C. Hunter;Wen-mei W. Hwu

  • Affiliations:
  • University of Illinois, Urbana-Champaign;University of Illinois, Urbana-Champaign;University of Illinois, Urbana-Champaign

  • Venue:
  • Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Media- and telecommunications-focused processors, increasingly designed as deeply pipelined, statically-scheduled VLIWs, rely on loop buffers for low-overhead execution of simple loops. Key loops containing control flow pose a substantial problem---full predication has a high encoding overhead, and partial predication techniques do not support if-conversion, the transformation of general acyclic control flow into predicated blocks. Using a set of significant media processing benchmarks, drawn from MediaBench and contemporary telecommunications standards, we explore a compromise approach. We demonstrate a compiler using if-conversion and specialized loop transformations to arrange for 70-99% of fetched operations to come from a simple, statically managed 256-instruction loop buffer, saving instruction fetch power and eliminating branch penalties. To complement this we introduce a "niche" form of predication specialized to permit general if-conversion with only a single bit in the encoding of each operation and to eliminate much of the hardware overhead of a predicate register-based approach.