A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs

  • Authors:
  • William Thies;Vikram Chandrasekhar;Saman Amarasinghe

  • Affiliations:
  • -;-;-

  • Venue:
  • Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The emergence of multicore processors has heightened the need for effective parallel programming practices. In addition to writing new parallel programs, the next gener- ation of programmers will be faced with the overwhelming task of migrating decades' worth of legacy C code into a parallel representation. Addressing this problem requires a toolset of parallel programming primitives that can broadly apply to both new and existing programs. While tools such as threads and OpenMP allow programmers to express task and data parallelism, support for pipeline parallelism is distinctly lacking. In this paper, we offer a new and pragmatic approach to leveraging coarse-grained pipeline parallelism in C pro- grams. We target the domain of streaming applications, such as audio, video, and digital signal processing, which exhibit regular flows of data. To exploit pipeline paral- lelism, we equip the programmer with a simple set of an- notations (indicating pipeline boundaries) and a dynamic analysis that tracks all communication across those bound- aries. Our analysis outputs a stream graph of the applica- tion as well as a set of macros for parallelizing the program and communicating the data needed. We apply our method- ology to six case studies, including MPEG-2 decoding, MP3 decoding, GMTI radar processing, and three SPEC bench- marks. Our analysis extracts a useful block diagram for each application, and the parallelized versions offer a 2.78x mean speedup on a 4-core machine.