Decoupled software pipelining creates parallelization opportunities

  • Authors:
  • Jialu Huang;Arun Raman;Thomas B. Jablin;Yun Zhang;Tzu-Han Hung;David I. August

  • Affiliations:
  • Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA

  • Venue:
  • Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Decoupled Software Pipelining (DSWP) is one approach to automatically extract threads from loops. It partitions loops into long-running threads that communicate in a pipelined manner via inter-core queues. This work recognizes that DSWP can also be an enabling transformation for other loop parallelization techniques. This use of DSWP, called DSWP+, splits a loop into new loops with dependence patterns amenable to parallelization using techniques that were originally either inapplicable or poorly-performing. By parallelizing each stage of the DSWP+ pipeline using (potentially) different techniques, not only is the benefit of DSWP increased, but the applicability and performance of other parallelization techniques are enhanced. This paper evaluates DSWP+ as an enabling framework for other transformations by applying it in conjunction with DOALL, LOCALWRITE, and SpecDOALL to individual stages of the pipeline. This paper demonstrates significant performance gains on a commodity 8-core multicore machine running a variety of codes transformed with DSWP+.