Communication-centric optimizations by dynamically detecting collective operations

  • Authors:
  • Torsten Hoefler;Timo Schneider

  • Affiliations:
  • University of Illinois at Urbana-Champaign, Urbana, IL, USA;University of Technology Chemnitz, Chemnitz, Germany

  • Venue:
  • Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The steady increase of parallelism in high-performance computing platforms implies that communication will be most important in large-scale applications. In this work, we tackle the problem of transparent optimization of large-scale communication patterns using online compilation techniques. We utilize the Group Operation Assembly Language (GOAL), an abstract parallel dataflow definition language, to specify our transformations in a device-independent manner. We develop fast schemes that analyze dataflow and synchronization semantics in GOAL and detect if parts of the (or the whole) communication pattern express a known collective communication operation. The detection of collective operations allows us to replace the detected patterns with highly optimized algorithms or low-level hardware calls and thus improve performance significantly. Benchmark results suggest that our technique can lead to a performance improvement of orders of magnitude compared with various optimized algorithms written in Co-Array Fortran. Detecting collective operations also improves the programmability of parallel languages in that the user does not have to understand the detailed semantics of high-level communication operations in order to generate efficient and scalable code.