Automatic communication coalescing for irregular computations in UPC language

  • Authors:
  • Michail Alvanos;Montse Farreras;Ettore Tiotto;Xavier Martorell

  • Affiliations:
  • Programming Models, Barcelona Supercomputing Center;Universitat Politècnica de Catalunya;Static Compilation Technology, IBM Canada Software Lab, Canada;Universitat Politècnica de Catalunya

  • Venue:
  • CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Partitioned Global Address Space (PGAS) languages appeared to address programmer productivity in large scale parallel machines. However, fine grain accesses on shared structures have been identified as one of the main bottlenecks of PGAS languages. Manual or compiler assistance code optimization is required to avoid fine grain accesses. The downside of manually applying code transformations is the increased program complexity and hindering of the programmer productivity. On the other hand, compiler optimizations of fine grain accesses require knowledge of physical data mapping and the use of parallel loop constructs. This paper presents an optimization for prefetching and coalescing of shared accesses at runtime. Larger messages decrease the impact of remote access latency and increase the efficiency of the network communication. We have implemented our optimization for the Unified Parallel C (UPC) language. An experimental evaluation on a distributed-memory environment using a Power7 cluster demonstrates the benefits of our optimization.