Enhanced loop coalescing: a compiler technique for transforming non-uniform iteration spaces

  • Authors:
  • Arun Kejariwal;Alexandru Nicolau;Constantine D. Polychronopoulos

  • Affiliations:
  • Center for Embedded Computer Systems, University of California at Irvine, Irvine, CA;Center for Embedded Computer Systems, University of California at Irvine, Irvine, CA;Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel nested loops are the largest potential source of parallelism in numerical and scientific applications. Therefore, executing parallel loops with low run-time overhead is very important for achieving high performance on parallel computers. Guided self-scheduling (GSS) has long been used for dynamic scheduling of parallel loops on shared memory parallel machines and for efficient utilization of dynamically allocated processors. In order to minimize the synchronization (or scheduling) overhead in GSS, loop coalescing has been proposed as a restructuring technique to transform nested loops into a single loop. In other words, coalescing "flattens" the iteration space in lexicographic order of the indices of the original loop. Although coalescing helps reduce the run-time scheduling overhead, it does not necessarily minimize the makespan, i.e., the maximum finishing time, especially in situations where the execution time (workload) of iterations is not uniform as is often the case in practice, e.g., in control intensive applications. This can be attributed to the fact that the makespan is directly dependent on the workload distribution across the flattened iteration space. The latter in itself depends on the order of coalescing of the loop indices. We show that coalescing (as proposed) can potentially result in large makespans. In this paper, we present a loop permutation-based approach to loop coalescing, referred to as enhanced loop coalescing, to achieve near-optimal schedules. Several examples are presented and the general technique is discussed in detail.