Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
Scientific computing: an introduction with parallel computing
Scientific computing: an introduction with parallel computing
Techniques to overlap computation and communication in irregular iterative applications
ICS '94 Proceedings of the 8th international conference on Supercomputing
Using MPI: portable parallel programming with the message-passing interface
Using MPI: portable parallel programming with the message-passing interface
A manual for the CHAOS runtime library
A manual for the CHAOS runtime library
Compiler and run-time support for irregular computations
Compiler and run-time support for irregular computations
Exploiting spatial regularity in irregular iterative applications
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
An efficient uniform run-time scheme for mixed regular-irregular applications
ICS '98 Proceedings of the 12th international conference on Supercomputing
OpenMP Extensions for Irregular Parallel Applications on Clusters
IWOMP '07 Proceedings of the 3rd international workshop on OpenMP: A Practical Programming Model for the Multi-Core Era
Hi-index | 0.00 |
Important applications including those in computational chemistry, computational fluid dynamics, structural analysis and sparse matrix applications usually consist of a mixture of regular and irregular accesses. While current state-of-the-art run-time library support for such applications handles the irregular accesses reasonably well, the efficacy of the optimizations at run-time for the regular accesses is yet to be proven. This paper aims to find out a better approach to handle the above applications in a unified compiler and run-time framework. Specifically, this paper considers only regular applications and evaluates the performance of two approaches, a run-time approach using PILAR and a compile-time approach using a commercial HPF compiler. This study shows that using a particular representation of regular accesses, the performance of regular code using run-time libraries can come close to the performance of code generated by a compiler. We also determine the operations that usually contribute largely to the run-time overhead in case of regular accesses. Experimental results are reported for three regular applications on a 16-processor IBM SP-2.