High performance Fortran for highly irregular problems
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
An evaluation of computing paradigms for N-body simulations on distributed memory architectures
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
A semantics for imprecise exceptions
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
HPFBench: a high performance Fortran benchmark suite
ACM Transactions on Mathematical Software (TOMS)
Performance of Scheduling Scientific Applications with Adaptive Weighted Factoring
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
High performance BLAS formulation of the multipole-to-local operator in the fast multipole method
Journal of Computational Physics
Hi-index | 0.01 |
The optimization techniques for hierarchical O(N) N-body algorithmsdescribed here focus on managing the data distribution and the datareferences, both between the memories of different nodes and withinthe memory hierarchy of each node. We show how the techniques canbe expressed in data-parallel languages, such as High PerformanceFortran (HPF) and Connection Machine Fortran (CMF). Theeffectiveness of our techniques is demonstrated on animplementation of Anderson's hierarchical O(N) N-body method forthe Connection Machine system CM-5/5E. Of the total execution time,communication accounts for about 10-20% of the total time, with theaverage efficiency for arithmetic operations being about 40% andthe total efficiency (including communication) being about 35%. Forthe CM-5E, a performance in excess of 60 Mflop/s per node (peak 160Mflop/s per node) has been measured.