An evaluation of computing paradigms for N-body simulations on distributed memory architectures

Authors:
Collin McCurdy;John Mellor-Crummey
Affiliations:
Department of Computer Science, University of Wisconsin, Madison;Department of Computer Science, Rice University
Venue:
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
1999

Citing 11
Cited 2

A fast algorithm for particle simulations

Journal of Computational Physics
Run-time scheduling and execution of loops on message passing machines

Journal of Parallel and Distributed Computing - Special issue: algorithms for hypercube computers
An implementation of the fast multipole method without multipoles

SIAM Journal on Scientific and Statistical Computing
The high performance Fortran handbook

The high performance Fortran handbook
A parallel hashed Oct-Tree N-body algorithm

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Communication optimizations for irregular scientific computations on distributed memory architectures

Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Implementing O(N) N-body algorithms efficiently in data-parallel languages

Scientific Programming
High performance Fortran for highly irregular problems

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler support for machine-independent parallelization of irregular problems

Compiler support for machine-independent parallelization of irregular problems
High Performance Fortran: Language Specification (PART II)

ACM SIGPLAN Fortran Forum - Special issue: high performance Fortran language specification, part 2

A Data Parallel Formulation of the Barnes-Hut Method for N -Body Simulations

PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
User-controllable coherence for high performance shared memory multiprocessors

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

The efficiency of HPF with respect to irregular applications is still largely unproven. While recent work has shown that a highly irregular hierarchical n-body force calculation method can be implemented in HPF, we have found that the implmentation contains inefficiencies which cause it to run up to a factor of three times slower than our hand-coded, explicitly parallel implementation. Our work examines these inefficiencies, determines that most of the extra overhead is due to a single aspect of the communication strategy, and demonstrates that fixing the communication strategy can bring the overheads of the HPF application to within 25% of those of the hand-coded version.