A fast algorithm for particle simulations
Journal of Computational Physics
Spacefilling curves and the planar travelling salesman problem
Journal of the ACM (JACM)
Numerical study of a three-dimensional vortex method
Journal of Computational Physics
A modified tree code: don't laugh; it runs
Journal of Computational Physics
The parallel multipole method on the connection machine
SIAM Journal on Scientific and Statistical Computing
Parallel hierarchical N-body methods
Parallel hierarchical N-body methods
Astrophysical N-body simulations using hierarchical tree data structures
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
An atomic model for message-passing
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Parallel hierarchical N-body methods and their implications for multiprocessors
Parallel hierarchical N-body methods and their implications for multiprocessors
A parallel hashed Oct-Tree N-body algorithm
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Skeletons from the treecode closet
Journal of Computational Physics
Fast algorithms for N-body simulations
Fast algorithms for N-body simulations
Implications of hierarchical N-body methods for multiprocessor architectures
ACM Transactions on Computer Systems (TOCS)
An O(n) time hierarchical tree algorithm for computing force field in n-body simulations
Theoretical Computer Science
A Framework for Parallel Tree-Based Scientific Simulations
ICPP '97 Proceedings of the international Conference on Parallel Processing
Prototyping N-Body Simulation in Proteus
IPPS '92 Proceedings of the 6th International Parallel Processing Symposium
The Complexity of N-body Simulation
ICALP '93 Proceedings of the 20th International Colloquium on Automata, Languages and Programming
Tree data structures for N-body simulation
FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Hi-index | 0.00 |
This paper describes our experiences developing high-performance code for astrophysical N-body simulations. Recent N-body methods are based on an adaptive tree structure. The tree must be built and maintained across physically distributed memory; moreover, the communication requirements are irregular and adaptive. Together with the need to balance the computational work-load among processors, these issues pose interesting challenges and tradeoffs for high-performance implementation. Our implementation was guided by the need to keep solutions simple and general. We use a technique for implicitly representing a dynamic global tree across multiple processors which substantially reduces the programming complexity as well as the performance overheads of distributed memory architectures. The contributions include methods to vectorize the computation and minimize communication time which are theoretically and experimentally justified. The code has been tested by varying the number and distribution of bodies on different configurations of the Connection Machine CM-5. The overall performance on instances with 10 million bodies is typically over 48 percent of the peak machine rate, which compares favorably with other approaches.