High performance Fortran for highly irregular problems
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
On the Quality of Partitions Based on Space-Filling Curves
ICCS '02 Proceedings of the International Conference on Computational Science-Part III
Min-Max-Boundary Domain Decomposition
COCOON '98 Proceedings of the 4th Annual International Conference on Computing and Combinatorics
Dynamic Compressed Hypertoctrees with Application to the N-Body Problem
Proceedings of the 19th Conference on Foundations of Software Technology and Theoretical Computer Science
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A New Parallel Kernel-Independent Fast Multipole Method
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Fast multipole methods on graphics processors
Journal of Computational Physics
Fast electrostatic force calculation on parallel computer clusters
Journal of Computational Physics
A massively parallel adaptive fast-multipole method on heterogeneous architectures
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Parallel algorithms for inductance extraction of VLSI circuits
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Scalable fast multipole methods on distributed heterogeneous architectures
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Parallel performance of hierarchical multipole algorithms for inductance extraction
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
A massively parallel adaptive fast multipole method on heterogeneous architectures
Communications of the ACM
Hi-index | 0.03 |
We present an efficient and provably good partitioning and load balancing algorithm for parallel adaptive N-body simulation. The main ingredient of our method is a novel geometric characterization of a class of communication graphs that can be used to support hierarchical N-body methods such as the fast multipole method (FMM) and the Barnes--Hut method (BH). We show that communication graphs of these methods have a good partition that can be found efficiently sequentially and in parallel. In particular, we show that an N-body communication graph (either for BH or for FMM) can be partitioned into two subgraphs with equal computation load by removing only $O(\sqrt{n\log n})$ and O(n2/3(log n)1/3) number of nodes, respectively, for two and three dimensions. These bounds on node-partition imply bounds on edge-partition of $O(\sqrt{n}(\log n)^{3/2})$ and O(n2/3(log n)4/3), respectively, for two and three dimensions. To the best of our knowledge, this is the first theoretical result on the quality of partitioning N-body communication graphs for nonuniformly distributed particles. Our results imply that parallel adaptive N-body simulation can be made as scalable as computation on regular grids and as efficient as parallel N-body simulation on uniformly distributed particles.