A fast algorithm for particle simulations
Journal of Computational Physics
Beyond homogeneous decomposition: scaling long-range forces on Massively Parallel Systems
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Petascale Direct Numerical Simulation of Blood Flow on 200K Cores and Heterogeneous Architectures
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
We present an error-controlled, highly scalable FMM implementation for long-range interactions of particle systems with open, 1D, 2D and 3D periodic boundary conditions. We highlight three aspects of fast summation codes not fully addressed in most articles; namely memory consumption, error control and runtime minimization. The aim of this poster is to contribute to all of these three points in the context of modern large scale parallel machines. Especially the used data structures, the parallelization approach and the precision-dependent parameter optimization will be discussed. The current code is able to compute all mutual long-range interactions of more than three trillion particles on 294.912 BG/P cores within a few minutes for an expansion up to quadrupoles. The maximum memory footprint of such a computation has been reduced to less than 45 Bytes per particle. The code employs a one-sided, non-blocking parallelization approach with a small communication overhead.