A fast algorithm for particle simulations
Journal of Computational Physics
Multipole algorithms for molecular dynamics simulation on high performance computers
Multipole algorithms for molecular dynamics simulation on high performance computers
Fast Fourier Transform Accelerated Fast Multipole Algorithm
SIAM Journal on Scientific Computing
Journal of Computational Physics
An adaptive software library for fast Fourier transforms
Proceedings of the 14th international conference on Supercomputing
Efficient parallel implementations of multipole based n-body algorithms
Efficient parallel implementations of multipole based n-body algorithms
Massively parallel implementation of a fast multipole method for distributed memory machines
Journal of Parallel and Distributed Computing
High performance BLAS formulation of the multipole-to-local operator in the fast multipole method
Journal of Computational Physics
Automatic Generation of FFT for Translations of Multipole Expansions in Spherical Harmonics
International Journal of High Performance Computing Applications
Fast electrostatic force calculation on parallel computer clusters
Journal of Computational Physics
Latency-Optimized Parallelization of the FMM Near-Field Computations
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Hi-index | 31.46 |
In molecular dynamics the fast multipole method (FMM) is an attractive alternative to Ewald summation for calculating electrostatic interactions due to the operation counts. However when applied to small particle systems and taken to many processors it has a high demand for interprocessor communication. In a distributed memory environment this demand severely limits applicability of the FMM to systems with O(10 K atoms). We present an algorithm that allows for fine grained overlap of communication and computation, while not sacrificing synchronization and determinism in the equations of motion. The method avoids contention in the communication subsystem making it feasible to use the FMM for smaller systems on larger numbers of processors. Our algorithm also facilitates application of multiple time stepping techniques within the FMM. We present scaling at a reasonably high level of accuracy compared with optimized Ewald methods.