Fast Fourier Transform Accelerated Fast Multipole Algorithm

Authors:
William D. Elliott;John A. Board, Jr.
Affiliations:
-;-
Venue:
SIAM Journal on Scientific Computing
Year:
1996

Citing 0
Cited 11

Approximate complex polynomial evaluation in near constant work per point

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Using a Fast Multipole Method to Accelerate Spline Evaluations

IEEE Computational Science & Engineering
A fast algorithm for three-dimensional potential fields calculation: fast Fourier transform on multipoles

Journal of Computational Physics
A kernel-independent adaptive fast multipole algorithm in two and three dimensions

Journal of Computational Physics
Efficient fast multipole method for low-frequency scattering

Journal of Computational Physics
Communications overlapping in fast multipole particle dynamics methods

Journal of Computational Physics
Massively parallel implementation of a fast multipole method for distributed memory machines

Journal of Parallel and Distributed Computing
High performance BLAS formulation of the multipole-to-local operator in the fast multipole method

Journal of Computational Physics
Automatic Generation of FFT for Translations of Multipole Expansions in Spherical Harmonics

International Journal of High Performance Computing Applications
Application of the fast multipole method for the evaluation of magnetostatic fields in micromagnetic computations

Journal of Computational Physics
High performance BLAS formulation of the adaptive Fast Multipole Method

Mathematical and Computer Modelling: An International Journal

Quantified Score

Hi-index	0.03

Visualization

Abstract

This paper describes an ${\cal O}(p^2 \log_2(p) N)$ implementation of the fast multipole algorithm (FMA) for $N$-body simulations. This method of computing the FMA is faster than the original, which is ${\cal O}(p^4N)$, where $p$ is the number of terms retained in the truncated multipole expansion representation of the potential field of a collection of charged particles. The $p$ term determines the accuracy of the calculation. The limiting ${\cal O}(p^4)$ computation in the original FMA is a convolution-like operation on a matrix of multipole coefficients. This paper describes the implementation details of a conversion of this limiting computation to linear convolution, which is then computed in the Fourier domain using the fast Fourier transform (FFT), based on a method originally outlined by Greengard and Rokhlin. In addition, this paper describes a new block decomposition of the multipole expansion data that provides numerical stability and efficient computation. The resulting ${\cal O}(p^2 \log_2(p))$ subroutine has a speedup of 2 on a sequential processor over the original method for $p=8$, and a speedup of 4 for $p=16$. The new subroutine vectorizes well and has a speedup of 3 on a vector processor at $p=8$ and a speedup of 6 at $p=16$.