A fast algorithm for particle simulations
Journal of Computational Physics
An extended set of FORTRAN basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
An implementation of the fast multipole method without multipoles
SIAM Journal on Scientific and Statistical Computing
Journal of Parallel and Distributed Computing
Multipole translation theory for the three-dimensional Laplace and Helmholtz equations
SIAM Journal on Scientific Computing
Multipole algorithms for molecular dynamics simulation on high performance computers
Multipole algorithms for molecular dynamics simulation on high performance computers
Fast Fourier Transform Accelerated Fast Multipole Algorithm
SIAM Journal on Scientific Computing
Implementing O(N) N-body algorithms efficiently in data-parallel languages
Scientific Programming
Journal of Computational Physics
GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark
ACM Transactions on Mathematical Software (TOMS)
Generalized Gaussian Quadratures and Singular Value Decompositions of Integral Operators
SIAM Journal on Scientific Computing
A fast adaptive multipole algorithm in three dimensions
Journal of Computational Physics
Basic Linear Algebra Subprograms for Fortran Usage
ACM Transactions on Mathematical Software (TOMS)
Reconstruction and representation of 3D objects with radial basis functions
Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Numerical Linear Algebra for High Performance Computers
Numerical Linear Algebra for High Performance Computers
A Matrix Version of the Fast Multipole Method
SIAM Review
Nonlinear Optimization, Quadrature, and Interpolation
SIAM Journal on Optimization
The rapid evaluation of potential fields in particle systems
The rapid evaluation of potential fields in particle systems
Efficient parallel implementations of multipole based n-body algorithms
Efficient parallel implementations of multipole based n-body algorithms
SIAM Journal on Scientific Computing
Efficient fast multipole method for low-frequency scattering
Journal of Computational Physics
Minimizing development and maintenance costs in supporting persistently optimized BLAS
Software—Practice & Experience - Research Articles
Communications overlapping in fast multipole particle dynamics methods
Journal of Computational Physics
FastCap: a multipole accelerated 3-D capacitance extraction program
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Journal of Computational Physics
Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
High performance BLAS formulation of the adaptive Fast Multipole Method
Mathematical and Computer Modelling: An International Journal
An (almost) direct deployment of the Fast Multipole Method on the Cell processor
The Journal of Supercomputing
Hi-index | 31.45 |
The multipole-to-local (M2L) operator is the most time-consuming part of the far field computation in the fast multipole method for Laplace equation. Its natural expression, though commonly used, does not respect a sharp error bound: we here first prove the correctness of a second expression. We then propose a matrix formulation implemented with basic linear algebra subprograms (BLAS) routines in order to speed up its computation for these two expressions. We also introduce special data storages in memory to gain greater computational efficiency. This BLAS scheme is finally compared, for uniform distributions, to other M2L improvements such as block FFT, FFT with polynomial scaling, rotations and plane wave expansions. When considering runtime, extra memory storage, numerical stability and common precisions for Laplace equation, the BLAS version appears as the best one.