A parallel hashed Oct-Tree N-body algorithm
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
MPI: The Complete Reference
Scalable parallel formulations of the barnes-hut method for n-body simulations
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Guest Editors' Introduction: Fast Multipole Methods
IEEE Computational Science & Engineering
A Massively Parallel Fast Multipole Algorithm in Three Dimensions
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
HPCA'09 Proceedings of the Second international conference on High Performance Computing and Applications
The load balancing algorithm based on the parallel implementation of IPO and FMM
HPCA'09 Proceedings of the Second international conference on High Performance Computing and Applications
Hi-index | 0.00 |
In recent years, the Multilevel Fast Multipole Method (MLFMA) [14, 17] has been developed into one of the most powerful techniques for accelerating the iterative solution of integral equations of electromagnetics. It has been shown that the MLFMA reduces the computational complexity of a matrix-vector multiply from O(N2) to O(N log N), where N is the number of unknowns. In an attempt to extend the range of problems that can be solved using this technique, we have recently developed an application independent, parallel MLFMA kernel, called ScaleME, for distributed memory computers using MPI.In this paper, we shall discuss the characteristic features which distinguishes it from its static counterpart, such as work required for each level, the size of multipole expansions and interpolation/filtering operations, and their influence in the parallel algorithm design. We shall follow it with a discussion of major issues in the parallelization, which are unique to the dynamic MLFMA, such as reducing the memory requirements for translation operators and the reduction of replicated geometric data structures.We shall also briefly discuss the load balancing strategies. Finally, we shall present some representative numerical results from some ScaleME accelerated electromagnetic scattering codes, including a simulation involving 4 million unknowns and that of the radar cross-section computation of a full scale air-craft on a Beowulf class cluster.