Computer simulation using particles
Computer simulation using particles
The Future Fast Fourier Transform?
SIAM Journal on Scientific Computing
NAMD2: greater scalability for parallel molecular dynamics
Journal of Computational Physics - Special issue on computational molecular biophysics
A New and Efficient FFT Algorithm for Distributed Memory Systems
Proceedings of the 1994 International Conference on Parallel and Distributed Systems
An overview of the BlueGene/L Supercomputer
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
The Fastest Fourier Transform in the West
The Fastest Fourier Transform in the West
Blue Matter, an application framework for molecular simulation on blue gene
Journal of Parallel and Distributed Computing - High-performance computational biology
Optimization of MPI collective communication on BlueGene/L systems
Proceedings of the 19th annual international conference on Supercomputing
Blue matter: approaching the limits of concurrency for classical molecular dynamics
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Topology mapping for Blue Gene/L supercomputer
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
The development and integration of a distributed 3D FFT for a cluster of workstations
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Overview of the Blue Gene/L system architecture
IBM Journal of Research and Development
Blue Gene/L torus interconnection network
IBM Journal of Research and Development
Blue Gene/L advanced diagnostics environment
IBM Journal of Research and Development
Design and implementation of message-passing services for the Blue Gene/L supercomputer
IBM Journal of Research and Development
Vectorization techniques for the Blue Gene/L double FPU
IBM Journal of Research and Development
IBM Journal of Research and Development
Performance measurements of the 3D FFT on the blue gene/l supercomputer
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Automatic performance optimization of the discrete fourier transform on distributed memory computers
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
This paper describes a parallel strategy to extend the scalability of a small 3D FFT on thousands of Blue Gene/L processors. The approach is to execute the intermediate phases of the 3D FFT on smaller processor subsets. Performance measurements of the standalone 3D FFT on two communication protocols, MPI and BG/L ADE are presented. While the performance of the 3D-FFT with MPI-based and BG/L ADE-based implementations exhibited qualitatively similar behavior, the BG/L ADE-based version has lower communication cost than the MPI based version for small message sizes. Measurements also show that the proposed approach is effective in improving Particle-Mesh-based N-body simulation performance significantly at the limits of scalability.