Fast parallel algorithms for short-range molecular dynamics
Journal of Computational Physics
NAMD2: greater scalability for parallel molecular dynamics
Journal of Computational Physics - Special issue on computational molecular biophysics
Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflops Computer
International Journal of Parallel Programming
An overview of the BlueGene/L Supercomputer
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
NAMD: biomolecular simulation on thousands of processors
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Blue Matter, an application framework for molecular simulation on blue gene
Journal of Parallel and Distributed Computing - High-performance computational biology
Blue Gene: a vision for protein science using a petaflop supercomputer
IBM Systems Journal - Deep computing for the life sciences
A Portable Programming Interface for Performance Evaluation on Modern Processors
International Journal of High Performance Computing Applications
Simulation-based performance prediction for large parallel machines
International Journal of Parallel Programming - Special issue: The next generation software program
Early evaluation of the cray XT3
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
An Evaluation of the Oak Ridge National Laboratory Cray XT3
International Journal of High Performance Computing Applications
A regression-based approach to scalability prediction
Proceedings of the 22nd annual international conference on Supercomputing
Examining the Feasibility of Reconfigurable Models for Molecular Dynamics Simulation
ICA3PP '08 Proceedings of the 8th international conference on Algorithms and Architectures for Parallel Processing
Performance Analysis Framework for High-Level Language Applications in Reconfigurable Computing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Experimental evaluation of molecular dynamics simulations on multi-core systems
HiPC'08 Proceedings of the 15th international conference on High performance computing
Early evaluation of the cray XT3
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
Large-scale simulations and computational modeling using molecular dynamics (MD) continues to make significant impacts in the field of biology. It is well known that simulations of biological events at native time and length scales requires computing power several orders of magnitude beyond today's commonly available systems. Supercomputers, such as IBM Blue Gene/L and Cray XT3, will soon make tens to hundreds of teraFLOP/s of computing power available by utilizing thousands of processors. The popular algorithms and MD applications, however, were not initially designed to run on thousands of processors. In this paper, we present detailed investigations of the performance issues, which are crucial for improving the scalability of the MD-related algorithms and applications on massively parallel processing (MPP) architectures. Due to the varying characteristics of biological input problems, we study two prototypical biological complexes that use the MD algorithm: an explicit solvent and an implicit solvent. In particular, we study the AMBER application, which supports a variety of these types of input problems. For the explicit solvent problem, we focused on the particle mesh Ewald (PME) method for calculating the electrostatic energy, and for the implicit solvent model, we targeted the Generalized Born (GB) calculation. We uncovered and subsequently modified a limitation in AMBER that restricted the scaling beyond 128 processors. We collected performance data for experiments on up to 2048 Blue Gene/L and XT3 processors and subsequently identified that the scaling is largely limited by the underlying algorithmic characteristics and also by the implementation of the algorithms. Furthermore, we found that the input problem size of biological system is constrained by memory available per node. In conclusion, our results indicate that MD codes can significantly benefit from the current generation architectures with relatively modest optimization efforts. Nevertheless, the key for enabling scientific breakthroughs lies in exploiting the full potential of these new architectures.