A fast algorithm for particle simulations
Journal of Computational Physics
50 GFlops molecular dynamics on the Connection Machine 5
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Fast parallel algorithms for short-range molecular dynamics
Journal of Computational Physics
NAMD2: greater scalability for parallel molecular dynamics
Journal of Computational Physics - Special issue on computational molecular biophysics
1.34 Tflops molecular dynamics simulation for NaCl with a special-purpose computer: MDM
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Scalable atomistic simulation algorithms for materials research
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
NAMD: biomolecular simulation on thousands of processors
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Zonal methods for the parallel execution of range-limited N-body simulations
Journal of Computational Physics
De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers
International Journal of High Performance Computing Applications
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A Multilevel Parallelization Framework for High-Order Stencil Computations
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Millisecond-scale molecular dynamics simulations on Anton
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
The IBM Blue Gene/Q Compute Chip
IEEE Micro
Hi-index | 0.00 |
Recent advancements in reactive molecular dynamics (MD) simulations based on many-body interatomic potentials necessitate efficient dynamic n-tuple computation, where a set of atomic n-tuples within a given spatial range is constructed at every time step. Here, we develop a computation-pattern algebraic framework to mathematically formulate general n-tuple computation. Based on translation/reflection-invariant properties of computation patterns within this framework, we design a shift-collapse (SC) algorithm for cell-based parallel MD. Theoretical analysis quantifies the compact n-tuple search space and small communication cost of SC-MD for arbitrary n, which are reduced to those in best pair-computation approaches (e.g. eighth-shell method) for n = 2. Benchmark tests show that SC-MD outperforms our production MD code at the finest grain, with 9.7-and 5.1-fold speedups on Intel-Xeon and BlueGene/Q clusters. SC-MD also exhibits excellent strong scalability.