Fast analysis of molecular dynamics trajectories with graphics processing units-Radial distribution function histogramming

Authors:
Benjamin G. Levine;John E. Stone;Axel Kohlmeyer
Affiliations:
Institute for Computational Molecular Science and Department of Chemistry, Temple University, Philadelphia, PA, United States;Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, IL, United States;Institute for Computational Molecular Science and Department of Chemistry, Temple University, Philadelphia, PA, United States
Venue:
Journal of Computational Physics
Year:
2011

Citing 16
Cited 2

More iteration space tiling

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
(Pen)-ultimate tiling?

Integration, the VLSI Journal
Tile size selection using cache organization and data layout

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
Scatter-Add in Data Parallel Architectures

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
N-Body simulation on GPUs

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
General purpose molecular dynamics simulations fully implemented on graphics processing units

Journal of Computational Physics
GPU acceleration of cutoff pair potentials for molecular modeling applications

Proceedings of the 5th conference on Computing frontiers
Atomic Vector Operations on Chip Multiprocessors

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Entering the petaflop era: the architecture and performance of Roadrunner

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Adapting a message-driven parallel application to GPU-accelerated clusters

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Parallel Lattice Boltzmann Flow Simulation on Emerging Multi-core Platforms

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Graphical Processing Units for Quantum Chemistry

Computing in Science and Engineering
High performance computation and interactive display of molecular orbitals on GPUs and multi-core CPUs

Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
GPU acceleration of a production molecular docking code

Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Accelerating Correlated Quantum Chemistry Calculations Using Graphical Processing Units

Computing in Science and Engineering

Scaling fast multipole methods up to 4000 GPUs

Proceedings of the ATIP/A*CRC Workshop on Accelerator Technologies for High-Performance Computing: Does Asia Lead the Way?
GPU-accelerated molecular visualization on petascale supercomputing platforms

UltraVis '13 Proceedings of the 8th International Workshop on Ultrascale Visualization

Quantified Score

Hi-index	31.45

Visualization

Abstract

The calculation of radial distribution functions (RDFs) from molecular dynamics trajectory data is a common and computationally expensive analysis task. The rate limiting step in the calculation of the RDF is building a histogram of the distance between atom pairs in each trajectory frame. Here we present an implementation of this histogramming scheme for multiple graphics processing units (GPUs). The algorithm features a tiling scheme to maximize the reuse of data at the fastest levels of the GPU's memory hierarchy and dynamic load balancing to allow high performance on heterogeneous configurations of GPUs. Several versions of the RDF algorithm are presented, utilizing the specific hardware features found on different generations of GPUs. We take advantage of larger shared memory and atomic memory operations available on state-of-the-art GPUs to accelerate the code significantly. The use of atomic memory operations allows the fast, limited-capacity on-chip memory to be used much more efficiently, resulting in a fivefold increase in performance compared to the version of the algorithm without atomic operations. The ultimate version of the algorithm running in parallel on four NVIDIA GeForce GTX 480 (Fermi) GPUs was found to be 92 times faster than a multithreaded implementation running on an Intel Xeon 5550 CPU. On this multi-GPU hardware, the RDF between two selections of 1,000,000 atoms each can be calculated in 26.9s per frame. The multi-GPU RDF algorithms described here are implemented in VMD, a widely used and freely available software package for molecular dynamics visualization and analysis.