Fast parallel algorithms for short-range molecular dynamics
Journal of Computational Physics
GPU acceleration of cutoff pair potentials for molecular modeling applications
Proceedings of the 5th conference on Computing frontiers
IPDPSW '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
Hi-index | 0.00 |
Heterogeneous systems containing accelerators such as GPUs or co-processors such as Intel MIC are becoming more prevalent due to their ability of exploiting large-scale parallelism in applications. In this paper, we have developed a hierarchical parallelization scheme for molecular dynamics simulations on CPU-MIC heterogeneous systems. The scheme exploits multi-level parallelism combining (1) task-level parallelism using a tightly-coupled division method, (2) thread-level parallelism employing spatial-decomposition through dynamically scheduled multi-threading, and (3) data-level parallelism via SIMD technology. By employing a hierarchy of parallelism with several optimization methods such as memory latency hiding and data pre-fetching, our MD code running on a CPU-MIC heterogeneous system (one 2.60GHZ eight-core Intel Xeon E5-2670 CPU and one 57-core Intel Knight Corner co-processor) achieves (1) multi-thread parallel efficiency of 72.4% for 57 threads on the co-processor with up to 7.62 times SIMD speedup on each core for the force computation task, and (2) up to 2.25 times speedup on the CPU-MIC system over the pure CPU system, which outperforms our previous work on a CPU-GPU (one NVIDIA Tesla M2050) platform. Our work shows that MD simulations can benefit enormously from the CPU-MIC heterogeneous platforms.