Gyrokinetic particle simulation model
Journal of Computational Physics
Computer simulation using particles
Computer simulation using particles
Particle-in-cell simulation codes in High Performance Fortran
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Accelerating a paricle -in-cell simulation using a hybrid counting sort
Journal of Computational Physics
ICCS '02 Proceedings of the International Conference on Computational Science-Part III
VORPAL: a versatile plasma simulation code
Journal of Computational Physics
Scientific Computations on Modern Parallel Vector Systems
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
QUICKPIC: a highly efficient particle-in-cell code for modeling wakefield acceleration in plasmas
Journal of Computational Physics
IBM Journal of Research and Development
Fast parallel Particle-To-Grid interpolation for plasma PIC simulations on the GPU
Journal of Parallel and Distributed Computing
0.374 Pflop/s trillion-particle kinetic modeling of laser plasma interaction on Roadrunner
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
OhHelp: a scalable domain-decomposing dynamic load balancing for particle-in-cell simulations
Proceedings of the 23rd international conference on Supercomputing
Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
The Scalable Heterogeneous Computing (SHOC) benchmark suite
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Journal of Computational Physics
Kinetic turbulence simulations at extreme scale on leadership-class systems
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Hi-index | 0.00 |
The gyrokinetic Particle-in-Cell (PIC) method is a critical computational tool enabling petascale fusion simulation research. In this work, we present novel multi- and manycore-centric optimizations to enhance performance of GTC, a PIC-based production code for studying plasma microturbulence in tokamak devices. Our optimizations encompass all six GTC sub-routines and include multi-level particle and grid decompositions designed to improve multi-node parallel scaling, particle binning for improved load balance, GPU acceleration of key subroutines, and memory-centric optimizations to improve single-node scaling and reduce memory utilization. The new hybrid MPI-OpenMP and MPI-OpenMP-CUDA GTC versions achieve up to a 2x speedup over the production Fortran code on four parallel systems --- clusters based on the AMD Magny-Cours, Intel Nehalem-EP, IBM BlueGene/P, and NVIDIA Fermi architectures. Finally, strong scaling experiments provide insight into parallel scalability, memory utilization, and programmability trade-offs for large-scale gyrokinetic PIC simulations, while attaining a 1.6× speedup on 49,152 XE6 cores.