Additive semi-implicit Runge-Kutta methods for computing high-speed nonequilibrium reactive flows
Journal of Computational Physics
Roofline: an insightful visual performance model for multicore architectures
Communications of the ACM - A Direct Path to Dependable Software
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Parallel filtering in global gyrokinetic simulations
Journal of Computational Physics
Hi-index | 0.00 |
A gyrokinetic toroidal five dimensional Eulerian code GT5D [Y.Idomura et. al., Comput. Phys. Commun 179, 391 (2008)] is ported on five advanced massively parallel platforms and comprehensive benchmark tests are performed. Sustained performances of the GT5D kernel and their dependency on the memory bandwidth are discussed. By using a novel multi-layer hybrid parallelization model, the size of MPI communicators can be suppressed below ~ 100 up to ~ 107 cores, and the scalability is improved on multi-core platforms. In strong scaling tests, a good scalability is confirmed up to several thousands cores on every platforms, and the maximum sustained performance of ~ 19.4 Tflops (the peak ratio of ~ 10.1%) is achieved using 16384 cores of BX900.