Computer simulation using particles
Computer simulation using particles
A modified tree code: don't laugh; it runs
Journal of Computational Physics
Astrophysical N-body simulations using hierarchical tree data structures
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Astrophysical N-body simulations on GRAPE-4 special-purpose computer
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
$7.0/Mflops astrophysical N-body simulation with treecode on GRAPE-5
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
N-body simulation of galaxy formation on GRAPE-4 special-purpose computer
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
A 1.349 Tflops simulation of black holes in a galactic center on GRAPE-6
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Avalon: an Alpha/Linux cluster achieves 10 Gflops for $15k
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Highly portable and efficient implementations of parallel adaptive N-body methods
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Scientific Simulations with Special Purpose Computers: The Grade Systems
Scientific Simulations with Special Purpose Computers: The Grade Systems
Performance evaluation and tuning of GRAPE-6 - towards 40 "real" Tflops
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
190 TFlops Astrophysical N-body Simulation on a Cluster of GPUs
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
2HOT: an improved parallel hashed oct-tree n-body algorithm for cosmological simulation
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Petascale direct numerical simulation of turbulent channel flow on up to 786K cores
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Designing and auto-tuning parallel 3-D FFT for computation-communication overlap
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
As an entry for the 2012 Gordon-Bell performance prize, we report performance results of astrophysical N-body simulations of one trillion particles performed on the full system of K computer. This is the first gravitational trillion-body simulation in the world. We describe the scientific motivation, the numerical algorithm, the parallelization strategy, and the performance analysis. Unlike many previous Gordon-Bell prize winners that used the tree algorithm for astrophysical N-body simulations, we used the hybrid TreePM method, for similar level of accuracy in which the short-range force is calculated by the tree algorithm, and the long-range force is solved by the particle-mesh algorithm. We developed a highly-tuned gravity kernel for short-range forces, and a novel communication algorithm for long-range forces. The average performance on 24576 and 82944 nodes of K computer are 1.53 and 4.45 Pflops, which correspond to 49% and 42% of the peak speed.