A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
The SPARC architecture manual (version 9)
The SPARC architecture manual (version 9)
PACS-CS: A Large-Scale Bandwidth-Aware PC Cluster for Scientific Computations
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
ICKS '08 Proceedings of the International Conference on Informatics Education and Research for Knowledge-Circulating Society (icks 2008)
ICPPW '09 Proceedings of the 2009 International Conference on Parallel Processing Workshops
Quantum algorithms for predicting the properties of complex materials
Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
4.45 Pflops astrophysical N-body simulation on K computer: the gravitational trillion-body problem
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Heuristic static load-balancing algorithm applied to the fragment molecular orbital method
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A Symmetry-Based Decomposition Approach to Eigenvalue Problems
Journal of Scientific Computing
The Experience in Designing and Evaluating the High Performance Cluster Netuno
International Journal of Parallel Programming
Hi-index | 0.00 |
Real space DFT (RSDFT) is a simulation technique most suitable for massively-parallel architectures to perform first-principles electronic-structure calculations based on density functional theory. We here report unprecedented simulations on the electron states of silicon nanowires with up to 107,292 atoms carried out during the initial performance evaluation phase of the K computer being developed at RIKEN. The RSDFT code has been parallelized and optimized so as to make effective use of the various capabilities of the K computer. Simulation results for the self-consistent electron states of a silicon nanowire with 10,000 atoms were obtained in a run lasting about 24 hours and using 6,144 cores of the K computer. A 3.08 peta-flops sustained performance was measured for one iteration of the SCF calculation in a 107,292-atom Si nanowire calculation using 442,368 cores, which is 43.63% of the peak performance of 7.07 peta-flops.