ScaLAPACK user's guide
Exploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Vectorization techniques for the Blue Gene/L double FPU
IBM Journal of Research and Development
The blue gene/L supercomputer: a hardware and software story
International Journal of Parallel Programming
PNMPI tools: a whole lot greater than the sum of their parts
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Evaluating the effect of replacing CNK with linux on the compute-nodes of blue gene/l
Proceedings of the 22nd annual international conference on Supercomputing
Architecture of Qbox: a scalable first-principles molecular dynamics code
IBM Journal of Research and Development
Overview of the IBM Blue Gene/P project
IBM Journal of Research and Development
Dynamic topology aware load balancing algorithms for molecular dynamics applications
Proceedings of the 23rd international conference on Supercomputing
A Case Study of Communication Optimizations on 3D Mesh Interconnects
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Avoiding hot-spots on two-level direct networks
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Optimizing Halley's Iteration for Computing the Matrix Polar Decomposition
SIAM Journal on Matrix Analysis and Applications
Science at LLNL with IBM Blue Gene/Q
IBM Journal of Research and Development
Predicting application performance using supervised learning on communication features
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Task mapping stencil computations for non-contiguous allocations
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
First-principles simulations of high-Z metallic systems using the Qbox code on the BlueGene/L supercomputer demonstrate unprecedented performance and scaling for a quantum simulation code. Specifically designed to take advantage of massively-parallel systems like BlueGene/L, Qbox demonstrates excellent parallel efficiency and peak performance. A sustained peak performance of 207.3 TFlop/s was measured on 65,536 nodes, corresponding to 56.5% of the theoretical full machine peak using all 128k CPUs.