An overview of the BlueGene/L Supercomputer
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Vectorization for SIMD architectures with alignment constraints
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Efficient SIMD Code Generation for Runtime Alignment and Length Conversion
Proceedings of the international symposium on Code generation and optimization
A Performance and Scalability Analysis of the BlueGene/L Architecture
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Unlocking the Performance of the BlueGene/L Supercomputer
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Design and implementation of message-passing services for the Blue Gene/L supercomputer
IBM Journal of Research and Development
Systems research challenges: a scale-out perspective
IBM Journal of Research and Development
MPI performance analysis tools on Blue Gene/L
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
A study of MPI performance analysis tools on blue gene/L
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Early experience with scientific applications on the blue gene/l supercomputer
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
Blue Gene/L represents a new way to build supercomputers, using a large number of low power processors, together with multiple integrated interconnection networks. Whether real applications can scale to tens of thousands of processors (on a machine like Blue Gene/L) has been an open question. In this paper, we describe early experience with several physics and material science applications on a 32,768 node Blue Gene/L system, which was installed recently at the Lawrence Livermore National Laboratory. Our study shows some problems in the applications and in the current software implementation, but overall, excellent scaling of these applications to 32K nodes on the current Blue Gene/L system. While there is clearly room for improvement, these results represent the first proof point that MPI applications can effectively scale to over ten thousand processors. They also validate the scalability of the hardware and software architecture of Blue Gene/L.