Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Efficient algorithms for all-to-all communications in multi-port message-passing systems
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
U-Net: a user-level network interface for parallel and distributed computing
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems
IEEE Transactions on Parallel and Distributed Systems
MPI: The Complete Reference
An overview of the BlueGene/L Supercomputer
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Design and Implementation of MPI on Puma Portals
MPIDC '96 Proceedings of the Second MPI Developers Conference
High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Overview of the Blue Gene/L system architecture
IBM Journal of Research and Development
Blue Gene/L torus interconnection network
IBM Journal of Research and Development
Blue Gene/L programming and operating environment
IBM Journal of Research and Development
Blue matter on blue gene/L: massively parallel computation for biomolecular simulation
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Scaling physics and material science applications on a massively parallel Blue Gene/L system
Proceedings of the 19th annual international conference on Supercomputing
Fast synchronization for chip multiprocessors
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Shared memory programming for large scale machines
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Scalable algorithms for molecular dynamics simulations on commodity clusters
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
MPI performance analysis tools on Blue Gene/L
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
MPI collective algorithm selection and quadtree encoding
Parallel Computing
Performance without pain = productivity: data layout and collective communication in UPC
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Large-scale maximum likelihood-based phylogenetic analysis on the IBM BlueGene/L
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
A study of the effects of machine geometry and mapping on distributed transpose performance
Proceedings of the 5th conference on Computing frontiers
Proceedings of the 22nd annual international conference on Supercomputing
IBM Journal of Research and Development
Scalable molecular dynamics with NAMD on the IBM Blue Gene/L system
IBM Journal of Research and Development
Architecture of the Component Collective Messaging Interface
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Overview of the Blue Gene/L system architecture
IBM Journal of Research and Development
Architecture of the Component Collective Messaging Interface
International Journal of High Performance Computing Applications
Overlapping communication and computation by using a hybrid MPI/SMPSs approach
Proceedings of the 24th ACM International Conference on Supercomputing
A study of MPI performance analysis tools on blue gene/L
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Achieving strong scaling with NAMD on blue Gene/L
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
TLSync: support for multiple fast barriers using on-chip transmission lines
Proceedings of the 38th annual international symposium on Computer architecture
The potential of on-chip multiprocessing for QCD machines
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Performance measurements of the 3D FFT on the blue gene/l supercomputer
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Blue matter: strong scaling of molecular dynamics on blue gene/l
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Automatic communication coalescing for irregular computations in UPC language
CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
Hi-index | 0.00 |
The Blue Gene®/L (BG/L) supercomputer, with 65,536 dual-processor compute nodes, was designed from the ground up to support efficient execution of massively parallel message-passing programs. Part of this support is an optimized implementation of the Message Passing Interface (MPI), which leverages the hardware features of BG/L. MPI for BG/L is implemented on top of a more basic message-passing infrastructure called the message layer. This message layer can be used both to implement other higher-level libraries and directly by applications. MPI and the message layer are used in the two BG/L modes of operation: the coprocessor mode and the virtual node mode. Performance measurements show that our message-passing services deliver performance close to the hardware limits of the machine. They also show that dedicating one of the processors of a node to communication functions (coprocessor mode) greatly improves the message-passing bandwidth, whereas running two processes per compute node (virtual node mode) can have a positive impact on application performance.