Design and implementation of message-passing services for the Blue Gene/L supercomputer

  • Authors:
  • G. Almási;C. Archer;J. G. Castaños;J. A. Gunnels;C. C. Erway;P. Heidelberger;X. Martorell;J. E. Moreira;K. Pinnow;J. Ratterman;B. D. Steinmacher-Burow;W. Gropp;B. Toonen

  • Affiliations:
  • IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, New York;IBM Systems and Technology Group, Rochester, Minnesota;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, New York;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, New York;Computer Science Department, Brown University, Providence, Rhode Island;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, New York;Technical University of Catalonia, Barcelona, Spain;IBM Systems and Technology Group, Rochester, Minnesota;IBM Engineering and Technology Services, Rochester, Minnesota;IBM Engineering and Technology Services, Rochester, Minnesota;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, New York;Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois;Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois

  • Venue:
  • IBM Journal of Research and Development
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Blue Gene®/L (BG/L) supercomputer, with 65,536 dual-processor compute nodes, was designed from the ground up to support efficient execution of massively parallel message-passing programs. Part of this support is an optimized implementation of the Message Passing Interface (MPI), which leverages the hardware features of BG/L. MPI for BG/L is implemented on top of a more basic message-passing infrastructure called the message layer. This message layer can be used both to implement other higher-level libraries and directly by applications. MPI and the message layer are used in the two BG/L modes of operation: the coprocessor mode and the virtual node mode. Performance measurements show that our message-passing services deliver performance close to the hardware limits of the machine. They also show that dedicating one of the processors of a node to communication functions (coprocessor mode) greatly improves the message-passing bandwidth, whereas running two processes per compute node (virtual node mode) can have a positive impact on application performance.