Scaling MPI to short-memory MPPs such as BG/L

  • Authors:
  • M. Farreras;T. Cortes;J. Labarta;G. Almasi

  • Affiliations:
  • Universitat Politecnica de Catalunya(UPC), Barcelona, Spain;Universitat Politecnica de Catalunya(UPC), Barcelona, Spain;Universitat Politecnica de Catalunya(UPC), Barcelona, Spain;IBM T. J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • Proceedings of the 20th annual international conference on Supercomputing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scalability to large number of processes is one of the weaknesses of current MPI implementations. Standard implementations are able to scale to hundreds of nodes, but not beyond. The main problem in these implementations is that they assume some resources (for both data and control-data) will always be available to receive/process unexpected messages. As we will show, this is not always true, especially in short-memory machines like the BG/L that has 64K nodes but each node only has 512Mbytes of memory.The objective of this paper is to present one algorithm that improves the robustness of MPI implementations for short-memory MPPs, taking care of data and control-data reception, the system will scale up to any number of nodes. The proposed solution achieves this goal without any observable overhead when there are no memory problems. Furthermore, in the worst case, when memory resources are extremely scarce, the overhead will never double the execution time (and we should never forget that in this extreme situation, traditional MPI implementations would fail to execute).