Optimizing a conjugate gradient solver with non-blocking collective operations

  • Authors:
  • Torsten Hoefler;Peter Gottschling;Wolfgang Rehm;Andrew Lumsdaine

  • Affiliations:
  • Open Systems Lab, Indiana University, Bloomington, IN;Open Systems Lab, Indiana University, Bloomington, IN;Department of Computer Science, Technical University of Chemnitz, Chemnitz, Germany;Open Systems Lab, Indiana University, Bloomington, IN

  • Venue:
  • EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a case study about the applicability and usage of non-blocking collective operations. These operations provide the ability to overlap communication with computation and to avoid unnecessary synchronization. We introduce our NBC library, a portable low-overhead implementation of non-blocking collectives on top of MPI-1. We demonstrate the easy usage of the NBC library with the optimization of a conjugate gradient solver with only minor changes to the traditional parallel implementation of the program. The optimized solver runs up to 34% faster and is able to overlap most of the communication. We show that there is, due to the overlap, no performance difference between Gigabit Ethernet and InfiniBandTM for our calculation.