Fast barrier synchronization for InfiniBand™

  • Authors:
  • Torsten Hoefler;Torsten Mehlan;Frank Mietke;Wolfgang Rehm

  • Affiliations:
  • Chemnitz University of Technology, Dept. of Computer Science, Chemnitz, Germany;Chemnitz University of Technology, Dept. of Computer Science, Chemnitz, Germany;Chemnitz University of Technology, Dept. of Computer Science, Chemnitz, Germany;Chemnitz University of Technology, Dept. of Computer Science, Chemnitz, Germany

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The MPI _Barrier() call can be crucial for several applications and has been target of different optimizations since several decades. The best solution to the barrier problem scales with O(log2N) and uses the dissemination principle. A new method using an enhanced dissemination principle and inherent network parallelism will be demonstrated in this paper. The new approach was able to speedup the barrier performance by 40% in relation to the best published algorithm. It is shown that it is possible to leverage the inherent hardware parallelism inside the InfiniBand™ network to lower the latency of the MPI Barrier() operation without additional costs. The principle of sending multiple messages in (pseudo-) parallel can be implemented into a well known algorithm to decrease the number of rounds and speed the overall operation up.