RDMA-based and SMP-aware Multi-port All-Gather on Multi-rail QsNet^II SMP Clusters

  • Authors:
  • Ying Qian;Ahmad Afsahi

  • Affiliations:
  • Queen's University, Canada;Queen's University, Canada

  • Venue:
  • ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clusters of Symmetric Multiprocessors (SMP) are more commonplace than ever in achieving high-performance. Scientific applications running on clusters employ collective communications extensively. Using shared memory communication among colocated processes on SMP nodes as well as Remote Direct Memory Access (RDMA) operations for internode communication and trying to overlap them is a proven technique in boosting the performance of collective operations. The effect is much more pronounced when efficient multi-port collectives on multi-rail networks are devised and implemented. In this work, we design and implement multi-port RDMA-based and SMP-aware all-gather algorithms with message striping over multi-rail QsNet^II directly at the Elan level. We compare our algorithms against RDMA-only traditional algorithms and the native elan_gather(). Our performance results indicate that the proposed SMP-aware Bruck all-gather gains an improvement of up to 1.96 for 4KB messages over the native elan_gather(). Meanwhile, the Direct algorithm achieves up to 1.49 improvement for 32KB messages.