Efficient RDMA-based multi-port collectives on multi-rail QsNetII clusters

  • Authors:
  • Ying Qian;Ahmad Afsahi

  • Affiliations:
  • Department of Electrical and Computer Engineering, Queen's University, Kingston, ON, Canada;Department of Electrical and Computer Engineering, Queen's University, Kingston, ON, Canada

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many scientific applications use MPI collective communications intensively. Therefore, efficient and scalable implementation of collective operations is critical to the performance of such applications running on clusters. Quadrics QsNetII is a high-performance interconnect for clusters that implements some collectives at the Elan level. These collectives are directly used by their corresponding MPI collectives. Quadrics software supports point-to-point striping over multi-rail QsNetII networks. However, multirail collectives have not been supported. In this work, we propose a number of RDMA-based multi-port collectives over multi-rail QsNetII clusters directly at the Elan level. Our performance results indicate that the proposed multiport gather gains an improvement of up to 6.35 for 1MB message over the native elan_gather. The proposed multiport all-to-all performs better than the native elan_alltoall by a factor of 2.19 for 16KB message. Moreover, we have also proposed two algorithms for the scatter operation.