Scalable, high-performance NIC-based all-to-all broadcast over Myrinet/GM

  • Authors:
  • Weikuan Yu;D. K. Panda;D. Buntinas

  • Affiliations:
  • Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA;Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA;LRI, Univ. de Paris Sud, Orsay, France

  • Venue:
  • CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

All-to-all broadcast is one of the common collective operations that involve dense communication between all processes in a parallel program. Previously, programmable network interface cards (NICs) have been leveraged to efficiently support collective operations, including barrier, broadcast, and reduce. This work explores the characteristics of all-to-all broadcast and proposes new algorithms to exploit the potential advantages of NIC programmability. Along with these algorithms, salient strategies have been used to provide scalable topology management, global buffer management, efficient communication processing, and message reliability. The algorithms have been incorporated into a NIC-based collective protocol over Myrinet/GM. The NIC-based all-to-all broadcast operations improve all-to-all broadcast bandwidth over 16 nodes by a factor of 3, compared to host-based all-to-all broadcast operation. Furthermore, the NIC-based operations have been demonstrated to achieve better scalability to large systems and very low host CPU utilization.