Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes

  • Authors:
  • H. Subramoni;S. Potluri;K. Kandalla;B. Barth;J. Vienne;J. Keasler;K. Tomko;K. Schulz;A. Moody;D. K. Panda

  • Affiliations:
  • The Ohio State University;The Ohio State University;The Ohio State University;Texas Advanced Computing Center, Austin, Texas;The Ohio State University;Lawrence Livermore National Laboratory, Livermore, California;Ohio Supercomputing Center, Columbus, Ohio;Texas Advanced Computing Center, Austin, Texas;Lawrence Livermore National Laboratory, Livermore, California;The Ohio State University

  • Venue:
  • SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Over the last decade, InfiniBand has become an increasingly popular interconnect for deploying modern super-computing systems. However, there exists no detection service that can discover the underlying network topology in a scalable manner and expose this information to runtime libraries and users of the high performance computing systems in a convenient way. In this paper, we design a novel and scalable method to detect the InfiniBand network topology by using Neighbor-Joining techniques (NJ). To the best of our knowledge, this is the first instance where the neighbor joining algorithm has been applied to solve the problem of detecting InfiniBand network topology. We also design a network-topology-aware MPI library that takes advantage of the network topology service. The library places processes taking part in the MPI job in a network-topology-aware manner with the dual aim of increasing intra-node communication and reducing the long distance inter-node communication across the InfiniBand fabric.