Automatic generation and tuning of MPI collective communication routines

  • Authors:
  • Ahmad Faraj;Xin Yuan

  • Affiliations:
  • Florida State University, Tallahassee, FL;Florida State University, Tallahassee, FL

  • Venue:
  • Proceedings of the 19th annual international conference on Supercomputing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In order for collective communication routines to achieve high performance on different platforms, they must be able to adapt to the system architecture and use different algorithms for different situations. Current Message Passing Interface (MPI) implementations, such as MPICH and LAM/MPI, are not fully adaptable to the system architecture and are not able to achieve high performance on many platforms. In this paper, we present a system that produces efficient MPI collective communication routines. By automatically generating topology specific routines and using an empirical approach to select the best implementations, our system adapts to a given platform and constructs routines that are customized for the platform. The experimental results show that the tuned routines consistently achieve high performance on clusters with different network topologies.