An architecture to perform NIC based MPI matching

  • Authors:
  • K. Scott Hemmert;Keith D. Underwood;Arun Rodrigues

  • Affiliations:
  • Sandia National Laboratories, P.O. Box 5800, MS-1319, Albuquerque, NM 87185-1319, USA;Sandia National Laboratories, P.O. Box 5800, MS-1319, Albuquerque, NM 87185-1319, USA;Sandia National Laboratories, P.O. Box 5800, MS-1319, Albuquerque, NM 87185-1319, USA

  • Venue:
  • CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern supercomputers aggregate thousands of microprocessors through a high performance network. Many of these systems place a processor on the network interface controller (NIC) to handle some portion of the MPI processing. This processing involves traversing a linked list and invoking a matching function for each item. Although this task is critical to the performance of the system, microprocessors perform it extremely poorly. Furthermore, the traditional network processor approaches of multicore and multithreading map poorly to the problem because the list is a shared data structure. While match processing can be implemented directly in hardware, hardware implementations can be extremely inflexible and lead to extremely high risk. This paper presents a novel, programmable architecture for a processor to handle the matching function. The matching engine approaches the performance of a direct hardware implementation while maintaining a high degree of flexibility and programmability. More importantly, it requires a dramatically smaller area than a conventional processor.