NIC-based reduction algorithms for large-scale clusters

  • Authors:
  • Fabrizio Petrini;Adam Moody;Juan Fernandez;Eitan Frachtenberg;Dhabaleswar K. Panda

  • Affiliations:
  • Applied Computer Science Group, Pacific Northwest National Laboratory, Richland, WA 99352, USA.;Integrated Computing and Communications Department, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA.;Computer Engineering Department, University of Murcia, 30071 Murcia, Spain.;Computer and Computational Sciences (CCS) Division, Los Alamos National Laboratory, NM 87545, USA.;Department of Computer and Information Science, The Ohio State University, Columbus, OH 43210, USA

  • Venue:
  • International Journal of High Performance Computing and Networking
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Efficient reduction algorithms are crucial to many large-scale, parallel scientific applications. While previous algorithms constrain processing to the host CPU, we explore and utilise the processors in modern cluster Network Interface Cards (NICs). We present the design issues, solutions, analytical models, and experimental evaluations of a family of NIC-based reduction algorithms. Through experiments on the ALC cluster at Lawrence Livermore National Laboratory, which connects 960 dual-CPU nodes with the Quadrics QsNet interconnect, we find NIC-based reductions to be more efficient than host-based implementations. At large-scale, our NIC-based reductions are more than twice as fast as the host-based, production-level MPI implementation.