A contention-aware performance model for HPC-based networks: a case study of the InfiniBand network

  • Authors:
  • Maxime Martinasso;Jean-François Méhaut

  • Affiliations:
  • University of Grenoble, Computer Science Laboratory LIG, Grenoble, France;University of Grenoble, Computer Science Laboratory LIG, Grenoble, France

  • Venue:
  • Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multi-core clusters are cost-effective clusters largely used in high-performance computing. Parallel applications using message passing as a communication mechanism may introduce complex communication behaviours on such clusters. By sending and receiving data simultaneously from and to several nodes, parallel applications create concurrent accesses to the resources of the network. In this paper, we present a general model that expresses network resource sharing characterised by a dynamic contention graph. The model is based on a linear system weighted by bandwidth distribution factors called penalty coefficients that are specific to a network technology. We propose a method to solve the linear system and present an analysis to determine penalty coefficients on InfiniBand technology. We use complex network conflicts to assess the ability of the model to predict with low errors.