Performance analysis and prediction for distributed homogeneous clusters

  • Authors:
  • Heinz Kredel;Hans Günther Kruse;Sabine Richling;Erich Strohmaier

  • Affiliations:
  • IT-Center, University of Mannheim, Mannheim, Germany;IT-Center, University of Mannheim, Mannheim, Germany;IT-Center, University of Heidelberg, Heidelberg, Germany;Future Technology Group, Lawrence Berkeley National Laboratory, Berkeley, USA

  • Venue:
  • Computer Science - Research and Development
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new performance model based on the roofline concept for the analysis and performance prediction of distributed computing clusters. The background for our performance modeling is the 28 km InfiniBand interconnection between two bwGRiD clusters each consisting of 140 compute nodes in day-to-day production use. The model is used to analyze the MPI performance of intra-cluster communication compared to inter-cluster communication. We compare the new modeling results to our earlier stochastic model (Richling et al. in Proc. of 3PGCIC-2010. IEEE, New York 2010) where we could give an estimate on the bandwidth requirements for doubling the performance of an application (LinPack as the simplest example). We will derive some bounds for the size of regions in a cluster and the scaling of the maximal speed-up for the region-region-interconnected network.