Quantifying the potential benefit of overlapping communication and computation in large-scale scientific applications

  • Authors:
  • José Carlos Sancho;Kevin J. Barker;Darren J. Kerbyson;Kei Davis

  • Affiliations:
  • Performance and Architecture Laboratory (PAL), Los Alamos National Laboratory, NM;Performance and Architecture Laboratory (PAL), Los Alamos National Laboratory, NM;Performance and Architecture Laboratory (PAL), Los Alamos National Laboratory, NM;Performance and Architecture Laboratory (PAL), Los Alamos National Laboratory, NM

  • Venue:
  • Proceedings of the 2006 ACM/IEEE conference on Supercomputing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The design and implementation of a high performance communication network are critical factors in determining the performance and cost-effectiveness of a largescale computing system. The major issues center on the trade-off between the network cost and the impact of latency and bandwidth on application performance. One promising technique for extracting maximum application performance given limited network resources is based on overlapping computation with communication, which partially or entirely hides communication delays. While this approach is not new, there are few studies that quantify the potential benefit of such overlapping for large-scale production scientific codes. We address this with an empirical method combined with a network model to quantify the potential overlap in several codes and examine the possible performance benefit. Our results demonstrate, for the codes examined, that a high potential tolerance to network latency and bandwidth exists because of a high degree of potential overlap. Moreover, our results indicate that there is often no need to use finegrained communication mechanisms to achieve this benefit, since the major source of potential overlap is found in independent work--computation on which pending messages does not depend. This allows for a potentially significant relaxation of network requirements without a consequent degradation of application performance.