Improving Application Performance on the HP/Convex Exemplar

  • Authors:
  • Thomas Sterling;Phillip Merkey;Daniel Savarese

  • Affiliations:
  • -;-;-

  • Venue:
  • Computer
  • Year:
  • 1996

Quantified Score

Hi-index 4.10

Visualization

Abstract

The Earth and space sciences community faces rich computational challenges ranging from static, regular, and embarrassingly parallel to dynamic, unstructured, and tightly coupled. This problem domain requires highly scalable systems exhibiting broad generality, efficiency, and programmability. These capabilities are appearing in the emerging scalable shared memory cache-coherent architectures like that of the HP/Convex Exemplar SPP-1000. The goal of this class of architecture is to make scientific programming as easy and efficient as it is on vector supercomputers. The authors describe the Exemplar's architecture, whose global system organization comprises up to 16 multiprocessors interconnected by four SCI (Scalable Coherent Interface) ring networks. They then present the findings from four applications: the piecewise parabolic method, a finite-element method for unstructured meshes, a tree code for the n-body problem, and a particle-in-cell code. The authors present application performance data derived after the Exemplar at Goddard Space Flight Center went into production use. These studies expose the operational properties of the Exemplar and determine its suitability for Earth and space sciences applications. The testing reveals that global cache coherence can be used effectively to simplify programming and data migration. However, the basic problem of locality sensitivity still demands direct programmer involvement to achieve effective system behavior. The question of whether message-passing or shared memory programming models are better remains open.