Vectorization of tree traversals
Journal of Computational Physics
PVM: a framework for parallel distributed computing
Concurrency: Practice and Experience
Implications of hierarchical N-body methods for multiprocessor architectures
ACM Transactions on Computer Systems (TOCS)
A Performance Evaluation of the Convex SPP-1000 Scalable Shared Memory Parallel Computer
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
An Initial evaluation of the Convex SPP-1000 for Earth and Space Science Applications
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Hardware fault containment in scalable shared-memory multiprocessors
Proceedings of the 24th annual international symposium on Computer architecture
IEEE Transactions on Parallel and Distributed Systems
Modeling of interconnection subsystems for massively parallel computers
Performance Evaluation
Hi-index | 4.10 |
The Earth and space sciences community faces rich computational challenges ranging from static, regular, and embarrassingly parallel to dynamic, unstructured, and tightly coupled. This problem domain requires highly scalable systems exhibiting broad generality, efficiency, and programmability. These capabilities are appearing in the emerging scalable shared memory cache-coherent architectures like that of the HP/Convex Exemplar SPP-1000. The goal of this class of architecture is to make scientific programming as easy and efficient as it is on vector supercomputers. The authors describe the Exemplar's architecture, whose global system organization comprises up to 16 multiprocessors interconnected by four SCI (Scalable Coherent Interface) ring networks. They then present the findings from four applications: the piecewise parabolic method, a finite-element method for unstructured meshes, a tree code for the n-body problem, and a particle-in-cell code. The authors present application performance data derived after the Exemplar at Goddard Space Flight Center went into production use. These studies expose the operational properties of the Exemplar and determine its suitability for Earth and space sciences applications. The testing reveals that global cache coherence can be used effectively to simplify programming and data migration. However, the basic problem of locality sensitivity still demands direct programmer involvement to achieve effective system behavior. The question of whether message-passing or shared memory programming models are better remains open.