Scalability Analysis of Multidimensional Wavefront Algorithms on Large-Scale SMP Clusters

Authors:
Adolfy Hoisie;Olaf Lubeck;Harvey Wasserman
Affiliations:
-;-;-
Venue:
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Year:
1999

Citing 0
Cited 16

A general performance model for parallel sweeps on orthogonal grids for particle transport calculations

Proceedings of the 14th international conference on Supercomputing
Time-Sharing Parallel Jobs in the Presence of Multiple Resource Requirements

IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
A General Predictive Performance Model for Wavefront Algorithms on Clusters of SMPs

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Out-of-Core and Pipeline Techniques for Wavefront Algorithms

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Adaptive Parallel Job Scheduling with Flexible Coscheduling

IEEE Transactions on Parallel and Distributed Systems
How Well Can Simple Metrics Represent the Performance of HPC Applications?

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A performance prediction framework for scientific applications

Future Generation Computer Systems
The Design and Implementation of a Domain-Specific Language for Network Performance Testing

IEEE Transactions on Parallel and Distributed Systems
A genetic algorithms approach to modeling the performance of memory-bound computations

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Implementation and performance modeling of deterministic particle transport (Sweep3D) on the IBM Cell/B.E.

Scientific Programming - High Performance Computing with the Cell Broadband Engine
A performance prediction framework for scientific applications

Future Generation Computer Systems
STAPL: an adaptive, generic parallel C++ library

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
The reverse-acceleration model for programming petascale hybrid systems

IBM Journal of Research and Development
GPU accelerated simulations of 3D deterministic particle transport using discrete ordinates method

Journal of Computational Physics
Optimizing sweep3d for graphic processor unit

ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Adapting wave-front algorithms to efficiently utilize systems with deep communication hierarchies

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We develop a model for the parallel perform-ance of algorithms that consist of concurrent, two-dimensional wavefronts implemented in a message pass-ing environment. The model combines the separate con-tributions of computation and communication wavefronts. We validate the model on three supercomputer systems, with up to 500 processors, using data from an ASCI de-terministic particle transport application, although the model is general to any wavefront algorithm implemented on a 2-D processor domain. We also use the model to make estimates of performance and scalability of wave-front algorithms on 100-TFLOPS computer systems ex-pected to be in existence within the next decade. Our model shows that on a 1-billion-cell problem, single-node computation speed (not inter-processor communication performance, as is widely believed) is the bottleneck. Fi-nally, we present preliminary considerations that reveal the additional complexity associated with modeling wavefront algorithms on reduced-connectivity network topologies, such as clusters of SMPs.