Three parallel programming paradigms: comparisons on an archetypal PDE computation

Authors:
M. Ehtesham Hayder;Constantinos S. Ierotheou;David E. Keyes
Affiliations:
Center for Research on High Performance Software, Rice University, Houston, TX;Parallel Processing Research Group, University of Greenwich, London SE18 6PF, UK;Computer Science Department, Old Dominion University and ICASE, Norfolk, VA
Venue:
Progress in computer research
Year:
2001

Citing 17
Cited 1

GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems

SIAM Journal on Scientific and Statistical Computing
Domain decomposition on parallel computers

IMPACT of Computing in Science and Engineering
The high performance Fortran handbook

The high performance Fortran handbook
Computer aided parallelisation tools (CAPTools)—conceptual overview and performance on the parallelisation of structured mesh codes

Parallel Computing
Exploitation of symbolic information in interprocedural dependence analysis

Parallel Computing
Automatic parallel code generation for message passing on distributed memory systems

Parallel Computing
Domain decomposition: parallel multilevel methods for elliptic partial differential equations

Domain decomposition: parallel multilevel methods for elliptic partial differential equations
Efficient management of parallelism in object-oriented numerical software libraries

Modern software tools for scientific computing
Parallel Newton--Krylov--Schwarz Algorithms for the Transonic Full Potential Equation

SIAM Journal on Scientific Computing
High performance Fortran: history, status and future

Parallel Computing - Special issues on languages and compilers for parallel computers
A comparison of PETSc library and HPF implementations of an archetypal PDS computation

Advances in Engineering Software - Special issue; special issue on large-scale analysis and design on high-performance computers and workstations
Achieving high sustained performance in an unstructured mesh CFD application

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Domain-Based Parallelism and Problem Decomposition Methods in Computational Science and Engineering

Domain-Based Parallelism and Problem Decomposition Methods in Computational Science and Engineering
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Extending HPF for Advanced Data-Parallel Applications

IEEE Parallel & Distributed Technology: Systems & Technology
Performance Enhancement on Microprocessors with Hierarchical Memory Systems for Solving Large Sparse Linear Systems

International Journal of High Performance Computing Applications
Numerical Methods for Unconstrained Optimization and Nonlinear Equations (Classics in Applied Mathematics, 16)

Numerical Methods for Unconstrained Optimization and Nonlinear Equations (Classics in Applied Mathematics, 16)

Asynchronous Communication Schemes for Finite Difference Methods on Multiple GPUs

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Three paradigms for distributed-memory parallel computation that free the application programmer from the details of message passing are compared for an archetypal structured scientific computation -- a nonlinear, structured-grid partial differential equation boundary value problem -- using the same algorithm on the same hardware. All of the paradigms -- parallel languages represented by the Portland Group's HPF, (semi-)automated serial-to-parallel source-to-source translation represented by CAP-Tools from the University of Greenwich, and parallel libraries represented by Argonne's PETSc -- are found to be easy to use for this problem class, and all are reasonably effective in exploiting concurrency after a short learning curve. The level of involvement required by the application programmer under any paradigm includes specification of the data partitioning, corresponding to a geometrically simple decomposition of the domain of the PDE. Programming in SPMD style for the PETSc library requires writing only the routines that discretize the PDE and its Jacobian, managing subdomain-to-processor mappings (affine global-to-local index mappings), and interfacing to library solver routines. Programming for HPF requires a complete sequential implementation of the same algorithm as a starting point, introduction of concurrency through subdomain blocking (a task similar to the index mapping), and modest experimentation with rewriting loops to elucidate to the compiler the latent concurrency. Programming with CAPTools involves feeding the same sequential implementation to the CAPTools interactive parallelization system, and guiding the source-to-source code transformation by responding to various queries about quantities knowable only at runtime. Results representative of "the state of the practice" for a scaled sequence of structured grid problems are given on three of the most important contemporary high-performance platforms: the IBM SP, the SGI Origin 2000, and the CRAYY T3E.