Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications

Authors:
Yun He;Chris H. Q. Ding
Affiliations:
NERSC Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720, USA YHe@lbl.gov;NERSC Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720, USA CHQDing@lbl.gov
Venue:
The Journal of Supercomputing
Year:
2001

Citing 17
Cited 7

Solving problems on concurrent processors. Vol. 1: General techniques and regular problems

Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
What every computer scientist should know about floating-point arithmetic

ACM Computing Surveys (CSUR)
Numerical recipes in FORTRAN (2nd ed.): the art of scientific computing

Numerical recipes in FORTRAN (2nd ed.): the art of scientific computing
Parallel ocean general circulation modeling

Proceedings of the eleventh annual international conference of the Center for Nonlinear Studies on Experimental mathematics : computational issues in nonlinear science: computational issues in nonlinear science
Algorithm 719: Multiprecision translation and execution of FORTRAN programs

ACM Transactions on Mathematical Software (TOMS)
On properties of floating point arithmetics: numerical stability and the cost of accurate computations

On properties of floating point arithmetics: numerical stability and the cost of accurate computations
Computational design of the NCAR community climate model

Parallel Computing - Special issue: climate and weather modeling
Design and performance of a scalable parallel community climate model

Parallel Computing - Special issue: climate and weather modeling
Iterative methods for solving linear systems

Iterative methods for solving linear systems
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms

The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The symmetric eigenvalue problem

The symmetric eigenvalue problem
Data organization and I/O in a parallel ocean circulation model

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A Fortran Multiple-Precision Arithmetic Package

ACM Transactions on Mathematical Software (TOMS)
Pracniques: further remarks on reducing truncation errors

Communications of the ACM
Accuracy and Stability of Numerical Algorithms

Accuracy and Stability of Numerical Algorithms
Design, implementation and testing of extended and mixed precision BLAS

ACM Transactions on Mathematical Software (TOMS)
Atmosperic Data Assimilation on Distributed-Memory Parallel Supercomputers

HPCN Europe 1998 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking

High-Precision Floating-Point Arithmetic in Scientific Computation

Computing in Science and Engineering
Dual-mode floating-point multiplier architectures with parallel operations

Journal of Systems Architecture: the EUROMICRO Journal
Dual-mode floating-point adder architectures

Journal of Systems Architecture: the EUROMICRO Journal
Supporting extended precision on graphics processors

Proceedings of the Sixth International Workshop on Data Management on New Hardware
In search of numerical consistency in parallel programming

Parallel Computing
FPGA implementation of an exact dot product and its application in variable-precision floating-point arithmetic

The Journal of Supercomputing
Performance metrics in a hybrid MPI-OpenMP based molecular dynamics simulation with short-range interactions

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Numerical reproducibility and stability of large scale scientific simulations, especially climate modeling, on distributed memory parallel computers are becoming critical issues. In particular, global summation of distributed arrays is most susceptible to rounding errors, and their propagation and accumulation cause uncertainty in final simulation results. We analyzed several accurate summation methods and found that two methods are particularly effective to improve (ensure) reproducibility and stability: Kahan's self-compensated summation and Bailey's double-double precision summation. We provide an MPI operator MPI_SUMDD to work with MPI collective operations to ensure a scalable implementation on large number of processors. The final methods are particularly simple to adopt in practical codes: not only global summations, but also vector-vector dot products and matrix-vector or matrix-matrix operations.