Minimizing synchronizations in sparse iterative solvers for distributed supercomputers

Authors:
Sheng-Xin Zhu;Tong-Xiang Gu;Xing-Ping Liu
Affiliations:
-;-;-
Venue:
Computers & Mathematics with Applications
Year:
2014

Citing 29
Cited 0

A class of parallel iterative methods implemented on multiprocessors

A class of parallel iterative methods implemented on multiprocessors
A bridging model for parallel computation

Communications of the ACM
s-step iterative methods for symmetric linear systems

Journal of Computational and Applied Mathematics
BI-CGSTAB: a fast and smoothly converging variant of BI-CG for the solution of nonsymmetric linear systems

SIAM Journal on Scientific and Statistical Computing
Variants of BICGSTAB for matrices with complex spectrum

SIAM Journal on Scientific Computing
Reducing the effect of global communication in GMRES(m) and CG on parallel distributed memory computers

Applied Numerical Mathematics
Reliable updated residuals in hybrid Bi-CG methods

Computing
GPBi-CG: Generalized Product-type Methods Based on Bi-CG for Solving Nonsymmetric Linear Systems

SIAM Journal on Scientific Computing
LogGP: incorporating long messages into the LogP model for parallel computation

Journal of Parallel and Distributed Computing
Accuracy and Stability of Numerical Algorithms

Accuracy and Stability of Numerical Algorithms
Accuracy of Two Three-term and Three Two-term Recurrences for Krylov Space Solvers

SIAM Journal on Matrix Analysis and Applications
GPBiCG(m, l): a hybrid of BiCGSTAB and GPBiCG methods with efficiency and robustness

Applied Numerical Mathematics - Developments and trends in iterative methods for large systems of equations—in memoriam Rüdiger Weiss
The Improved Quasi-minimal Residual Method on Massively Distributed Memory Computers

HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Communication Cost Reduction for Krylov Methods on Parallel Computers

HPCN Europe 1994 Proceedings of the nternational Conference and Exhibition on High-Performance Computing and Networking Volume II: Networking and Tools
The Improved CGS Method for Large and Sparse Linear Systems on Bulk Synchronous Parallel Architecture

ICA3PP '02 Proceedings of the Fifth International Conference on Algorithms and Architectures for Parallel Processing
The Improved BiCGStab Method for Large and Sparse Unsymmetric Linear Systems on Parallel Distributed Memory Architectures

ICA3PP '02 Proceedings of the Fifth International Conference on Algorithms and Architectures for Parallel Processing
Accurate and Efficient Floating Point Summation

SIAM Journal on Scientific Computing
Latency lags bandwith

Communications of the ACM - Voting systems
Accurate Sum and Dot Product

SIAM Journal on Scientific Computing
Sparsity: Optimization Framework for Sparse Matrix Kernels

International Journal of High Performance Computing Applications
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Accurate Floating-Point Summation Part I: Faithful Rounding

SIAM Journal on Scientific Computing
An improved parallel hybrid bi-conjugate gradient method suitable for distributed parallel computing

Journal of Computational and Applied Mathematics
Minimizing communication in sparse matrix solvers

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Reducing Floating Point Error in Dot Product Using the Superblock Family of Algorithms

SIAM Journal on Scientific Computing
Optimizing collective communication on multicores

HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Communication-avoiding krylov subspace methods

Communication-avoiding krylov subspace methods
Parallelism and error reduction in a high performance environment

Parallelism and error reduction in a high performance environment
A generalization of s-step variants of gradient methods

Journal of Computational and Applied Mathematics

Quantified Score

Hi-index	0.09

Visualization

Abstract

Eliminating synchronizations is one of the important techniques related to minimizing communications for modern high performance computing. This paper discusses principles of reducing communications due to global synchronizations in sparse iterative solvers on distributed supercomputers. We demonstrate how to minimize global synchronizations by rescheduling a typical Krylov subspace method. The benefit of minimizing synchronizations is shown in theoretical analysis and verified by numerical experiments. The experiments also show the local communications for some structured sparse matrix-vector multiplications and global communications in the underlying supercomputers increase in the order P^1^/^2^.^5 and P^4^/^5 respectively, where P is the number of processors.