Parallel ADI solver based on processor scheduling

Authors:
Alex Povitsky
Affiliations:
Department of Mechanical Engineering, Concordia University, 1455 de Maisonneuve Blvd, H-549, Montreal, Quebec, Canada H3G 1M8
Venue:
Applied Mathematics and Computation
Year:
2002

Citing 11
Cited 1

Complexity of parallel implementation of domain decomposition techniques for elliptic partial differential equations

SIAM Journal on Scientific and Statistical Computing
Numerical computation of internal & external flows: fundamentals of numerical discretization

Numerical computation of internal & external flows: fundamentals of numerical discretization
Optimizing tridiagonal solvers for alternating direction methods on Boolean cube multiprocessors

SIAM Journal on Scientific and Statistical Computing
Alernating-Direction Line-Relaxation Methods on Multicomputers

SIAM Journal on Scientific Computing
Scaling of Beowulf-class distributed systems

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
A multi-level parallelization concept for high-fidelity multi-block solvers

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering

Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Multiphase Complete Exchange: A Theoretical Analysis

IEEE Transactions on Computers
MULTIPHASE COMPLETE EXCHANGE ON PARAGON, SP2 \& CS-2

MULTIPHASE COMPLETE EXCHANGE ON PARAGON, SP2 \'& CS-2
A parallel compact multi-dimensional numerical algorithm with aeroacoustics applications

A parallel compact multi-dimensional numerical algorithm with aeroacoustics applications

Parallel Dichotomy Algorithm for solving tridiagonal system of linear equations with multiple right-hand sides

Parallel Computing

Quantified Score

Hi-index	0.48

Visualization

Abstract

Gaussian elimination is used for the direct solution of banded linear systems that typically appear in implicit numerical methods for PDEs. Gaussian elimination for narrow-banded systems (also known as the Thomas algorithm (TA)) includes forward and backward recurrences along lines of a numerical grid. Multi-domain decomposition, essential for parallelization of implicit solvers, spans the recurrences across processors in one or more directions. Processor idle time and inter-processor communication time are two interdependent reasons for the poor parallelization efficiency of TAs. In this research an efficient parallel algorithm for 3D directionally split problems is developed. The proposed solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. The proposed algorithm uses a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines. This algorithm has data available for other computational tasks while processors are idle from the TA. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm considerably reduces the communication cost and processor idle time over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.