Numerical Linear Algebra for High Performance Computers

Authors:
Jack J. Dongarra;Lain S. Duff;Danny C. Sorensen;Henk A. Vander Vorst
Affiliations:
-;-;-;-
Venue:
Numerical Linear Algebra for High Performance Computers
Year:
1998

Citing 0
Cited 108

Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
How to vectorize the algebraic multilevel iteration

ACM Transactions on Mathematical Software (TOMS) - Special issue in honor of John Rice's 65th birthday
Landing CG on EARTH: a case study of fine-grained multithreading on an evolutionary path

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A recursive formulation of Cholesky factorization of a matrix in packed storage

ACM Transactions on Mathematical Software (TOMS)
A numerical method to simulate radio-frequency plasma discharges

Journal of Computational Physics
Parallel two level block ILU Preconditioning techniques for solving large sparse linear systems

Parallel Computing
Numerical experiments to optimize the use of (I)LU preconditioning in the iterative linear solver package LINSOL

Applied Numerical Mathematics - Developments and trends in iterative methods for large systems of equations—in memoriam Rüdiger Weiss
Estimation of VAR Models: Computational Aspects

Computational Economics
Preconditioning techniques for large linear systems: a survey

Journal of Computational Physics
ParIC: A Family of Parallel Incomplete Cholesky Preconditioners

HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
Object-Oriented Approach to Finite Element Modeling on Clusters

PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
Parallel Displacement Decomposition Solvers for Elasticity Problems

PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
An Efficient Parallel Linear Solver with a Cascadic Conjugate Gradient Method: Experience with Reality

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Parallel Algorithm for Fast Cloth Simulation

VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
Recursive Version of LU Decomposition

NAA '00 Revised Papers from the Second International Conference on Numerical Analysis and Its Applications
Parallel Computation of Pseudospectra Using Transfer Functions on a MATLAB-MPI Cluster Platform

Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Fast Cloth Simulation with Parallel Computers

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization

Parallel Computing
Two numerical methods for an inverse problem for the 2-D Helmholtz equation

Journal of Computational Physics
Massive data set issues in air pollution modelling

Handbook of massive data sets
Multigrain Parallelism for Eigenvalue Computations on Networks of Clusters

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
How to vectorize the algebraic multi-level iteration

Computational science, mathematics and software
An efficient 3D grid based scheduling for heterogeneous systems

Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium
Parallel, multigrain iterative solvers for hiding network latencies on MPPs and networks of clusters

Parallel Computing - Parallel matrix algorithms and applications (PMAA '02)
Local supercomputing training in the computational sciences using remote national centers

Future Generation Computer Systems - Special issue: Selected papers from the workshop on education in computational sciences held at the ICCS 2002
Parallel scheduling of the PCG method for banded matrices rising from FDM/FEM

Journal of Parallel and Distributed Computing
Adapting a parallel sparse direct solver to architectures with clusters of SMPs

Parallel Computing - Special issue: Parallel and distributed scientific and engineering computing
Self-adapting software for numerical linear algebra and LAPACK for clusters

Parallel Computing - Special issue: Parallel and distributed scientific and engineering computing
A portable parallel implementation of a boundary element elastostatic code for shared and distributed memory systems

Advances in Engineering Software
Parallel Iterative Solvers of GeoFEM with Selective Blocking Preconditioning for Nonlinear Contact Problems on the Earth Simulator

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Evolving interfaces via gradients of geometry-dependent interior Poisson problems: application to tumor growth

Journal of Computational Physics
Approaches Based on Permutations for Partitioning Sparse Matrices on Multiprocessors

The Journal of Supercomputing
Multilevel preconditioned iterative eigensolvers for Maxwell eigenvalue problems

Applied Numerical Mathematics - 6th IMACS International symposium on iterative methods in scientific computing
LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Hybrid scheduling for the parallel solution of linear systems

Parallel Computing - Parallel matrix algorithms and applications (PMAA'04)
The design and implementation of the MRRR algorithm

ACM Transactions on Mathematical Software (TOMS)
A new compact scheme for parallel computing using domain decomposition

Journal of Computational Physics
BiCGStab, VPAStab and an adaptation to mildly nonlinear systems

Journal of Computational and Applied Mathematics
A numerical evaluation of sparse direct solvers for the solution of large sparse symmetric linear systems of equations

ACM Transactions on Mathematical Software (TOMS)
Parallel implementation of efficient preconditioned linear solver for grid-based applications in chemical physics: II: QMR linear solver

Journal of Computational Physics
Parallel preconditioned conjugate gradient square method based on normalized approximate inverses

Scientific Programming - International Symposium of Parallel and Distributed Computing & International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogenous Networks
Experiences of sparse direct symmetric solvers

ACM Transactions on Mathematical Software (TOMS)
An operation stacking framework for large ensemble computations

Proceedings of the 21st annual international conference on Supercomputing
The schur aggregation for solving linear systems of equations

Proceedings of the 2007 international workshop on Symbolic-numeric computation
On the design of interfaces to sparse direct solvers

ACM Transactions on Mathematical Software (TOMS)
High performance BLAS formulation of the multipole-to-local operator in the fast multipole method

Journal of Computational Physics
Scheduling malleable tasks with interdependent processing rates: Comments and observations

Discrete Applied Mathematics
Additive preconditioning and aggregation in matrix computations

Computers & Mathematics with Applications
Cache efficient bidiagonalization using BLAS 2.5 operators

ACM Transactions on Mathematical Software (TOMS)
Displacement decomposition and parallelisation of the PCG method for elasticity problems

International Journal of Computational Science and Engineering
Performance analysis of distributed iterative linear solvers

MMACTE'05 Proceedings of the 7th WSEAS International Conference on Mathematical Methods and Computational Techniques In Electrical Engineering
Benchmarking GPUs to tune dense linear algebra

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Parallel Approximate Finite Element Inverses on Symmetric Multiprocessor Systems

ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
Fast and Small Short Vector SIMD Matrix Multiplication Kernels for the Synergistic Processing Element of the CELL Processor

ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
OpenMP based parallel normalized direct methods for sparse finite element linear systems

The Journal of Supercomputing
A class of parallel tiled linear algebra algorithms for multicore architectures

Parallel Computing
Distributed SBP Cholesky factorization algorithms with near-optimal scheduling

ACM Transactions on Mathematical Software (TOMS)
Improving the Performance of a Verified Linear System Solver Using Optimized Libraries and Parallel Computation

High Performance Computing for Computational Science - VECPAR 2008
QR factorization for the Cell Broadband Engine

Scientific Programming - High Performance Computing with the Cell Broadband Engine
Optimizing matrix multiplication for a short-vector SIMD architecture - CELL processor

Parallel Computing
Accelerating linpack with CUDA on heterogenous clusters

Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Direct and iterative solution of the generalized Dirichlet-Neumann map for elliptic PDEs on square domains

Journal of Computational and Applied Mathematics
A comparison of projective and direct solvers for finite elements in elastostatics

Advances in Engineering Software
VPAStab(J,L): An iterative method with look-ahead for the solution of large sparse linear systems

Journal of Computational and Applied Mathematics
Fast Implicit Simulation of Oscillatory Flow in Human Abdominal Bifurcation Using a Schur Complement Preconditioner

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Multilevel preconditioned iterative eigensolvers for Maxwell eigenvalue problems

Applied Numerical Mathematics - 6th IMACS International symposium on iterative methods in scientific computing
Low cost high performance uncertainty quantification

Proceedings of the 2nd Workshop on High Performance Computational Finance
A study on quaternion blockquasi-tridiagonal systems

Computers & Mathematics with Applications
Rectangular full packed format for cholesky's algorithm: factorization, solution, and inversion

ACM Transactions on Mathematical Software (TOMS)
Evaluation of linear solvers for astrophysics transfer problems

VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
Application of the multi-level parallelism (MLP) software to a finite element groundwater program using iterative solvers with comparison to MPI

ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
Performance evaluation of parallel gram-schmidt re-orthogonalization methods

VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Operation Stacking for Ensemble Computations With Variable Convergence

International Journal of High Performance Computing Applications
Prospectus for the next LAPACK and ScaLAPACK libraries

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Implementing linear algebra routines on multi-core processors with pipelining and a look ahead

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Cholesky factorization of band matrices using multithreaded BLAS

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
A new domain decomposition approach suited for grid computing

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Performance evaluation of basic linear algebra subroutines on a matrix co-processor

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Additive preconditioning for matrix computations

CSR'08 Proceedings of the 3rd international conference on Computer science: theory and applications
Comparison study of performance of parallel steady state solver on different computer architectures

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Parallel Dichotomy Algorithm for solving tridiagonal system of linear equations with multiple right-hand sides

Parallel Computing
Polynomial homotopies on multicore workstations

Proceedings of the 4th International Workshop on Parallel and Symbolic Computation
Direct multi-grid methods for linear systems with harmonic aliasing patterns

IEEE Transactions on Signal Processing
Algebraic and numerical algorithms

Algorithms and theory of computation handbook
On the performance of parallel normalized explicit preconditioned conjugate gradient type methods

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Partitioned Triangular Tridiagonalization

ACM Transactions on Mathematical Software (TOMS)
High-performance modeling acoustic and elastic waves using the parallel Dichotomy Algorithm

Journal of Computational Physics
Parallel Hybrid Preconditioning: Incomplete Factorization with Selective Sparse Approximate Inversion

SIAM Journal on Scientific Computing
An error correction solver for linear systems: evaluation of mixed precision implementations

VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
A block IDR(s) method for nonsymmetric linear systems with multiple right-hand sides

Journal of Computational and Applied Mathematics
Designing and dynamically load balancing hybrid LU for multi/many-core

Computer Science - Research and Development
Two implementations of the preconditioned conjugate gradient method on heterogeneous computing grids

International Journal of Applied Mathematics and Computer Science - Computational Intelligence in Modern Control Systems
Schur Complement Preconditioners for Surface Integral-Equation Formulations of Dielectric Problems Solved with the Multilevel Fast Multipole Algorithm

SIAM Journal on Scientific Computing
Parallel exact and approximate arrow-type inverses on symmetric multiprocessor systems

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
Sparse systems solving on GPUs with GMRES

The Journal of Supercomputing
Design and implementation of parallelized cholesky factorization

HPCA'09 Proceedings of the Second international conference on High Performance Computing and Applications
Portable and scalable FPGA-based acceleration of a direct linear system solver

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Parallelising matrix operations on clusters for an optimal control-based quantum compiler

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
JuliusC: a practical approach for the analysis of divide-and-conquer algorithms

LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Performance evaluation of a parallel algorithm for a radiative transfer problem

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Vectorized sparse matrix multiply for compressed row storage format

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
A static parallel multifrontal solver for finite element meshes

ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Mixed precision iterative refinement methods for linear systems: convergence analysis based on krylov subspace methods

PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
Low-cost data uncertainty quantification

Concurrency and Computation: Practice & Experience
Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures

Concurrency and Computation: Practice & Experience
Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modeling

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors

Computer Science - Research and Development
Simultaneous computation of the row and column rank profiles

Proceedings of the 38th international symposium on International symposium on symbolic and algebraic computation

Quantified Score

Hi-index	0.04

Visualization

Abstract

From the Publisher:This book presents a unified treatment of recently developed techniques and current understanding about solving systems of linear equations and large scale eigenvalue problems on high-performance computers. It provides a rapid introduction to the world of vector and parallel processing for these linear algebra applications. Topics include major elements of advanced-architecture computers and their performance, recent algorithmic development, and software for direct solution of dense matrix problems, direct solution of sparse systems of equations, iterative solution of sparse systems of equations, and solution of large sparse eigenvalue problems. This book supersedes the SIAM publication Solving Linear Systems on Vector and Shared Memory Computers, which appeared in 1990. The new book includes a considerable amount of new material in addition to incorporating a substantial revision of existing text.About the Authors: Jack J. Dongarra is a Distinguished Professor of Computer Science at the University of Tennessee and a Distinguished Scientist at Oak Ridge National Laboratory. Iain S. Duff is Group Leader of Numerical Analysis at the CCLRC Rutherford Appleton Laboratory, the Project Leader for the Parallel Algorithms Group at CERFACS in Toulouse, and a Visiting Professor of Mathematics at the University of Strathclyde. Danny C. Sorensen is a Professor of Computational and Applied Mathematics at Rice University. Henk A. van der Vorst is a Professor in Numerical Analysis at Utrecht University in the Netherlands.