Solving Linear Systems on Vector and Shared Memory Computers

Authors:
Jack J. Dongarra;Iain S. Duff;Danny C. Sorensen;Henk Van Der Vorst
Affiliations:
-;-;-;-
Venue:
Solving Linear Systems on Vector and Shared Memory Computers
Year:
1990

Citing 0
Cited 78

Performance of various computers using standard linear equations software

ACM SIGARCH Computer Architecture News
Automatic software cache coherence through vectorization

ICS '92 Proceedings of the 6th international conference on Supercomputing
Compiler blockability of numerical algorithms

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Vectorized algorithm for B-spline curve fitting on Cray X-MP EA/16se

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Sparse matrix methods for chemical process separation calculations on supercomputers

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Electromagnetic scattering calculations on the Intel Touchstone Delta

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Experience with fine-grain synchronization in MIMD machines for preconditioned conjugate gradient

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compilation techniques for sparse matrix computations

ICS '93 Proceedings of the 7th international conference on Supercomputing
Distributed data access in AC

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
The design of a new frontal code for solving sparse, unsymmetric systems

ACM Transactions on Mathematical Software (TOMS)
Automatic Data Structure Selection and Transformation for Sparse Matrix Computations

IEEE Transactions on Parallel and Distributed Systems
A performance model for krylov subspace methods on mesh-based parallel computers

Parallel Computing
Compiler blockability of dense matrix factorizations

ACM Transactions on Mathematical Software (TOMS)
Recursion leads to automatic variable blocking for dense linear-algebra algorithms

IBM Journal of Research and Development
The automatic generation of sparse primitives

ACM Transactions on Mathematical Software (TOMS)
Parallel Performance Analysis of the Improved Quasi-Minimal Residual Method on Bulk Synchronous Parallel Architectures

The Journal of Supercomputing
Compiler and Run-Time Support for Exploiting Regularity within Irregular Applications

IEEE Transactions on Parallel and Distributed Systems
High performance computing with the Array package for Java: a case study using data mining

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Design and evaluation of a linear algebra package for Java

Proceedings of the ACM 2000 conference on Java Grande
From flop to megaflops: Java for technical computing

ACM Transactions on Programming Languages and Systems (TOPLAS)
The NINJA project

Communications of the ACM
FLAME: Formal Linear Algebra Methods Environment

ACM Transactions on Mathematical Software (TOMS)
Automatic intra-register vectorization for the Intel architecture

International Journal of Parallel Programming
Automatic Intra-Register Vectorization for the Intel® Architecture

International Journal of Parallel Programming
Block Red-Black Ordering: A New Ordering Strategy for Parallelization of ICCG Method

International Journal of Parallel Programming
Linear Algebra Libraries for High-Performance Computers: A Personal Perspective

IEEE Parallel & Distributed Technology: Systems & Technology
Performance Considerations of Shared Virtual Memory Machines

IEEE Transactions on Parallel and Distributed Systems
On the numerical evaluation of linear recurrences

Journal of Computational and Applied Mathematics
The Combined Effectiveness of Unimodular Transformations, Tiling, and Software Prefetching

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
The Improved BiCG Method for Large and Sparse Linear Systems on Parallel Distributed Memory Architectures

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
From Flop to MegaFlops: Java for Technical Computing

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
High Performance Numerical Computing in Java: Language and Compiler Issues

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
A New Message Passing Algorithm for Solving Linear Recurrence Systems

PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
Data Flow Computing and the Conjugate Gradient Method

PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Parallel Pivots LU Algorithm on the Cray T3E

ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Fault-Tolerant High-Performance Matrix Multiplication: Theory and Practice

DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
A Performance Study on a Single Processing Node of the HITACHI SR8000

NAA '00 Revised Papers from the Second International Conference on Numerical Analysis and Its Applications
Parallel execution time analysis for least squares problems on distributed memory architectures

Practical parallel computing
A parallel finite element program on a Beowulf cluster

Advances in Engineering Software - Engineering computational technology
Fault tolerant matrix operations using checksum and reverse computation

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Performance Improvement for Matrix Calculation on CP-PACS Node Processor

HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
High Performance Fortran and Possible Extensions to Support Conjugate Gradient Algorithms

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Algorithm-Based Diskless Checkpointing for Fault-Tolerant Matrix Operations

FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
The Implicit Pipeline Method

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Matrix bidiagonalization: implementation and evaluation on the Trident processor

Neural, Parallel & Scientific Computations
Parallel scheduling of the PCG method for banded matrices rising from FDM/FEM

Journal of Parallel and Distributed Computing
Parallel MCGLS and ICGLS Methods for Least Squares Problems on Distributed Memory Architectures

The Journal of Supercomputing
Java programming for high-performance numerical computing

IBM Systems Journal
Representing linear algebra algorithms in code: the FLAME application program interfaces

ACM Transactions on Mathematical Software (TOMS)
Parallel out-of-core computation and updating of the QR factorization

ACM Transactions on Mathematical Software (TOMS)
Accelerating the SVD Block-Jacobi Method

Computing - Editorial: Special issue on GAMM – Workshop on Guaranteed Error-bounds for the Solution of Nonlinear Problems in Applied Mathematics
Three-level hybrid vs. flat MPI on the Earth Simulator: parallel iterative solvers for finite-element method

Applied Numerical Mathematics - 6th IMACS International symposium on iterative methods in scientific computing
Parallel and systolic solution of normalized explicit approximate inverse preconditioning

The Journal of Supercomputing - Special issue: Parallel and distributed processing and applications
Parallel algorithms development for programmable logic devices

Advances in Engineering Software
Random Walks for Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
NINJA: Java for high performance numerical computing

Scientific Programming
A highly efficient implementation of back propagation algorithm using matrix instruction set architecture

Neural, Parallel & Scientific Computations
On the estimation of a large sparse Bayesian system: The Snaer program

Computational Statistics & Data Analysis
A highly efficient implementation of a backpropagation learning algorithm using matrix ISA

Journal of Parallel and Distributed Computing
Families of algorithms related to the inversion of a Symmetric Positive Definite matrix

ACM Transactions on Mathematical Software (TOMS)
An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization

High Performance Computing for Computational Science - VECPAR 2008
Three-level hybrid vs. flat MPI on the Earth Simulator: Parallel iterative solvers for finite-element method

Applied Numerical Mathematics - 6th IMACS International symposium on iterative methods in scientific computing
Parallel image processing with the block data parallel architecture

IBM Journal of Research and Development
POWER3: the next generation of PowerPC processors

IBM Journal of Research and Development
Parallel MCGLS and ICGLS methods for least squares problems on distributed memory architectures

ISPA'03 Proceedings of the 2003 international conference on Parallel and distributed processing and applications
New data distribution for solving triangular systems on distributed memory machines

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Evaluating linear recursive filters using novel data formats for dense matrices

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Faster graph-theoretic image processing via small-world and quadtree topologies

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Implementation and performance analysis of parallel conjugate gradient on the cell broadband engine

IBM Journal of Research and Development
A parallel block LU decomposition method for distributed finite element matrices

Parallel Computing
Goal-Oriented and Modular Stability Analysis

SIAM Journal on Matrix Analysis and Applications
A note on the numerical inversion of the laplace transform

PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Performance study of LU decomposition on the programmable GPU

HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Mobile pipelines: parallelizing left-looking algorithms using navigational programming

HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Vectorized sparse matrix multiply for compressed row storage format

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Parallelization and performance comparison of the conjugate gradient equation solver on multicore Cell and Xeon computers

Concurrency and Computation: Practice & Experience
Families of Algorithms for Reducing a Matrix to Condensed Form

ACM Transactions on Mathematical Software (TOMS)
On the parallel technologies of conjugate and semi-conjugate gradient methods for solving very large sparse SLAEs

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies

Quantified Score

Hi-index	0.02

Solving Linear Systems on Vector and Shared Memory Computers

Quantified Score

Visualization

Abstract