Families of Algorithms for Reducing a Matrix to Condensed Form

Authors:
Field G. Van Zee;Robert A. van de Geijn;Gregorio Quintana-Ortí;G. Joseph Elizondo
Affiliations:
The University of Texas at Austin;The University of Texas at Austin;Universidad Jaume I;The University of Texas at Austin
Venue:
ACM Transactions on Mathematical Software (TOMS)
Year:
2012

Citing 23
Cited 0

The WY representation for products of householder matrices

SIAM Journal on Scientific and Statistical Computing - Papers from the Second Conference on Parallel Processing for Scientific Computin
An extended set of FORTRAN basic linear algebra subprograms

ACM Transactions on Mathematical Software (TOMS)
Implementation of the GMRES method using householder transformations

SIAM Journal on Scientific and Statistical Computing - Telecommunication Programs at U.S. Universities
A storage-efficient WY representation for products of householder transformations

SIAM Journal on Scientific and Statistical Computing
A set of level 3 basic linear algebra subprograms

ACM Transactions on Mathematical Software (TOMS)
Modification of the householder method based on the compact WY representation

SIAM Journal on Scientific and Statistical Computing
LAPACK Users' guide (third ed.)

LAPACK Users' guide (third ed.)
Efficient eigenvalue and singular value computations on shared memory machines

Parallel Computing - Special issue on parallelization techniques for numerical modelling
Basic Linear Algebra Subprograms for Fortran Usage

ACM Transactions on Mathematical Software (TOMS)
Algorithm 807: The SBR Toolbox—software for successive band reduction

ACM Transactions on Mathematical Software (TOMS)
Matrix algorithms

Matrix algorithms
FLAME: Formal Linear Algebra Methods Environment

ACM Transactions on Mathematical Software (TOMS)
Solving Linear Systems on Vector and Shared Memory Computers

Solving Linear Systems on Vector and Shared Memory Computers
A Note On Parallel Matrix Inversion

SIAM Journal on Scientific Computing
Aggregations of Elementary Transformations

Aggregations of Elementary Transformations
LAPACK Working Note 72: The Computation of Elementary Unitary Matrices

LAPACK Working Note 72: The Computation of Elementary Unitary Matrices
The science of deriving dense linear algebra algorithms

ACM Transactions on Mathematical Software (TOMS)
Representing linear algebra algorithms in code: the FLAME application program interfaces

ACM Transactions on Mathematical Software (TOMS)
Accumulating Householder transformations, revisited

ACM Transactions on Mathematical Software (TOMS)
Improving the performance of reduction to Hessenberg form

ACM Transactions on Mathematical Software (TOMS)
Cache efficient bidiagonalization using BLAS 2.5 operators

ACM Transactions on Mathematical Software (TOMS)
Condensed forms for the symmetric eigenvalue problem on multi-threaded architectures

Concurrency and Computation: Practice & Experience
The libflame Library for Dense Matrix Computations

Computing in Science and Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a recent paper it was shown how memory traffic can be diminished by reformulating the classic algorithm for reducing a matrix to bidiagonal form, a preprocess when computing the singular values of a dense matrix. The key is a reordering of the computation so that the most memory-intensive operations can be “fused.” In this article, we show that other operations that reduce matrices to condensed form (reduction to upper Hessenberg form and reduction to tridiagonal form) can be similarly reorganized, yielding different sets of operations that can be fused. By developing the algorithms with a common framework and notation, we facilitate the comparing and contrasting of the different algorithms and opportunities for optimization on sequential architectures. We discuss the algorithms, develop a simple model to estimate the speedup potential from fusing, and showcase performance improvements consistent with the what the model predicts.