A storage-efficient WY representation for products of householder transformations

Authors:
Robert Schreiber;Charles van Loan
Affiliations:
-;Cornell Univ., Ithaca, NY
Venue:
SIAM Journal on Scientific and Statistical Computing
Year:
1989

Citing 0
Cited 46

LAPACK: a portable linear algebra library for high-performance computers

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Stability of block algorithms with fast level-3 BLAS

ACM Transactions on Mathematical Software (TOMS)
RISC microprocessors and scientific computing

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Efficient householder QR factorization for superscalar processors

ACM Transactions on Mathematical Software (TOMS)
Computing rank-revealing QR factorizations of dense matrices

ACM Transactions on Mathematical Software (TOMS)
Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form

ACM Transactions on Mathematical Software (TOMS)
Parallel Strategies for Solving SURE Models with Variance Inequalities and Positivity of Correlations Constraints

Computational Economics - Computational Studies at Stanford
A framework for symmetric band reduction

ACM Transactions on Mathematical Software (TOMS)
High-Performance Library Software for QR Factorization

PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
Parallel Two-Stage Reduction of a Regular Matrix Pair to Hessenberg-Triangular Form

PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
Using Pentangular Factorizations for the Reduction to Banded Form

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
An Efficient Parallel Algorithm to Solve Block-Toeplitz Systems

The Journal of Supercomputing
Parallel out-of-core computation and updating of the QR factorization

ACM Transactions on Mathematical Software (TOMS)
Accumulating Householder transformations, revisited

ACM Transactions on Mathematical Software (TOMS)
Improving the performance of reduction to Hessenberg form

ACM Transactions on Mathematical Software (TOMS)
Algorithm 854: Fortran 77 subroutines for computing the eigenvalues of Hamiltonian matrices II

ACM Transactions on Mathematical Software (TOMS)
Cache efficient bidiagonalization using BLAS 2.5 operators

ACM Transactions on Mathematical Software (TOMS)
Block size selection of parallel LU and QR on PVP-based and RISC-based supercomputers

CHINA HPC '07 Proceedings of the 2007 Asian technology information program's (ATIP's) 3rd workshop on High performance computing in China: solution approaches to impediments for high performance computing
Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines

Scientific Programming
A class of parallel tiled linear algebra algorithms for multicore architectures

Parallel Computing
QR factorization for the Cell Broadband Engine

Scientific Programming - High Performance Computing with the Cell Broadband Engine
Applying recursion to serial and parallel QR factorization leads to better performance

IBM Journal of Research and Development
Scaling LAPACK panel operations using parallel cache assignment

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
LAPACK-style codes for pivoted Cholesky and QR updating

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Implementing linear algebra routines on multi-core processors with pipelining and a look ahead

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Parallel tiled QR factorization for multicore architectures

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Scheduling two-sided transformations using tile algorithms on multicore architectures

Scientific Programming
Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing

Parallel Computing
QCG-OMPI: MPI applications on grids

Future Generation Computer Systems
High-performance up-and-downdating via householder-like transformations

ACM Transactions on Mathematical Software (TOMS)
Algorithm 915, SuiteSparseQR: Multifrontal multithreaded rank-revealing sparse QR factorization

ACM Transactions on Mathematical Software (TOMS)
Parallel two-stage reduction to Hessenberg form using dynamic scheduling on shared-memory architectures

Parallel Computing
Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations

Parallel Computing
DAGuE: A generic distributed DAG engine for High Performance Computing

Parallel Computing
Soft error resilient QR factorization for hybrid system with GPGPU

Proceedings of the second workshop on Scalable algorithms for large-scale systems
A matrix-type for performance–portability

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Parallel algorithms for the determination of lyapunov characteristics of large nonlinear dynamical systems

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Fine granularity sparse QR factorization for multicore based systems

PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures

Concurrency and Computation: Practice & Experience
Divide and Conquer on Hybrid GPU-Accelerated Multicore Systems

SIAM Journal on Scientific Computing
Communication-optimal Parallel and Sequential QR and LU Factorizations

SIAM Journal on Scientific Computing
Families of Algorithms for Reducing a Matrix to Condensed Form

ACM Transactions on Mathematical Software (TOMS)
Accelerating the singular value decomposition of rectangular matrices with the CSK600 and the integrable SVD

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Efficient generalized Hessenberg form and applications

ACM Transactions on Mathematical Software (TOMS)
Parallel reduction to hessenberg form with algorithm-based fault tolerance

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Scalable matrix decompositions with multiple cores on FPGAs

Microprocessors & Microsystems

Quantified Score

Hi-index	0.00

Visualization

Abstract