Recursion leads to automatic variable blocking for dense linear-algebra algorithms
IBM Journal of Research and Development
GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark
ACM Transactions on Mathematical Software (TOMS)
Algorithm 784: GEMM-based level 3 BLAS: portability and optimization issues
ACM Transactions on Mathematical Software (TOMS)
LAPACK Users' guide (third ed.)
LAPACK Users' guide (third ed.)
A recursive formulation of Cholesky factorization of a matrix in packed storage
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
Recursive Blocked Data Formats and BLAS's for Dense Linear Algebra Algorithms
PARA '98 Proceedings of the 4th International Workshop on Applied Parallel Computing, Large Scale Scientific and Industrial Problems
High-Performance Library Software for QR Factorization
PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
Applying recursion to serial and parallel QR factorization leads to better performance
IBM Journal of Research and Development
Minimal-storage high-performance Cholesky factorization via blocking and recursion
IBM Journal of Research and Development
New data structures for matrices and specialized inner kernels: low overhead for high performance
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Hi-index | 0.00 |
Recent progress in using recursion as a general technique for producing dense linear algebra library software for today's memory tiered computer systems is presented. To allow for efficient utilization of a memory hierarchy, our approach is to apply the technique of hierarchical blocking. The success of our approach includes novel recursive blocked algorithms, hybrid data formats and superscalar kernels.