An extended set of FORTRAN basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
A logical approach to discrete math
A logical approach to discrete math
The Mathematica book (3rd ed.)
The Mathematica book (3rd ed.)
An Industrial Strength Theorem Prover for a Logic Based on Common Lisp
IEEE Transactions on Software Engineering
Using PLAPACK: parallel linear algebra package
Using PLAPACK: parallel linear algebra package
Quickly detecting relevant program invariants
Proceedings of the 22nd international conference on Software engineering
Basic Linear Algebra Subprograms for Fortran Usage
ACM Transactions on Mathematical Software (TOMS)
An axiomatic basis for computer programming
Communications of the ACM
FLAME: Formal Linear Algebra Methods Environment
ACM Transactions on Mathematical Software (TOMS)
Automatically tuned linear algebra software
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
PLAPACK: parallel linear algebra package design overview
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
The Science of Programming
A Discipline of Programming
Computer-Aided Reasoning: An Approach
Computer-Aided Reasoning: An Approach
Accuracy and Stability of Numerical Algorithms
Accuracy and Stability of Numerical Algorithms
A Note On Parallel Matrix Inversion
SIAM Journal on Scientific Computing
A Family of High-Performance Matrix Multiplication Algorithms
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Formal Methods for High-Performance Linear Algebra Libraries
Proceedings of the IFIP TC2/WG2.5 Working Conference on the Architecture of Scientific Software
Formal derivation of algorithms: The triangular sylvester equation
ACM Transactions on Mathematical Software (TOMS)
Developing Linear Algebra Algorithms: A Collection of Class Projects
Developing Linear Algebra Algorithms: A Collection of Class Projects
Dynamically discovering likely program invariants
Dynamically discovering likely program invariants
A systematic approach to the design and analysis of linear algebra algorithms
A systematic approach to the design and analysis of linear algebra algorithms
Representing linear algebra algorithms in code: the FLAME application program interfaces
ACM Transactions on Mathematical Software (TOMS)
Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Improving the performance of reduction to Hessenberg form
ACM Transactions on Mathematical Software (TOMS)
Optimizing FIAT with level 3 BLAS
ACM Transactions on Mathematical Software (TOMS)
Program generation for the all-pairs shortest path problem
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
FFT program generation for shared memory: SMP and multicore
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Scalable parallelization of FLAME code via the workqueuing model
ACM Transactions on Mathematical Software (TOMS)
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Anatomy of high-performance matrix multiplication
ACM Transactions on Mathematical Software (TOMS)
Formal constraints on memory management for composite overloaded operations
Scientific Programming
Families of algorithms related to the inversion of a Symmetric Positive Definite matrix
ACM Transactions on Mathematical Software (TOMS)
Updating an LU Factorization with Pivoting
ACM Transactions on Mathematical Software (TOMS)
Solving Dense Linear Systems on Graphics Processors
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
How to Write Fast Numerical Code: A Small Introduction
Generative and Transformational Techniques in Software Engineering II
Exploring the Optimization Space of Dense Linear Algebra Kernels
Languages and Compilers for Parallel Computing
Solving dense linear systems on platforms with multiple hardware accelerators
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization
High Performance Computing for Computational Science - VECPAR 2008
Automated transformation for performance-critical kernels
LCSD '07 Proceedings of the 2007 Symposium on Library-Centric Software Design
Operator Language: A Program Generation Framework for Fast Kernels
DSL '09 Proceedings of the IFIP TC 2 Working Conference on Domain-Specific Languages
The implementation of BLAS for band matrices
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Using hybrid CPU-GPU platforms to accelerate the computation of the matrix sign function
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Partitioned Triangular Tridiagonalization
ACM Transactions on Mathematical Software (TOMS)
Efficient model order reduction of large-scale systems on multi-core platforms
ICCSA'11 Proceedings of the 2011 international conference on Computational science and Its applications - Volume Part V
Knowledge-based automatic generation of partitioned matrix expressions
CASC'11 Proceedings of the 13th international conference on Computer algebra in scientific computing
Goal-Oriented and Modular Stability Analysis
SIAM Journal on Matrix Analysis and Applications
Rapid development of high-performance linear algebra libraries
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Automatic derivation of linear algebra algorithms with application to control theory
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Rapid development of high-performance out-of-core solvers
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
ACM Transactions on Mathematical Software (TOMS)
Journal of Parallel and Distributed Computing
Concurrency and Computation: Practice & Experience
Families of Algorithms for Reducing a Matrix to Condensed Form
ACM Transactions on Mathematical Software (TOMS)
Parallelizing dense linear algebra operations with task queues in llc
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Application-tailored linear algebra algorithms: A search-based approach
International Journal of High Performance Computing Applications
A Basic Linear Algebra Compiler
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
In this article we present a systematic approach to the derivation of families of high-performance algorithms for a large set of frequently encountered dense linear algebra operations. As part of the derivation a constructive proof of the correctness of the algorithm is generated. The article is structured so that it can be used as a tutorial for novices. However, the method has been shown to yield new high-performance algorithms for well-studied linear algebra operations and should also be of interest to those who wish to produce best-in-class high-performance codes.