The C programming language
An extended set of FORTRAN basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
LAPACK's user's guide
Using PLAPACK: parallel linear algebra package
Using PLAPACK: parallel linear algebra package
An annotation language for optimizing software libraries
Proceedings of the 2nd conference on Domain-specific languages
Basic Linear Algebra Subprograms for Fortran Usage
ACM Transactions on Mathematical Software (TOMS)
FLAME: Formal Linear Algebra Methods Environment
ACM Transactions on Mathematical Software (TOMS)
PLAPACK: parallel linear algebra package design overview
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
MPI: The Complete Reference
Accuracy and Stability of Numerical Algorithms
Accuracy and Stability of Numerical Algorithms
Solving Linear Systems on Vector and Shared Memory Computers
Solving Linear Systems on Vector and Shared Memory Computers
PLAPACK: High Performance through High-Level Abstraction
ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Formal Methods for High-Performance Linear Algebra Libraries
Proceedings of the IFIP TC2/WG2.5 Working Conference on the Architecture of Scientific Software
Formal derivation of algorithms: The triangular sylvester equation
ACM Transactions on Mathematical Software (TOMS)
Developing Linear Algebra Algorithms: A Collection of Class Projects
Developing Linear Algebra Algorithms: A Collection of Class Projects
A systematic approach to the design and analysis of linear algebra algorithms
A systematic approach to the design and analysis of linear algebra algorithms
The science of deriving dense linear algebra algorithms
ACM Transactions on Mathematical Software (TOMS)
SIAM Journal on Scientific Computing
Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Improving the performance of reduction to Hessenberg form
ACM Transactions on Mathematical Software (TOMS)
Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Scalable parallelization of FLAME code via the workqueuing model
ACM Transactions on Mathematical Software (TOMS)
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Anatomy of high-performance matrix multiplication
ACM Transactions on Mathematical Software (TOMS)
Families of algorithms related to the inversion of a Symmetric Positive Definite matrix
ACM Transactions on Mathematical Software (TOMS)
An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization
High Performance Computing for Computational Science - VECPAR 2008
Programming matrix algorithms-by-blocks for thread-level parallelism
ACM Transactions on Mathematical Software (TOMS)
Managing the complexity of lookahead for LU factorization with pivoting
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
A Block-Oriented Language and Runtime System for Tensor Algebra with Very Large Arrays
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Unified Embedded Parallel Finite Element Computations via Software-Based Fréchet Differentiation
SIAM Journal on Scientific Computing
Rapid development of high-performance linear algebra libraries
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Journal of Parallel and Distributed Computing
Concurrency and Computation: Practice & Experience
Extendable pattern-oriented optimization directives
ACM Transactions on Architecture and Code Optimization (TACO)
Families of Algorithms for Reducing a Matrix to Condensed Form
ACM Transactions on Mathematical Software (TOMS)
Toward scalable matrix multiply on multithreaded architectures
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Elemental: A New Framework for Distributed Memory Dense Matrix Computations
ACM Transactions on Mathematical Software (TOMS)
Hi-index | 0.00 |
In this article, we present a number of Application Program Interfaces (APIs) for coding linear algebra algorithms. On the surface, these APIs for the MATLAB M-script and C programming languages appear to be simple, almost trivial, extensions of those languages. Yet with them, the task of programming and maintaining families of algorithms for a broad spectrum of linear algebra operations is greatly simplified. In combination with our Formal Linear Algebra Methods Environment (FLAME) approach to deriving such families of algorithms, dozens of algorithms for a single linear algebra operation can be derived, verified to be correct, implemented, and tested, often in a matter of minutes per algorithm. Since the algorithms are expressed in code much like they are explained in a classroom setting, these APIs become not just a tool for implementing libraries, but also a valuable tool for teaching the algorithms that are incorporated in the libraries. In combination with an extension of the Parallel Linear Algebra Package (PLAPACK) API, the approach presents a migratory path from algorithm to MATLAB implementation to high-performance sequential implementation to parallel implementation. Finally, the APIs are being used to create a repository of algorithms and implementations for linear algebra operations, the FLAME Interface REpository (FIRE), which already features hundreds of algorithms for dozens of commonly encountered linear algebra operations.