Matrix computations (3rd ed.)
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
LAPACK Users' guide (third ed.)
LAPACK Users' guide (third ed.)
An updated set of basic linear algebra subprograms (BLAS)
ACM Transactions on Mathematical Software (TOMS)
A Note On Parallel Matrix Inversion
SIAM Journal on Scientific Computing
Data parallel Haskell: a status report
Proceedings of the 2007 workshop on Declarative aspects of multicore programming
SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Families of algorithms related to the inversion of a Symmetric Positive Definite matrix
ACM Transactions on Mathematical Software (TOMS)
Harnessing the Multicores: Nested Data Parallelism in Haskell
APLAS '08 Proceedings of the 6th Asian Symposium on Programming Languages and Systems
Runtime support for multicore Haskell
Proceedings of the 14th ACM SIGPLAN international conference on Functional programming
Parallelizing dense and banded linear algebra libraries using SMPSs
Concurrency and Computation: Practice & Experience
Seq no more: better strategies for parallel Haskell
Proceedings of the third ACM Haskell symposium on Haskell
Regular, shape-polymorphic, parallel arrays in Haskell
Proceedings of the 15th ACM SIGPLAN international conference on Functional programming
Accelerating Haskell array codes with multicore GPUs
Proceedings of the sixth workshop on Declarative aspects of multicore programming
A monad for deterministic parallelism
Proceedings of the 4th ACM symposium on Haskell
A meta-scheduler for the par-monad: composable scheduling for the heterogeneous cloud
Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Hi-index | 0.00 |
We investigate the use of functional programming to develop a numerical linear algebra run-time; i.e. a framework where the solvers can be adapted easily to different contexts and task parallelism can be attained (semi-) automatically. We follow a bottom up strategy, where the first step is the design and implementation of a framework layer, composed by a functional version of BLAS (Basic Linear Algebra Subprograms) routines. The framework allows the manipulation of arbitrary representations for matrices and vectors and it is also possible to write and combine multiple implementations of BLAS operations based on different algorithms and parallelism strategies. Using this framework, we implement a functional version of Cholesky factorization, which serves as a proof of concept to evaluate the flexibility and performance of our approach.