PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
ScaLAPACK user's guide
PARA '96 Proceedings of the Third International Workshop on Applied Parallel Computing, Industrial Computation and Optimization
LAPACK Working Note 20: A Portable Linear Algebra Library For High-Performance Computers
LAPACK Working Note 20: A Portable Linear Algebra Library For High-Performance Computers
Architecture of an automatically tuned linear algebra library
Parallel Computing
Designing polylibraries to speed up linear algebra computations
International Journal of High Performance Computing and Networking
Hi-index | 0.00 |
In this work we propose the architecture of an automatically tuned linear algebra library, which is composed by a set of linear algebra routines along with their installation routines. During the installation process on a system, the linear algebra routines will be tuned automatically to the system conditions: hardware characteristics and basic libraries used in the linear algebra routines. The design methodology is analysed with a block LU factorisation. Variants for a sequential and parallel version of this routine on a logical rectangular mesh of processors are considered. An analytical model of the algorithm is developed as the basis of our methodology, and the behaviour of the algorithm is analysed with message-passing using MPI on several platforms: Network of SUN workstations, SGI Origin 2000 and IBM SP2, and with different basic linear algebra libraries: reference BLAS, machine-specific BLAS and ATLAS. The experiments show that it is possible to make a good automatic choice of configurable parameters of the linear algebra routines during the installation process. The average execution time of the Linear Algebra Routine is reduced by about 15% with respect to the nontuned version.