Matrix computations (3rd ed.)
LAPACK Users' guide (third ed.)
LAPACK Users' guide (third ed.)
Algorithm 807: The SBR Toolbox—software for successive band reduction
ACM Transactions on Mathematical Software (TOMS)
Reducing Power with Performance Constraints for Parallel Sparse Applications
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 11 - Volume 12
Using multiple energy gears in MPI programs on a power-scalable cluster
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems
International Journal of High Performance Computing Applications
Comparative study of one-sided factorizations with multiple software packages on multi-core hardware
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications
IEEE Transactions on Parallel and Distributed Systems
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures
ACM Transactions on Mathematical Software (TOMS)
Hi-index | 0.00 |
This paper presents the power profile of two high performance dense linear algebra libraries i.e., LAPACK and PLASMA. The former is based on block algorithms that use the fork-join paradigm to achieve parallel performance. The latter uses fine-grained task parallelism that recasts the computation to operate on submatrices called tiles. In this way tile algorithms are formed. We show results from the power profiling of the most common routines, which permits us to clearly identify the different phases of the computations. This allows us to isolate the bottlenecks in terms of energy efficiency. Our results show that PLASMA surpasses LAPACK not only in terms of performance but also in terms of energy efficiency.