GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems
SIAM Journal on Scientific and Statistical Computing
Numerical Linear Algebra for High Performance Computers
Numerical Linear Algebra for High Performance Computers
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
International Journal of Parallel, Emergent and Distributed Systems
Implementing sparse matrix-vector multiplication on throughput-oriented processors
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
The International Exascale Software Project roadmap
International Journal of High Performance Computing Applications
An error correction solver for linear systems: evaluation of mixed precision implementations
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear Systems
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Case studies of multi-core energy efficiency in task based programs
ICT-GLOW'12 Proceedings of the Second international conference on ICT as Key Technology against Global Warming
Journal of Computational Physics
Hi-index | 0.00 |
In this paper, we analyze the power consumption of different GPU-accelerated iterative solver implementations enhanced with energy-saving techniques. Specifically, while conducting kernel calls on the graphics accelerator, we manually set the host system to a power-efficient idle-wait status so as to leverage dynamic voltage and frequency control. While the usage of iterative refinement combined with mixed precision arithmetic often improves the execution time of an iterative solver on a graphics processor, this may not necessarily be true for the power consumption as well. To analyze the trade-off between computation time and power consumption we compare a plain GMRES solver and its preconditioned variant to the mixed-precision iterative refinement implementations based on the respective solvers. Benchmark experiments conclusively reveal how the usage of idle-wait during GPU-kernel calls effectively leverages the power-tools provided by hardware, and improves the energy performance of the algorithm.