Automating the generation of composed linear algebra kernels
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Operation Stacking for Ensemble Computations With Variable Convergence
International Journal of High Performance Computing Applications
Parallel memory prediction for fused linear algebra kernels
ACM SIGMETRICS Performance Evaluation Review - Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
Flexible Variants of Block Restarted GMRES Methods with Application to Geophysics
SIAM Journal on Scientific Computing
Hi-index | 0.00 |
The increasing gap between processor performance and memory access time warrants the re-examination of data movement in iterative linear solver algorithms. For this reason, we explore and establish the feasibility of modifying a standard iterative linear solver algorithm in a manner that reduces the movement of data through memory. In particular, we present an alternative to the restarted GMRES algorithm for solving a single right-hand side linear system $Ax=b$ based on solving the block linear system $AX=B$. Algorithm performance, i.e., time to solution, is improved by using the matrix $A$ in operations on groups of vectors. Experimental results demonstrate the importance of implementation choices on data movement as well as the effectiveness of the new method on a variety of problems from different application areas.