Autotuning Stencil-Based Computations on GPUs

Authors:
Azamat Mametjanov;Daniel Lowell;Ching-Chen Ma;Boyana Norris
Affiliations:
-;-;-;-
Venue:
CLUSTER '12 Proceedings of the 2012 IEEE International Conference on Cluster Computing
Year:
2012

Citing 0
Cited 1

Towards making autotuning mainstream

International Journal of High Performance Computing Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Finite-difference, stencil-based discretization approaches are widely used in the solution of partial differential equations describing physical phenomena. Newton-Krylov iterative methods commonly used in stencil-based solutions generate matrices that exhibit diagonal sparsity patterns. To exploit these structures on modern GPUs, we extend the standard diagonal sparse matrix representation and define new matrix and vector data types in the PETSc parallel numerical toolkit. We create tunable CUDA implementations of the operations associated with these types after identifying a number of GPU-specific optimizations and tuning parameters for these operations. We discuss our implementation of GPU auto tuning capabilities in the Orio framework and present performance results for several kernels, comparing them with vendor-tuned library implementations.