Exploring the Optimization Space of Dense Linear Algebra Kernels

Authors:
Qing Yi;Apan Qasem
Affiliations:
University of Texas at San Anstonio,;Texas State University,
Venue:
Languages and Compilers for Parallel Computing
Year:
2008

Citing 8
Cited 4

A comparison of empirical and model-driven optimization

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Combining Models and Guided Empirical Search to Optimize for Multiple Levels of the Memory Hierarchy

Proceedings of the international symposium on Code generation and optimization
Predicting Unroll Factors Using Supervised Classification

Proceedings of the international symposium on Code generation and optimization
The science of deriving dense linear algebra algorithms

ACM Transactions on Mathematical Software (TOMS)
Tuning High Performance Kernels through Empirical Compilation

ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Using Machine Learning to Focus Iterative Optimization

Proceedings of the International Symposium on Code Generation and Optimization
Automatic tuning of whole applications using direct search and a performance-based transformation system

The Journal of Supercomputing
Automated transformation for performance-critical kernels

LCSD '07 Proceedings of the 2007 Symposium on Library-Centric Software Design

Using machine learning to improve automatic vectorization

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
POET: a scripting language for applying parameterized source-to-source program transformations

Software—Practice & Experience
Automatic restructuring of GPU kernels for exploiting inter-thread data locality

CC'12 Proceedings of the 21st international conference on Compiler Construction
AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Dense linear algebra kernels such as matrix multiplication have been used as benchmarks to evaluate the effectiveness of many automated compiler optimizations. However, few studies have looked at collectively applying the transformations and parameterizing them for external search. In this paper, we take a detailed look at the optimization space of three dense linear algebra kernels. We use a transformation scripting language (POET) to implement each kernel-level optimization as applied by ATLAS. We then extensively parameterize these optimizations from the perspective of a general-purpose compiler and use a stand-alone empirical search engine to explore the optimization space using several different search strategies. Our exploration of the search space reveals key interaction among several transformations that must be considered by compilers to approach the level of efficiency obtained through manual tuning of kernels.