Proceedings of the 1989 ACM/IEEE conference on Supercomputing
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Tile size selection using cache organization and data layout
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Precise miss analysis for program transformations with caches of arbitrary associativity
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A fast Fourier transform compiler
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
A tile selection algorithm for data locality and cache interference
ICS '99 Proceedings of the 13th international conference on Supercomputing
Exact analysis of the cache behavior of nested loops
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
SPL: a language and compiler for DSP algorithms
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
High-level adaptive program optimization with ADAPT
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Language and Compiler Support for Adaptive Distributed Applications
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Automatically tuned linear algebra software
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Automatic Performance Tuning in the UHFFT Library
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Asserting performance expectations
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
ECO: An Empirical-Based Compilation and Optimization System
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Combining Models and Guided Empirical Search to Optimize for Multiple Levels of the Memory Hierarchy
Proceedings of the international symposium on Code generation and optimization
Interprocedural parallelization analysis in SUIF
ACM Transactions on Programming Languages and Systems (TOPLAS)
A code isolator: isolating code fragments from large programs
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
The next generation software program
International Journal of Parallel Programming - Special issue: The next generation software program
An overview of the ECO project
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Loop transformation recipes for code generation and auto-tuning
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
This paper describes initial experiences with semi-automated performance tuning of a sparse linear solver in LS-DYNA, a large, widely used engineering application. Through a collection of tools supporting empirical optimization, we alleviate the burden of performance tuning for mapping today's sophisticated engineering software to increasingly complex hardware platforms. We describe a tool that automatically isolates code segments to create benchmark subsets for the purposes of performance tuning. We present a collection of automatically generated empirical results that demonstrate the sensitivity of the application's performance to optimization parameters. Through this case study, we demonstrate the importance of developing automatic performance tuning support for performance-sensitive applications.