Mesa: automatic generation of lookup table optimizations

Authors:
Chris Wilcox;Michelle Mills Strout;James M. Bieman
Affiliations:
Colorado State University, Fort Collins, CO, USA;Colorado State University, Fort Collins, CO, USA;Colorado State University, Fort Collins, CO, USA
Venue:
Proceedings of the 4th International Workshop on Multicore Software Engineering
Year:
2011

Citing 12
Cited 2

Developing a tool for memoizing functions in C++

ACM SIGPLAN Notices
Patterns in Java, volume 2

Patterns in Java, volume 2
Fast Multiple-Precision Evaluation of Elementary Functions

Journal of the ACM (JACM)
Hardware Designs for Exactly Rounded Elementary Functions

IEEE Transactions on Computers
The Art of Assembly Language

The Art of Assembly Language
A Portable Programming Interface for Performance Evaluation on Modern Processors

International Journal of High Performance Computing Applications
Fuzzy Memoization for Floating-Point Multimedia Applications

IEEE Transactions on Computers
Can software engineering solve the HPCS problem?

Proceedings of the second international workshop on Software engineering for high performance computing system applications
Tool Support for Inspecting the Code Quality of HPC Applications

SE-HPC '07 Proceedings of the 3rd International Workshop on Software Engineering for High Performance Computing Applications
Towards a framework for automated performance tuning

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
A special-purpose compiler for look-up table and code generation for function evaluation

Proceedings of the Conference on Design, Automation and Test in Europe
Initial Results on Fuzzy Floating Point Computation for Multimedia Processors

IEEE Computer Architecture Letters

Fourth international workshop on multicore software engineering (IWMSE 2011)

Proceedings of the 33rd International Conference on Software Engineering
Tool support for software lookup table optimization

Scientific Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scientific programmers strive constantly to meet performance demands. Tuning is often done manually, despite the significant development time and effort required. One example is lookup table (LUT) optimization, a technique that is generally applied by hand due to a lack of methodology and tools. LUT methods reduce execution time by replacing computations with memory accesses to precomputed tables of results. LUT optimizations improve performance when the memory access is faster than the original computation, and the level of reuse is sufficient to amortize LUT initialization. Current practice requires programmers to inspect program source to identify candidate expressions, then develop specific LUT code for each optimization. Measurement of LUT accuracy is usually ad hoc, and the interaction with multicore parallelization has not been explored. In this paper we present Mesa, a standalone tool that implements error analysis and code generation to improve the process of LUT optimization. We evaluate Mesa on a multicore system using a molecular biology application and other scientific expressions. Our LUT optimizations realize a performance improvement of 5X for the application and up to 45X for the expressions, while tightly controlling error. We also show that the serial optimization is just as effective on a parallel version of the application. Our research provides a methodology and tool for incorporating LUT optimizations into existing scientific code