What every computer scientist should know about floating-point arithmetic
ACM Computing Surveys (CSUR)
Determining mental state from EEG signals using parallel implementations of neural networks
Scientific Programming - On applications analysis
Table-driven implementation of the exponential function in IEEE floating-point arithmetic
ACM Transactions on Mathematical Software (TOMS)
Experimentation in software engineering: an introduction
Experimentation in software engineering: an introduction
The Symmetric Table Addition Method for Accurate Function Approximation
Journal of VLSI Signal Processing Systems
POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
OpenMP: An Industry-Standard API for Shared-Memory Programming
IEEE Computational Science & Engineering
Computing Elementary Functions: A New Approach for Achieving High Accuracy and Good Performance
Proceedings of the Symposium on Accurate Scientific Computations
Faithful Powering Computation Using Table Look-Up and a Fused Accumulation Tree
ARITH '01 Proceedings of the 15th IEEE Symposium on Computer Arithmetic
Fuzzy Memoization for Floating-Point Multimedia Applications
IEEE Transactions on Computers
Can software engineering solve the HPCS problem?
Proceedings of the second international workshop on Software engineering for high performance computing system applications
Writing Fast Programs: A Practical Guide for Scientists and Engineers
Writing Fast Programs: A Practical Guide for Scientists and Engineers
An Introduction to GCC
Return of the hardware floating-point elementary function
ARITH '07 Proceedings of the 18th IEEE Symposium on Computer Arithmetic
ACM Transactions on Mathematical Software (TOMS)
Towards a framework for automated performance tuning
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
A special-purpose compiler for look-up table and code generation for function evaluation
Proceedings of the Conference on Design, Automation and Test in Europe
Mesa: automatic generation of lookup table optimizations
Proceedings of the 4th International Workshop on Multicore Software Engineering
A ROSE-Based OpenMP 3.0 research compiler supporting multiple runtime libraries
IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more
Hi-index | 0.00 |
A number of scientific applications are performance-limited by expressions that repeatedly call costly elementary functions. Lookup table LUT optimization accelerates the evaluation of such functions by reusing previously computed results. LUT methods can speed up applications that tolerate an approximation of function results, thereby achieving a high level of fuzzy reuse. One problem with LUT optimization is the difficulty of controlling the tradeoff between performance and accuracy. The current practice of manual LUT optimization adds programming effort by requiring extensive experimentation to make this tradeoff, and such hand tuning can obfuscate algorithms.In this paper we describe a methodology and tool implementation to improve the application of software LUT optimization. Our Mesa tool implements source-to-source transformations for C or C++ code to automate the tedious and error-prone aspects of LUT generation such as domain profiling, error analysis, and code generation. We evaluate Mesa with five scientific applications. Our results show a performance improvement of 3.0× and 6.9× for two molecular biology algorithms, 1.4× for a molecular dynamics program, 2.1× to 2.8× for a neural network application, and 4.6× for a hydrology calculation. We find that Mesa enables LUT optimization with more control over accuracy and less effort than manual approaches.