Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
A fast Fourier transform compiler
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
The Autopilot performance-directed adaptive control system
Future Generation Computer Systems - I. High Performance Numerical Methods and Applications. II. Performance Data Mining: Automated Diagnosis, Adaption, and Optimization
Knowledge Discovery in Auto-tuning Parallel Numerical Library
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Parallel Computing - Heterogeneous computing
ABCLib_DRSSED: A parallel eigensolver with an auto-tuning facility
Parallel Computing
Hi-index | 0.00 |
This paper evaluates the effect of an auto-tuning facility with the user's knowledge for numerical software. We proposed a new software architecture framework, named FIBER, to generalize auto-tuning facilities and obtain highly accurate estimated parameters. The FIBER framework also provides a loop-unrolling function and an algorithm selection function to support code development by library developers needing code generation and parameter registration processes. FIBER offers three kinds of parameter optimization layers---install-time, before execute-time, and run-time. The user's knowledge is needed in the before execute-time optimization layer. In this paper, eigensolver parameters that apply the FIBER framework are described and evaluated in three kinds of parallel computers: the HITACHI SR8000/MPP, Fujitsu VPP800/63, and Pentium4 PC cluster. Our evaluation of the application of the before execute-time layer indicated a maximum speed increase of 3.4 times for eigensolver parameters, and a maximum increase of 17.1 times for the algorithm selection of orthogonalization in the computation kernel of the eigensolver.