An efficient time-step-based self-adaptive algorithm for predictor-corrector methods of Runge-Kutta type

Authors:
Natalia Kalinnik;Matthias Korch;Thomas Rauber
Affiliations:
-;-;-
Venue:
Journal of Computational and Applied Mathematics
Year:
2011

Citing 12
Cited 1

Parallel iteration of high-order Runge-Kutta methods with stepsize control

Journal of Computational and Applied Mathematics
Parallel and sequential methods for ordinary differential equations

Parallel and sequential methods for ordinary differential equations
Optimized extrapolation methods for parallel solution of IVPs on different computer architectures

Applied Mathematics and Computation
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology

ICS '97 Proceedings of the 11th international conference on Supercomputing
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
Automatically Tuned Linear Algebra Software

Automatically Tuned Linear Algebra Software
Adaptive Loop Tiling for a Multi-cluster CMP

ICA3PP '08 Proceedings of the 8th international conference on Algorithms and Architectures for Parallel Processing
Parameter optimization for explicit parallel peer two-step methods

Applied Numerical Mathematics
Compilers: Principles, Techniques, & Tools with Gradiance

Compilers: Principles, Techniques, & Tools with Gradiance
A scalable auto-tuning framework for compiler optimization

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Locality optimized shared-memory implementations of iterated runge-kutta methods

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing

Adaptive parallel tiled code generation and accelerated auto-tuning

International Journal of High Performance Computing Applications

Quantified Score

Hi-index	7.29

Visualization

Abstract

Finding an efficient implementation variant for the numerical solution of problems from computational science and engineering involves many implementation decisions that are strongly influenced by the specific hardware architecture. The complexity of these architectures makes it difficult to find the best implementation variant by manual tuning. For numerical solution methods from linear algebra, auto-tuning techniques based on a global search engine as they are used for ATLAS or FFTW can be used successfully. These techniques generate different implementation variants at installation time and select one of these implementation variants either at installation time or at runtime, before the computation starts. For some numerical methods, auto-tuning at installation time cannot be applied directly, since the best implementation variant may strongly depend on the specific numerical problem to be solved. An example is solution methods for initial value problems (IVPs) of ordinary differential equations (ODEs), where the coupling structure of the ODE system to be solved has a large influence on the efficient use of the memory hierarchy of the hardware architecture. In this context, it is important to use auto-tuning techniques at runtime, which is possible because of the time-stepping nature of ODE solvers. In this article, we present a sequential self-adaptive ODE solver that selects the best implementation variant from a candidate pool at runtime during the first time steps, i.e., the auto-tuning phase already contributes to the progress of the computation. The implementation variants differ in the loop structure and the data structures used to realize the numerical algorithm, a predictor-corrector (PC) iteration scheme with Runge-Kutta (RK) corrector considered here as an example. For those implementation variants in the candidate pool that use loop tiling to exploit the memory hierarchy of a given hardware platform we investigate the selection of tile sizes. The self-adaptive ODE solver combines empirical search with a model-based approach in order to reduce the search space of possible tile sizes. Runtime experiments demonstrate the efficiency of the self-adaptive solver for different IVPs across a range of problem sizes and on different hardware architectures.