The potential for parallelism in Runge-Kutta methods. Part 1: RK formulas in standard form
SIAM Journal on Numerical Analysis
Parallel and sequential methods for ordinary differential equations
Parallel and sequential methods for ordinary differential equations
Optimized extrapolation methods for parallel solution of IVPs on different computer architectures
Applied Mathematics and Computation
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
Programming with POSIX threads
Programming with POSIX threads
Low-storage, explicit Runge-Kutta schemes for the compressible Navier-Stokes equations
Applied Numerical Mathematics
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
An updated set of basic linear algebra subprograms (BLAS)
ACM Transactions on Mathematical Software (TOMS)
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Short note: a new minimum storage Runge-Kutta scheme for computational acoustics
Journal of Computational Physics
Optimizing locality and scalability of embedded Runge--Kutta solvers using block-based pipelining
Journal of Parallel and Distributed Computing
Improving locality for ODE solvers by program transformations
Scientific Programming
Twostep-by-twostep PIRK-type PC methods with continuous output formulas
Journal of Computational and Applied Mathematics
A scalable auto-tuning framework for compiler optimization
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Optimized Stencil Computation Using In-Place Calculation on Modern Multicore Systems
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Parameterized tiling revisited
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Hi-index | 0.00 |
We consider the solution of initial value problems (IVPs) of large systems of ordinary differential equations (ODEs) for which memory space requirements determine the choice of the integration method. In particular, we discuss the space-efficient sequential and parallel implementation of embedded Rungeâ聙聰Kutta (RK) methods. Our focus is on the exploitation of a special structure of commonly appearing ODE systems, referred to as â聙聵â聙聵limited access distance,â聙聶â聙聶 to improve scalability and memory usage. Such systems may arise, for example, from the semi-discretization of partial differential equations (PDEs). The storage space required by classical RK methods is directly proportional to the dimension n of the ODE system and the number of stages s of the method. We propose an implementation strategy based on a pipelined processing of the stages of the RK method and show how the memory usage of this computation scheme can be reduced to less than three storage registers by an overlapping of vectors without compromising the choice of method coefficients or the potential for efficient stepsize control. We analyze and compare the scalability of different parallel implementation strategies in detailed runtime experiments on different modern parallel architectures.