POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Solving ordinary differential equations I (2nd revised. ed.): nonstiff problems
Solving ordinary differential equations I (2nd revised. ed.): nonstiff problems
Parallel and sequential methods for ordinary differential equations
Parallel and sequential methods for ordinary differential equations
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
LAPACK Users' guide (third ed.)
LAPACK Users' guide (third ed.)
Architecture-cognizant divide and conquer algorithms
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Memory characteristics of iterative methods
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Journal of Parallel and Distributed Computing
The working set model for program behavior
Communications of the ACM
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Scientific Computing with Ordinary Differential Equations
Scientific Computing with Ordinary Differential Equations
Parallel solution of a Schrödinger-Poisson system
HPCN Europe '95 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
On the Complexity of Loop Fusion
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Automatically Tuned Linear Algebra Software
Automatically Tuned Linear Algebra Software
Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines
Scientific Programming
Optimizing locality and scalability of embedded Runge--Kutta solvers using block-based pipelining
Journal of Parallel and Distributed Computing
Parallel Low-Storage Runge-Kutta Solvers for ODE Systems with Limited Access Distance
International Journal of High Performance Computing Applications
Locality optimized shared-memory implementations of iterated runge-kutta methods
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
Runge-Kutta methods are popular methods for the solution of ordinary differential equations and implementations are provided by many scientific libraries. The performance of Runge-Kutta methods depends on the specific application problem to be solved, but also on the characteristics of the target machine. For processors with a memory hierarchy, the locality of data referencing pattern has a large impact on the efficiency of a program. In this paper, we describe program transformations for Runge-Kutta methods resulting in implementations with improved locality behavior for systems of ODEs. The transformations are based on properties of the solution method but are independent from the specific application problem or the specific target machine so that the resulting implementation is suitable as library function. We show that the locality improvement leads to performance gains on different recent microprocessors.