Optimizing Transformations of Stencil Operations for Parallel Object-Oriented Scientific Frameworks on Cache-Based Architectures

Authors:
Federico Bassetti;Kei Davis;Daniel J. Quinlan
Affiliations:
-;-;-
Venue:
ISCOPE '98 Proceedings of the Second International Symposium on Computing in Object-Oriented Parallel Environments
Year:
1998

Citing 4
Cited 6

Expression templates

C++ gems
Advanced compiler design and implementation

Advanced compiler design and implementation
The C++ Programming Language, Third Edition

The C++ Programming Language, Third Edition
Overture: An Object-Oriented Framework for Solving Partial Differential Equations

ISCOPE '97 Proceedings of the Scientific Computing in Object-Oriented Parallel Environments

Just When You Thought Your Little Language Was Safe: ``Expression Templates'' in Java

GCSE '00 Proceedings of the Second International Symposium on Generative and Component-Based Software Engineering-Revised Papers
Cache-Efficient Multigrid Algorithms

ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Treating a User-Defined Parallel Library as a Domain-Specific Language

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance Optimization of 3D Multigrid on Hierarchical Memory Architectures

PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Smashing: Folding Space to Tile through Time

Languages and Compilers for Parallel Computing
Combining performance aspects of irregular gauss-seidel via sparse tiling

LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

High-performance scientific computing relies increasingly on high-level, large-scale, object-oriented software frameworks to manage both algorithmic complexity and the complexities of parallelism: distributed data management, process management, inter-process communication, and load balancing. This encapsulation of data management, together with the prescribed semantics of a typical fundamental component of such object-oriented frameworks--a parallel or serial array class library--provides an opportunity for increasingly sophisticated compile-time optimization techniques. This paper describes two optimizing transformations suitable for certain classes of numerical algorithms, one for reducing the cost of inter-processor communication, and one for improving cache utilization; demonstrates and analyzes the resulting performance gains; and indicates how these transformations are being automated.