Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Memory characteristics of iterative methods
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Tiling optimizations for 3D scientific computations
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
High-performacne parallel implicit CFD
Parallel Computing - Special issue on parallel computing in aerospace
Automatically tuned linear algebra software
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Cache-Efficient Multigrid Algorithms
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Iterative Algorithms on High Performance Architectures
Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Performance Optimization of 3D Multigrid on Hierarchical Memory Architectures
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Enhancing the performance of multigrid smoothers in simultaneous multithreading architectures
VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
Hi-index | 0.00 |
Efficient program execution can only be achieved if the codes respect the hierarchical memory design of the underlying architectures; programs must exploit caches to avoid high latencies involved with main memory accesses. However, iterative methods like multigrid are characterized by successive sweeps over data sets, which are commonly too large to fit in cache.This paper is based on our previous work on data access transformations for multigrid methods for constant coefficient problems. However, the case of variable coefficients, which we consider here, requires more complex data structures.We focus on data layout techniques to enhance the cache efficiency of multigrid codes for variable coefficient problems on regular meshes. We provide performance results which illustrate the effectiveness of our layout optimizations in conjunction with data access transformations.