A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiling for massively parallel architectures: a perspective
Microprocessing and Microprogramming - Special issue: parallel programmable architectures and compilation
ICS '96 Proceedings of the 10th international conference on Supercomputing
Automatic data layout for distributed memory machines
Automatic data layout for distributed memory machines
The High Performance FORTRAN Handbook
The High Performance FORTRAN Handbook
Detecting and Using Affinity in an Automatic Data Distribution Tool
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Experiences with enumeration of integer projections of parametric polytopes
CC'05 Proceedings of the 14th international conference on Compiler Construction
Integer affine transformations of parametric ℤ-polytopes and applications to loop nest optimization
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Highly scalable parallel computers, e.g. SCI-coupled workstation clusters, are NUMA architectures. Thus good static locality is essential for high performance and scalability of parallel programs on these machines. This paper describes novel techniques to optimize static locality at compilation time by application of data transformations and data distributions. The metric which guides the optimizations employs Ehrhart polynomials and allows to calculate the amount of static locality precisely. The effectiveness of our novel techniques has been confirmed by experiments conducted on the SCI-coupled workstation cluster of the PC2 at the University of Paderborn.