An integer linear programming approach for optimizing cache locality
ICS '99 Proceedings of the 13th international conference on Supercomputing
A compiler technique for improving whole-program locality
POPL '01 Proceedings of the 28th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Compiler-directed selection of dynamic memory layouts
Proceedings of the ninth international symposium on Hardware/software codesign
An empirical evaluation of high level transformations for embedded processors
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Static and Dynamic Locality Optimizations Using Integer Linear Programming
IEEE Transactions on Parallel and Distributed Systems
Data Relation Vectors: A New Abstraction for Data Optimizations
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Efficient Representation Scheme for Multidimensional Array Operations
IEEE Transactions on Computers
A Layout-Conscious Iteration Space Transformation Technique
IEEE Transactions on Computers
Array recovery and high-level transformations for DSP applications
ACM Transactions on Embedded Computing Systems (TECS)
Influence of Array Allocation Mechanisms on Memory System Energy
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Compiler Transformation of Pointers to Explicit Array Accesses in DSP Applications
CC '01 Proceedings of the 10th International Conference on Compiler Construction
Influence of Loop Optimizations on Energy Consumption of Multi-bank Memory Systems
CC '02 Proceedings of the 11th International Conference on Compiler Construction
MARS: A Distributed Memory Approach to Shared Memory Compilation
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Array Unification: A Locality Optimization Technique
CC '01 Proceedings of the 10th International Conference on Compiler Construction
A compiler approach for reducing data cache energy
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Strategies for Improving Data Locality in Embedded Applications
ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
IEEE Transactions on Parallel and Distributed Systems
Exploiting bank locality in multi-bank memories
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Array Regrouping and Its Use in Compiling Data-Intensive Embedded Applications
IEEE Transactions on Computers
Impact of Data Transformations on Memory Bank Locality
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Quasidynamic Layout Optimizations for Improving Data Locality
IEEE Transactions on Parallel and Distributed Systems
New Complexity Results on Array Contraction and Related Problems
Journal of VLSI Signal Processing Systems
Improving whole-program locality using intra-procedural and inter-procedural transformations
Journal of Parallel and Distributed Computing
Reducing data cache leakage energy using a compiler-based approach
ACM Transactions on Embedded Computing Systems (TECS)
Multi-compilation: capturing interactions among concurrently-executing applications
Proceedings of the 3rd conference on Computing frontiers
Integrating loop and data optimizations for locality within a constraint network based framework
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Reducing energy consumption of multiprocessor SoC architectures by exploiting memory bank locality
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Cache miss clustering for banked memory systems
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
A memory-conscious code parallelization scheme
Proceedings of the 44th annual Design Automation Conference
Using Padding to Optimize Locality in Scientific Applications
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
Applying data copy to improve memory performance of general array computations
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.01 |
This paper is concerned with integrating global data transformations and local loop transformations in order to minimise overhead on distributed shared memory machines such as the SGi Origin 2000. By first developing an extended algebraic transformation framework, a new technique to allow the static application of global data transformations, such as partitioning, to reshaped arrays is presented, eliminating the need for expensive temporary copies and hence eliminating any communication and synchronisation. In addition, by integrating loop and data transformations, any introduced poor spatial locality and expensive array subscripts can be eliminated. A specific optimisation algorithm is derived and applied to well-known benchmarks, where it is shown to give a significant improvement in execution time over existing approaches