POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
New CPU benchmark suites from SPEC
COMPCON '92 Proceedings of the thirty-seventh international conference on COMPCON
Optimizing for parallelism and data locality
ICS '92 Proceedings of the 6th international conference on Supercomputing
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Compiling for numa parallel machines
Compiling for numa parallel machines
Unifying data and control transformations for distributed shared-memory machines
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Tile size selection using cache organization and data layout
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic data layout for high performance Fortran
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
A novel approach towards automatic data distribution
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Data-centric multi-level blocking
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Non-singular data transformations: definition, validity and applications
ICS '97 Proceedings of the 11th international conference on Supercomputing
Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
An efficient uniform run-time scheme for mixed regular-irregular applications
ICS '98 Proceedings of the 12th international conference on Supercomputing
A hyperplane based approach for optimizing spatial locality in loop nests
ICS '98 Proceedings of the 12th international conference on Supercomputing
Advanced compiler design and implementation
Advanced compiler design and implementation
Improving locality using loop and data transformations in an integrated framework
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Precise miss analysis for program transformations with caches of arbitrary associativity
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Automatic data layout for distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Improving memory hierarchy performance for irregular applications
ICS '99 Proceedings of the 13th international conference on Supercomputing
An integer linear programming approach for optimizing cache locality
ICS '99 Proceedings of the 13th international conference on Supercomputing
Quantifying loop nest locality using SPEC'95 and the perfect benchmarks
ACM Transactions on Computer Systems (TOCS)
Dynamic data distribution with control flow analysis
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Compiler-directed selection of dynamic memory layouts
Proceedings of the ninth international symposium on Hardware/software codesign
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Compiling Communication-Efficient Programs for Massively Parallel Machines
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors
ICPP '97 Proceedings of the international Conference on Parallel Processing
Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
A Matrix-Based Approach to the Global Locality Optimization Problem
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Integrating Loop and Data Transformations for Global Optimisation
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Localizing Non-Affine Array References
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Memory Hierarchy Management for Iterative Graph Structures
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Optimizing supercompilers for supercomputers
Optimizing supercompilers for supercomputers
Microprocessors & Microsystems
Hi-index | 0.00 |
Compiler-directed locality optimization techniques are effective in reducing the number of cycles spent in off-chip memory accesses. Recently, methods have been developed that transform memory layouts of data structures at compile-time to improve spatial locality of nested loops beyond current control-centric (loop nest-based) optimizations. Most of these data-centric transformations use a single static (program-wide) memory layout for each array. A disadvantage of these static layout-based locality enhancement strategies is that they might fail to optimize codes that manipulate arrays which demand different layouts in different parts of the code. In this paper, we introduce a new approach which extends current static layout optimization techniques by associating different memory layouts with the same array in different parts of the code. We call this strategy "quasidynamic layout optimization.驴 In this strategy, the compiler determines memory layouts (for different parts of the code) at compile time, but layout conversions occur at runtime. We show that the possibility of dynamically changing memory layouts during the course of execution adds a new dimension to the data locality optimization problem. Our strategy employs a static layout optimizer module as a building block and, by repeatedly invoking it for different parts of the code, it checks whether runtime layout modifications bring additional benefits beyond static optimization. Our experiments indicate significant improvements in execution time over static layout-based locality enhancing techniques.