A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Unifying data and control transformations for distributed shared-memory machines
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Synthesizing transformations for locality enhancement of imperfectly-nested loop nests
Proceedings of the 14th international conference on Supercomputing
A compiler technique for improving whole-program locality
POPL '01 Proceedings of the 28th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Loop fusion for memory space optimization
Proceedings of the 14th international symposium on Systems synthesis
Compiling stencils in high performance Fortran
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Compiling for Distributed Memory Architectures
IEEE Transactions on Parallel and Distributed Systems
Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors
ICPP '97 Proceedings of the international Conference on Parallel Processing
Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Data Redistribution in an Automatic Data Distribution Tool
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Optimization opportunities created by global data reordering
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A Framework for Automatic Dynamic Data Mapping
SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
Bandwidth-Constrained Mapping of Cores onto NoC Architectures
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Technology, performance, and computer-aided design of three-dimensional integrated circuits
Proceedings of the 2004 international symposium on Physical design
Hi-index | 0.00 |
Data locality has been a central theme in the compiler optimization world for a long time. Most of the prior compiler techniques try to optimize data locality in a one-dimensional linear address space. However, there are many problems out there where the domain for data locality can be two or higher dimensional. For example, in a 2D mesh network environment, each node has connections with its four neighbors, and therefore, the data locality can potentially be exploited in two dimensions from a given processor's viewpoint. Because of this, maximizing the number of communications with any of four neighbors (instead of other nodes) helps improve performance. Similar examples can be given from the areas of embedded sensor processing and 3D systems as well. In this application domain, we make two specific contributions. First, we show how array data of a loop-intensive application can be mapped onto a 2D mesh so that the communication distances between the nodes are reduced. Second, we discuss how code restructuring through loop transformation can help us achieve better data locality in the 2D space.