Data optimization: allocation of arrays to reduce communication on SIMD machines
Journal of Parallel and Distributed Computing - Massively parallel computation
Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
Automatic data mapping for distributed-memory parallel computers
ICS '92 Proceedings of the 6th international conference on Supercomputing
Automatic array alignment in data-parallel programs
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Automatic data partitioning on distributed memory multicomputers
Automatic data partitioning on distributed memory multicomputers
Mobile and replicated alignment of arrays in data-parallel programs
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
An optimizing Fortran D compiler for MIMD distributed-memory machines
An optimizing Fortran D compiler for MIMD distributed-memory machines
Compiling Communication-Efficient Programs for Massively Parallel Machines
IEEE Transactions on Parallel and Distributed Systems
Compile-Time Techniques for Data Distribution in Distributed Memory Machines
IEEE Transactions on Parallel and Distributed Systems
Communication-Free Hyperplane Partitioning of Nested Loops
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Array Distribution in Data-Parallel Programs
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Detecting and Using Affinity in an Automatic Data Distribution Tool
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Automatic Data Layout Using 0-1 Integer Programming
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts
IEEE Transactions on Parallel and Distributed Systems
An integer linear programming approach for optimizing cache locality
ICS '99 Proceedings of the 13th international conference on Supercomputing
Deriving Array Distributions by Optimization Techniques
The Journal of Supercomputing
A compiler technique for improving whole-program locality
POPL '01 Proceedings of the 28th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Dynamic data distribution with control flow analysis
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Static and Dynamic Locality Optimizations Using Integer Linear Programming
IEEE Transactions on Parallel and Distributed Systems
Data Relation Vectors: A New Abstraction for Data Optimizations
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
An Advanced Compiler Framework for Non-Cache-Coherent Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors
ICPP '97 Proceedings of the international Conference on Parallel Processing
Quasidynamic Layout Optimizations for Improving Data Locality
IEEE Transactions on Parallel and Distributed Systems
Improving whole-program locality using intra-procedural and inter-procedural transformations
Journal of Parallel and Distributed Computing
Using data replication to reduce communication energy on chip multiprocessors
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
A data layout optimization framework for NUCA-based multicores
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
Data distribution is one of the key aspects that a parallelizing compiler for a distributed memory architecture should consider, in order to get efficiency from the system. The cost of accessing local and remote data can be one or several orders of magnitude different, and this can dramatically affect performance. In this paper, we present a novel approach to automatically perform static data distribution. All the constraints related to parallelism and data movement are contained in a single data structure, the Communication-Parallelism Graph (CPG). The problem is solved using a linear 0-1 integer programming model and solver. In this paper we present the solution for one-dimensional array distributions, although its extension to multi-dimensional array distributions is also outlined. The solution is static in the sense that the layout of the arrays does not change during the execution of the program. We also show the feasibility of using this approach to solve the problem in terms of compilation time and quality of the solutions generated.