Computer
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Automatic data partitioning on distributed memory multicomputers
Automatic data partitioning on distributed memory multicomputers
Automatic data layout for high performance Fortran
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
A novel approach towards automatic data distribution
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Detecting and Using Affinity in an Automatic Data Distribution Tool
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Data Redistribution in an Automatic Data Distribution Tool
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Efficient Distribution Analysis via Graph Contraction
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Automatic Data Layout Using 0-1 Integer Programming
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Aligning parallel arrays to reduce communication
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
A Framework for Automatic Dynamic Data Mapping
SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
Automatic data layout for distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts
IEEE Transactions on Parallel and Distributed Systems
A compiler technique for improving whole-program locality
POPL '01 Proceedings of the 28th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
IEEE Transactions on Parallel and Distributed Systems
A synthesis of memory mechanisms for distributed architectures
ICS '01 Proceedings of the 15th international conference on Supercomputing
Static and Dynamic Locality Optimizations Using Integer Linear Programming
IEEE Transactions on Parallel and Distributed Systems
Data Relation Vectors: A New Abstraction for Data Optimizations
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Fortran RED - A Retargetable Environment for Automatic Data Layout
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
An Automatic Iteration/Data Distribution Method Based on Access Descriptors for DSMM
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Compiler Techniques for the Distribution of Data and Computation
IEEE Transactions on Parallel and Distributed Systems
Quasidynamic Layout Optimizations for Improving Data Locality
IEEE Transactions on Parallel and Distributed Systems
Dyn-MPI: Supporting MPI on Non Dedicated Clusters
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Improving whole-program locality using intra-procedural and inter-procedural transformations
Journal of Parallel and Distributed Computing
The MHETA Execution Model for Heterogeneous Clusters
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Dyn-MPI: Supporting MPI on medium-scale, non-dedicated clusters
Journal of Parallel and Distributed Computing
Matrix-based streamization approach for improving locality and parallelism on FT64 stream processor
The Journal of Supercomputing
Applying Data Mapping Techniques to Vector DSPs
Journal of Signal Processing Systems
Hi-index | 0.00 |
This paper describes the design of a data distribution tool which automatically derives the data mapping for the arrays and the parallelization strategy for the loops in a Fortran 77 program. The layout generated can be static or dynamic, and the distribution is one-dimensional BLOCK or CYCLIC. The tool takes into account the control flow statements in the code in order to better estimate the behavior of the program. All the information regarding data movement and parallelism is contained in a single data structure named Communication-Parallelism Graph (CPG). The CPG is used to model a minimal path problem in which time is the objective function to minimize. It is solved using a general purpose linear programming solver, which finds the optimal solution for the whole problem. The experimental results will illustrate the quality of the solutions generated and the feasibility of the approach in terms of compilation time.