Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Integer and combinatorial optimization
Integer and combinatorial optimization
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
A methodology for parallelizing programs for multicomputers and complex memory multiprocessors
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Data optimization: allocation of arrays to reduce communication on SIMD machines
Journal of Parallel and Distributed Computing - Massively parallel computation
A static performance estimator to guide data partitioning decisions
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic data mapping for distributed-memory parallel computers
Automatic data mapping for distributed-memory parallel computers
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Automatic array alignment in data-parallel programs
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Automatic data partitioning on distributed memory multicomputers
Automatic data partitioning on distributed memory multicomputers
An optimizing Fortran D compiler for MIMD distributed-memory machines
An optimizing Fortran D compiler for MIMD distributed-memory machines
Automatic alignment of array data and processes to reduce communication time on DMPPs
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic data layout for distributed memory machines
Automatic data layout for distributed memory machines
Compiler techniques for data partitioning of sequentially iterated parallel loops
ICS '90 Proceedings of the 4th international conference on Supercomputing
Flow Analysis of Computer Programs
Flow Analysis of Computer Programs
Requirements for Data-Parallel Programming Environments
IEEE Parallel & Distributed Technology: Systems & Technology
The Alignment-Distribution Graph
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Array Distribution in Data-Parallel Programs
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Solving Alignment Using Elementary Linear Algebra
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Detecting and Using Affinity in an Automatic Data Distribution Tool
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Efficient Distribution Analysis via Graph Contraction
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Automatic Data Layout Using 0-1 Integer Programming
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Automatic data and computation decomposition for distributed memory machines
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Compiler-directed page coloring for multiprocessors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Dynamic feedback: an effective technique for adaptive computing
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Data distribution support on distributed shared memory multiprocessors
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Automatic data layout for distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts
IEEE Transactions on Parallel and Distributed Systems
An integer linear programming approach for optimizing cache locality
ICS '99 Proceedings of the 13th international conference on Supercomputing
Eliminating synchronization overhead in automatically parallelized programs using dynamic feedback
ACM Transactions on Computer Systems (TOCS)
A compiler technique for improving whole-program locality
POPL '01 Proceedings of the 28th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Dynamic data distribution with control flow analysis
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Compiler optimization of dynamic data distributions for distributed-memory multicomputers
Compiler optimizations for scalable parallel systems
Static and Dynamic Locality Optimizations Using Integer Linear Programming
IEEE Transactions on Parallel and Distributed Systems
Data Relation Vectors: A New Abstraction for Data Optimizations
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
An Advanced Compiler Framework for Non-Cache-Coherent Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers
The Journal of Supercomputing
A Layout-Conscious Iteration Space Transformation Technique
IEEE Transactions on Computers
A General Data Layout for Distributed Consistency in Data Parallel Applications
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors
ICPP '97 Proceedings of the international Conference on Parallel Processing
Locality Enhancement for Large-Scale Shared-Memory Multiprocessors
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Experimental Study of Compiler Techniques for NUMA Machines
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Linear data distribution based on index analysis
High performance scientific and engineering computing
Custom Data Layout for Memory Parallelism
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Quasidynamic Layout Optimizations for Improving Data Locality
IEEE Transactions on Parallel and Distributed Systems
A Compiler and Runtime Infrastructure for Automatic Program Distribution
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Improving communication scheduling for array redistribution
Journal of Parallel and Distributed Computing
Improving whole-program locality using intra-procedural and inter-procedural transformations
Journal of Parallel and Distributed Computing
Analyses for the translation of OpenMP codes into SPMD style with array privatization
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Optimizing data locality using array tiling
Proceedings of the International Conference on Computer-Aided Design
A data layout optimization framework for NUCA-based multicores
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.02 |
High Performance Fortran (HPF) is rapidly gaining acceptance as a language for parallel programming. The goal of HPF is to provide a simple yet efficient machine independent parallel programming model. Besides the algorithm selection, the data layout choice is the key intellectual step in writing an efficient HPF program. The developers of HPF did not believe that data layouts can be determined automatically in all cases, Therefore HPF requires the user to specify the data layout. It is the task of the HPF compiler to generate efficient code for the user supplied data layout. The choice of a good data layout depends on the HPF compiler used, the target architecture, the problem size, and the number of available processors. Allowing remapping of arrays at specific points in the program makes the selection of an efficient data layout even harder. Although finding an efficient data layout fully automatically may not be possible in all cases. HPF users will need support during the data layout selection process. In particular, this support is necessary if the user is not familiar with the characteristics of the target HPF compiler and target architecture, or even with HPF itself. Therefore, tools for automatic data layout and performance estimation will be crucial if the HPF is to find general acceptance in the scientific community. This paper discusses a framework for automatic data layout for use in a data layout assistant tool for a data-parallel language such as HPF. The envisioned tool can be used to generate a first data layout for a sequential Fortran program without data layout statements, or to extend a partially specified data layout in a HPF program to a totally specified data layout. Since the data layout assistant is not embedded in a compiler and will run only a few times during the tuning process of an application program, the framework can use techniques that may be too computationally expensive to be included in a compiler. A prototype data layout assistant tool based on our framework has been implemented as part of the D system currently under development at Rice University. The paper reports preliminary experimental results. The results indicate that the framework is efficient and generates data layouts of high quality.