Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Data dependence and its application to parallel processing
International Journal of Parallel Programming
Analysis of interprocedural side effects in a parallel programming environment
Proceedings of the 1st International Conference on Supercomputing
Process decomposition through locality of reference
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Journal of Parallel and Distributed Computing - Special issue: software tools for parallel programming and visualization
Structured dataflow analysis for arrays and its use in an optimizing complier
Software—Practice & Experience
An interval-based approach to exhaustive and incremental interprocedural data-flow analysis
ACM Transactions on Programming Languages and Systems (TOPLAS)
Detecting redundant accesses to array data
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiling programs for nonshared memory machines
Compiling programs for nonshared memory machines
Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
How to analyze large programs efficiently and informatively
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
A methodology for high-level synthesis of communication on multicomputers
ICS '92 Proceedings of the 6th international conference on Supercomputing
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
GIVE-N-TAKE—a balanced code placement framework
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
An HPF compiler for the IBM SP2
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Global optimization by suppression of partial redundancies
Communications of the ACM
A program data flow analysis procedure
Communications of the ACM
Data-Parallel Programming on Multicomputers
IEEE Software
An Implementation of Interprocedural Bounded Regular Section Analysis
IEEE Transactions on Parallel and Distributed Systems
Compiling Communication-Efficient Programs for Massively Parallel Machines
IEEE Transactions on Parallel and Distributed Systems
Compiling Global Name-Space Parallel Loops for Distributed Execution
IEEE Transactions on Parallel and Distributed Systems
Compiler Analysis for Irregular Problems in Fortran D
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
A Framework for Exploiting Data Availability to Opimize Communication
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
A portable machine-independent global optimizer--design and measurements
A portable machine-independent global optimizer--design and measurements
Incremental dependence analysis
Incremental dependence analysis
Global communication analysis and optimization
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Optimizing communication in HPF programs on fine-grain distributed shared memory
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Problem and machine sensitive communication optimization
ICS '98 Proceedings of the 12th international conference on Supercomputing
Integrated Range Comparison for Data-Parallel Compilation Systems
IEEE Transactions on Parallel and Distributed Systems
A preprocessing step for global loop transformations for data transfer optimization
CASES '00 Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems
A balanced code placement framework
ACM Transactions on Programming Languages and Systems (TOPLAS)
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Global optimization techniques for automatic parallelization of hybrid applications
ICS '01 Proceedings of the 15th international conference on Supercomputing
Static Single Assignment Form for Message-Passing Programs
International Journal of Parallel Programming
Communication-free partitioning of nested loops
Compiler optimizations for scalable parallel systems
A framework for global communication analysis of optimizations
Compiler optimizations for scalable parallel systems
Bidirectional data flow analysis: myths and reality
ACM SIGPLAN Notices
Automatic Parallelization of Recursive Procedures
International Journal of Parallel Programming
Algorithms for Supporting Compiled Communication
IEEE Transactions on Parallel and Distributed Systems
Gilgamesh: a multithreaded processor-in-memory architecture for petaflops computing
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
CC--MPI: a compiled communication capable MPI prototype for ethernet switched clusters
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Flexible Hardware/Software Support for Message Passing on a Distributed Shared Memory Architecture
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Effective communication coalescing for data-parallel applications
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Towards automatic translation of OpenMP to MPI
Proceedings of the 19th annual international conference on Supercomputing
Communication Optimizations for Fine-Grained UPC Applications
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Automatic nonblocking communication for partitioned global address space programs
Proceedings of the 21st annual international conference on Supercomputing
Performance portable optimizations for loops containing communication operations
Proceedings of the 22nd annual international conference on Supercomputing
Multidimensional Blocking in UPC
Languages and Compilers for Parallel Computing
The rise and fall of high performance Fortran
Communications of the ACM
Automatic communication coalescing for irregular computations in UPC language
CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
Improving communication in PGAS environments: static and dynamic coalescing in UPC
Proceedings of the 27th international ACM conference on International conference on supercomputing
Hi-index | 0.02 |
This paper presents a framework, based on global array data-flow analysis, to reduce communication costs in a program being compiled for a distributed memory machine. We introduce available section descriptor, a novel representation of communication involving array sections. This representation allows us to apply techniques for partial redundancy elimination to obtain powerful communication optimizations. With a single framework, we are able to capture optimizations like 1) vectorizing communication, 2) eliminating communication that is redundant on any control flow path, 3) reducing the amount of data being communicated, 4) reducing the number of processors to which data must be communicated, and (5) moving communication earlier to hide latency, and to subsume previous communication. We show that the bidirectional problem of eliminating partial redundancies can be decomposed into simpler unidirectional problems even in the context of an array section representation, which makes the analysis procedure more efficient. We present results from a preliminary implementation of this framework, which are extremely encouraging, and demonstrate the effectiveness of this analysis in improving the performance of programs.