Theory of linear and integer programming
Theory of linear and integer programming
Process decomposition through locality of reference
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Run-time scheduling and execution of loops on message passing machines
Journal of Parallel and Distributed Computing - Special issue: algorithms for hypercube computers
Journal of Parallel and Distributed Computing - Special issue: software tools for parallel programming and visualization
Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
A practical algorithm for exact array dependence analysis
Communications of the ACM
Computer support for machine-independent parallel programming in Fortran D
Languages, compilers and run-time environments for distributed memory machines
A methodology for high-level synthesis of communication on multicomputers
ICS '92 Proceedings of the 6th international conference on Supercomputing
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The high performance Fortran handbook
The high performance Fortran handbook
Preliminary experiences with the Fortran D compiler
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Generating communication for array statements: design, implementation, and evaluation
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
An optimizing Fortran D compiler for MIMD distributed-memory machines
An optimizing Fortran D compiler for MIMD distributed-memory machines
A linear-time algorithm for computing the memory access sequence in data-parallel programs
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Omega Library interface guide
The Omega Library interface guide
Compiling high performance Fortran for distributed-memory systems
Digital Technical Journal
An HPF compiler for the IBM SP2
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers
ICS '95 Proceedings of the 9th international conference on Supercomputing
An Implementation Framework for HPF Distributed Arrays on Message-Passing Parallel Computer Systems
IEEE Transactions on Parallel and Distributed Systems
Compiling array expressions for efficient execution on distributed-memory machines
Journal of Parallel and Distributed Computing
Simplifying control flow in compiler-generated parallel code
International Journal of Parallel Programming - Special issue on languages and compilers for parallel computing. Part I
Compiling Communication-Efficient Programs for Massively Parallel Machines
IEEE Transactions on Parallel and Distributed Systems
Compiling Global Name-Space Parallel Loops for Distributed Execution
IEEE Transactions on Parallel and Distributed Systems
Code generation for multiple mappings
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Compiler Optimization of Implicit Reductions for Distributed Memory Multiprocessors
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Loop fusion in high performance Fortran
ICS '98 Proceedings of the 12th international conference on Supercomputing
Compiler-supported simulation of highly scalable parallel applications
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Transforming loops to recursion for multi-level memory hierarchies
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Proceedings of the fifteenth workshop on Parallel and distributed simulation
The Efficient Computation of Ownership Sets in HPF
IEEE Transactions on Parallel and Distributed Systems
High performance Fortran compilation techniques for parallelizing scientific codes
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Compiler supported high-level abstractions for sparse disk-resident datasets
ICS '02 Proceedings of the 16th international conference on Supercomputing
Increasing temporal locality with skewing and recursive blocking
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Compiler-optimized simulation of large-scale applications on high performance architectures
Journal of Parallel and Distributed Computing - Parallel and Distributed Discrete Event Simulation--An Emerging Technology
Simplifying Control Flow in Compiler-Generated Parallel Code
International Journal of Parallel Programming
A Technique to Eliminate Redundant Inter-Processor Communication on Parallelizing Compiler TINPAR
International Journal of Parallel Programming
POEMS: End-to-End Performance Design of Large Parallel Adaptive Computational Systems
IEEE Transactions on Software Engineering
Compiling Several Classes of Communication Patterns on a Multithreaded Architecture
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Improving Effective Bandwidth through Compiler Enhancement of Global Cache Reuse
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Issues of the Automatic Generation of HPF Loop Programs
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
Compiler Synthesis of Task Graphs for Parallel Program Performance Prediction
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
An Evaluation of Data-Parallel Compiler Support for Line-Sweep Applications
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Data-Parallel Compiler Support for Multipartitioning
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Compilation and Runtime-Optimizations for Software Distributed Shared Memory
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Toward Compiler Support for Scalable Parallelism Using Multipartitioning
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Generalized Multipartitioning for Multi-Dimensional Arrays
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Generalized multipartitioning of multi-dimensional arrays for parallelizing line-sweep computations
Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium
Improving effective bandwidth through compiler enhancement of global cache reuse
Journal of Parallel and Distributed Computing
Compiler Support for Exploiting Coarse-Grained Pipelined Parallelism
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
COTS Clusters vs. the Earth Simulator: An Application Study Using IMPACT-3D
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Effective communication coalescing for data-parallel applications
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
International Journal of High Performance Computing Applications
A translation system for enabling data mining applications on GPUs
Proceedings of the 23rd international conference on Supercomputing
Program locality analysis using reuse distance
ACM Transactions on Programming Languages and Systems (TOPLAS)
An extensible global address space framework with decoupled task and data abstractions
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Logical inference techniques for loop parallelization
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Compiling affine loop nests for distributed-memory parallel architectures
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Generating efficient data movement code for heterogeneous architectures with distributed-memory
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Automatic data allocation and buffer management for multi-GPU machines
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
In this paper, we describe our experience with using an abstract integer-set framework to develop the Rice dHPF compiler, a compiler for High Performance Fortran. We present simple, yet general formulations of the major computation partitioning and communication analysis tasks as well as a number of important optimizations in terms of abstract operations on sets of integer tuples. This approach has made it possible to implement a comprehensive collection of advanced optimizations in dHPF, and to do so in the context of a more general computation partitioning model than previous compilers. One potential limitation of the approach is that the underlying class of integer set problems is fundamentally unable to represent HPF data distributions on a symbolic number of processors. We describe how we extend the approach to compile codes for a symbolic number of processors, without requiring any changes to the set formulations for the above optimizations. We show experimentally that the set representation is not a dominant factor in compile times on both small and large codes. Finally, we present preliminary performance measurements to show that the generated code achieves good speedups for a few benchmarks. Overall, we believe we are the first to demonstrate by implementation experience that it is practical to build a compiler for HPF using a general and powerful integer-set framework.