A design methodology for synthesizing parallel algorithms and architectures
Journal of Parallel and Distributed Computing
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Principles of runtime support for parallel processors
ICS '88 Proceedings of the 2nd international conference on Supercomputing
An experimental study of methods for parallel preconditioned Krylov methods
C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Process decomposition through locality of reference
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
A parallelizing compiler for distributed memory parallel computers
A parallelizing compiler for distributed memory parallel computers
Run-time scheduling and execution of loops on message passing machines
Journal of Parallel and Distributed Computing - Special issue: algorithms for hypercube computers
A fan-in algorithm for distributed sparse numerical factorization
SIAM Journal on Scientific and Statistical Computing
Updating distributed variables in local computations
Concurrency: Practice and Experience
Supporting shared data structures on distributed memory architectures
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
SIAM Journal on Scientific and Statistical Computing
Generating explicit communication from shared-memory program references
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
The DINO parallel programming language
Journal of Parallel and Distributed Computing
Performance effects of irregular communication patterns on massively parallel multiprocessors
Journal of Parallel and Distributed Computing
Execution time support for adaptive scientific algorithms on distributed
Concurrency: Practice and Experience
Compiler optimizations for Fortran D on MIMD distributed-memory machines
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiling programs for nonshared memory machines
Compiling programs for nonshared memory machines
The high performance Fortran handbook
The high performance Fortran handbook
Pandore: a system to manage data distribution
ICS '90 Proceedings of the 4th international conference on Supercomputing
Parallelizing Loops with Indirect Array References of Pointers
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Slicing Analysis and Indirect Accesses to Distributed Arrays
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Massive Parallelism and Process Contraction
Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing
Interprocedural compilation of irregular applications for distributed memory machines
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Efficient resolution of sparse indirections in data-parallel compilers
ICS '95 Proceedings of the 9th international conference on Supercomputing
Efficient support of parallel sparse computation for array intrinsic functions of Fortran 90
ICS '98 Proceedings of the 12th international conference on Supercomputing
Landing CG on EARTH: a case study of fine-grained multithreading on an evolutionary path
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Runtime and compiler support for irregular computations
Compiler optimizations for scalable parallel systems
Compiling parallel code for sparse matrix applications
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Automatic data and computation decomposition on distributed memory parallel computers
ACM Transactions on Programming Languages and Systems (TOPLAS)
Distributed component architecture for scientific applications
CRPIT '02 Proceedings of the Fortieth International Conference on Tools Pacific: Objects for internet, mobile and embedded applications
Data parallel language and compiler support for data intensive applications
Parallel Computing - Parallel data-intensive algorithms and applications
Impact of Data Distribution on Performance of Irregular Reductions on Multithreaded Architectures
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
Compiler and Runtime Support for Irregular Reductions on a Multithreaded Architecture
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Data Locality Exploitation in Algorithms including Sparse Communications
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Improving the Sparse Parallelization Using Semantical Information at Compile-Time
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Run-Time Reference Clustering for Cache Performance Optimization
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
Local Enumeration Techniques for Sparse Algorithms
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Optimizing irregular shared-memory applications for distributed-memory systems
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler and middleware support for scalable data mining
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Towards a science of parallel programming
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Asynchronous progressive irregular prefix operation in HPF2
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Synthesizing concurrent schedulers for irregular algorithms
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
The tao of parallelism in algorithms
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the international conference on Supercomputing
Hi-index | 14.98 |
This paper addresses the issue of compiling concurrent loop nests in the presence of complicated array references and irregularly distributed arrays. Arrays accessed within loops may contain accesses that make it impossible to precisely determine the reference pattern at compile time. This paper proposes a run time support mechanism that is used effectively by a compiler to generate efficient code in these situations. The compiler accepts as input a Fortran 77 program enhanced with specifications for distributing data, and outputs a message passing program that runs on the nodes of a distributed memory machine. The runtime support for the compiler consists of a library of primitives designed to support irregular patterns of distributed array accesses and irregularly distributed array partitions. A variety of performance results on the Intel iPSC/860 are presented.