Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
Preliminary experiences with the Fortran D compiler
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
GIVE-N-TAKE—a balanced code placement framework
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
Efficient support for irregular applications on distributed-memory machines
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Software—Practice & Experience
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
An integrated compile-time/run-time software distributed shared memory system
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Compiler and software distributed shared memory support for irregular applications
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimizing communication in HPF programs on fine-grain distributed shared memory
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Enhancing Software DSM for Compiler-Parallelized Applications
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Exploiting spatial regularity in irregular iterative applications
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Initial Results for Glacial Variable Analysis
LCPC '96 Proceedings of the 9th International Workshop on Languages and Compilers for Parallel Computing
Compile-time Synchronization Optimizations for Software DSMs
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Update Protocols and Iterative Scientific Applications
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Compiler Optimization of Implicit Reductions for Distributed Memory Multiprocessors
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
On Automatic Parallelization of Irregular Reductions on Scalable Shared Memory Systems
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
On improving the performance of data partitioning oriented parallel irregular reductions
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Proceedings of the international conference on Supercomputing
Hi-index | 0.00 |
Current compilers for distributed-memory multiprocessors parallelize irregular reductions either by generating calls to sophisticated run-time systems (CHAOS) or by relying on replicated buffers and the shared-memory interface supported by software DSMs (TreadMarks). We introduce LocalWrite, a new technique for parallelizing irregular reductions based on the owner-computes rule. It eliminates the need for buffers or synchronized writes, but may replicate computation. We investigate the impact of connectivity (node/edge ratio), locality (accesses to local data) and adaptivity (edge modifications) on their relative performance. LocalWrite improves performance by 50-150% compared to using replicated buffers, and can match or exceed gather/scatter for applications with low locality or high adaptivity.