Alternating direction methods on multiprocessors
SIAM Journal on Scientific and Statistical Computing
Implementing the beam and warming method on the hypercube
C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Efficient implementation of a 3-dimensional ADI method on the iPSC/860
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Using integer sets for data-parallel program analysis and optimization
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
High performance Fortran compilation techniques for parallelizing scientific codes
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Data-Parallel Compiler Support for Multipartitioning
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Toward Compiler Support for Scalable Parallelism Using Multipartitioning
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
A Constructive Solution to the Juggling Problem in Processor Array Synthesis
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
An Evaluation of Data-Parallel Compiler Support for Line-Sweep Applications
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
IEEE Transactions on Parallel and Distributed Systems
Topology-aware tile mapping for clusters of SMPs
Proceedings of the 3rd conference on Computing frontiers
Hi-index | 0.00 |
Multipartitioning is a strategy for parallelizing computations that require solving 1D recurrences along each dimension of a multi-dimensional array. Previous techniques for multipartitioning yield efficient parallelizations over 3D domains only when the number of processors is a perfect square. This paper considers the general problem of computing multipartitionings for d-dimensional data volumes on an arbitrary number of processors. We describe an algorithm that computes an optimal multipartitioning onto all of the processors for this general case. Finally, we describe how we extended the Rice dHPF compiler for High Performance Fortran to generate code that exploits generalized multipartitioning and show that the compiler's generated code for the NAS SP computational fluid dynamics benchmark achieves scalable high performance.