ACM Transactions on Mathematical Software (TOMS)
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
A new parallel architecture for sparse matrix computation based on finite projective geometries
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Exploiting the memory hierarchy in sequential and parallel sparse Cholesky factorization
Exploiting the memory hierarchy in sequential and parallel sparse Cholesky factorization
Program partitioning for NUMA multiprocessor computer systems
Journal of Parallel and Distributed Computing - Special issue on performance of supercomputers
List scheduling with and without communication delays
Parallel Computing
Techniques to overlap computation and communication in irregular iterative applications
ICS '94 Proceedings of the 8th international conference on Supercomputing
Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
Scheduling and code generation for parallel architectures
Scheduling and code generation for parallel architectures
Multiprocessor runtime support for fine-grained, irregular DAGs
Multiprocessor runtime support for fine-grained, irregular DAGs
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Provably efficient scheduling for languages with fine-grained parallelism
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Modeling the benefits of mixed data and task parallelism
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Decoupling synchronization and data transfer in message passing systems of parallel computers
ICS '95 Proceedings of the 9th international conference on Supercomputing
Run-time compilation for parallel sparse matrix computations
ICS '96 Proceedings of the 10th international conference on Supercomputing
Run-time techniques for exploiting irregular task parallelism on distributed memory architectures
Journal of Parallel and Distributed Computing
Elimination forest guided 2D sparse LU factorization
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Sparse LU factorization with partial pivoting on distributed memory machines
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Partitioning and Scheduling Parallel Programs for Multiprocessors
Partitioning and Scheduling Parallel Programs for Multiprocessors
Parallel Programming and Compilers
Parallel Programming and Compilers
Computer Solution of Large Sparse Positive Definite
Computer Solution of Large Sparse Positive Definite
Improved load distribution in parallel sparse cholesky factorization
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Automatic Extraction of Functional Parallelism from Ordinary Programs
IEEE Transactions on Parallel and Distributed Systems
DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors
IEEE Transactions on Parallel and Distributed Systems
Experience with active messages on the Meiko CS-2
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Flexible Communication Mechanisms for Dynamic Structured Applications
IRREGULAR '96 Proceedings of the Third International Workshop on Parallel Algorithms for Irregularly Structured Problems
Software support for parallel processing of irregular and dynamic computations
Software support for parallel processing of irregular and dynamic computations
Sparse gaussian elimination on high-performance computers
Sparse gaussian elimination on high-performance computers
Scheduling and run-time support for parallel irregular computations
Scheduling and run-time support for parallel irregular computations
Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization
SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
Toward an automatic parallelization of sparse matrix computations
Journal of Parallel and Distributed Computing
Irregularity handling via structured parallel programming
International Journal of Computational Science and Engineering
Hi-index | 0.00 |
In this article we investigate the trade-off between time and space efficiency in scheduling and executing parallel irregular computations on distributed-memory machines. We employ acyclic task dependence graphs to model irregular parallelism with mixed granularity, and we use direct remote memory access to support fast communication. We propose new scheduling techniques and a run-time active memory management scheme to improve memory utilization while retaining good time efficiency, and we provide a theoretical analysis on correctness and performance. This work is implemented in the context of the RAPID system which uses an inspector/executor approach to parallelize irregular computations at run-ti me. We demostrate the effectiveness of the proposed techniques on several irregular applications such as sparse matrix code and the fast multipole method for particle simulation. Our experimental results on Cray-T3E show that problems large sizes can be solved under limited space capacity, and that the loss of execution efficiency caused by the extra memory management overhead is reasonable.