CHARM++: a portable concurrent object oriented system based on C++
OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
Multilevel hypergraph partitioning: application in VLSI domain
DAC '97 Proceedings of the 34th annual Design Automation Conference
Level 3 basic linear algebra subprograms for sparse matrices: a user-level interface
ACM Transactions on Mathematical Software (TOMS)
Maximizing parallelism and minimizing synchronization with affine partitions
Parallel Computing - Special issues on languages and compilers for parallel computers
Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
A high-level approach to synthesis of high-performance codes for quantum chemistry
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Disk Resident Arrays: An Array-Oriented I/O Library for Out-Of-Core Computations
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Cilk: efficient multithreaded computing
Cilk: efficient multithreaded computing
A Multi-Platform Co-Array Fortran Compiler
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit
International Journal of High Performance Computing Applications
Integrated Data and Task Management for Scientific Applications
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
An extensible global address space framework with decoupled task and data abstractions
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
Effective data distribution and parallelization of computations involving irregular data structures is a challenging task. We address the twin-problems in the context of computations involving block-sparse matrices. The programming model provides a global view of a distributed block-sparse matrix. Abstractions are provided for the user to express the parallel tasks in the computation. The tasks are mapped onto processors to ensure load balance and locality. The abstractions are based on the Aggregate Remote Memory Copy Interface, and are interoperable with the Global Arrays programming suite and MPI. Results are presented that demonstrate the utility of the approach.