Combinatorial algorithms for integrated circuit layout
Combinatorial algorithms for integrated circuit layout
CHARM++: a portable concurrent object oriented system based on C++
OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
Level 3 basic linear algebra subprograms for sparse matrices: a user-level interface
ACM Transactions on Mathematical Software (TOMS)
Using integer sets for data-parallel program analysis and optimization
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Shared Memory Programming in Metacomputing Environments: The Global Array Approach
The Journal of Supercomputing - Special issue: high performance distributed computing
Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication
IEEE Transactions on Parallel and Distributed Systems
Synthesizing transformations for locality enhancement of imperfectly-nested loop nests
Proceedings of the 14th international conference on Supercomputing
Tiling imperfectly-nested loop nests
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Global arrays: a portable "shared-memory" programming model for distributed memory computers
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
A high-level approach to synthesis of high-performance codes for quantum chemistry
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Disk Resident Arrays: An Array-Oriented I/O Library for Out-Of-Core Computations
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Cilk: efficient multithreaded computing
Cilk: efficient multithreaded computing
A Multi-Platform Co-Array Fortran Compiler
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit
International Journal of High Performance Computing Applications
Data and computation abstractions for dynamic and irregular computations
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Hypergraph partitioning for automatic memory hierarchy management
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
Although message passing using MPI is the dominant model for parallel programming today, the significant effort required to develop high-performance MPI applications has prompted the development of several parallel programming models that are more convenient. Programming models such as Co-Array Fortran, Global Arrays, Titanium, and UPC provide a more convenient global view of the data, but face significant challenges in delivering high performance over a range of applications. It is particularly challenging to achieve high performance using global-address-space languages for unstructured applications with irregular data structures. In this paper, we describe a global-address-space parallel programming framework with decoupled task and data abstractions. The framework centers around the use of task pools, where tasks specify operands in a distributed, globally addressable pool of data chunks. The data chunks can be addressed in a logical multidimensional "tuple" space, and are distributed among the nodes of the system. Locality-aware load balancing of tasks in the task pool is achieved through judicious mapping via hyper-graph partitioning, as well as dynamic task/data migration. The framework implements a transparent interface for out-of-core data, so that explicit orchestration of movement of data between disks and memory is not required of the programmer. The use of the framework for implementation of parallel blocksparse tensor computations in the context of a quantum chemistry application is illustrated.