Computer
Fine-grained mobility in the Emerald system
ACM Transactions on Computer Systems (TOCS)
Lazy release consistency for software distributed shared memory
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
CHARM++: a portable concurrent object oriented system based on C++
OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
Fine-grain access control for distributed shared memory
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Supporting dynamic data structures on distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
ACM Transactions on Computer Systems (TOCS)
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Lazy release consistency for distributed shared memory
Lazy release consistency for distributed shared memory
Global arrays: a nonuniform memory access programming model for high-performance computers
The Journal of Supercomputing
Design and performance of the Shasta distributed shared memory protocol
ICS '97 Proceedings of the 11th international conference on Supercomputing
Co-array Fortran for parallel programming
ACM SIGPLAN Fortran Forum
Shared virtual memory with automatic update support
ICS '99 Proceedings of the 13th international conference on Supercomputing
Orca: a language for distributed programming
ACM SIGPLAN Notices
OpenMP: An Industry-Standard API for Shared-Memory Programming
IEEE Computational Science & Engineering
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Cid: A Parallel, "Shared-Memory" C for Distributed-Memory Machines
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
PIT: A Library for the Parallelization of Irregular Problems
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
GASNet Specification, v1.1
The Implementation of Cashmere
The Implementation of Cashmere
MPI: A Message-Passing Interface Standard
MPI: A Message-Passing Interface Standard
Parallel Programming and Parallel Abstractions in Fortress
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Adaptive Parallel Graph Mining for CMP Architectures
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
TreadMarks: distributed shared memory on standard workstations and operating systems
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Scioto: A Framework for Global-View Task Parallelism
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Multiresolution quantum chemistry in multiwavelet bases
ICCS'03 Proceedings of the 2003 international conference on Computational science
Improving data locality for irregular partitioned global address space parallel programs
Proceedings of the 50th Annual Southeast Regional Conference
Hi-index | 0.00 |
This paper describes the Global Trees (GT) system that provides a multi-layered interface to a global address space view of distributed tree data structures, while providing scalable performance on distributed memory systems. The Global Trees system utilizes coarse-grained data movement to enhance locality and communication efficiency. We describe the design and implementation of GT, illustrate its use in the context of a gravitational simulation application, and provide experimental results that demonstrate the effectiveness of the approach. The key benefits of using this system include efficient shared-memory style programming of distributed trees, tree-specific optimizations for data access and computation, and the ability to customize many aspects of GT to optimize application performance.