Communications of the ACM
Orca: A Language for Parallel Programming of Distributed Systems
IEEE Transactions on Software Engineering
IBM Systems Journal
High performance messaging on workstations: Illinois fast messages (FM) for Myrinet
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Global arrays: a nonuniform memory access programming model for high-performance computers
The Journal of Supercomputing
Low-latency communication on the IBM RISC system/6000 SP
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Advances in High-Performance Computing
Advances in High-Performance Computing
Global arrays: a portable "shared-memory" programming model for distributed memory computers
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Integrating polling, interrupts, and thread management
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
MPI: A Message-Passing Interface Standard
MPI: A Message-Passing Interface Standard
Active messages: an efficient communication architecture for multiprocessors
Active messages: an efficient communication architecture for multiprocessors
The implementation of MPI-2 one-sided communication for the NEC SX-5
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems
IEEE Transactions on Parallel and Distributed Systems
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Priority Based Messaging for Software Distributed Shared Memory
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
VIBe: A Micro-benchmark Suite for Evaluating Virtual Interface Architecture (VIA) Implementations
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Distributed dynamic hash tables using IBM LAPI
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
ARMI: an adaptive, platform independent communication library
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Load balancing of molecular dynamics simulation with NWChem
IBM Systems Journal - Deep computing for the life sciences
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit
International Journal of High Performance Computing Applications
High Performance Remote Memory Access Communication: The Armci Approach
International Journal of High Performance Computing Applications
Shared memory programming for large scale machines
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
AM++: a generalized active message framework
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Extensible PGAS semantics for C++
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Symmetric data objects and remote memory access communication for fortran-95 applications
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Architecture and early performance of the new IBM HPS fabric and adapter
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Automatic communication coalescing for irregular computations in UPC language
CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
Designing energy efficient communication runtime systems: a view from PGAS models
The Journal of Supercomputing
IBM Blue Gene/Q system software stack
IBM Journal of Research and Development
Hi-index | 0.01 |
LAPI is a low-level, high-performance communication interface available on the IBM RS/6000 SP system. It provides art active-message-like interface along with remote memory copy and synchronization functionality. It is designed primarily for use by experienced programmers in developing parallel subsystems. libraries and tools, but we also expect power programmers to use it in end-user applications. IBM developed LAPI as a part of a project with Pacific Northwest National Laboratory (PNNL) to optimize the performance of the Global Arrays (GA) toolkit and its applications on the IBM RS/6000 SP. We provide an overview of LAPI characteristics and discuss its differences from other models such as MPI-2. We present some base performance parameters of LAPI including latency and bandwidth and compare it with performance of the MPI/MPL. The Global Array library from PNNL was ported to LAPI to exploit the performance benefits of this new interface. Experience using LAPI to implement GA and the performance of the resulting library are presented.