I-structures: data structures for parallel computing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A scalar architecture for pseudo vector processing based on slide-windowed registers
ICS '93 Proceedings of the 7th international conference on Supercomputing
The EM-X parallel computer: architecture and basic performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
High performance messaging on workstations: Illinois fast messages (FM) for Myrinet
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
MPI-FM: high performance MPI on workstation clusters
Journal of Parallel and Distributed Computing - Special issue on workstation clusters and network-based computing
CP-PACS: a massively parallel processor for large scale scientific calculations
ICS '97 Proceedings of the 11th international conference on Supercomputing
ScaLAPACK user's guide
ICS '98 Proceedings of the 12th international conference on Supercomputing
MBCF: a protected and virtualized high-speed user-level memory-based communication facility
ICS '98 Proceedings of the 12th international conference on Supercomputing
Highly efficient implementation of MPI point-to-point communication using remote memory operations
ICS '98 Proceedings of the 12th international conference on Supercomputing
The design and evaluation of high performance communication using a Gigabit Ethernet
ICS '99 Proceedings of the 13th international conference on Supercomputing
PM2: a high performance communication middleware for heterogeneous network environments
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
PM: An Operating System Coordinated High Performance Communication Library
HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Implementing MPI with the Memory-Based Communication Facilities on the SSS-CORE Operating System
Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
An MPI Library which uses Polling, Interrupts and Remote Copying for the Fujitsu AP1000+
ISPAN '96 Proceedings of the 1996 International Symposium on Parallel Architectures, Algorithms and Networks
LAPACK Working Note 94: A User''s Guide to the BLACS v1.0
LAPACK Working Note 94: A User''s Guide to the BLACS v1.0
Message Passing for Linux Clusters with Gigabit Ethernet Mesh Connections
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
Exploiting 162-Nanosecond End-to-End Communication Latency on Anton
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.01 |
A fast message-passing library FMPL has been designed and developed to maximize communication performance by utilizing general architectural communication support such as remote memory operations, as well as to maximize total performance by eliminating dynamic communication overhead and overlapping communication and computation. FMPL provides a low-cost general-purpose point-to-point communication and collective communication such as broadcast, barrier synchronization and reduction. On a Hitachi SR8000, FMPL achieves an 8-byte latency of 12.8μsec., while MPI achieves 20μsec. FMPL is designed for building more highly functional message-passing libraries like BLACS as well as applications that need maximum performance.